Dependently Typed Dates

2022-09-16

After so many years of programming without dependent types, I honestly wasn't sure I'd ever reach a point where I'd stop, look at my code and think: "Oh yeah, I could use dependent types here." But that moment came sooner than I expected, when I realized I could write a correct-by-construction type for dates in Idris.

Before I get into the code, I'll say that as a caveat, it's only correct by the specification of "date" that I have in my head -- I know there are leap seconds which are added in a noncomputable fashion. For simplicity's sake, I'll be sticking to YMD format(ish) dates.

We can treat years as just integers, so there's nothing interesting going on there. So we'll start by defining months, indexed by an ordinal number:

data Month : Nat -> Type where
  Jan : Month 1
  Feb : Month 2
  Mar : Month 3
  Apr : Month 4
  May : Month 5
  Jun : Month 6
  Jul : Month 7
  Aug : Month 8
  Sep : Month 9
  Oct : Month 10
  Nov : Month 11
  Dec : Month 12

This makes it easy to define functions of type Month n -> Nat and (n : Nat) -> Maybe (Month n).

natToMonth : (m : Nat) -> Maybe (Month m)
natToMonth 1 = Just Jan
natToMonth 2 = Just Feb
natToMonth 3 = Just Mar
natToMonth 4 = Just Apr
natToMonth 5 = Just May
natToMonth 6 = Just Jun
natToMonth 7 = Just Jul
natToMonth 8 = Just Aug
natToMonth 9 = Just Sep
natToMonth 10 = Just Oct
natToMonth 11 = Just Nov
natToMonth 12 = Just Dec
natToMonth _ = Nothing

monthToNat : {m : _} -> Month m -> Nat
monthToNat _ = m

Note that in monthToNat we need to make sure the index doesn't get erased; by default, it has multiplicity 0. Alternatively if you still wish for the index to get erased, you can merely case on the month and explicitly give back the correct number.

Now we get into the interesting bit: each month has its own number of days, and that number is kind of arbitrary. We want our dates to be ill-typed if the day of the month is invalid, so let's start by writing a function that says how many days are in each month:

daysInMonth : Month m -> Nat
daysInMonth Jan = 31
daysInMonth Feb = 28
daysInMonth Mar = 31
daysInMonth Apr = 30
daysInMonth May = 31
daysInMonth Jun = 30
daysInMonth Jul = 31
daysInMonth Aug = 31
daysInMonth Sep = 30
daysInMonth Oct = 31
daysInMonth Nov = 30
daysInMonth Dec = 31

Ah, but this isn't quite right either. Depending on whether it's a leap year or not, February can have 29 or 28 days. So let's edit this function to take in an additional parameter for the year, and ignore it for every month except February:

daysInMonth : (year : Integer) -> Month m -> Nat
daysInMonth _ Jan = 31
daysInMonth year Feb = if year `mod` 4 == 0 then 29 else 28
daysInMonth _ Mar = 31
daysInMonth _ Apr = 30
daysInMonth _ May = 31
daysInMonth _ Jun = 30
daysInMonth _ Jul = 31
daysInMonth _ Aug = 31
daysInMonth _ Sep = 30
daysInMonth _ Oct = 31
daysInMonth _ Nov = 30
daysInMonth _ Dec = 31

When we ultimately define our date type, we'll need to make sure that the day of month falls between a range: the lower bound is 1, and the upper bound is the number of days in the month. So let's define an auxiliary function:

between : (lb : Nat) -> (ub : Nat) -> Nat -> Bool
between lb ub k = k >= lb && k <= ub

Now we're ready to give a definition for our YMD date type:

record Date where
  constructor MkDate
  year : Integer
  month : (m ** Month m)
  dayOfMonth : (d ** So (between 1 (daysInMonth year (snd month)) d))

Here we're using dependent pairs for the month and day, however, the use for days is much more interesting. The first part of the day type is a natural number (we don't need to annotate it because it's determined by the second part), and the second is a proof. The proof says that the day is between 1 and however-many-days-are-in-the-month. We use the So type to lift a boolean check into the type level. If we need to do this check at runtime instead of compile time, we can use choose : (b : Bool) -> Either (So b) (So (not b)).

Before we test to see whether our type correctly rejects invalid dates, let's write a parser so we can test it with strings instead of a record literal. Since we're using the YYYY-MM-DD format, we'll want to use String.split : (Char -> Bool) -> String -> List1 String. Note the non-empty return type. This is different from PureScript, where split :: Pattern -> String -> Array String. This difference may not seem like a big deal, but something to consider is what happens if the needle isn't in the haystack. Since Idris is guaranteeing that you get a nonempty list, the only way to make sure this happens if the predicate fails is to give the input string back unchanged. After all, given any value v : a you can always create a nonempty list v ::: [] : List1 a. This behavior is the same in PureScript, but the more precise type in Idris gives more information about how the function works. In particular, there's no way to have an empty list as output, even if the input string is empty.

To write the full parser, it will help to have some auxiliary functions first:

parseDayOfMonth :
    (year : Integer) ->
    (month : Month m) ->
    Nat ->
    Maybe (d ** So (between 1 (daysInMonth year month) d))
parseDayOfMonth year month d = do
  prf <- getLeft $ choose $ between 1 (daysInMonth year month) d
  pure (d ** prf)

parseNat : String -> Maybe Nat
parseNat str = cast <$> parseInteger str

Notice the use of choose which is required for runtime verification. Now we can write the complete parser:

parseDate : String -> Maybe Date
parseDate str = do
  let
    year ::: [month, day]
      = String.split (== '-') str
      | _ => Nothing
  year <- parseInteger year
  m <- parseNat month
  day <- parseNat day
  month <- natToMonth m
  day <- parseDayOfMonth year month day
  pure $ MkDate
    { year = year
    , month = (m ** month)
    , dayOfMonth = day
    }

And we can check whether it accepts valid dates and rejects invalid ones in the repl:

Main> parseDate "2022-09-16"
Just (MkDate 2022 (9 ** Sep) (16 ** Oh))
Main> parseDate "2022-09-31"
Nothing