Dates Are Hard

Photograph of Prague's astronomical clock

No, I’m not talking about a meeting with a lover or potential lover. While those can be stressful, the calendar math used to determine the precise date and time on which such a meeting might occur is infinitely more difficult to perform. To software programmers, this isn’t news, but I recently encountered an issue when calculating the time for an RFC 4122 UUID that had me questioning the accuracy of our modern, accepted calendars, especially with regard to the days of the week on which our dates fall.

I was working on a simple bug fix for my rhumsaa/uuid PHP library. All tests passed locally, so I assumed the tests would pass in Travis CI after I pushed them to the repository. After all, I hadn’t made any changes to the library; I had just moved a few things in the composer.json file.

But then I received a broken build email from Travis CI. I clicked on the build link to see what had happened, and I saw this:

uuid build 84

Notice how the tests passed in PHP 5.3 and HHVM, but they failed in PHP versions 5.4 and 5.5. I was doubly confused, since I was running my local tests against PHP 5.5.4, but they were failing on Travis CI in 5.5!

The confusion doesn’t stop there. I took a look at the test failures. There were three of them. Each was some variation of this:

2) Rhumsaa\Uuid\UuidTest::testGetDateTime
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'Sun, 16 Oct 1582 16:34:04 +0000'
+'Sat, 16 Oct 1582 16:34:04 +0000'

It’s expecting Sunday but was getting Saturday for the very same date. How could that be?

At this point, I should explain why I’m checking for this specific date. It’s not an arbitrary choice. Version 1 UUIDs are based on timestamps that are created from 100-nanosecond intervals since 00:00:00 UTC on October 15, 1582. Again, UUID doesn’t arbitrarily use this date. It’s an important date in history. It is the first day of the Gregorian calendar.

For my unit tests, I chose to test a few of the earliest dates that could possibly be used to create UUIDs. I chose to use static date strings, since I didn’t expect the dates to change. I used PHP to generate the date strings in RFC 2822 format:

php > var_dump(gmdate('r', strtotime('1582-10-16T16:34:04+00:00')));
string(31) "Sun, 16 Oct 1582 16:34:04 +0000"

And my tests included code that looked like this:

$uuid = Uuid::fromString('0901e600-0154-1000-9b21-0800200c9a66');
$this->assertInstanceOf('\DateTime', $uuid->getDateTime());
$this->assertEquals('Sun, 16 Oct 1582 16:34:04 +0000', $uuid->getDateTime()->format('r'));

Using these date strings, I was under the mistaken impression that systems know on what day of the week any particular date is supposed to fall. The system on which I ran this PHP code was convinced that the 16th of October in 1582 was a Sunday, so I trusted this.

The 16th of October in 1582 was not a Sunday, however. It was, in fact, a Saturday. And the 15th of October in 1582 was not a Saturday (as these same systems reported) but, rather, a Friday. When Travis CI reported two of my builds as broken, it was because these systems were accurately reporting the day of the week.

It gets stranger, though. The Unix cal program doesn’t seem to know the correct day of the week for these dates, either:

$ cal 10 1582
October 1582
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

What’s going on here? To make a long story even longer, Gregory’s calendar reforms sought to correct a drift in the date of the vernal equinox, and to correct this shift and place it back on March 21st, ten days were removed from the calendar at the time of adoption of the Gregorian calendar. Thus, October 4, 1582 falls on a Thursday and the very next day is October 15, 1582, which is a Friday.

The Unix cal program doesn’t show this removal of dates in October 1582, so dates 5-14 are still in place. As a result, how can we be certain that our current days of the week fall on the correct dates? It’s clearly wrong. Where does Unix cal fix this?

It turns out, the Unix cal program follows Great Britain’s adoption of the Gregorian calendar. Great Britain (and its American colonies at the time) adopted the Gregorian calendar in September 1752, and the cal program shows this:

$ cal 9 1752
September 1752
Su Mo Tu We Th Fr Sa
1 2 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

Still, while cal uses Great Britain’s adoption date, the Unix date command appears to use Gregory’s adoption date, but it doesn’t remove the dates 5-14 in October 1582. Therefore, while the 15th falls on a Friday, the 4th falls on a Monday, ten days earlier. October 14, 1582 shouldn’t exist, but it does:

$ TZ=UTC date -d "1582-10-15T00:00:00.00Z"
Fri Oct 15 00:00:00 UTC 1582
$ TZ=UTC date -d "1582-10-04T00:00:00.00Z"
Mon Oct 4 00:00:00 UTC 1582
$ TZ=UTC date -d "1582-10-14T00:00:00.00Z"
Thu Oct 14 00:00:00 UTC 1582

So, the mystery is solved, and it makes sense why this happens, but it means that it’s tricky to determine the day of the week for dates in the distant past. As for my tests, I dropped the use of the RFC 2822 date format. I didn’t need to test the day of the week. I just needed to test the date. Switching to the ISO 8601 format eliminated the problem for me.

However, this still doesn’t answer why some builds of PHP report October 15, 1582 as occurring on a Friday, while others report it as being a Saturday. Perhaps Derick can help answer that. :-)