Category Archives: Noda Time

What’s up with TimeZoneInfo on .NET 6? (Part 2)

In part 1, we ended up with a lot of test data specified in a text file, but without working tests – and with a conundrum as to how we’d test the .NET Core 3.1 data which requires additional information about the “hidden” AdjustmentRule.BaseUtcOffsetDelta property.

As with the previous blog post, this one is fairly stream-of-consciousness – you’ll see me changing my mind and spotting earlier mistakes as I go along. It’s not about trying to give advice – it’s intended for anyone who is interested in my thought process. If that’s not the kind of material you enjoy, I suggest you skip this post.

Abstracting platform differences

Well, time does wonders, and an answer to making most of the code agnostic to whether it’s running under .NET 6 or not now seems obvious: I can introduce my own class which is closer to the .NET 6 AdjustmentRule class, in terms of accessible properties. As it happens, a property of StandardOffset for the rule makes my code simpler than adding TimeZoneInfo.BaseUtcOffset and AdjustmentRule.BaseUtcOffsetDelta together every time I need it. But fundamentally I can put all the information I need into one class, and populate that class appropriately (without even needing a TimeZoneInfo) for tests, and use the TimeZoneInfo where necessary in the production code. (We have plenty of tests that use the actual TimeZoneInfo – but using just data from adjustment rules makes it easy to experiment with the Unix representation while on Windows.)

That means adding some derived data to our text file for .NET Core 3.1 – basically working out what the AdjustmentRule.BaseUtcOffsetDelta would be for that rule. We can do that by finding one instant in time within the rule, asking the TimeZoneInfo whether that instant is in daylight savings, and then comparing the prediction of “zone base UTC offset and maybe rule daylight savings” with the actual UTC offset.

With that in place, we get most of the way to passing tests – at least for the fixed data we’re loading from the text file.

Hidden data

There’s one problem though: Europe/Dublin in 1960. We have this adjustment rule in both .NET Core 3.1 and .NET 6:

1960-04-10 - 1960-10-02: Daylight delta: +00; DST starts April 10 at 03:00:00 and ends October 02 at 02:59:59.999

Now I know that should actually be treated as “daylight savings of 1 hour, standard offset of UTC+0”. The TimeZoneInfo knows that as well – if you ask for the UTC offset at (say) June 1st 1960, you the correct answer of UTC+1, and if you ask the TimeZoneInfo whether it’s in daylight savings or not, it returns true… but in most rules, a “daylight delta” of 0 means “this isn’t in daylight time”.

I believe this is basically a problem with the conversion from the internal rules to the publicly-available rules which loses some “hidden” bits of information. But basically it means that when I create my standardized adjustment rule, I need to provide some extra information. That’s annoying in terms of specifying yet more data in the text file, but it’s doable.

Given that the Unix rules and the Windows rules are really quite different, and I’ve already got separate paths for them (and everything still seems to be working on Windows), at this point I think it’s worth only using the “enhanced” adjustment rule code for Unix. That has the helpful property that we don’t need different ways of identifying the first instant in an adjustment rule: for Unix, you always use the daylight saving transition time of day; for Windows you never do.

At this point, I’m actually rethinking my strategy of “introduce a new type”. It’s got so much more than I really wanted it to have, I think I’m just going to split what was originally one method into two:

// Hard to test: needs a time zone
internal static BclAdjustmentRule FromUnixAdjustmentRule(TimeZoneInfo zone, TimeZoneInfo.AdjustmentRule rule)

... becomes ...

// Same signature as before, but now it just extracts appropriate information and calls the one below.
internal static BclAdjustmentRule FromUnixAdjustmentRule(TimeZoneInfo zone, TimeZoneInfo.AdjustmentRule rule)

// Easier to test with data from a text file
internal static BclAdjustmentRule FromUnixAdjustmentRule(TimeZoneInfo.AdjustmentRule rule,
    string standardName, string daylightName, TimeSpan zoneStandardOffset, TimeSpan ruleStandardOffset,
    bool forceDaylightSavings)

The unit tests that I’m trying to get to pass with just rule data just need to call into the second method. The production code (tested in other unit tests, but only on the “right” system) will call the first method.

(In the end, it turns out it’s easier to make the second method return a ZoneInterval rather than a BclAdjustmentRule, but the idea is the same.)

Are we nearly there yet?

At this point, I wish I’d paid slightly more attention while changing the code… because the code that did work for America/Sao_Paulo in 2018 is now failing for .NET 6. I can see why it’s failing now – I’m not quite so sure why it worked a few hours before.

The problem is in this pair of adjustment rules:

2017-10-15 - 2017-12-31: Daylight delta: +01; DST starts October 15 at 00:00:00 and ends December 31 at 23:59:59.999
2018-01-01 - 2018-02-17: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends February 17 at 23:59:59.999

These should effectively be merged: the end of the first rule should be the start of the second rule. In most rules, we can treat the rule as starting at the combination of “start date and start transition time” with a UTC offset of “the base offset of the zone” (not the rule). We then treat the rule as ending as the combination of “end date and end transition time” with a UTC offset of “the base offset of the zone + daylight delta of the rule”. But that doesn’t work in the example above: we end up with a gap of an hour between the two rules.

There’s a horrible hack that might fix this: if the end date is on December 31st with an end transition time of 23:59:59 (with any subsecond value), we could ignore daylight savings.

In implementing that, I found a commented out piece of code which did it, which was effectively abandoned in the refactoring described above – so that explains my confusion about why the code had only just stopped working.

With that in place, the data-based unit tests are green.

Now to run the main set of unit tests…

TimeZoneInfo-based unit tests

Just to reiterate, I have two types of tests for this code:

  • Unit tests based on rules described in a text file. These can be run on any implementation, and always represent “the rule data we’d expect to see on Unix”. There are currently 16 tests here – specific periods of history for specific time zones.
  • Unit tests based on the TimeZoneInfo objects exposed by the BCL on the system we’re running on. These include checking every time zone, and ensuring that every transition between 1950 and either 2037 or 2050 (depending on the system, for reasons I won’t go into now) is the same between the TimeZoneInfo representation, and the Noda Time representation of the TimeZoneInfo.

The first set of tests is what we’ve been checking so far – I now need to get the second set of tests working in at least four contexts: .NET Core 3.1 and .NET 6, on both Windows and Linux.

When I started this journey, the tests were working on Windows (both .NET Core 3.1 and .NET 6) and on Linux on .NET Core 3.1. It was only .NET 6 that was failing. Let’s start by checking that we haven’t broken anything on Windows… yes, everything is still working there. That’s pretty unsurprising, given that I’ve aimed to keep the Windows code path exactly the same as it was before. (There’s a potential optimization I can apply using the new BaseUtcOffsetDelta property on .NET 6, but I can apply that later.)

Next is testing .NET Core 3.1 on Linux – this used to work, but I wouldn’t be surprised to see problems introduced by the changes… and indeed, the final change I made broke lots of time zones due to trying to add daylight savings to DateTime.MaxValue. That’s easily fixed… and with that fix in place there are two errors. That’s fine – I’ll check those later and add text-file-data-based tests for those two time zones. Let’s check .NET 6 first though, which had large numbers of problems before… now we have 14. Definite progress! Those 14 failures seem to fall into two categories, so I’ll address those first.

Adding more test data and fixing one problem

First, I’m going to commit the code I’ve got. It definitely needs changing, but if I try something that doesn’t help, I want to be able to get back to where I was.

Next, let’s improve the exception messages thrown by the .NET 6 code. There are two exceptions, that currently look like this:

(For Pacific/Wallis and others)
System.InvalidOperationException : Zone recurrence rules have identical transitions. This time zone is broken.

(For America/Creston and others)
System.ArgumentException : The start Instant must be less than the end Instant (Parameter 'start')

Both have stack traces of code, but that doesn’t help me know what the invalid values are, which means I can’t easily find the relevant rules to work on.

After a few minutes of work, this is fixed and the output is much more informative:

(For Pacific/Wallis and others)
System.InvalidOperationException : Zone recurrence rules have identical transitions. This time zone is broken. Transition time: 0002-12-31T11:45:00Z

(For America/Creston and others)
System.ArgumentException : The start Instant must be less than the end Instant. start: 1944-01-01T07:00:00Z; end: 1944-01-01T06:01:00Z (Parameter 'start')

The broken rule for Pacific/Wallis is particularly interesting – year 2AD! So let’s see what the rules look like in textual form. First let’s look at Pacific Wallis

Pacific/Wallis

.NET Core 3.1:
Base offset = 12
0001-01-01 - 1900-12-31: Base UTC offset delta: +00:15; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:44:39
1900-12-31 - 2038-01-19: Daylight delta: +00; DST starts December 31 at 23:44:40 and ends January 19 at 15:14:06
2038-01-19 - 9999-12-31: Daylight delta: +00; DST starts January 19 at 15:14:07 and ends December 31 at 23:59:59

.NET 6.0:
Base offset = 12
0001-01-01 - 0001-12-31: Base UTC offset delta: +00:15; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
0002-01-01 - 1899-12-31: Base UTC offset delta: +00:15; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
1900-01-01 - 1900-12-31: Base UTC offset delta: +00:15; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:44:39.999

Noda Time zone intervals:
0001-01-01T00:00:00Z - 1900-12-31T11:44:40Z, +12:15:20, +0
1900-12-31T11:44:40Z - 9999-12-31T23:59:59Z, +12, +0

The first Noda Time zone interval extends from the start of time, and the second one extends to the end of time. I haven’t yet decided whether I’ll actually try to represent all of this in the “regular” kind of test. The offsets shown as +00:15 should actually be +00:15:20, but it looks like .NET doesn’t handle sub-minute offsets. That’s interesting… I can easily change the Noda Time data to round to expect 12:15 of course.

Both .NET Core 3.1 and .NET 6 have pretty “interesting” representations here:

  • Why does .NET Core 3.1 have a new rule in 2038? It’s no coincidence that the instant being represented is 231 seconds after the Unix epoch, I’m sure… but there’s no need for a new rule.
  • Why does .NET 6 have one rule for year 1AD and a separate rule for years 2 to 1899 inclusive?
  • Why use an offset rounded to the nearest minute, but keep the time zone transition at 1900-12-31T11:44:40Z?

It’s not clear to me just from inspection why this would cause the Noda Time conversion to fail, admittedly. But that’ll be fun to dig into. Before we do, let’s find the test data for America/Creston, around 1944:

America/Creston

.NET Core 3.1:
Base offset = -7
1942-02-09 - 1944-01-01: Daylight delta: +01; DST starts February 09 at 02:00:00 and ends January 01 at 00:00:59
1943-12-31 - 1944-04-01: Daylight delta: +00; DST starts December 31 at 23:01:00 and ends April 01 at 00:00:59
1944-04-01 - 1944-10-01: Daylight delta: +01; DST starts April 01 at 00:01:00 and ends October 01 at 00:00:59
1944-09-30 - 1967-04-30: Daylight delta: +00; DST starts September 30 at 23:01:00 and ends April 30 at 01:59:59

.NET 6.0:
Base offset = -7
1942-02-09 - 1942-12-31: Daylight delta: +01; DST starts February 09 at 02:00:00 and ends December 31 at 23:59:59.999
1943-01-01 - 1943-12-31: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
1944-01-01 - 1944-01-01: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends January 01 at 00:00:59.999
1944-04-01 - 1944-10-01: Daylight delta: +01; DST starts April 01 at 00:01:00 and ends October 01 at 00:00:59.999
1967-04-30 - 1967-10-29: Daylight delta: +01; DST starts April 30 at 02:00:00 and ends October 29 at 01:59:59.999

Noda Time zone intervals:

1942-02-09T09:00:00Z - 1944-01-01T06:01:00Z, -7, +1
1944-01-01T06:01:00Z - 1944-04-01T07:01:00Z, -7, +0
1944-04-01T07:01:00Z - 1944-10-01T06:01:00Z, -7, +1
1944-10-01T06:01:00Z - 1967-04-30T09:00:00Z, -7, +0

Well those are certainly “interesting” rules – and I can immediately see why Noda Time has rejected the .NET 6 rules. The third rule starts at 1944-01-01T07:00:00Z (assuming that the 1944-01-01T00:00:00 is in zone standard time of UTC-7) and finishes at 1944-01-01T06:01:00Z (assuming that the 1944-01-01T00:00:59.999 is in daylight time).

Part of the problem is that we’ve said before that if a rule ends on December 31st at 23:59:59, we’ll interpret that as being in standard time instead of being in daylight time… which means that the second rule would finish at 1944-01-01T07:00:00Z – but we genuinely want it to finish at 06:00:00Z, and maybe understand the third rule to mean 1944-01-01T06:00:00Z to 19:44-01-01T06:01:00Z for that last minute of daylight time before we observe standard time until April 1st.

We could do that by adding special rules:

  • If a rule appears to end before it starts, assume that the start should be treated as daylight time instead of standard time. (That would make the third rule valid, covering 1944-01-01T06:00:00Z to 19:44-01-01T06:01:00Z.)
  • If one rule ends after the next one starts, change its end point to the start of the next one. I currently have the reverse logic to this, changing the start point of the next one instead. That wouldn’t help us here. I can’t remember exactly why I’ve got the current logic: I need to add some comments on this bit of code…

Astonishingly, this works, getting us down to 8 errors on .NET 6. Of these, 6 are the same kind of error as Pacific/Wallis, but 2 are unfortunately of the form “you created a time zone successfully, but it doesn’t give the same results as the original one”. Hmm.

Still, let’s commit this change and move on to Pacific/Wallis.

Handling Pacific/Wallis

Once I’d added the Pacific/Wallis data, those tests passed – which means the problem must lie somewhere in how the results of the rules are interpreted in order to build up a DateTimeZone from the converted rules. That’s logic’s already in a BuildMap method, which I just need to make internal instead of private. That also contains some code which we’re essentially duplicating in the test code (around inserting standard zone intervals if they’re missing, and coalescing some zone intervals together). At some point I want to refactor both the production and test code to remove the duplication – but I want to get to working code first.

I’ll add a new test, just for Pacific/Wallis (as that’s the only test case we’ve got which is complete from the start to the end of time), and which just constructs the map. I expect it will throw an exception, so I’m not actually going top assert anything about the result yet.

Hmm. It doesn’t throw. That’s weird. Let’s rerun the full time zone tests to make sure we still have a problem at all… yes, it’s still failing.

At this point, my suspicion is that some of the code that is “duplicated” between production and test code really isn’t quite duplicated at all. Debugging the code on Linux is going to be annoying, so let’s go about this the old-fashioned way: logging.

I’d expected to be able to log the zone interval for each part of the map… but PartialZoneIntervalMap.GetZoneInterval fails, which is really, really weird. What’s even weirder is that the stack trace includes StandardDaylightAlternatingMap – which is only used in the Windows rules.

All my unit tests assume we’ve recognized that the adjustment rules are from Unix… but the ones for Pacific/Wallis actually look like Windows ones: on .NET 6, they start on January 1st, and end on December 31st.

Let’s add a grotty hack: I happen to know that Windows time zone data never has any “really old” rules other than ones that start at the beginning of time – if we see anything with a start year that’s not 1 and isn’t after (say) 1600, that’s going to be a Unix rule.

Put that extra condition in, and lo and behold, Pacific/Wallis starts working – hooray!

Let’s rerun everything…

So, running all the tests on every framework/platform pair that I can easily test, we get:

  • Linux, .NET 6: 2 failures – Australia/Broken_Hill and Antarctica/Macquarie
  • Linux, .NET Core 3.1: 3 failures – Asia/Dhaka, Australia/Broken_Hill and Antarctica/Macquarie
  • Windows, .NET 6: all tests pass
  • Windows, .NET Core 3.1: all tests pass

All the remaining failures are of the form “the offset is wrong at instant X”.

So, joy of joys, we’re back to collecting more data and adding more test cases.

First though, I’ll undo making a few things internal that didn’t actually help in the end. I might redo them later, but I don’t need them now. Basically the extra test to create the full map didn’t give me any more insight. (There’s a bunch of refactoring to do when I’ve got all the tests passing, but I might as well avoid having more changes than are known to be helpful.)

Going for green

At this point, I should reveal that I have a hunch. One bit of code I deleted when rewriting the “Unix rule to zone interval” conversion code was the opposite of the issue I described earlier of rules which are in daylight savings, but have a DaylightDelta of zero. The code I deleted basicaly said “If the time zone says it’s not in daylight savings, but DaylightDelta is non-zero, then treat it as zero anyway.” So I’m hoping that’s the issue, but I want to get the test data first. I’m hoping that it’s the same for all three time zones that are having problems though. We’ll start with Australia/Broken_Hill, which is failing during 1999. Dumping the rules in .NET 6 and .NET Core 3.1 under Linux, and looking at the Noda Time tzvalidate page, I get:

Base UTC offset: 09:30

.NET Core 3.1:
1998-10-25 - 1999-03-28: Daylight delta: +01; DST starts October 25 at 02:00:00 and ends March 28 at 02:59:59
1999-03-28 - 1999-10-31: Daylight delta: +00; DST starts March 28 at 02:00:00 and ends October 31 at 01:59:59
1999-10-31 - 1999-12-31: Daylight delta: +01; DST starts October 31 at 02:00:00 and ends December 31 at 23:59:59
1999-12-31 - 2000-03-26: Daylight delta: +01; DST starts December 31 at 23:00:00 and ends March 26 at 02:59:59

.NET 6:

1999-01-01 - 1999-03-28: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends March 28 at 02:59:59.999
1999-10-31 - 1999-12-31: Daylight delta: +01; DST starts October 31 at 02:00:00 and ends December 31 at 23:59:59.999
1999-12-31 - 1999-12-31: Daylight delta: +01; DST starts December 31 at 23:00:00 and ends December 31 at 23:59:59.999
2000-01-01 - 2000-03-26: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends March 26 at 02:59:59.999

Noda Time:
1998-10-24T16:30:00Z - 1999-03-27T16:30:00Z, +09:30, +01
1999-03-27T16:30:00Z - 1999-10-30T16:30:00Z, +09:30, +00
1999-10-30T16:30:00Z - 2000-03-25T16:30:00Z, +10:30, +01

Annoyingly, my test data parser doesn’t handle partial-hour base UTC offsets at the moment, so I’m going to move on to Antarctica/Macquarie – I’ll put Broken Hill back in later if I need to. The text format of Broken Hill would be:

Australia/Broken_Hill
Base offset = 09:30
---
1998-10-25 - 1999-03-28: Daylight delta: +01; DST starts October 25 at 02:00:00 and ends March 28 at 02:59:59
1999-03-28 - 1999-10-31: Daylight delta: +00; DST starts March 28 at 02:00:00 and ends October 31 at 01:59:59
1999-10-31 - 1999-12-31: Daylight delta: +01; DST starts October 31 at 02:00:00 and ends December 31 at 23:59:59
1999-12-31 - 2000-03-26: Daylight delta: +01; DST starts December 31 at 23:00:00 and ends March 26 at 02:59:59
---
1999-01-01 - 1999-03-28: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends March 28 at 02:59:59.999
1999-10-31 - 1999-12-31: Daylight delta: +01; DST starts October 31 at 02:00:00 and ends December 31 at 23:59:59.999
1999-12-31 - 1999-12-31: Daylight delta: +01; DST starts December 31 at 23:00:00 and ends December 31 at 23:59:59.999
2000-01-01 - 2000-03-26: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends March 26 at 02:59:59.999
---
1998-10-24T16:30:00Z - 1999-03-27T16:30:00Z, +09:30, +01
1999-03-27T16:30:00Z - 1999-10-30T16:30:00Z, +09:30, +00
1999-10-30T16:30:00Z - 2000-03-25T16:30:00Z, +10:30, +01

However, let’s have a look at Antarctica/Macquarie, which is broken in 2009. Here’s the data:

Antarctica/Macquarie
Base UTC offset: +10

.NET Core 3.1:
2008-10-05 - 2009-04-05: Daylight delta: +01; DST starts October 05 at 02:00:00 and ends April 05 at 02:59:59
2009-04-05 - 2009-10-04: Daylight delta: +00; DST starts April 05 at 02:00:00 and ends October 04 at 01:59:59
2009-10-04 - 2009-12-31: Daylight delta: +01; DST starts October 04 at 02:00:00 and ends December 31 at 23:59:59
2009-12-31 - 2011-04-03: Daylight delta: +01; DST starts December 31 at 23:00:00 and ends April 03 at 02:59:59

.NET 6.0:
2008-10-05 - 2008-12-31: Daylight delta: +01; DST starts October 05 at 02:00:00 and ends December 31 at 23:59:59.999
2009-01-01 - 2009-04-05: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends April 05 at 02:59:59.999
2009-10-04 - 2009-12-31: Daylight delta: +01; DST starts October 04 at 02:00:00 and ends December 31 at 23:59:59.999
2009-12-31 - 2009-12-31: Daylight delta: +01; DST starts December 31 at 23:00:00 and ends December 31 at 23:59:59.999
2010-01-01 - 2010-12-31: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
2011-01-01 - 2011-04-03: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends April 03 at 02:59:59.999

Noda Time:
2008-10-04T16:00:00Z - 2009-04-04T16:00:00Z +10, +01
2009-04-04T16:00:00Z - 2009-10-03T16:00:00Z +10, +00
2009-10-03T16:00:00Z - 2011-04-02T16:00:00Z +10, +01

(The .NET 6 data needs to go as far as 2011 in order to include all the zone intervals for 2009, because there were no changes in 2010.)

Good news! The test fails… but only in .NET Core 3.1.

(A few days later… this blog post is being written in sporadic bits of spare time.)

Okay, let’s check I can still reproduce this – in .NET 6 on Linux, BclDateTimeZone test that converts a TimeZoneInfo to a BclDateTimeZone fails for Antarctica/Macquarie because it gives the wrong offset at 2009-10-04T00:00:00Z – the TimeZoneInfo reports +11, and the BclDateTimeZone reports +10. But the unit test for the .NET 6 data apparently gives the correct ZoneInterval. Odd.

Again, this is tricky to debug, so I’ll add some logging. Just a simple “new” test that logs all of the zone intervals in the relevant period. The results are:

Australian Eastern Standard Time: [2008-04-05T16:00:00Z, 2008-10-04T16:00:00Z) +10 (+00)
Australian Eastern Daylight Time: [2008-10-04T16:00:00Z, 2009-04-04T16:00:00Z) +11 (+01)
Australian Eastern Standard Time: [2009-04-04T16:00:00Z, 2009-12-31T13:00:00Z) +10 (+00)
Australian Eastern Daylight Time: [2009-12-31T13:00:00Z, 2011-04-02T16:00:00Z) +11 (+01)
Australian Eastern Standard Time: [2011-04-02T16:00:00Z, 2011-10-01T16:00:00Z) +10 (+00)
Australian Eastern Daylight Time: [2011-10-01T16:00:00Z, 2012-03-31T16:00:00Z) +11 (+01)

It looks like we’re not adding in the implicit standard time zone interval between April and October 2009. This is code that I’m aiming to deduplicate between the production code and the rule-data-oriented unit tests – it looks like the unit test code is doing the right thing, but the production code isn’t.

(In the process of doing this, I’ve decided to suppress warning CA1303 – it’s completely useless for me, and actively hinders simple debug-via-console-logging.)

Adding extra logging to the BuildMap method, it looks like we’ve already lost the information by the time we get there: there’s no sign of the October 2009 date anywhere. Better look further back in the code…

… and immediately spot the problem. This code, intended to handle overlapping rules:

convertedRules[i - 1] = convertedRules[i].WithEnd(convertedRules[i].Start);

… should be:

convertedRules[i - 1] = convertedRules[i - 1].WithEnd(convertedRules[i].Start);

Again, that’s code which isn’t covered by the rule-data-oriented unit tests. I’m really looking forward to removing that duplication. Anyway, let’s see what that fix leaves us with… oh. It hasn’t actually fixed it. Hmm.

Ha. I fixed it in this blog post but not in the actual code. So it’s not exactly a surprise that the tests were still broken!

Having actually fixed it, now the only failing test in .NET 6 on Linux is the one testing the .NET Core 3.1 data for Antarctica/Macquarie. Hooray! Running the tests for .NET Core 3.1, that’s also the only failing test. The real time zone seems to be okay. That’s odd… and suggests that my test data was incorrectly transcribed. Time to check it again… no, it really is that test data. Hmm. Maybe this time there’s another bug in the code that’s intended to be duplicated between production and tests, but this time with the bug in the test code.

Aha… the .NET Core 3.1 test code didn’t have the “first fix up overlapping rules” code that’s in the .NET 6 tests. The circumstances in which that fix-up is needed happen much more rarely when using the .NET Core 3.1 rules – this is the first time we’d needed it for the rule-data-oriented tests, but it was happening unconditionally in the production code. So that makes sense.

Copy that code (which now occurs three times!) and the tests pass.

All the tests are green, across Windows and Linux, .NET Core 3.1 and .NET 6.0 – wahoo!

Time for refactoring

First things first: commit the code that’s working. I’m pretty reasonable at refactoring, but I wouldn’t put it past myself to mess things up.

Okay, let’s try to remove as much of the code in the tests as possible. They should really be pretty simple. It’s pretty easy to extract the code that fixes up overlapping adjustment rules – with a TODO comment that says I should really add a test to make sure the overlap is an expected kind (i.e. by exactly the amount of daylight savings, which should be the same for both rules). I’ll add that later.

The part about coalescing adjacent intervals is trickier though – that’s part of a process creating a “full” zone interval map, extending to the start and end of time. It’s useful to do that, but it leaves us with a couple of awkward aspects of the existing test data. Sometimes we have DST adjustment rules at the start or end of the .NET 6 data just so that the desired standard time rule can be generated.

In the tests we just committed, we accounted for that separately by removing those “extra” rules before validating the results. It’s harder to do that in the new tests, and it goes against the aim of making the tests as simple as possible. Additionally, if the first or last zone interval is a standard time one, the “create full map” code will extend those intervals to the start of time, which isn’t what we want either.

Rather than adding more code to handle this, I’ll just document that all test data must start and end with a daylight zone interval – or the start/end of time. Most of the test data already complies with this – I just need to add a bit more information for the others.

Interestingly, while doing this, I found that there’s an odd discrepancy between data sources for Europe/Prague in 1945 in terms of when daylight savings started:

  • TimeZoneInfo says April 2nd in one rule, and then May 8th in another
  • Noda Time (and tzvalidate, and zdump) says April 2nd
  • timeanddate.com says April 8th

Fortunately, specifying both of the rules from TimeZoneInfo ends up with the tests passing, so I’ll take that.

With those data changes in, everything’s nicer and green – so let’s commit again.

Next up, there’s a bit of rearranging of the test code itself. Literally just moving code around, mostly moving it into the two nested helper classes.

Again, run all the tests – still green, so let’s commit again. (I’m planning on squashing all of these commits together by the end, of course.)

Next there’s a simplification option which I’d noted before but never found time to implement – just removing a bit of redundancy. While fixing that, I’ve noticed we’ve got IZoneIntervalMap and IZoneIntervalMapWithMinMax… if we could add min/max offset properties to IZoneIntervalMap, we could simplify things further. I’ve just added a TODO for this though, as it’s a larger change than I want to include immediately. Run tests, commit on green.

When comments become more than comments

Now for some more comments. The code I’ve got works, but the code to handle the corner cases in BclAdjustmentRule.ConvertUnixRuleToBclAdjustmentRule isn’t commented clearly. For every case, I’m going to have a comment that:

  • Explains what the data looks like
  • Explains what it’s intended to mean
  • Gives a sample zone + framework (ideally with a date) so I can look at the data again later on

Additionally, the code that goes over a set of converted Unix rules and fixes up the end instant for overlapping rules needs:

  • More comments including an example
  • Validation to ensure the fix-up only occurs in an expected scenario
  • A test for that validation (with deliberately broken rules)

When trying to do this, I’ve found it hard to justify some of the code. It’s all just a bit too hacky. With tests in place that can be green, I’ve tried to improve things – in particular, there was some code to handle a rule ending at the very end of the year, which changed the end point from daylight time to standard time. That feels like it’s better handled in the “fix-up” code… but even that’s really hard to justify.

What I can do is leave the code there with some examples of what fails. I’m still hoping I can simplify it more later though.

A new fly in the ointment

In the course of doing this, I’ve discovered one additional tricky aspect: in .NET 6, the last adjustment rule can be a genuine alternating one rather than a fixed one. For example, Europe/London finishes with:

2037-10-25 - 9999-12-31: Daylight delta: +01; DST starts Last Sunday of March; 01:00:00 and ends Last Sunday of October; 02:00:00

Currently we don’t take that into account, and that will make life trickier. Sigh. That rule isn’t too hard to convert, but it means I’m unlikely to get there today after all.

It shouldn’t be too hard to test this though: we currently have this line of code in BclDateTimeZoneTest:

// Currently .NET Core doesn't expose the information we need to determine any DST recurrence
// after the final tzif rule. For the moment, limit how far we check.
// See https://github.com/dotnet/corefx/issues/17117
int endYear = TestHelper.IsRunningOnDotNetCoreUnix ? 2037 : 2050;

That now needs to be “on .NET Core 3.1, use 2037; on .NET 6 use 2050”. With that change in place, I expect the tests to fail. I’ve decided I won’t actually implement that in the first PR; let’s get all the existing tests working first, then extend them later.

Let’s get it merged…

Even though there’s more work to do, this is much better than it was.

It’s time to get it merged, and then finish up the leftover work. I might also file an issue asking for Microsoft to improve the documentation and see if they’re able to provide sample code that makes sense of all of this…

What’s up with TimeZoneInfo on .NET 6? (Part 1)

.NET 6 was released in November 2021, and includes two new types which are of interest to date/time folks: DateOnly and TimeOnly. (Please don’t add comments saying you don’t like the names.) We want to support these types in Noda Time, with conversions between DateOnly and LocalDate, and TimeOnly and LocalTime. To do so, we’ll need a .NET-6-specific target.

Even as a starting point, this is slightly frustrating – we had conditional code differentiating between .NET Framework and PCLs for years, and we finally removed it in 2018. Now we’re having to reintroduce some. Never mind – it can’t be helped, and this is at least simple conditional code.

Targeting .NET 6 requires the .NET 6 SDK of course – upgrading to that was overdue anyway. That wasn’t particularly hard, although it revealed some additional nullable reference type warnings that needed fixing, almost all in tests (or in IComparable.Compare implementations).

Once everything was working locally, and I’d updated CI to use .NET 6 as well, I figured I’d be ready to start on the DateOnly and TimeOnly support. I was wrong. The pull request intending just to support .NET 6 with absolutely minimal changes failed its unit tests in CI running on Linux. There were 419 failures out of a total of 19334. Ouch! It looked like all of them were in BclDateTimeZoneTest – and fixing those issues is what this post is all about.

Yesterday (at the time of writing – by the time this post is finished it may be in the more distant past) I started trying to look into what was going on. After a little while, I decided that this would be worth blogging about – so most of this post is actually written as I discover more information. (I’m hoping that folks find my diagnostic process interesting, basically.)

Time zones in Noda Time and .NET

Let’s start with a bit of background about how time zones are represented in .NET and Noda Time.

In .NET, time zones are represented by the TimeZoneInfo class (ignoring the legacy TimeZone class). The data used to perform the underlying calculation of “what’s the UTC offset at a given instant in this time zone” are exposed via the GetAdjustmentRules() method, returning an array of the nested TimeZoneInfo.AdjustmentRule class. TimeZoneInfo instances are usually acquired via either TimeZoneInfo.FindSystemTimeZoneById(), TimeZoneInfo.GetSystemTimeZones(), or the TimeZoneInfo.Local static property. On Windows the information is populated from the Windows time zone database (which I believe is in the registry); on Linux it’s populated from files, typically in the /usr/share/zoneinfo directory. For example, the file /usr/share/zoneinfo/Europe/London file contains information about the time zone with the ID “Europe/London”.

In Noda Time, we separate time zones from their providers. There’s an abstract DateTimeZone class, with one public derived class (BclDateTimeZone) and various internal derived classes (FixedDateTimeZone, CachedDateTimeZone, PrecalculatedDateTimeZone) in the main NodaTime package. There are also two public implementations in the NodaTime.Testing package. Most code shouldn’t need to use anything other than DateTimeZone – the only reason BclDateTimeZone is public is to allow users to obtain the original TimeZoneInfo instance that any given BclDateTimeZone was created from.

Separately, there’s an IDateTimeZoneProvider interface. This only has a single implementation normally: DateTimeZoneCache. That cache makes that underlying provider code simpler, as it only has to implement IDateTimeZoneSource (which most users never need to touch). There are two implementations of IDateTimeZoneSource: BclDateTimeZoneSource and TzdbDateTimeZoneSource. The BCL source is for interop with .NET: it uses TimeZoneInfo as a data source, and basically adapts it into a Noda Time representation. The TZDB source implements the IANA time zone database – there’s a “default” set of data built into Noda Time, but you can also load specific data should you need to. (Noda Time uses the term “TZDB” everywhere for historical reasons – when the project started in 2009, IANA wasn’t involved at all. In retrospect, it would have been good to change the name immediately when IANA did get involved in 2011 – that was before the 1.0 release in 2012.)

This post is all about how BclDateTimeZone handles the adjustment rules in TimeZoneInfo. Unfortunately the details of TimeZoneInfo.AdjustmentRule have never been very clearly documented (although it’s better now – see later), and I’ve blogged before about their strange behaviour. The source code for BclDateTimeZone has quite a few comments explaining “unusual” code that basically tries to make up for this. Over the course of writing this post, I’ll be adding some more.

Announced TimeZoneInfo changes in .NET 6

I was already aware that there might be some trouble brewing in .NET 6 when it came to Noda Time, due to enhancements announced when .NET 6 was released. To be clear, I’m not complaining about these enhancements. They’re great for the vast majority of users: you can call TimeZoneInfo.FindSystemTimeZoneById with either an IANA time zone ID (e.g. “Europe/London”) or a Windows time zone ID (e.g. “GMT Standard Time” for the UK, even when it’s not on standard time) and it will return you the “right” time zone, converting the ID if necessary. I already knew I’d need to check what Noda Time was doing and exactly how .NET 6 behaved, to avoid problems.

I suspect that the subject of this post is actually caused by this change though:

Two other minor improvements were made to how adjustment rules are populated from IANA data internally on non-Windows operating systems. They don’t affect external behavior, other than to ensure correctness in some edge cases.

Ensure correctness, eh? They don’t affect external behavior? Hmm. Given what I’ve already seen, I’m pretty sure I’m going to disagree with that assessment. Still, let’s plough on.

Getting started

The test errors in CI (via GitHub actions) seemed to fall into two main buckets, on a very brief inspection:

  • Failure to convert a TimeZoneInfo into a BclDateTimeZone at all (BclDateTimeZone.FromTimeZoneInfo() throwing an exception)
  • Incorrect results when using a BclDateTimeZone that has been converted. (We validate that BclDateTimeZone gives the same UTC offsets as TimeZoneInfo around all the transitions that we’ve detected, and we check once a week for about 100 years as well, just in case we missed any transitions.)

The number of failures didn’t bother me – this is the sort of thing where a one-line change can fix hundreds of tests. But without being confident of where the problem was, I didn’t want to start a “debugging via CI” cycle – that’s just awful.

I do have a machine that can dual boot into Linux, but it’s only accessible when I’m in my shed (as opposed to my living room or kitchen), making it slightly less convenient for debugging than my laptop. But that’s not the only option – there’s WSL 2 which I hadn’t previously looked at. This seemed like the perfect opportunity.

Installing WSL 2 was a breeze, including getting the .NET 6 SDK installed. There’s one choice I’ve made which may or may not be the right one: I’ve cloned the Noda Time repo within Linux, so that when I’m running the tests there it’s as close to being on a “regular” Linux system as normal. I can still use Visual Studio to edit the files (via the WSL mount point of \\wsl.localhost), but it’ll be slightly fiddly to manage. The alternative would be to avoid cloning any of the source code within the Linux file system, instead running the tests from WSL against the source code on the Windows file system. I may change my mind over the best approach half way through…

First, the good news: running the tests against the netcoreapp3.1 target within WSL 2, everything passed first time. Hooray!

Now the bad news: I didn’t get the same errors in WSL 2 that I’d seen in CI. Instead of 419, there were 1688! Yikes. They were still all within BclDateTimeZoneTest though, so I didn’t investigate that discrepancy any further – it may well be a difference in terms of precise SDK versions, or Linux versions. We clearly want everything to work on WSL 2, so let’s get that working first and see what happens in CI. (Part of me does want to understand the differences, to avoid a situation where the tests could pass in CI but not in WSL 2. I may come back to that later, when I understand everything more.)

First issue: abutting maps

The first exception reported in WSL 2 – accounting for the majority of errors – was a conversion failure:

NodaTime.Utility.DebugPreconditionException : Maps must abut (parameter name: maps)

The “map” in question is a PartialZoneIntervalMap, which maps instants to offsets over some interval of time. A time zone (at least for BclDateTimeZone) is created from a sequence of PartialZoneIntervalMaps, where the end of map n is the start of map n+1. The sequence has to cover the whole of time.

As it happens, by the time I’m writing this, I know what the immediate problem is here (because I fixed it last night, before starting to write this blog post) but in the interests of simplicity I’m going to effectively ignore what I did last night, beyond this simplified list:

  • I filtered the tests down to a single time zone (to get a smaller log)
  • I added more information to the exception (showing the exact start/end that were expected to be the same)
  • I added Console.WriteLine logging to BclDateTimeZone construction to dump a view of the adjustment rules
  • I observed and worked around an oddity that we’ll look at shortly

Looking at this now, the fact that it’s a DebugPreconditionException makes me wonder whether this is the difference between CI and local failures: for CI, we run in release mode. Let’s try running the tests in release mode… and yes, we’re down to 419 failures, the same as for CI! That’s encouraging, although it suggests that I might want CI to run tests in debug as well as release mode – at least when the main branch has been updated.

Even before the above list of steps, it seemed likely that the problems would be due changes in the adjustment rule representation in TimeZoneInfo. So at this point, let’s take a steps back and look at what’s meant to be in an adjustment rule, and what we observe in both .NET Core 3.1 and .NET 6.

What’s in an AdjustmentRule?

An adjustment rule covers an interval of time, and describes how the time zone behaves during that interval. (A bit like the PartialZoneIntervalMap mentioned above.)

Let’s start with some good news: it looks like the documentation for TimeZoneInfo.AdjustmentRule has been improved since I last looked at it. It has 6 properties:

  • BaseUtcOffsetDelta: this is only present in .NET 6, and indicates the difference between “the UTC offset of Standard Time returned by TimeZoneInfo.BaseUtcOffset” and “the UTC offset of Standard Time when this adjustment rule is active”. Effectively this makes up for Windows time zones historically not being able to represent the concept of a zone’s standard time changing.
  • DateStart/DateEnd: the date interval during which the rule applies.
  • DaylightDelta: the delta between standard time and daylight time during this rule. This is typically one hour.
  • DaylightTransitionStart/DaylightTransitionEnd: the information about when the time zone starts and ends daylight saving time (DST) while this rule is in force.

Before we go into the details of DST, there are two “interesting” aspects to DateStart/DateEnd:

Firstly, the documentation doesn’t say whether the rule applies between those UTC dates, or those local dates. I believe they’re local – but that’s an awkard way of specifying things, as local date/time values can be skipped or ambiguous. I really wish this has been set to UTC, and documented as such. Additionally, although you’d expect the transition from one rule to the next to be at midnight (given that only the start/end are only dates), the comments in my existing BclDateTimeZone code suggest that it’s actually at a time of day that depends on the DST transitions times. (It’s very possible that my code is wrong, of course. We’ll look at that in a bit.)

Secondly, the documentation includes this interesting warning (with an example which I’ve snipped out):

Unless there is a compelling reason to do otherwise, you should define the adjustment rule’s start date to occur within the time interval during which the time zone observes standard time. Unless there is a compelling reason to do so, you should not define the adjustment rule’s start date to occur within the time interval during which the time zone observes daylight saving time.

Why? What is likely to go wrong if you violate this? This sort of “here be dragons, but only vaguely specified ones” documentation always feels unhelpful to me. (And yes, I’ve probably written things like that too…)

Anyway, let’s look at the TimeZoneInfo.TransitionTime struct, which is the type of DaylightTransitionStart and DaylightTransitionEnd. The intention is to be able to represent ideas like “3am on February 25th” or “2am on the third Sunday in October”. The first of these is a fixed date rule; the second is a floating date rule (because the day-of-month of “the third Sunday in October” depends on the year). TransitionTime is a struct with 6 properties:

  • IsFixedDateRule: true for fixed date rules; false for floating date rules
  • Day (only relevant for fixed date rules): the day-of-month on which the transition occurs
  • DayOfWeek (only relevant for floating date rules): the day-of-week on which the transition occurs
  • Week (only relevant for floating date rules): confusingly, this isn’t really “the week of the month” on which the transition occurs; it’s “the occurrence of DayOfWeek on which the transition occurs”. (The idea of a “Monday to Sunday” or “Sunday to Saturday” week is irrelevant here; it’s just “the first Sunday” or “the second Sunday” etc.) If this has a value of 5, it means “last” regardless of whether that’s the fourth or fifth occurrence.
  • Month the month of year in which the transition occurs
  • TimeOfDay: the local time of day prior to the transition, at which the transition occurs. (So for a transition that skips forward from 1am to 2am for example, this would be 1am. For a transition that skips back from 2am to 1am, this would be 2am.)

Let’s look at the data

From here on, I’m writing and debugging at the same time – any stupid mistakes I make along the way will be documented. (I may go back to indicate that it turned out an idea was stupid at the start of that idea, just to avoid anyone else following it.)

Rather than trying to get bogged down in what the existing Noda Time implementation does, I think it would be useful to compare the data for the same time zone in Windows and Linux, .NET Core 3.1 and .NET 6.

Aha! It looks like I’ve had this idea before! The tool already exists as NodaTime.Tools.DumpTimeZoneInfo. I just need to target it for .NET 6 as well, and add the .NET-6-only BaseUtcOffsetDelta property the completeness.

Interlude: WSL 2 root file issues

Urgh. For some reason, something (I suspect it’s Visual Studio or a background process launched by it, but I’m not sure) keeps on creating files (or modifying existing files) so they’re owned by the root user on the Linux file system. Rather than spending ages investigating this, I’m just going to switch to the alternative mode: use my existing git repo on the Windows file system, and run the code that’s there from WSL when I need to.

(I’m sure this is all configurable and feasible; I just don’t have the energy right now.)

Back to the data…

I’m going to use London as my test time zone, mostly because that’s the time zone I live in, but also because I know it has an interesting oddity between 1968 and 1971, where the UK was on “British Standard Time” – an offset of UTC+1, like “British Summer Time” usually is, but this was “permanent standard time”. In other words, for a few years, our standard UTC offset changed. I’m expecting that to show up in the BaseUtcOffsetDelta property.

So, let’s dump some of the data for the Europe/London time zone, with both .NET Core 3.1 and .NET 6. The full data is very long (due to how the data is represented in the IANA binary format) but here are interesting portions of it, including the start, the British Standard Time experiment, this year (2022) and the last few lines:

.NET Core 3.1:

Zone ID: Europe/London
Display name: (UTC+00:00) GMT
Standard name: GMT
Daylight name: GMT+01:00
Base offset: 00:00:00
Supports DST: True
Rules:
0001-01-01 - 1847-12-01: Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 01 at 00:01:14
1847-12-01 - 1916-05-21: Daylight delta: +00; DST starts December 01 at 00:01:15 and ends May 21 at 01:59:59
1916-05-21 - 1916-10-01: Daylight delta: +01; DST starts May 21 at 02:00:00 and ends October 01 at 02:59:59
1916-10-01 - 1917-04-08: Daylight delta: +00; DST starts October 01 at 02:00:00 and ends April 08 at 01:59:59
...
1967-03-19 - 1967-10-29: Daylight delta: +01; DST starts March 19 at 02:00:00 and ends October 29 at 02:59:59
1967-10-29 - 1968-02-18: Daylight delta: +00; DST starts October 29 at 02:00:00 and ends February 18 at 01:59:59
1968-02-18 - 1968-10-26: Daylight delta: +01; DST starts February 18 at 02:00:00 and ends October 26 at 23:59:59
1968-10-26 - 1971-10-31: Daylight delta: +00; DST starts October 26 at 23:00:00 and ends October 31 at 01:59:59
1971-10-31 - 1972-03-19: Daylight delta: +00; DST starts October 31 at 02:00:00 and ends March 19 at 01:59:59
1972-03-19 - 1972-10-29: Daylight delta: +01; DST starts March 19 at 02:00:00 and ends October 29 at 02:59:59
1972-10-29 - 1973-03-18: Daylight delta: +00; DST starts October 29 at 02:00:00 and ends March 18 at 01:59:59
...
2022-03-27 - 2022-10-30: Daylight delta: +01; DST starts March 27 at 01:00:00 and ends October 30 at 01:59:59
2022-10-30 - 2023-03-26: Daylight delta: +00; DST starts October 30 at 01:00:00 and ends March 26 at 00:59:59
...
2036-03-30 - 2036-10-26: Daylight delta: +01; DST starts March 30 at 01:00:00 and ends October 26 at 01:59:59
2036-10-26 - 2037-03-29: Daylight delta: +00; DST starts October 26 at 01:00:00 and ends March 29 at 00:59:59
2037-03-29 - 2037-10-25: Daylight delta: +01; DST starts March 29 at 01:00:00 and ends October 25 at 01:59:59
2037-10-25 - 9999-12-31: Daylight delta: +01; DST starts October 25 at 01:00:00 and ends December 31 at 23:59:59

.NET 6:

Zone ID: Europe/London
Display name: (UTC+00:00) United Kingdom Time
Standard name: Greenwich Mean Time
Daylight name: British Summer Time
Base offset: 00:00:00
Supports DST: True
Rules:
0001-01-01 - 0001-12-31: Base UTC offset delta: -00:01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
0002-01-01 - 1846-12-31: Base UTC offset delta: -00:01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
1847-01-01 - 1847-12-01: Base UTC offset delta: -00:01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 01 at 00:01:14.999
1916-05-21 - 1916-10-01: Daylight delta: +01; DST starts May 21 at 02:00:00 and ends October 01 at 02:59:59.999
1917-04-08 - 1917-09-17: Daylight delta: +01; DST starts April 08 at 02:00:00 and ends September 17 at 02:59:59.999
1918-03-24 - 1918-09-30: Daylight delta: +01; DST starts March 24 at 02:00:00 and ends September 30 at 02:59:59.999
...
1967-03-19 - 1967-10-29: Daylight delta: +01; DST starts March 19 at 02:00:00 and ends October 29 at 02:59:59.999
1968-02-18 - 1968-10-26: Daylight delta: +01; DST starts February 18 at 02:00:00 and ends October 26 at 23:59:59.999
1968-10-26 - 1968-12-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts October 26 at 23:00:00 and ends December 31 at 23:59:59.999
1969-01-01 - 1970-12-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
1971-01-01 - 1971-10-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends October 31 at 01:59:59.999
1972-03-19 - 1972-10-29: Daylight delta: +01; DST starts March 19 at 02:00:00 and ends October 29 at 02:59:59.999
1973-03-18 - 1973-10-28: Daylight delta: +01; DST starts March 18 at 02:00:00 and ends October 28 at 02:59:59.999
...
2022-03-27 - 2022-10-30: Daylight delta: +01; DST starts March 27 at 01:00:00 and ends October 30 at 01:59:59.999
...
2037-03-29 - 2037-10-25: Daylight delta: +01; DST starts March 29 at 01:00:00 and ends October 25 at 01:59:59.999
2037-10-25 - 9999-12-31: Daylight delta: +01; DST starts Last Sunday of March; 01:00:00 and ends Last Sunday of October; 02:00:00

Wow… that’s quite a difference. Let’s see:

  • The names (display/standard/daylight) are all different – definitely better in .NET 6.
  • .NET 6 appears to have one rule for the year 1, and then another (but identical) for years 2 to 1846
  • .NET 6 doesn’t have any rules between 1847 and 1916
  • .NET 6 only uses one rule per year, starting and ending at the DST boundaries; .NET Core 3.1 had one rule for each transition
  • The .NET Core 3.1 rules end at 59 minutes past the hour (e.g. 01:59:59) whereas the .NET 6 rules finish 999 milliseconds later

Fixing the code

So my task is to “interpret” all of this rule data in Noda Time, bearing in mind that:

  • It needs to work with Windows data as well (which has its own quirks)
  • It probably shouldn’t change logic based on which target framework it was built against, as I suspect it’s entirely possible
    for the DLL targeting .NET Standard 2.0 to end up running in .NET 6.

We do already have code that behaves differently based on whether it believes
the rule data comes from Windows or Unix – Windows rules always start on January 1st and end on December 31st, so if all
the rules in a zone follow that pattern, we assume we’re dealing with Windows data. That makes it slightly easier.

Likewise, we already have code that assumes any gaps between rules are in standard time – so actually the fact that .NET 6 only reports half as many rules probably won’t cause a problem.

Let’s start by handling the difference of transitions finishing at x:59:59 vs x:59:59.999. The existing code always adds 1 second to the end time, to account for x:59:59. It’s easy enough to adjust that to add either 1 second or 1 millisecond. This error was what caused our maps to have problems, I suspect. (We’d have a very weird situation in a few cases where one map started after the previous one ended.)

// This is added to the declared end time, so that it becomes an exclusive upper bound.
var endTimeCompensation = Duration.FromSeconds(1) - Duration.FromMilliseconds(bclLocalEnd.Millisecond);

Let’s try it: dotnet test -f net6.0

Good grief. Everything passed. Better try it with 3.1 as well: dotnet test -f netcoreapp3.1

Yup, everything passed there, too. And on Windows, although that didn’t surprise me much, given that we have separate paths.

This surprises me for two reasons:

  • Last night, when just experimenting, I made a change to just subtract bclLocalEnd.Millisecond milliseconds from bclLocalEnd (i.e. truncate it down). That helped a lot, but didn’t fix everything.
  • The data has changed really quite substantially, so I’m surprised that there aren’t extra issues. Do we get the “standard offset” correct during the British Standard Time experiment, for example?

I’m somewhat suspicious of the first bullet point… so I’m going to stash the fix, and try to reproduce last night.

Testing an earlier partial fix (or not…)

First, I remember that I did something I definitely wanted to keep last night. When adjacent maps don’t abut, let’s throw a better exception.

So before I do anything else, let’s reproduce the original errors: dotnet test -f net6.0

Ah. It still passes. Doh! When I thought I was running the .NET 6 tests under Linux, it turned out I was still in a Windows tab in Windows Terminal. (I use bash in all my terminals, so there’s not quite as much distinction as you might expect.) Well, that at least explains why the small fix worked rather better than expected. Sigh.

Okay, let’s rerun the tests… and they fail as expected. Now let’s add more details to the exception before reapplying the fix… done.

The resulting exception is clearer, and makes it obvious that the error is due to the 999ms discrepancy:

NodaTime.Utility.DebugPreconditionException : Maps must abut: 0002-01-01T00:00:00.999 != 0002-01-01T00:00:00

Let’s reapply the fix from earlier, which we expect to solve that problem but not everything. Retest… and we’re down to 109 failures rather than 1688. Much better, but not great.

Let’s understand one new error

We’re still getting errors of non-abutting maps, but now they’re (mostly) an hour out, rather than 999ms. Here’s one from Europe/Prague:

NodaTime.Utility.DebugPreconditionException : Maps must abut: 1947-01-01T00:00:00 != 1946-12-31T23:00:00

Most errors are in the 20th century, although there are some in 2038 and 2088, which is odd. Let’s have a look at the raw data for Prague around the time that’s causing problems, and we can see whether fixing just Prague helps with anything else.

.NET 6 data:

1944-04-03 - 1944-10-02: Daylight delta: +01; DST starts April 03 at 02:00:00 and ends October 02 at 02:59:59.999
1945-04-02 - 1945-05-08: Daylight delta: +01; DST starts April 02 at 02:00:00 and ends May 08 at 23:59:59.999
1945-05-08 - 1945-10-01: Daylight delta: +01; DST starts May 08 at 23:00:00 and ends October 01 at 02:59:59.999
1946-05-06 - 1946-10-06: Daylight delta: +01; DST starts May 06 at 02:00:00 and ends October 06 at 02:59:59.999
1946-12-01 - 1946-12-31: Daylight delta: -01; DST starts December 01 at 03:00:00 and ends December 31 at 23:59:59.999
1947-01-01 - 1947-02-23: Daylight delta: -01; DST starts January 01 at 00:00:00 and ends February 23 at 01:59:59.999
1947-04-20 - 1947-10-05: Daylight delta: +01; DST starts April 20 at 02:00:00 and ends October 05 at 02:59:59.999
1948-04-18 - 1948-10-03: Daylight delta: +01; DST starts April 18 at 02:00:00 and ends October 03 at 02:59:59.999
1949-04-09 - 1949-10-02: Daylight delta: +01; DST starts April 09 at 02:00:00 and ends October 02 at 02:59:59.999
1979-04-01 - 1979-09-30: Daylight delta: +01; DST starts April 01 at 02:00:00 and ends September 30 at 02:59:59.999

This is interesting – most years have just one rule, but the three years of 1945-1947 have two rules each.

Let’s look at the .NET Core 3.1 representation – which comes from the same underlying file, as far as I’m aware:

1944-10-02 - 1945-04-02: Daylight delta: +00; DST starts October 02 at 02:00:00 and ends April 02 at 01:59:59
1945-04-02 - 1945-05-08: Daylight delta: +01; DST starts April 02 at 02:00:00 and ends May 08 at 23:59:59
1945-05-08 - 1945-10-01: Daylight delta: +01; DST starts May 08 at 23:00:00 and ends October 01 at 02:59:59
1945-10-01 - 1946-05-06: Daylight delta: +00; DST starts October 01 at 02:00:00 and ends May 06 at 01:59:59
1946-05-06 - 1946-10-06: Daylight delta: +01; DST starts May 06 at 02:00:00 and ends October 06 at 02:59:59
1946-10-06 - 1946-12-01: Daylight delta: +00; DST starts October 06 at 02:00:00 and ends December 01 at 02:59:59
1946-12-01 - 1947-02-23: Daylight delta: -01; DST starts December 01 at 03:00:00 and ends February 23 at 01:59:59
1947-02-23 - 1947-04-20: Daylight delta: +00; DST starts February 23 at 03:00:00 and ends April 20 at 01:59:59
1947-04-20 - 1947-10-05: Daylight delta: +01; DST starts April 20 at 02:00:00 and ends October 05 at 02:59:59
1947-10-05 - 1948-04-18: Daylight delta: +00; DST starts October 05 at 02:00:00 and ends April 18 at 01:59:59
1948-04-18 - 1948-10-03: Daylight delta: +01; DST starts April 18 at 02:00:00 and ends October 03 at 02:59:59
1948-10-03 - 1949-04-09: Daylight delta: +00; DST starts October 03 at 02:00:00 and ends April 09 at 01:59:59
1949-04-09 - 1949-10-02: Daylight delta: +01; DST starts April 09 at 02:00:00 and ends October 02 at 02:59:59
1949-10-02 - 1978-12-31: Daylight delta: +00; DST starts October 02 at 02:00:00 and ends December 31 at 23:59:59
1979-01-01 - 1979-04-01: Daylight delta: +00; DST starts January 01 at 00:00:00 and ends April 01 at 01:59:59

Okay, so that makes a certain amount of sense – it definitely shows that there was something unusual happening in the Europe/Prague time zone. Just as one extra point of data, let’s look at the nodatime.org tzvalidate results – this shows all transitions. (tzvalidate is a format designed to allow authors of time zone library code to validate that they’re interpreting the IANA data the same way as each other.)

Europe/Prague
Initially:           +01:00:00 standard CET
1944-04-03 01:00:00Z +02:00:00 daylight CEST
1944-10-02 01:00:00Z +01:00:00 standard CET
1945-04-02 01:00:00Z +02:00:00 daylight CEST
1945-10-01 01:00:00Z +01:00:00 standard CET
1946-05-06 01:00:00Z +02:00:00 daylight CEST
1946-10-06 01:00:00Z +01:00:00 standard CET
1946-12-01 02:00:00Z +00:00:00 daylight GMT
1947-02-23 02:00:00Z +01:00:00 standard CET
1947-04-20 01:00:00Z +02:00:00 daylight CEST
1947-10-05 01:00:00Z +01:00:00 standard CET
1948-04-18 01:00:00Z +02:00:00 daylight CEST
1948-10-03 01:00:00Z +01:00:00 standard CET
1949-04-09 01:00:00Z +02:00:00 daylight CEST
1949-10-02 01:00:00Z +01:00:00 standard CET

Again there’s that odd period from December 1946 to near the end of February 1947 where there’s daylight savings of -1 hour. I’m not interested in the history of that right now – I’m interested in why the code is failing.

In this particular case, it looks like the problem is we’ve got two adjacent rules in .NET 6 (one at the end of 1946 and the other at the start of 1947) which both just describe periods of daylight saving.

If we can construct the maps to give the right results, Noda Time already has code in to work out “that’s okay, there’s no transition at the end of 1946”. But we need to get the maps right to start with.

Unfortunately, BclDateTimeZone already has complicated code to handle the previously-known corner cases. That makes the whole thing feel quite precarious – I could easily end up breaking other things by trying to fix this one specific aspect. Still, that’s what unit tests are for.

Looking at the code, I suspect the problem is with the start time of the first rule of 1947, which I’d expect to start at 1947-01-01T00:00:00Z, but is actually deemed to start at 1946-12-31T23:00:00Z. (In the course of writing that out, I notice that my improved-abutting-error exception doesn’t include the “Z”. Fix that now…)

Ah… but the UTC start of the rule is currently expected to be “the start date + the transition start time – base UTC offset”. That does give 1946-12-31T23:00:00Z. We want to apply the daylight savings (of -1 hour) in this case, because the start of the rule is during daylight savings. Again, there’s no documentation to say exactly what is meant by “start date” for the rule, and hopefully you can see why it’s really frustrating to have to try to reverse-engineer this in a version-agnostic way. Hmm.

Seeking an unambiguous and independent interpretation of AdjustmentRule

It’s relatively easy to avoid the “maps don’t abut” issue if we don’t care about really doing the job properly. After converting each AdjustmentRule to its Noda Time equivalent, we can look at rule pair of adjacent rules in the sequence: if the start of the “next” rule is earlier than the end of the “previous” rule, we can just adjust the start point. But that’s really just brushing the issue under the carpet – and as it happens, it just moves the exception to a different point.

That approach also requires knowledge of surrounding adjustment rules in order to completely understand one adjustment rule. That really doesn’t feel right to me. We should be able to understand the adjustment rule purely from the data exposed by that rule and the properties for the TimeZoneInfo itself. The code is already slightly grubby by calling TimeZoneInfo.IsDaylightSavingTime(). If I could work out how to remove that call too, that would be great. (It may prove infeasible to remove it for .NET Core 3.1, but feasible in 6. That’s not too bad. Interesting question: if the “grubby” code still works in .NET 6, is it better to use conditional code so that only the “clean” code is used in .NET 6, or avoid the conditional code? Hmm. We’ll see.)

Given that the rules in both .NET Core 3.1 and .NET 6 effectively mean that the start and end points are exactly the start and end points of DST (or other) transitions, I should be able to gather a number of examples of source data and expected results, and try to work out rules from that. In particular, this source data should include:

  • “Simple” situations (partly as a warm-up…)
  • Negative standard time offset (e.g. US time zones)
  • Negative savings (e.g. Prague above, and Europe/Dublin from 1971 onwards)
  • DST periods that cross year boundaries (primarily the southern hemisphere, e.g. America/Sao_Paulo)
  • Zero savings, but still in DST (Europe/Dublin before 1968)
  • Standard UTC offset changes (e.g. Europe/London 1968-1971, Europe/Moscow from March 2011 to October 2014)
  • All of the above for both .NET Core 3.1 and .NET 6, including the rules which represent standard time in .NET Core 3.1 but which are omitted in .NET 6

It looks like daylight periods which cross year boundaries are represented as single rules in .NET Core 3.1 and dual rules in .NET 6, so we’ll need to take that into account. In those cases we’ll need to map to two Noda Time rules, and we don’t mind where the transition between them is, so long as they abut. In general, working out the zone intervals that are relevant for a single year may require multiple lines of data from each source. (But we must be able to infer some of that from gaps, and other parts from individual source rules.)

Fortunately we’re not trying to construct “full rules” within Noda Time – just ZoneInterval values, effectively. All we need to be able to determine is:

  • Start instant
  • End instant
  • Standard offset
  • Daylight savings (if any)

When gathering the data, I’m going to assume that using the existing Noda Time interpretation of the IANA data is okay. That could be dangerous if either .NET interprets the data incorrectly, or if the Linux data isn’t the same as the IANA 2021e data I’m working from. There are ways to mitigate those risks, but they would be longwinded and I don’t think the risk justifies the extra work.

What’s absolutely vital is that the data is gathered carefully. If I mess this up (looking at the wrong time zone, or the wrong year, or running some code on Windows that I meant to run on Linux – like the earlier tests) it could several hours of work. This will be tedious.

Let’s gather some data…

Europe/Paris in 2020:
.NET Core 3.1:
Base offset = 1
2019-10-27 - 2020-03-29: Daylight delta: +00; DST starts October 27 at 02:00:00 and ends March 29 at 01:59:59
2020-03-29 - 2020-10-25: Daylight delta: +01; DST starts March 29 at 02:00:00 and ends October 25 at 02:59:59
2020-10-25 - 2021-03-28: Daylight delta: +00; DST starts October 25 at 02:00:00 and ends March 28 at 01:59:59

.NET 6:
Base offset = 1
2019-03-31 - 2019-10-27: Daylight delta: +01; DST starts March 31 at 02:00:00 and ends October 27 at 02:59:59.999
2020-03-29 - 2020-10-25: Daylight delta: +01; DST starts March 29 at 02:00:00 and ends October 25 at 02:59:59.999
2021-03-28 - 2021-10-31: Daylight delta: +01; DST starts March 28 at 02:00:00 and ends October 31 at 02:59:59.999

Noda Time zone intervals (start - end, standard, savings):
2019-10-27T01:00:00Z - 2020-03-29T01:00:00Z, +1, +0
2020-03-29T01:00:00Z - 2020-10-25T01:00:00Z, +1, +1
2020-10-25T01:00:00Z - 2021-03-28T01:00:00Z, +1, +0


America/Los_Angeles in 2020:

.NET Core 3.1:
Base offset = -8
2019-11-03 - 2020-03-08: Daylight delta: +00; DST starts November 03 at 01:00:00 and ends March 08 at 01:59:59
2020-03-08 - 2020-11-01: Daylight delta: +01; DST starts March 08 at 02:00:00 and ends November 01 at 01:59:59
2020-11-01 - 2021-03-14: Daylight delta: +00; DST starts November 01 at 01:00:00 and ends March 14 at 01:59:59

.NET 6:
Base offset = -8
2019-03-10 - 2019-11-03: Daylight delta: +01; DST starts March 10 at 02:00:00 and ends November 03 at 01:59:59.999
2020-03-08 - 2020-11-01: Daylight delta: +01; DST starts March 08 at 02:00:00 and ends November 01 at 01:59:59.999
2021-03-14 - 2021-11-07: Daylight delta: +01; DST starts March 14 at 02:00:00 and ends November 07 at 01:59:59.999

Noda Time zone intervals:
2019-11-03T09:00:00Z - 2020-03-08T10:00:00Z, -8, +0
2020-03-08T10:00:00Z - 2020-11-01T09:00:00Z, -8, +1
2020-11-01T09:00:00Z - 2021-03-14T10:00:00Z, -8, +0


Europe/Prague in 1946/1947:

.NET Core 3.1:
Base offset = 1
1945-10-01 - 1946-05-06: Daylight delta: +00; DST starts October 01 at 02:00:00 and ends May 06 at 01:59:59
1946-05-06 - 1946-10-06: Daylight delta: +01; DST starts May 06 at 02:00:00 and ends October 06 at 02:59:59
1946-10-06 - 1946-12-01: Daylight delta: +00; DST starts October 06 at 02:00:00 and ends December 01 at 02:59:59
1946-12-01 - 1947-02-23: Daylight delta: -01; DST starts December 01 at 03:00:00 and ends February 23 at 01:59:59
1947-02-23 - 1947-04-20: Daylight delta: +00; DST starts February 23 at 03:00:00 and ends April 20 at 01:59:59
1947-04-20 - 1947-10-05: Daylight delta: +01; DST starts April 20 at 02:00:00 and ends October 05 at 02:59:59
1947-10-05 - 1948-04-18: Daylight delta: +00; DST starts October 05 at 02:00:00 and ends April 18 at 01:59:59
1948-04-18 - 1948-10-03: Daylight delta: +01; DST starts April 18 at 02:00:00 and ends October 03 at 02:59:59

.NET 6:
Base offset = 1
1945-05-08 - 1945-10-01: Daylight delta: +01; DST starts May 08 at 23:00:00 and ends October 01 at 02:59:59.999
1946-05-06 - 1946-10-06: Daylight delta: +01; DST starts May 06 at 02:00:00 and ends October 06 at 02:59:59.999
1946-12-01 - 1946-12-31: Daylight delta: -01; DST starts December 01 at 03:00:00 and ends December 31 at 23:59:59.999
1947-01-01 - 1947-02-23: Daylight delta: -01; DST starts January 01 at 00:00:00 and ends February 23 at 01:59:59.999
1947-04-20 - 1947-10-05: Daylight delta: +01; DST starts April 20 at 02:00:00 and ends October 05 at 02:59:59.999
1948-04-18 - 1948-10-03: Daylight delta: +01; DST starts April 18 at 02:00:00 and ends October 03 at 02:59:59.999

Noda Time zone intervals:
1945-10-01T01:00:00Z - 1946-05-06T01:00:00Z, +1, +0
1946-05-06T01:00:00Z - 1946-10-06T01:00:00Z, +1, +1
1946-10-06T01:00:00Z - 1946-12-01T02:00:00Z, +1, +0
1946-12-01T02:00:00Z - 1947-02-23T02:00:00Z, +1, -1
1947-02-23T02:00:00Z - 1947-04-20T01:00:00Z, +1, +0
1947-04-20T01:00:00Z - 1947-10-05T01:00:00Z, +1, +1
1947-10-05T01:00:00Z - 1948-04-18T01:00:00Z, +1, +0


Europe/Dublin in 2020:

.NET Core 3.1:
Base offset = 1
2019-10-27 - 2020-03-29: Daylight delta: -01; DST starts October 27 at 02:00:00 and ends March 29 at 00:59:59
2020-03-29 - 2020-10-25: Daylight delta: +00; DST starts March 29 at 02:00:00 and ends October 25 at 01:59:59
2020-10-25 - 2021-03-28: Daylight delta: -01; DST starts October 25 at 02:00:00 and ends March 28 at 00:59:59

.NET 6.0:
Base offset = 1
2019-10-27 - 2019-12-31: Daylight delta: -01; DST starts October 27 at 02:00:00 and ends December 31 at 23:59:59.999
2020-01-01 - 2020-03-29: Daylight delta: -01; DST starts January 01 at 00:00:00 and ends March 29 at 00:59:59.999
2020-10-25 - 2020-12-31: Daylight delta: -01; DST starts October 25 at 02:00:00 and ends December 31 at 23:59:59.999
2021-01-01 - 2021-03-28: Daylight delta: -01; DST starts January 01 at 00:00:00 and ends March 28 at 00:59:59.999

Noda Time zone intervals:
2019-10-27T01:00:00Z - 2020-03-29T01:00:00Z, +1, -1
2020-03-29T01:00:00Z - 2020-10-25T01:00:00Z, +1, +0
2020-10-25T01:00:00Z - 2021-03-28T01:00:00Z, +1, -1


Europe/Dublin in 1960:

.NET Core 3.1:
Base offset = 1
1959-10-04 - 1960-04-10: Daylight delta: +00; DST starts October 04 at 03:00:00 and ends April 10 at 02:59:59
1960-04-10 - 1960-10-02: Daylight delta: +00; DST starts April 10 at 03:00:00 and ends October 02 at 02:59:59

.NET 6.0:
Base offset = 1
1959-10-04 - 1959-12-31: Base UTC offset delta: -01; Daylight delta: +00; DST starts October 04 at 03:00:00 and ends December 31 at 23:59:59.999
1960-01-01 - 1960-04-10: Base UTC offset delta: -01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends April 10 at 02:59:59.999
1960-04-10 - 1960-10-02: Daylight delta: +00; DST starts April 10 at 03:00:00 and ends October 02 at 02:59:59.999
1960-10-02 - 1960-12-31: Base UTC offset delta: -01; Daylight delta: +00; DST starts October 02 at 03:00:00 and ends December 31 at 23:59:59.999
1961-01-01 - 1961-03-26: Base UTC offset delta: -01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends March 26 at 02:59:59.999

Noda Time zone intervals:
1959-10-04T02:00:00Z - 1960-04-10T02:00:00Z, +0, +0
1960-04-10T02:00:00Z - 1960-10-02T02:00:00Z, +0, +1
1960-10-02T02:00:00Z - 1961-03-26T02:00:00Z, +0, +0


America/Sao_Paulo in 2018 (not 2020, as Brazil stopped observing daylight savings in 2019):

.NET Core 3.1:
Base offset = -3
2017-10-15 - 2018-02-17: Daylight delta: +01; DST starts October 15 at 00:00:00 and ends February 17 at 23:59:59
2018-02-17 - 2018-11-03: Daylight delta: +00; DST starts February 17 at 23:00:00 and ends November 03 at 23:59:59
2018-11-04 - 2019-02-16: Daylight delta: +01; DST starts November 04 at 00:00:00 and ends February 16 at 23:59:59

.NET 6.0:
Base offset = -3
2017-10-15 - 2017-12-31: Daylight delta: +01; DST starts October 15 at 00:00:00 and ends December 31 at 23:59:59.999
2018-01-01 - 2018-02-17: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends February 17 at 23:59:59.999
2018-11-04 - 2018-12-31: Daylight delta: +01; DST starts November 04 at 00:00:00 and ends December 31 at 23:59:59.999
2019-01-01 - 2019-02-16: Daylight delta: +01; DST starts January 01 at 00:00:00 and ends February 16 at 23:59:59.999

Noda Time zone intervals:
2017-10-15T03:00:00Z - 2018-02-18T02:00:00Z, -3, +1
2018-02-18T02:00:00Z - 2018-11-04T03:00:00Z, -3, +0
2018-11-04T03:00:00Z - 2019-02-17T02:00:00Z, -3, +1


Europe/London in 1968-1971

.NET Core 3.1:
Base offset = 0
1968-10-26 - 1971-10-31: Daylight delta: +00; DST starts October 26 at 23:00:00 and ends October 31 at 01:59:59

.NET 6:
Base offset = 0
1968-10-26 - 1968-12-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts October 26 at 23:00:00 and ends December 31 at 23:59:59.999
1969-01-01 - 1970-12-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
1971-01-01 - 1971-10-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends October 31 at 01:59:59.999

Noda Time zone intervals:
1968-10-26T23:00:00Z - 1971-10-31T02:00:00Z, +1, +0


Europe/Moscow in 2011-2014

.NET Core 3.1:
Base offset = 3
2011-03-27 - 2014-10-26: Daylight delta: +00; DST starts March 27 at 02:00:00 and ends October 26 at 00:59:59

.NET 6:
Base offset = 3
2011-03-27 - 2011-12-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts March 27 at 02:00:00 and ends December 31 at 23:59:59.999
2012-01-01 - 2013-12-31: Base UTC offset delta: +01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends December 31 at 23:59:59.999
2014-01-01 - 2014-10-26: Base UTC offset delta: +01; Daylight delta: +00; DST starts January 01 at 00:00:00 and ends October 26 at 00:59:59.999

Noda Time zone intervals:
2011-03-26T23:00:00Z - 2014-10-25T22:00:00Z, +4, +0

I think that forcing myself to collect these small bits of data and write them down will be a bit of a game-changer. Previously I’ve taken handwritten notes for individual issues, relying on the “global” unit tests (check every transition in every time zone) to catch any problems after I’d implemented them. But with the data above, I can write unit tests. And those unit tests don’t need to depend on whether we’re running on Windows and Linux, which will make the whole thing much simpler. We’re not testing an actual time zone – we’re testing “adjustment rule to Noda Time representation” with adjustment rules as they would show up on Linux.

There’s one slightly fiddly bit: I suspect that detecting “base UTC offset delta” for .NET Core 3.1 will require the time zone itself (as we can’t get to the rule data). I might get all the rest of the unit tests working first (and even the non-zero-delta ones for .NET 6) and come back to that.

That’s all for now…

I’ve now implemented the above test data in uncommitted code. After starting to include strings directly into the code, I’ve decided to put all the test data in a text file, pretty much as it’s specified above (just with very minor formatting changes). This is going to be really handy in terms of having readable test cases; I’m already glad I’ve put the effort into it.

However, I’ve discovered that it’s incomplete, as we need test cases for offset changes across the international date line (in both directions). It’s also possible that the choice of America/Sao_Paulo is unfortunate, as Brazil changed clocks at midnight. We might want an example in Australia as well. (Potentially even two: one with whole hour offsets and one with half hour offsets.)

Even without that additional test, there are issues. I can get all but “Europe/Dublin in 1968” to work in .NET 6. I haven’t yet worked out how to handle changing standard offsets in .NET Core 3.1 in a testable way. Even the fact that standard offsets can change is a pain, in terms of working out the transition times in .NET 6, as it appears to be something like “Assume the start is in standard time and the end is in daylight time, except don’t take any standard time deltas into account when calculating that” – which is very weird indeed. (And I don’t understand how the Europe/Dublin data in .NET 6 is meant to convey the expected data. It’s very odd.)

This post is quite long enough though, so I’m going to post it now and take a break from time zones for a bit. Hopefully I’ll post a “part 2” when I’ve actually got everything working.

Just as a reminder, supposedly these changes in .NET 6: “don’t affect external behavior, other than to ensure correctness in some edge cases”. Mmm. Really.

Using “git bash” from AppVeyor

Update: I don’t know whether it was partially due to this blog post or not, but AppVeyor has fixed things so that you don’t (currently, 20th October 2019) need to use the fix in this post. You may want to include it anyway, for the sake of future-proofing.


TL;DR: If your AppVeyor build starts breaking because it’s started using WSL bash, change the path in your YAML file – see the end of the post for an example.

For years now, I’ve used bash scripts for all kinds of automation in Windows projects. The version of bash I use is the one that comes with Git for Windows – I believe its origins include Cygwin, MSYS2, and MinGW-w64. (I don’t know enough about the differences between those projects or which builds on which to say more. Fortunately, I don’t need to.) This version of bash is installed by default AppVeyor, the CI system I typically use for Windows builds, so I don’t need to do anything else.

Importantly, I don’t want to use Windows Subsystem for Linux (WSL) on Windows builds. The point of doing the build is to use the Windows tool chains. I use Travis for doing Linux builds.

On October 11th 2019, my Noda Time AppVeyor build failed with this error:

build/appveyor.sh: line 11: dotnet: command not found

It turns out this is because AppVeyor has started shipping WSL with its Visual Studio 2019 images. The bash from WSL is earlier in the path than the version of bash from git, so that one is used, and everything starts failing.

It took a little while to diagnose this, but the fix is pretty easy – you just need to put git bash earlier in your path. I chose to do this in the “install” part of appveyor.yml:

install:
  # Make sure we get the bash that comes with git, not WSL bash
  - ps: $env:Path = "C:\Program Files\Git\bin;$env:Path"

Using just that change, the build started working again. Hooray!

Storing UTC is not a silver bullet

Note: this is a pretty long post. If you’re not interested in the details, the conclusion at the bottom is intended to be read in a standalone fashion. There’s also a related blog post by Lau Taarnskov – if you find this one difficult to read for whatever reason, maybe give that a try.

When I read Stack Overflow questions involving time zones, there’s almost always someone giving the advice to only ever store UTC. Convert to UTC as soon as you can, and convert back to a target time zone as late as you can, for display purposes, and you’ll never have a time zone issue again, they say.

This blog post is intended to provide a counterpoint to that advice. I’m certainly not saying storing UTC is always the wrong thing to do, but it’s not always the right thing to do either.

Note on simplifications: this blog post does not go into supporting non-Gregorian calendar systems, or leap seconds. Hopefully developers writing applications which need to support either of those are already aware of their requirements.

Background: EU time zone rule changes

The timing of this blog post is due to recent European Parliament proceedings that look like they will probably end the clocks changing twice a year into “summer time” or “winter time” within EU member states. The precise details are yet to be finalized and are unimportant to the bigger point, but for the purpose of this blog post I’ll assume that each member state has to decide whether they will “spring forward” one last time on March 28th 2021, then staying in permanent “summer time”, or “fall back” one last time on October 31st 2021, then staying in permanent “winter time”. So from November 1st 2021 onwards, the UTC offset of each country will be fixed – but there may be countries which currently always have the same offset as each other, and will have different offsets from some point in 2021. (For example, France could use winter time and Germany could use summer time.)

The larger point is that time zone rules change, and that applications should expect that they will change. This isn’t a corner case, it’s the normal way things work. There are usually multiple sets of rule changes (as released by IANA) each year. At least in the European changes, we’re likely to have a long notice period. That often isn’t the case – sometimes we don’t find out about rule changes until a few days before they happen.

Application example

For the sake of making everything concrete, I’m going to imagine that we’re writing an application to help conference organizers. A conference organizer can create a conference within the application, specifying when and where it’s happening, and (amongst other things) the application will display a countdown timer of “the number of hours left before the start of the conference”. Obviously a real application would have a lot more going on than this, but that’s enough to examine the implementation options available.

To get even more concrete, we’ll assume that a conference organizer has registered a conference called “KindConf” and has said that it will start at 9am in Amsterdam, on July 10th 2022. They perform this registration on March 27th 2019, when the most recently published IANA time zone database is 2019a, which predicts that the offset observed in Amsterdam on July 10th 2022 will be UTC+2.

For the sake of this example, we’ll assume that the Netherlands decides to fall back on October 31st 2021 for one final time, leaving them on a permanent offset of UTC+1. Just to complete the picture, we’ll assume that this decision is taken on February 1st 2020, and that IANA publishes the changes on March 14th 2020, as part of release 2020c.

So, what can the application developer do? In all the options below, I have not gone into details of the database support for different date/time types. This is important, of course, but probably deserves a separate blog post in its own right, on a per-database basis. I’ll just assume we can represent the information we want to represent, somehow.

Interlude: requirements

Before we get to the implementations, I’ll just mention a topic that’s been brought up a few times in the comments and on Twitter. I’ve been assuming that the conference does still occur at 9am on July 10th 2022… in other words, that the “instant in time at which the conference starts” changes when the rules change.

It’s unlikely that this would ever show up in a requirements document. I don’t remember ever being in a meeting with a product manager where they’d done this type of contingency planning. If you’re lucky, someone would work out that there’s going to be a problem long before the rules actually change. At that point, you’d need to go through the requirements and do the implementation work. I’d argue that this isn’t a new requirement – it’s a sort of latent, undiscovered requirement you’ve always had, but you hadn’t known about before.

Now, back to the options…

Option 1: convert to UTC and just use that forever

The schema for the Conferences table in the database might look like this:

  • ID: auto-incremented integer
  • Name: string
  • Start: date/time in UTC
  • Address: string

The entry for KindConf would look like this:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T07:00:00Z
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands

That entry is then preserved forever, without change. So what happens to our countdown timer?

Result

The good news is that anyone observing the timer will see it smoothly count down towards 0, with no jumps. The bad news is that when it reaches 0, the conference won’t actually start – there’ll be another hour left. This is not good.

Option 2: convert to UTC immediately, but reconvert after rule changes

The schema for the Conferences table would preserve the time zone ID. (I’m using the IANA ID for simplicity, but it could be the Windows system time zone ID, if absolutely necessary.) Alternatively, the time zone ID could be derived each time it’s required – more on that later.

  • ID: auto-incremented integer
  • Name: string
  • Start: date/time in UTC
  • Address: string
  • Time zone ID: string

The initial entry for KindConf would look like this:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T07:00:00Z
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam

On March 14th 2020, when the new time zone database is released, that entry could be changed to make the start time accurate again:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T08:00:00Z
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam

But what does that “change” procedure look like? We need to convert the UTC value back to the local time, and then convert back to UTC using different rules. So which rules were in force when that entry was created? It looks like we actually need an extra field in the schema somewhere: TimeZoneRulesVersion. This could potentially be a database-wide value, although that’s only going to be reasonable if you can update all entries and that value atomically. Allowing a value per entry (even if you usually expect all entries to be updated at roughly the same time) is likely to make things simpler.

So our original entry was actually:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T07:00:00Z
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam
  • TimeZoneRules: 2019a

And the modified entry is:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T08:00:00Z
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam
  • TimeZoneRules: 2020c

Of course, the entry could have been updated many times over the course of time, for 2019b, 2019c, …, 2020a, 2020b. Or maybe we only actually update the entry if the start time changes. Either way works.

Result

Now, anyone refreshing the countdown timer for the event will see the counter increase by an hour when the entry is updated. That may look a little odd – but it means that when the countdown timer reaches 0, the conference is ready to start. I’m assuming this is the desired behaviour.

Implementation

Let’s look at roughly what would be needed to perform this update in C# code. I’ll assume the use of Noda Time to start with, but then we’ll consider what happens if you’re not using Noda Time.

public class Conference
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Address { get; set; }
    public Instant Start { get; set; }
    public string TimeZoneId { get; set; }
    public string TimeZoneRules { get; set; }
}

// In other code... some parameters might be fields in the class.
public void UpdateStartTime(
    Conference conference,
    Dictionary<string, IDateTimeZoneProvider> timeZoneProvidersByVersion,
    string latestRules)
{
    // Map the start instant into the time zone using the old rules
    IDateTimeZoneProvider oldProvider = timeZoneProvidersByVersion[conference.TimeZoneRules];
    DateTimeZone oldZone = oldProvider[conference.TimeZoneId];
    ZonedDateTime oldZonedStart = conference.Start.InZone(oldZone);   

    IDateTimeZoneProvider newProvider = timeZoneProvidersByVersion[latestRules];
    DateTimeZone newZone = newProvider[conference.TimeZoneId];
    // Preserve the local time, but with the new time zone rules
    ZonedDateTime newZonedStart = oldZonedStart.LocalDateTime.InZoneLeniently(newZone);

    // Update the conference entry with the new information
    conference.Start = newZonedStart.ToInstant();
    conference.TimeZoneRules = latestRules;
}

The InZoneLeniently call is going to be a common issue – we’ll look at that later (“Ambiguous and skipped times”).

This code would work, and Noda Time would make it reasonably straightforward to build that dictionary of time zone providers, as we publish all the “NZD files” we’ve ever created from 2013 onwards on the project web site. If the code is being updated with the latest stable version of the NodaTime NuGet package, the latestRules parameter wouldn’t be required – DateTimeZoneProviders.Tzdb could be used instead. (And IDateTimeZoneProvider.VersionId could obtain the current version.)

However, this approach has three important requirements:

  • The concept of “version of time zone rules” has to be available to you
  • You have to be able to load a specific version of the time zone rules
  • You have to be able to use multiple versions of the time zone rules in the same application

If you’re using C# but relying on TimeZoneInfo then… good luck with any of those three. (It’s no doubt feasible, but far from simple out of the box, and it may require an external service providing historical data.)

I can’t easily comment on other platforms in any useful way, but I suspect that dealing with multiple versions of time zone data is not something that most developers come across.

Option 3: preserve local time, using UTC as derived data to be recomputed

Spoiler alert: this is my preferred option.

In this approach, the information that the conference organizer supplied (“9am on July 10th 2022”) is preserved and never changed. There is additional information in the entry that is changed when the time zone database is updated: the converted UTC instant. We can also preserve the version of the time zone rules used for that computation, as a way of allowing the process of updating entries to be restarted after a failure without starting from scratch, but it’s not strictly required. (It’s also probably useful as diagnostic information, too.)

The UTC instant is only stored at all for convenience. Having a UTC representation makes it easier to provide total orderings of when things happen, and also to compute the time between “right now” and the given instant, for the countdown timer. Unless it’s actually useful to you, you could easily omit it entirely. (My Noda Time benchmarks suggest it’s unlikely that doing the conversion on every request wouldn’t cause a bottleneck. A single local-to-UTC conversion on my not-terribly-fast benchmark machine only takes ~150ns. In most environments that’s close to noise. But for cases where it’s relevant, it’s fine to store the UTC as described below.)

So the schema would have:

  • ID: auto-incremented integer
  • Name: string
  • Local start: date/time in the specified time zone
  • Address: string
  • Time zone ID: string
  • UTC start: derived field for convenience
  • Time zone rules version: for optimization purposes

So our original entry is:

  • ID: 1
  • Name: KindConf
  • LocalStart: 2022-07-10T09:00:00
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam
  • UtcStart: 2022-07-10T07:00:00Z
  • TimeZoneRules: 2019a

On March 14th 2020, when the time zone database 2020c is released, this is modified to:

  • ID: 1
  • Name: KindConf
  • LocalStart: 2022-07-10T09:00:00
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam
  • UtcStart: 2022-07-10T08:00:00Z
  • TimeZoneRules: 2020c

Result

This is the same as option 2: after the update, there’s a jump of an hour, but when it reaches 0, the conference starts.

Implementation

This time, we don’t need to convert our old UTC value back to a local value: the “old” time zone rules version and “old” UTC start time are irrelevant. That simplifies matter significantly:

public class Conference
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Address { get; set; }
    public LocalDateTime LocalStart { get; set; }
    public string TimeZoneId { get; set; }
    public Instant UtcStart { get; set; }
    public string TimeZoneRules { get; set; }
}

// In other code... some parameters might be fields in the class.
public void UpdateUtcStart(
    Conference conference,
    IDateTimeZoneProvider latestZoneProvider)
{
    DateTimeZone newZone = latestZoneProvider[conference.TimeZoneId];
    // Preserve the local time, but with the new time zone rules
    ZonedDateTime newZonedStart = conference.LocalStart.InZoneLeniently(newZone);

    // Update the conference entry with the new information
    conference.UtcStart = newZonedStart.ToInstant();
    conference.TimeZoneRules = latestZoneProvider.VersionId;
}

As the time zone rules version is now optional, this code could be ported to use TimeZoneInfo instead. Obviously from my biased perspective the code wouldn’t be as pleasant, but it would be at least reasonable. The same is probably true on other platforms.

So I prefer option 3, but is it really so different from option 2? We’re still storing the UTC value, right? That’s true, but I believe the difference is important because the UTC value is an optimization, effectively.

Principle of preserving supplied data

For me, the key difference between the options is that in option 3, we store and never change what the conference organizer entered. The organizer told us that the event would start at the given address in Amsterdam, at 9am on July 10th 2022. That’s what we stored, and that information never needs to change (unless the organizer wants to change it, of course). The UTC value is derived from that “golden” information, but can be re-derived if the context changes – such as when time zone rules change.

In option 2, we don’t store the original information – we only store derived information (the UTC instant). We need to store information to tell us all the context about how we derived it (the old time zone rules version) and when updating the entry, we need to get back to the original information before we can re-derive the UTC instant using the new rules.

If you’re going to need the original information anyway, why not just store that? The implementation ends up being simpler, and it means it doesn’t matter whether or not we even have the old time zone rules.

Representation vs information

It’s important to note that I’m only talking about preserving the core information that the organizer entered. For the purposes of this example at least, we don’t need to care about the representation they happened to use. Did they enter it as “July 10 2022 09:00” and we then parsed that? Did they use a calendar control that provided us with “2022-07-10T09:00”? I don’t think that’s important, as it’s not part of the core information.

It’s often a useful exercise to consider what aspects of the data you’re using are “core” and which are incidental. If you’re receiving data from another system as text for example, you probably don’t want to store the complete XML or JSON, as that choice between XML and JSON isn’t relevant – the same data could be represented by an XML file and a JSON file, and it’s unlikely that anything later will need to know or care.

A possible option 4?

I’ve omitted a fourth option which could be useful here, which is a mixture of 2 and 3. If you store a “date/time with UTC offset” then you’ve effectively got both the local start time and the UTC instant in a single field. To show the values again, you’d start off with:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T09:00:00+02:00
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam
  • TimeZoneRules: 2019a

On March 14th 2020, when the time zone database 2020c is released, this is modified to:

  • ID: 1
  • Name: KindConf
  • Start: 2022-07-10T09:00:00+01:00
  • Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
  • TimeZoneId: Europe/Amsterdam
  • TimeZoneRules: 2020c

In systems that support “date/time with UTC offset” well in both the database and the languages using it, this might be an attractive solution. It’s important to note that the time zone ID is still required (unless you derive it from the address whenever you need it) – there’s a huge difference between knowing the time zone that’s applied, and knowing the UTC offset in one specific situation.

Personally I’m not sure I’m a big fan of this option, as it combines original and derived data in a single field – the local part is the original data, and the offset is derived. I like the separation between original and derived data in option 3.

With all those options presented, let’s look at a few of the corner cases I’ve mentioned in the course of the post.

Ambiguous and skipped times

In both of the implementations I’ve shown, I’ve used the InZoneLeniently method from Noda Time. While the mapping from UTC instant to local time is always completely unambiguous for a single time zone, the reverse mapping (from local time to UTC instant) is not always unambiguous.

As an example, let’s take the Europe/London time zone. On March 31st 2019, at 1am local time, we will “spring forward” to 2am, changing offset from UTC+0 to UTC+1. On October 27th 2019, at 2am local time, we will “fall back” to 1am, changing offset from UTC+1 to UTC+0. That means that 2019-03-31T01:30 does not happen at all in the Europe/London time zone, and 2019-10-27T01:30 occurs twice.

Now it’s reasonable to validate this when a conference organizer specifies the starting time of a conference, either prohibiting it if the given time is skipped, or asking for more information if the given time is ambiguous. I should point out that this is highly unlikely for a conference, as transitions are generally done in the middle of the night – but other scenarios (e.g. when to schedule an automated backup) may well fall into this.

That’s fine at the point of the first registration, but it’s also possible that a previously-unambiguous local time could become ambiguous under new time zone rules. InZoneLeniently handles that in a way documented in the Resolvers.LenientResolver. That may well not be the appropriate choice for any given application, and developers should consider it carefully, and write tests.

Recurrent events

The example I’ve given so far is for a single event. Recurrent events – such as weekly meetings – end up being trickier still, as a change to time zone rules can change the offsets for some instances but not others. Likewise meetings may well be attended by people from more than a single time zone – so it’s vital that the recurrence would have a single coordinating time zone, but offsets may need to be recomputed for every time zone involved, and for every occurrence. Application developers have to think about how this can be achieved within performance requirements.

Time zone boundary changes and splits

So far we’ve only considered time zone rules changing. In options 2-4, we stored a time zone ID within the entry. That assumes that the time zone associated with the event will not change over time. That assumption may not be valid.

As far as I’m aware, time zone rules change more often than changes to which time zone any given location is in – but it’s entirely possible for things to change over time. Suppose the conference wasn’t in Amsterdam itself, but Rotterdam. Currently Rotterdam uses the Europe/Amsterdam time zone, but what if the Netherlands splits into two countries between 2019 and 2022? It’s feasible that by the time the conference occurs, there could be a Europe/Rotterdam time zone, or something equivalent.

To that end, a truly diligent application developer might treat the time zone ID as derived data based on the address of the conference. As part of checking each entry when the time zone database is updated, they might want to find the time zone ID of the address of the conference, in case that’s changed. There are multiple services that provide this information, although it may need to be a multi-step process, first converting the address into a latitude/longitude position, and then finding the time zone for that latitude/longitude.

Past vs recent past

This post has all been about future date/time values. In Twitter threads discussing time zone rule changes, there’s been a general assertion that it’s safe to only store the UTC instant related to an event in the past. I would broadly agree with that, but with one big caveat: as I mentioned earlier, sometimes governments adopt time zone rule changes with almost no notice at all. Additionally, there can be a significant delay between the changes being published and them being available within applications. (That delay can vary massively based on your platform.)

This means that while a conversion to UTC for a value more than (say) a year ago will probably stay valid, if you’re recording a date and time of “yesterday”, it’s quite possible that you’re using incorrect rules without knowing it. (Even very old rules can change, but that’s rarer in my experience.)

Do you need to account for this? That depends on your application, like so many other things. I’d at least consider the principle described above – and unless it’s much harder for you to maintain the real source information for some reason, I’d default to doing that.

Conclusion

The general advice of “just convert all local date/time data to UTC and store that” is overly broad in my view. For future and near-past events, it doesn’t take into account that time zone rules change, making the initial conversion potentially inaccurate. Part of the point of writing this blog post is to raise awareness, so that even if people do still recommend storing UTC, they can add appropriate caveats rather than treating it as a universal silver bullet.

I should explicitly bring up timestamps at this point. Machine-generated timestamps are naturally instants in time, recording “the instant at which something occurred” in an unambiguous way. Storing those in UTC is entirely reasonable – potentially with an offset or time zone if the location at which the timestamp was generated is relevant. Note that in this case the source of the data isn’t “a local time to be converted”.

That’s the bigger point, that goes beyond dates and times and time zones: choosing what information to store, and how. Any time you discard information, that should be a conscious choice. Are you happy discarding the input format that was used to enter a date? Probably – but it’s still a decision to make. Defaulting to “convert to UTC” is a default to discarding information which in some cases is valid, but not all. Make it a conscious choice, and ensure you store all the information you think may be needed later. You might also want to consider whether and how you separate “source” information from “derived” information – this is particularly relevant when it comes to archiving, when you may want to discard all the derived data to save space. That’s much easier to do if you’re already very aware of which data is derived.

My experience is that developers either don’t think about date/time details nearly enough when coding, or are aware of some of the pitfalls but decide that means it’s just too hard to contemplate. Hopefully this worked example of real life complexity shows that it can be done: it takes a certain amount of conscious thought, but it’s not rocket science.

Hosting ASP.NET Core behind https in Google Kubernetes Engine

Side-note: this may be one of the clumsiest titles I’ve ever written for a blog post. But it does what it says on the tin. Oh, and the space after “ASP” in “ASP .NET Core” everywhere it to avoid auto-linking. While I could use a different dot or a zero-width non-breaking space to avoid it, I’m not sure I trust WordPress to do the right thing with those…

Background

Over the past few weeks, I’ve moved nodatime.org, csharpindepth.com and jonskeet.uk over to Google Kubernetes Engine. (They all used to be hosted on Azure.)

I’ve done this for a few reasons:

  • As my job is primarily making .NET developers more productive on Google Cloud Platform, it feels natural to run my own code there. I want to see where there are friction points, so I can help fix them.
  • I wanted more redundancy, particularly for nodatime.org; Kubernetes felt a simple way of managing that at a reasonable cost
  • HTTPS certificate management (via Let’s Encrypt) has been a bit painful for me on Azure; I could have automated more, but that would have taken extra time I don’t have. (It may also have improved since I last looked.)

The first of these is the most important, by a long way. But the HTTPS management part – and then the knock-on effects – is what I’m looking at in this blog post.

Basic hosting

Hosting an ASP .NET Core application in Google Kubernetes Engine (GKE from now on) is really simple, at least once you’ve understood the Kubernetes concepts. I have:

In each case, the ASP .NET Core application is built with a vanilla Dockerfile which would not look unusual to anyone who’s hosted ASP .NET Core in Docker anywhere else.

I happen to use Google Cloud Build to build the Docker images, and Google Container Registry to host the images, but neither of those are required. (For csharpindepth.com and jonskeet.uk there are simple triggers in Google Cloud Build to build and deploy on GitHub pushes; for nodatime.org it’s a bit more complicated as the documentation build currently has some Windows dependencies. I have a machine at home that polls GitHub every half hour, and pushes the built application to Google Cloud Build for packaging when necessary.)

So, that gets HTTP hosting sorted. I dare say there are some aspects I’ve not configured as well as I could have done, but it was reasonably straightforward to get going.

HTTPS with Google-managed certificates

With HTTP working, it’s time to shoot for HTTPS. It’s important to note that the apps I’m talking about are all hobby projects, not commercial ones – I’m already paying for hosting, so I don’t want to have to pay for SSL certificates as well. Enter Let’s Encrypt, of course.

A while ago I used Let’s Encrypt to set up HTTPS on Azure, and while it was free and I didn’t have to write any code, it wasn’t exactly painless. I followed two guides at the same time, because neither of them exactly matched the Azure portal I was looking at. There were lots of bits of information to grab from various different bits of the portal, and it took a couple of attempts to get right… but I got there. I also set up a web job to renew the certificates, but didn’t go through the hoops required to run those web jobs periodically. (There were instructions, but it looked like they’d take a while to work through compared with just manually running the web job every couple of months or so. I decided to take the pragmatic approach, knowing that I was expecting to move to GKE anyway. If Azure had been the expected permanent home for the apps, I’d have gone through the steps and I’m sure they’d have worked fine.) I don’t know which guide I worked through at the time, but if I were starting today I’d probably try Scott Hanselman’s guide.

So, what can I do on Google Cloud Platform instead? I decided to terminate the SSL connection at the load balancer, using Google-managed certificates. To be really clear, these are currently in beta, but have worked well for me so far. Terminating the SSL connection at the load balancer means that the load balancer forwards the request to the Kubernetes service as an HTTP request, not HTTPS. The ASP .NET Core app itself only exposes an HTTP port, so it doesn’t need to know any details of certificates.

The steps to achieve this are simple, assuming you have the Google Cloud SDK (gcloud) installed already:

  • Create the certificate, e.g.
    gcloud beta compute ssl-certificates create nodatime-org --domains nodatime.org
  • Attach the certificate to the load balancer, via the Kubernetes ingress in my case, with an annotation in the ingress metadata:
    ingress.gcp.kubernetes.io/pre-shared-cert: "nodatime-org"
  • Apply the modifications to the ingress:
    kubectl apply -f ingress.yaml
  • Wait for the certificate to become valid (the provisioning procedure takes a little while, and I’ve seen some interesting errors while that’s taking place)
  • Enjoy HTTPS support, with auto-renewing certificates!

There are only two downsides to this that I’ve experienced so far:

  • Currently each certificate can only be associated with a single domain. For example, I have different certificates for nodatime.org, http://www.nodatime.org and test.nodatime.org. (More about the last of these later.) This is a minor annoyance, but the ingress supports multiple pre-shared certificates, so it has no practical implications for me.
  • I had to accept some downtime on HTTPS when transferring from Azure to GKE, while the certificate was provisioning after I’d transferred the DNS entry. This was a one-time issue of course, and one that wouldn’t affect most users.

Beyond the basics

At this point I had working HTTPS URLs – but any visitor using HTTP would stay that way. (At the time of writing this is still true for csharpindepth.com and jonskeet.uk.) Obviously I’d like to encourage secure browsing, so I’d like to use the two pieces of functionality provided by ASP .NET Core:

  • Redirection of HTTP requests via app.UseHttpsRedirection()
  • HSTS support via app.UseHsts()

I should note here that the Microsoft documentation was fabulously useful throughout. It didn’t quite get me all the way, but it was really close.

Now, I could have just added those calls into the code and deployed straight to production. Local testing would have worked – it would have redirected from localhost:5000 on HTTP to localhost:5001 on HTTPS with no problems. It would also have failed massively for reasons we’ll look at in a minute. Just for a change, I happened to do the right thing…

For hosting changes, always use a test deployment first

In Azure, I had a separate AppService I could deploy to, called nodatimetest. It didn’t have a fancy URL, but it worked okay. That’s where I tested Azure-specific changes before deploying to the real AppService. Unfortunately, it wouldn’t have helped in this situation, as it didn’t have a certificate.

Fortunately, creating a new service in Kubernetes, adding it to the ingress, and creating a managed certificate is so easy that I did do this for the new hosting – and I’m so glad I did so. I use a small script to publish the local ASP .NET Core build to Google Cloud Build which does the Docker packaging, pushes it to Google Container Registry and updates the Kubernetes deployment. As part of that script, I add a small text file containing the current timestamp so I can check that I’m really looking at the deployment I expect. It takes just under two minutes to build, push, package, deploy – not a tight loop you’d want for every day development, but pretty good for the kind of change that can’t be tested locally.

So, I made the changes to use HTTPS redirection and HSTS, deployed, and… there was no obvious change.

Issue 1: No HTTPS port to redirect to

Remember how the ASP .NET Core app in Kubernetes is only listening on HTTP? That means it doesn’t know which port to redirect users to for HTTPS. Oops. While I guess it would be reasonable to guess 443 if it didn’t know any better, the default of “don’t redirect if you haven’t been told a port” means that your application doesn’t stop working if you get things wrong – it just doesn’t redirect.

This is easily fixed in ConfigureServices:

services.AddHttpsRedirection(options => options.HttpsPort = 443);

… but I’ve added conditional code so it doesn’t do that in development environment, as otherwise it would try to redirect from localhost:5000 to localhost:443, which wouldn’t work. This is a bit hacky at the moment, which is a common theme – I want to clean up all the configuration at some point quite soon (moving things into appsettings.json as far as possible) but it’s just about hanging together for now.

So, make the change, redeploy to test, and… observed infinite redirection in the browser. What?

Issue 2: Forwarding proxied headers

Remember again how the ASP .NET Core app is only listening on HTTP? We want it to behave differently depending on whether the end user made a request to the load balancer on HTTP or HTTPS. That means using headers forwarded from the proxy (in our case the load balancer) to determine the original request scheme. Fortunately, again there’s documentation on hand for this.

There are two parts to configuring this:

  • Configuring the ForwardedHeadersOptions in ConfigureServices
  • Calling app.UseForwardedHeaders() in Configure

(At least, that’s the way that’s documented. I’m sure there are myriad alternatives, but my experience level of ASP .NET Core is such that I’m still in “follow the examples verbatim, changing as little as possible” at the moment.)

I won’t go into the gory details of exactly how many times I messed up the forwarded headers options, but I will say:

  • The example which just changes options.ForwardedHeaders is probably fine if your proxy server is local to the application, but otherwise you will need do to extra work
  • The troubleshooting part of the documentation is spectacularly useful
  • There are warnings logged if you get things wrong, and those logs will help you – but they’re at a debug log level, so you may need to update your logging settings. (I only realized this after I’d fixed the problem, partly thanks to Twitter.)

Lesson to learn: when debugging a problem, turn on debugging logs. Who’d have thought?

Configuring this properly is an area where you really need to understand your deployment and how a request reaches you. In my case, the steps are:

  • The user’s HTTPS request is terminated by the load balancer
  • The load balancer makes a request to the Kubernetes service
  • The Kubernetes service makes a request to the application running on one of the suitable nodes

This leads to a relatively complex configuration, as there are two networks to trust (Google Cloud load balancers, and my internal Kubernetes network) and we need to allow two “hops” of proxying. So my configuration code looks like this:

services.Configure<ForwardedHeadersOptions>(options =>
{
    options.KnownNetworks.Clear();
    // Google Cloud Platform load balancers
    options.KnownNetworks.Add(new IPNetwork(IPAddress.Parse("130.211.0.0"), 22));
    options.KnownNetworks.Add(new IPNetwork(IPAddress.Parse("35.191.0.0"), 16));
    // GKE service which proxies the request as well.
    options.KnownNetworks.Add(new IPNetwork(IPAddress.Parse("10.0.0.0"), 8));
    options.ForwardedHeaders = ForwardedHeaders.XForwardedFor | ForwardedHeaders.XForwardedProto;
    options.ForwardLimit = 2;
});

(The call to KnownNetworks.Clear() probably isn’t necessary. The default is to include the loopback, which is safe enough to leave in the list.)

Okay, deploy that to the test environment. Everything will work now, right? Well, sort of…

Issue 3: make sure health checks are healthy!

As it happens, when I’d finished fixing issue 2, I needed to help at a birthday party for a family we’re friends with. Still, I went to the party happily knowing everything was fine.

I then came home and found the test deployment was broken. Really broken. “502 Bad Gateway” broken. For both HTTP and HTTPS. This is not good.

I tried adding more logging, but it looked like none of my requests were getting through to the application. I could see in the logs (thank you, Stackdriver!) that requests were being made, always to just “/” on HTTP. They were all being redirected to HTTPS via a 307, as I’d expect.

This confused me for a while. I honestly can’t remember what gave me the lightbulb moment of “Ah, these are load balancer health checks, and it thinks they’re failing!” but I checked with the load balancer in the Google Cloud Console and sure enough, I had multiple working backends, and one broken one – my test backend. The reason I hadn’t seen this before was that I’d only checked the test deployment for a few minutes – not long enough for the load balancer to deem the backend unhealthy.

I was stuck at this point for a little while. I considered reconfiguring the load balancer to make the health check over HTTPS, but I don’t think that could work as the app isn’t serving HTTPS itself – I’d need to persuade it to make the request as if it were a user-generated HTTPS request, with appropriate X-Forwarded-Proto etc headers. However, I saw that I could change which URL the load balancer would check. So how about we add a /healthz URL that would be served directly without being redirected? (The “z” at the end is a bit of Googler heritage. Just /health would be fine too, of course.)

I started thinking about adding custom inline middleware to do this, but fortunately didn’t get too far before realizing that ASP .NET Core provides health checking already… so all I needed to do was add the health check middleware before the HTTPS redirection middleware, and all would be well.

So in ConfigureServices, I added a no-op health check service:

services.AddHealthChecks();

And in Configure I added the middleware at an appropriate spot:

app.UseHealthChecks("/healthz");

After reconfiguring the health check on the load balancer, I could see /healthz requests coming in and receiving 200 (OK) responses… and the load balancer was then happy to use the backend again. Hooray!

After giving the test service long enough to fail, I deployed to production, changed the load balancer health check URL, and all was well. I did the two parts of this quickly enough so that it never failed – a safer approach would have been to add the health check handler but without the HTTPS redirection first, deploy that, change the health check URL, then turn on HTTPS.

But the end result is, all is working! Hooray!

Conclusion

Moving the service in the first place has been a long process, mostly due to a lack of time to spend on it, but the HTTPS redirection has been its own interesting bit of simultaneous pleasure and frustration. I’ve learned a number of lessons along the way:

  • The combination of Google Kubernetes Engine, Google Cloud Load Balancing, Google Cloud Build and Google Container registry is pretty sweet.
  • Managed SSL certificates are wonderfully easy to use, even if there is a bit of a worrying delay while provisioning
  • It’s really, really important to be able to test deployment changes (such as HTTPS redirection) in an environment which is very similar to production, but which no-one is depending on. (Obviously if you have a site which few people care about anyway, there’s less risk. But as it’s easy to set up a test deployment on GKE, why not do it anyway?)
  • HTTPS redirection caused me three headaches, all predictable:
    • ASP .NET Core needs to know the HTTPS port to redirect to.
    • You need to configure forwarded headers really carefully, and know your deployment model thoroughly .
    • Be aware of health checks! Make sure you leave a test deployment “live” for long enough for the health checks to mark it as unhealthy if you’ve done something wrong, before you deploy to production.
  • When debugging, turn on debug logging. Sounds obvious in retrospect, doesn’t it? (Don’t start trying to copy middleware source code into your own application so you can add logging, rather than using the logging already there…)

I also have some future work to do:

  • There’s a particular URL (http://nodatime.org/tzdb/latest.txt) which is polled by applications in order to spot time zone information changes. That’s the bulk of the traffic to the site. It currently redirects to HTTPS along with everything else, which leads to the total traffic being nearly double what it was before, for no real benefit. I’ve encouraged app authors to use HTTPS instead, but I’ve also filed a feature request against myself to consider serving that particular URL without the redirect. It looks like that’s non-trivial though.
  • I have a bunch of hard-coded information which should really be in appsettings.json. I want to move all of that, but I need to learn more about the best way of doing it first.

All in all, this has been a very positive experience – I hope the details above are useful to anyone else hosting ASP .NET Core apps in Google Kubernetes Engine.

NullableAttribute and C# 8

Background: Noda Time and C# 8

Note: this blog post was written based on experimentation with Visual Studio 2019 preview 2.2. It’s possible that some of the details here will change over time.

C# 8 is nearly here. At least, it’s close enough to being “here” that there are preview builds of Visual Studio 2019 available that support it. Unsurprisingly, I’ve been playing with it quite a bit.

In particular, I’ve been porting the Noda Time source code to use the new C# 8 features. The master branch of the repo is currently the code for Noda Time 3.0, which won’t be shipping (as a GA release) until after C# 8 and Visual Studio 2019 have fully shipped, so it’s a safe environment in which to experiment.

While it’s possible that I’ll use other C# 8 features in the future, the two C# 8 features that impact Noda Time most are nullable reference types and switch expressions. Both sets of changes are merged into master now, but the pull requests are still available so you can see just the changes:

The switch expressions PR is much simpler than the nullable reference types one. It’s entirely an implementation detail… although admittedly one that confused docfx, requiring a few of those switch expressions to be backed out or moved in a later PR.

Nullable reference types are a much, much bigger deal. They affect the public API, so they need to be treated much more carefully, and the changes end up being spread far wide throughout the codebase. That’s why the switch expression PR is a single commit, whereas nullable reference types is split into 14 commits – mostly broken up by project.

Reviewing the public API of a nullable reference type change

So I’m now in a situation where I’ve got nullable reference type support in Noda Time. Anyone consuming the 3.0 build (and there’s an alpha available for experimentation purposes) from C# 8 will benefit from the extra information that can now be expressed about parameters and return values. Great!

But how can I be confident in the changes to the API? My process for making the change in the first place was to enable nullable reference types and see what warnings were created. That’s a great starting point, but it doesn’t necessarily catch everything. In particular, although I started with the main project (the one that creates NodaTime.dll), I found that I needed to make more changes later on, as I modified other projects.

Just because your code compiles without any warnings with nullable reference types enabled doesn’t mean it’s “correct” in terms of the API you want to expose.

For example, consider this method:

public static string Identity(string input) => input;

That’s entirely valid C# 7 code, and doesn’t require any changes to compile, warning-free, in C# 8 with nullable reference types enabled. But it may not be what you actually want to expose. I’d argue that it should look like one of these:

// Allowing null input, and producing nullable output
public static string? Identity(string? input) => input;

// Preventing null input, and producing non-nullable output
public static string Identity(string input)
{
    // Convenience method for nullity checking.
    Preconditions.CheckNotNull(input, nameof(input));
    return input;
}

If you were completely diligent when writing tests for the code before C# 8, it should be obvious which is required – because you’d presumably have something like:

[Test]
public void Identity_AcceptsNull()
{
    Assert.IsNull(Identity(null));
}

That test would have produced a warning in C# 8, and would have suggested that the null-permissive API is the one you wanted. But maybe you forgot to write that test. Maybe the test you would have written was one that would have shown up a need to put that precondition in. It’s entirely possible that you write much more comprehensive tests than I do, but I suspect most of us have some code that isn’t explicitly tested in terms of its null handling.

The important part take-away here is that even code that hasn’t changed in appearance can change meaning in C# 8… so you really need to review any public APIs. How do you do that? Well, you could review the entire public API surface you’re exposing, of course. For many libraries that would be the simplest approach to take, as a “belt and braces” attitude to review. For Noda Time that’s less appropriate, as so much of the API only deals in value types. While a full API review would no doubt be useful in itself, I just don’t have the time to do it right now.

Instead, what I want to review is any API element which is impacted by the C# 8 change – even if the code itself hasn’t changed. Fortunately, that’s relatively easy to do.

Enter NullableAttribute

The C# 8 compiler applies a new attribute to every API element which is affected by nullability. As an example of what I mean by this, consider the following code which uses the #nullable directive to control the nullable context of the code.

public class Test
{
#nullable enable
    public void X(string input) {}

    public void Y(string? input) {}
#nullable restore

#nullable disable
    public void Z(string input) {}
#nullable restore
}

The C# 8 compiler creates an internal NullableAttribute class within the assembly (which I assume it wouldn’t if we were targeting a framework that already includes such an attribute) and applies the attribute anywhere it’s relevant. So the above code compiles to the same IL as this:

using System.Runtime.CompilerServices;

public class Test
{
    public void X([Nullable((byte) 1)] string input) {}    

    public void Y([Nullable((byte) 2)] string input) {}

    public void Z(string input) {}}
}

Note how the parameter for Z doesn’t have the attribute at all, because that code is still oblivious to nullable reference types. But both X and Y have the attribute applied to their parameters – just with different arguments to describe the nullability. 1 is used for not-null; 2 is used for nullable.

That makes it relatively easy to write a tool to display every part of a library’s API that relates to nullable reference types – just find all the members that refer to NullableAttribute, and filter down to public and protected members. It’s slightly annoying that NullableAttribute doesn’t have any properties; code to analyze an assembly needs to find the appropriate CustomAttributeData and examine the constructor arguments. It’s awkward, but not insurmountable.

I’ve started doing exactly that in the Noda Time repository, and got it to the state where it’s fine for Noda Time’s API review. It’s a bit quick and dirty at the moment. It doesn’t show protected members, or setter-only properties, or handle arrays, and there are probably other things I’ve forgotten about. I intend to improve the code over time and probably move it to my Demo Code repository at some point, but I didn’t want to wait until then to write about NullableAttribute.

But hey, I’m all done, right? I’ve just explained how NullableAttribute works, so what’s left? Well, it’s not quite as simple as I’ve shown so far.

NullableAttribute in more complex scenarios

It would be oh-so-simple if each parameter or return type could just be nullable or non-nullable. But life gets more complicated than that, with both generics and arrays. Consider a method called GetNames() returning a list of strings. All of these are valid:

// Return value is non-null, and elements aren't null
List<string> GetNames()

// Return value is non-null, but elements may be null
List<string?> GetNames()

// Return value may be null, but elements aren't null
List<string>? GetNames()

// Return value may be null, and elements may be null
List<string?>? GetNames()

So how are those represented in IL? Well, NullableAttribute has one constructor accepting a single byte for simple situations, but another one accepting byte[] for more complex ones like this. Of course, List<string> is still relatively simple – it’s just a single top-level generic type with a single type argument. For a more complex example, imagine Dictionary<List<string?>, string[]?> . (A non-nullable reference to a dictionary where each key is a not-null list of nullable strings, and each value is a possibly-null array of non-nullable elements. Ouch.)

The layout of NullableAttribute in these cases can be thought of in terms of a pre-order traversal of a tree representing the type, where generic type arguments and array element types are leaves in the tree. The above example could be thought of as this tree:

         Dictionary<,> (not null)
            /               \
           /                 \
 List<> (not null)      Array (nullable)
        |                     |
        |                     |
 string (nullable)      string (not null)

The pre-order traversal of that tree gives us these values:

  • Not null (dictionary)
  • Not null (list)
  • Nullable (string)
  • Nullable (array)
  • Not null (string)

So a parameter declared with that type would be decorated like this:

[Nullable(new byte[] { 1, 1, 2, 2, 1 })]

But wait, there’s more!

NullableAttribute in simultaneously-complex-and-simple scenarios

The compiler has one more trick up its sleeve. When all the elements in the tree are “not null” or all elements in the tree are “nullable”, it simply uses the constructor with the single-byte parameter instead. So Dictionary<List<string>, string[]> would be decorated with Nullable[(byte) 1] and Dictionary<List<string?>?, string?[]?>? would be decorated with Nullable[(byte) 2].

(Admittedly, Dictionary<,> doesn’t permit null keys anyway, but that’s an implementation detail.)

Conclusion

The C# 8 feature of nullable reference types is a really complicated one. I don’t think we’ve seen anything like this since async/await. This post has just touched on one interesting implementation detail. I’m sure there’ll be more posts on nullability over the next few months…

First steps with nullable reference types

This blog post is effectively a log of my experience with the preview of the C# 8 nullable reference types feature.

There are lots of caveats here: it’s mostly “as I go along” so there may well be backtracking. I’m not advising the right thing to do, as I’m still investigating that myself. And of course the feature is still changing. Oh, and this blog post is inconsistent about its tense. Sometimes I write in the present tense as I go along, sometimes I wrote in the past tense afterwards without worrying about it. I hope this isn’t/wasn’t/won’t be too annoying.

I decided that the best way of exploring the feature would be to try to use it with Noda Time. In particular:

  • Does it find any existing bugs?
  • Do my existing attributes match what Roslyn expects?
  • Does the feature get in the way, or improve my productivity?

Installation

I started at the preview page on GitHub. There are two really important points to note:

  • Do this on a non-production machine. I used an old laptop, but presumably you can use a VM instead.
  • Uninstall all versions of Visual Studio other than VS2017 first

I ended up getting my installation into a bad state, and had to reinstall VS2017 (and then the preview) before it would work again. Fortunately that takes a lot less time than it used to.

Check it works

The preview does not work with .NET Core projects or the command-line csc

It’s only for old-style MSBuild projects targeting the .NET framework, and only from Visual Studio.

So to test your installation:

  • Create a new .NET Framework desktop console app
  • Edit Program.cs to include: string? x = null;
  • Build

If you get an error CS0453 (“The type ‘string’ must be a non-nullable value type…”) then it’s not working. If it builds with maybe a warning about the variable not being used, you’re good to go.

First steps with Noda Time

The first thing I needed to do was convert Noda Time to a desktop project. This didn’t require the preview to be installed, so I was able to do it on my main machine.

I created a new solution with three desktop projects (NodaTime, NodaTime.Test and NodaTime.Testing), and added the dependencies between the projects and external ones. I then copied these project files over the regular Noda Time ones.

Handy tip: if you add <Compile Include="**\*.cs" /> in an MSBuild file and open it in Visual Studio, VS will replace it with all the C# files it finds. No need for tedious “Add existing” all over the place.

A small amount of fiddling was required for signing and resources, and then I had a working copy of Noda Time targeting just .NET 4.5. All tests passed :)

For anyone wanting to follow my progress, the code is in a branch of my fork of Noda Time although I don’t know how long I’ll keep it for.

Building with the preview

After fetching that branch onto my older laptop, it built first time – with 228 warnings, most of which were “CS8600: Cannot convert null to non-nullable reference.” Hooray – this is exactly what we want. Bear in mind that this is before I’ve made any changes to the code.

The warnings were split between the three projects like this:

  • NodaTime: 94
  • NodaTime.Testing: 0
  • NodaTime.Test: 134

Follow the annotations

Noda Time already uses [CanBeNull] and [NotNull] attributes for both parameters and return types to indicate expectations. The first obvious step is to visit each application of [CanBeNull] and use a nullable reference type there.

To make it easier to see what’s going on, I first unloaded the NodaTime.Test project. This was so that I could concentrate on making NodaTime self-consistent to start with.

Just doing that actually raised the number of warnings from 94 to 110. Clearly I’m not as consistent as I’d like to be. I suspect I’ve got plenty of parameters which can actually be null but which I didn’t apply the annotation to. It’s time to start actually looking at the warnings.

Actually fix things

I did this in a completely haphazard fashion: fix one warning, go onto another.

I’ve noticed a pattern that was already feasible before, but has extra benefits in the nullable reference type world. Instead of this:

// Old code
string id = SomeMethodThatCanReturnNull();
if (id == null)
{
    throw new SomeException();
}
// Use id knowing it's not null

… I can use the ?? operator with the C# 7 feature of throw expressions:

// Old code
string id = SomeMethodThatCanReturnNull() ??
    throw new SomeException();
// Use id knowing it's not null

That avoids having a separate local variable of type string?, which can be very handy.

I did find a few places where the compiler could do better at working out nullity. For example:

// This is fine
string? x = SomeExpressionThatCanReturnNull();
if (x == null)
{
    return;
}
string y = x;

// This creates a warning: the compiler doesn't know that x
// can't be null on the last line
string? x = SomeExpressionThatCanReturnNull();
if (ReferenceEquals(x, null))
{
    return;
}
string y = x;

The preview doc talks about this in the context of string.IsNullOrEmpty; the ReferenceEquals version is a little more subtle as we can’t determine nullity just from the output – it’s only relevant if the other argument is a constant null. On the other hand, that’s such a fundamental method that I’m hopeful it’ll get fixed.

Fixing these warnings didn’t take very long, but it was definitely like playing Whackamole. You fix one warning, and that causes another. For example, you might make a return type nullable to make
a return null; statement work – and that affects all the callers.

I found that rebuilding would sometimes find more warnings, too. At one point I thought I was done (for the time being) – after rebuilding, I had 26 warnings.

I ran into one very common problem: implementing IEquatable<T> (for a concrete reference type T). In every case, I ended up making it implement IEquatable<T?>. I think that’s the right thing to do… (I didn’t do it consistently though, as I found out later on. And IEqualityComparer<T> is trickier, as I’ll talk about later.)

Reload the test project

So, after about an hour of fixing warnings in the main project, what would happen when I reload the test project? We previously had 134 warnings in the test project. After reloading… I was down to 123.

Fixing the test project involved fixing a lot more of the production code, interestingly enough. And that led me to find a limitation not mentioned on the preview page:

public static NodaFormatInfo GetInstance(IFormatProvider? provider)
{
    switch (provider)
    {
        case null:
            return ...;
        case CultureInfo cultureInfo:
            return ...;
        case DateTimeFormatInfo dateTimeFormatInfo;
            return ...;
        default:
            throw new ArgumentException($"Cannot use provider of type {provider.GetType()}");
    }
}

This causes a warning of a possible dereference in the default case – despite the fact that provider clearly can’t be null, as otherwise it would match the null case. Will try to provide a full example in a bug report.

The more repetitive part is fixing all the tests that ensure a method throws an ArgumentNullException if called with a null argument. As there’s no compile-time checking as well, the argument
needs to be null!, meaning “I know it’s really null, but pretend it isn’t.” It makes me chuckle in terms of syntax, but it’s tedious to fix every occurrence.

IEqualityComparer<T>

I have discovered an interesting problem. It’s hard to implement IEqualityComparer<T> properly. The signatures on the interface are pretty trivial:

public interface IEqualityComparer<in T>
{
    bool Equals(T x, T y);
    int GetHashCode(T obj);
}

But problems lurk beneath the surface. The documentation for the Equals() method doesn’t state what should happen if x or y is null. I’ve typically treated this as valid, and just used the normal equality rules (two null references are equal to each other, but nothing else is equal to a null reference.) Compare that with GetHashCode(), where it’s explicitly documented that the method should throw ArgumentNullException if obj is null.

Now think about a type I’d like to implement an equality comparer for – Period for example. Should I write:

public class PeriodComparer : IEqualityComparer<Period?>

This allows x and y to be null – but also allows obj to be null, causing an ArgumentNullException, which this language feature is trying to eradicate as far as possible.

I could implement the non-nullable version instead:

public class PeriodComparer : IEqualityComparer<Period>

Now the compiler will check that you’re not passing a possibly-null value to GetHashCode(), but will also check that for Equals, despite it being okay.

This feels like it’s a natural but somewhat unwelcome result of the feature arriving so much later than the rest of the type system. I’ve chosen to implement the nullable form, but still throw the
exception in GetHashCode(). I’m not sure that’s the right solution, but I’d be interested to hear what others think.

Found bugs in Noda Time!

One of the things I was interested in finding out with all of this was how consistent Noda Time is in terms of its nullity handling. Until you have a tool like this, it’s hard to tell. I’m very pleased to say that most of it hangs together nicely – although so far that’s only the result of getting down to no warnings, rather than a file-by-file check through the code, which I suspect I’ll want to do eventually.

I did find two bugs, however. Noda Time tries to handle the case where TimeZoneInfo.Local returns a null reference, because we’ve seen that happen in the wild. (Hopefully that’s been fixed now in Mono, but even so it’s nice to know we can cope.) It turns out that we have code to cope with it in one place, but there are two places where we don’t… and the C# 8 tooling found that. Yay!

Found a bug in the preview!

To be clear, I didn’t expect the preview code to be perfect. As noted earlier, there are a few places I think it can be smarter. But I found a nasty bug that would hang Visual Studio and cause csc.exe to fail when building. It turns out that if you have a type parameter T with a constraint of T : class, IEquatable<T?>, that causes a stack overflow. I’ve reported the bug (now filed on GitHub thanks to diligent Microsoft folks) so hopefully it’ll be fixed long before the final version.

Admittedly the constraint is interesting in itself – it’s not necessarily clear what it means, if T is already a nullable reference type. I’ll let smarter people than myself work that out.

Conclusion

Well, that was a jolly exercise. My first impressions are:

  • We really need class library authors to embrace this as soon as C# 8 comes out, in order to make it as useful as possible early. Noda Time has no further dependencies, fortunately.
  • It didn’t take as long as I feared it might to do a first pass at annotating Noda Time, although I’m sure I missed some things while doing it.
  • A few bugs aside, the tooling is generally in very good shape; the warnings it produced were relevant and easy to understand.
  • It’s going to take me a while to get used to things like IList<string?>? for a nullable list of nullable strings.

Overall I’m very excited by all of this – I’m really looking forward to the final release. I suspect more blog posts will come over time…

Implementing IXmlSerializable in readonly structs

Background

There are three things you need to know to start with:

Operations on read-only variables which are value types copy the variable value first. I’ve written about this before on this blog. C# 7.2 addresses this by introducing the readonly modifier for structs. See the language proposal for more details. I was touched to see that the proposal references my blog post :)

The ref readonly local variables and in parameter features in C# 7.2 mean that “read-only variables” are likely to be more common in C# than they have been in the past.

Noda Time includes many value types which implement IXmlSerializable. Noda Time implements IXmlSerializable.ReadXml by assigning to this: fundamentally IXmlSerializable assumes a mutable type. I use explicit interface implementation to make this less likely to be used directly on an unboxed variable. With a generic method using an interface constraint you can observe a simple method call mutating a non-read-only variable, but that’s generally harmless.

Adding the readonly modifier to Noda Time structs

I relished news of the readonly modifier for struct declarations. At last, I can remove my ReadWriteForEfficiency attribute! Hmm. Not so much. To be fair, some of the structs (7 out of 18) were fine. But every struct that implements IXmlSerializable gave me this error:

Cannot assign to ‘this’ because it is read-only

That’s reasonable: in members of a readonly struct, this effectively becomes an in parameter instead of a ref parameter. But how can we fix that? Like any sensible developer, I turned to Stack Overflow, which has precisely the question I’m interested in. It even has an answer! Unfortunately, it amounts to a workaround: assigning to this via unsafe code.

Violating readonly with unsafe code

To give a concrete example of the answer from Stack Overflow, here’s my current LocalTime code:

void IXmlSerializable.ReadXml([NotNull] XmlReader reader)
{
    Preconditions.CheckNotNull(reader, nameof(reader));
    var pattern = LocalTimePattern.ExtendedIso;
    string text = reader.ReadElementContentAsString();
    this = pattern.Parse(text).Value;
}

Here’s the code that compiles when LocalTime is marked as readonly, after enabling unsafe blocks:

unsafe void IXmlSerializable.ReadXml([NotNull] XmlReader reader)
{
    Preconditions.CheckNotNull(reader, nameof(reader));
    var pattern = LocalTimePattern.ExtendedIso;
    string text = reader.ReadElementContentAsString();
    fixed (LocalTime* thisAddr = &this)
    {
        *thisAddr = pattern.Parse(text).Value;
    }
}

Essentially, the unsafe code is bypassing the controls around read-only structs. Just for kicks, let’s apply the same change
throughout Noda Time, and think about what would happen…

As it happens, that fix doesn’t work universally: ZonedDateTime is a “managed type” because it contains a reference (to a DateTimeZone) which means you can’t create a pointer to it. That’s a pity, but if we can make everything else readonly, that’s a good start. Now let’s look at the knock-on effects…

Safe code trying to abuse the unsafe code

Let’s try to abuse our “slightly dodgy” implementations. Here’s a class we’ll put in a “friendly” assembly which is trying to be as helpful as possible:

using NodaTime;

public class AccessToken
{
    private readonly Instant expiry;
    public ref readonly Instant Expiry => ref expiry;

    public AccessToken(Instant expiry) => this.expiry = expiry;
}

Great – it lets you get at the expiry time of an access token without even copying the value.

The big test is: can we break this friendly’s code’s assumptions about expiry really not changing its value?

Here’s code I expected to mutate the access token:

static void MutateAccessToken(AccessToken accessToken)
{
    ref readonly Instant expiry = ref accessToken.Expiry;
    string xml = "<evil>2100-01-01T00:00:00Z</evil>";
    EvilReadXml(in expiry, xml);
}

static void EvilReadXml<T>(in T value, string xml) where T : IXmlSerializable
{
    var reader = XmlReader.Create(new StringReader(xml));
    reader.Read();
    value.ReadXml(reader);
}

We have an in parameter in EvilReadXml, so the expiry variable is being passed by reference, and then we’re calling ReadXml on that parameter… so doesn’t that mean we’ll modify the parameter, and thus the underlying expiry field in the object?

Nope. Thinking about it, the compiler doesn’t know when it compiles EvilReadXml that T is a readonly struct – it could be a regular struct. So it has to create a copy of the value before calling ReadXml.

Looking the the spec proposal, there’s one interesting point in the section on ref extension methods:

However any use of an in T parameter will have to be done through an interface member. Since all interface members are considered mutating, any such use would require a copy.

Hooray! That suggests it’s safe – at least for now. I’m still worried though: what if the C# team adds a readonly struct generic constraint in C# 8? Would that allow a small modification of the above code to break? And if interface methods are considered mutating anyway, why doesn’t the language know that when I’m trying to implement the interface?

But hey, now that we know unsafe code is able to work around this in our XML implementation, what’s to stop nasty code from using the same trick directly?

Unsafe code abusing safe code

Imagine we didn’t support XML serialization at all. Could unsafe code mutate AccessToken? It turns out it can, very easily:

static void MutateAccessToken(AccessToken accessToken)
{
    ref readonly Instant expiry = ref accessToken.Expiry;
    unsafe
    {
        fixed (Instant* expiryAddr = &expiry)
        {
            *expiryAddr = Instant.FromUtc(2100, 1, 1, 0, 0);
        }
    }
}

This isn’t too surprising – unsafe code is, after all, unsafe. I readily admit I’m not an expert on .NET security, and I know the
landscape has changed quite a bit over time. These days I believe the idea of a sandbox has been somewhat abandoned – if you care about executing code you don’t really trust, you do it in a container and use that as your security boundary. That’s about the limit of my knowledge though, which could be incorrect anyway.

Where do we go from here?

At this point, I’m stuck making a choice between several unpleasant ones:

  • Leave Noda Time public structs as non-read-only. That prevents users from writing efficient code to use them without copying.
  • Remove XML serialization from Noda Time 3.0. We’re probably going to remove binary serialization anyway on the grounds that Microsoft is somewhat-discouraging it going forward, and it’s generally considered a security problem. However, I don’t think XML serialization is considered to be as bad, so long as you use it carefully, preventing arbitrary type loading. (See this paper for more information.)
  • Implement XML serialization using unsafe code as shown above. Even though my attempt at abusing it failed, that doesn’t provide very much comfort. I don’t know what other ramifications there might be for including unsafe code. Does that limit where the code can be deployed? It also doesn’t help with my concern about a future readonly struct constraint.

Thoughts very welcome. Reassurances from the C# team that “readonly struct” constraints won’t happen particularly welcome… along with alternative ways of implementing XML serialization.

Using .NET Core 2.0 SDK on Travis

This is just a brief post that I’m hoping may help some people migrate to use .NET Core 2.0 SDK on Travis. TL;DR: see the end of the post for a sample configuration.

Yesterday (August 15th), .NET Core 2.0 was fully released. Wonderfully, Travis already supports it. You just need dotnet: 2.0.0 in your YAML file.

I decided to experiment with upgrading the Noda Time build to require .NET Core 2.0 SDK. To be clear, I’m not doing anything in the code that requires 2.0, but it simplifies my build scripts:

Additionally, supporting netcoreapp2.0 means I’ll be able to run my benchmarks against that as well, which is going to be very interesting. However, my tests still target netcoreapp1.0, and that’s where I ran into problems.

Having done the bare minimum to try using 2.0 (edit global.json and .travis.yml) I ran into this error:

The specified framework 'Microsoft.NETCore.App', version '1.0.5' was not found.
  - Check application dependencies and target a framework version installed at:
      /
  - Alternatively, install the framework version '1.0.5'.

That makes sense. Although netcoreapp2.0 is compatible with netstandard1.0 (i.e. you can use libraries targeted to netstandard1.0 in a 2.0 environment) an application targeting netcoreapp1.0 really needs a 1.0 runtime.

So, we need to install just the runtime as well. I’d expected this to be potentially painful, but it’s really not. You just need an addons section in the YAML file:

addons:
  apt:
    sources:
    - sourceline: 'deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-trusty-prod trusty main'
      key_url: 'https://packages.microsoft.com/keys/microsoft.asc'
    packages:
    - dotnet-sharedframework-microsoft.netcore.app-1.0.5

Note that in my case, I want netcoreapp1.0 – if you need netcoreapp1.1, you’d probably install dotnet-sharedframework-microsoft.netcore.app-1.1.2.

So, aside from comments etc, my new Travis configuration will look like this:

language: csharp
mono: none
dotnet: 2.0.0
dist: trusty

addons:
  apt:
    sources:
    - sourceline: 'deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-trusty-prod trusty main'
      key_url: 'https://packages.microsoft.com/keys/microsoft.asc'
    packages:
    - dotnet-hostfxr-1.0.1
    - dotnet-sharedframework-microsoft.netcore.app-1.0.5

script:
  - build/travis.sh

I can now build with the 2.0 SDK, and run tests under both netcoreapp1.0 and netcoreapp2.0.

I’m hoping it’s just as simple on AppVeyor when that supports 2.0 as well…

Versioning conundrum for Noda Time – help requested

Obviously I’d normally ask developer questions on Stack Overflow but in this case, it feels like the answers may be at least somewhat opinion-based. If it turns out that it’s sufficiently straightforward that a Stack Overflow question and answer would be useful, I can always repost it there later.

The Facts

Noda Time 1.x exists “in production”, and the latest version is 1.3.1. This targets .NET 3.5 Client profile, .NET 4.0, and PCL Profile 328 (in a directory of lib\portable-net4+sl5+netcore45+wpa81+wp8+MonoAndroid1+MonoTouch1+XamariniOS1)

Noda Time currently includes the IANA time zone data (“TZDB”) – each released version of Noda Time contains the TZDB version that was “most recent” at the time that the Noda Time release was built. This gets out of date quite quickly, as there are multiple releases of TZDB every year. Those releases are named 2016a, 2016b etc. Noda Time also provides the ability to read .nzd files (Noda Zone Data – a custom format) and every time there’s a new release of TZDB, I build a .nzd file and upload it to nodatime.org, updating http://nodatime.org/tzdb/latest.txt to point to the latest version.

Noda Time 2.0 has not been released yet. When I do release it, I expect to target .NET 4.5 and netstandard1.0.

Each Noda Time 1.x release has an AssemblyVersion just based on major/minor, i.e. 1.0, 1.1, 1.2 etc. Based on this blog post, this may have been a mistake – it should quite possibly have been 1.0 for all versions. Obviously I can’t fix that now, but I can make the 2.x releases use 2.0 everywhere.

When 2.0 is “pretty much ready” we’re going to cut a 1.4 release which deprecates things that are removed in 2.0 and provides the new approaches as far as possible. For example, the IClock.Now property from 1.x is removed in 2.0, and replaced by IClock.GetCurrentInstant(). We’ll deprecate the Now property and introduce a GetCurrentInstant() extension method which delegates to it. This shouldn’t break any 1.x users, but should allow them to move over to the new API as far as possible before upgrading to 2.0. The intention is that users wouldn’t stay on 1.4 for very long. (Obviously they could do so, but there’s not a lot of benefit. 1.4 won’t have new features – it’s really just a transition version.)

So far, that’s just the way of the world. Now I want to make it easier for users to stay up-to-date with TZDB – including if nodatime.org goes down. (That’s considerably more likely than nuget.org going down, for example.)

The plan is to introduce a new nearly-data-only assembly, packaged as NodaTime.Tzdb. The aim is to allow users to update their data dependency at build time, in a controlled fashion. If you only want to specify an exact version to depend on, you can do so. If you want to pick up the latest version every time you build, that should be possible too.

The tricky bits come in terms of the versioning.

Some options

Firstly, the versioning scheme for the package ignoring everything else. I plan to use something like this:

  • 2016a => 1.2016.1
  • 2016b => 1.2016.2
  • 2016c => 1.2016.3
  • 2017a => 1.2017.1

This should make it reasonably easy to tell the TZDB version just from the package version.

However, I’m considering a few options within this idea:

  • I could create a single package per TZDB release, targeting .NET 3.5 client profile, .NET 4.0, the Profile 328 PCL, .NET 4.5, and .NET Standard 1.0. The first four of these could depend on Noda Time 1.1, and the last one would have to depend on Noda Time 2.0.
  • I could do the above, but depend on 1.3.1 instead of 1.1.
  • I could create one package with two versions per TZDB release – a 1.x depending on Noda Time 1.1, and a 2.x depending on Noda Time 2.0. For example, when TZDB 2016d is released, I could create 1.2016.4 and 2.2016.4.
  • I could create one package version depending on 1.1, one depending on 1.2, one depending on 1.3, one depending on 1.4 (when that exists) and one depending on 2.0.
  • I could create two separate packages, i.e. include the Noda Time major version number in the package name. I don’t like this idea, but it’s on the table.

Some concerns and questions

There are various aspects to this which cause me a few worries. I’m not sure how well I can really structure or segregate those, so I’ll just list them.

  • Can a non-prerelease package depend on a prerelease package for some frameworks? If not, that possibly blows the “single version” idea out of the water, as I can’t depend on NodaTime v2.0 yet – it’s not out.
  • Even if that’s feasible, is it sane to depend on different major versions of the NodaTime package from within a single version of the NodaTime.Tzdb package, or is that going to cause massive confusion?
  • Should I depend on NodaTime v1.1 or v1.3.1? They have different AssemblyVersion numbers, which I believe means an assembly binding redirect will be required if I depend on 1.1 but users depend on 1.3.1. To be clear, I don’t expect many users to still be on versions older than 1.3.1.
  • Likewise, is it going to cause issues for .NET 4.5 users who use NodaTime 2.0 (eventually) if they depend on a version of NodaTime.Tzdb that depends on NodaTime 1.3.1? Again, presumably assembly binding redirects are needed.
  • If I go with the “two-version” scheme (i.e. 1.2016.4 and 2.2016.4 etc) how careful would NodaTime 1.3.1 users have to be? I wouldn’t want them to accidentally get upgraded to NodaTime 2.0 when that’s released, by accidentally taking the 2.x line of NodaTime.Tzdb.
  • Does dotnet cli support the nuget allowedVersions feature at all? I haven’t found any support for it in DNX, but really it’s vital for this scheme to work at all – basically I’d expect a NodaTime 1.3.1 user to specify an allowed version range for NodaTime.Tzdb of [1,2)
  • Is my scheme of 1.2016.4 (etc) sensible? It’s somewhat abusing major/minor/patch, in that there’s no real difference in meaning between a minor version bump (“it’s the new year”) and a patch bump (“there’s been another release in the same year”). Neither kind of change will be breaking (unless you depend on specific time zones behaving in specific ways, of course), and it’s handy to be able to give a simple mapping between TZDB version and package version, but there may be consequences I’m unaware of.

Please feel free to ask clarifying questions in comments. Will look forward to getting some answers :)