(Note that this is deliberately not posted in the Noda Time blog. I reckon it’s of wider interest from a design perspective, and I won’t be posting any of the equivalent Noda Time code. I’ll just say now that we don’t have this sort of craziness in Noda Time, and leave it at that…)
A few weeks ago, I was answering a Stack Overflow question when I noticed an operation around dates and times which should have been losing information apparently not doing so. I investigated further, and discovered some "interesting" aspects of both DateTime and TimeZoneInfo. In an effort to keep this post down to a readable length (at least for most readers; certain WebDriver developers who shall remain nameless have probably given up by now already) I’ll save the TimeZoneInfo bits for another post.
Background: daylight saving transitions and ambiguous times
There’s one piece of inherent date/time complexity you’ll need to understand for this post to make sense: sometimes, a local date/time occurs twice. For the purposes of this post, I’m going to assume you’re in the UK time zone. On October 28th 2012, at 2am local time (1am UTC), UK clocks will go back to 1am local time. So 1:20am local time occurs twice – once at 12:20am UTC (in daylight saving time, BST), and once at 1:20am UTC (in standard time, GMT).
If you want to run any of the code in this post and you’re not in the UK, please adjust the dates and times used to a similar ambiguity for when your clocks go back. If you happen to be in a time zone which doesn’t observe daylight savings, I’m afraid you’ll have to adjust your system time zone in order to see the effect for yourself.
DateTime.Kind and conversions
As you may already know, as of .NET 2.0, DateTime has a Kind property, of type DateTimeKind – an enum with the following values:
- Local: The DateTime is considered to be in the system time zone. Not an arbitrary "local time in some time zone", but in the specific current system time zone.
- Utc: The DateTime is considered to be in UTC (corollary: it always unambiguously represents an instant in time)
- Unspecified: This means different things in different contexts, but it’s a sort of "don’t know" kind; this is closer to "local time in some time zone" which is represented as LocalDateTime in Noda Time.
DateTime provides three methods to convert between the kinds:
- ToUniversalTime: if the original kind is Local or Unspecified, convert it from local time to universal time in the system time zone. If the original kind is Utc, this is a no-op.
- ToLocalTime: if the original kind is Utc or Unspecified, convert it from UTC to local time. If the original kind is Local, this is a no-op.
- SpecifyKind: keep the existing date/time, but just change the kind. (So 7am stays as 7am, but it changes the meaning of that 7am effectively.)
(Prior to .NET 2.0, ToUniversalTime and ToLocalTime were already present, but always assumed the original value needed conversion – so if you called x.ToLocalTime().ToLocalTime().ToLocalTime() the result would probably end up with the appropriate offset from UTC being applied three times!)
Of course, none of these methods change the existing value – DateTime is immutable, and a value type – instead, they return a new value.
DateTime’s Deep Dark Secret
(The code in this section is presented in several chunks, but it forms a single complete piece of code – later chunks refer to variables in earlier chunks. Put it all together in a Main method to run it.)
Armed with the information in the previous sections, we should be able to make DateTime lose data. If we start with 12:20am UTC and 1:20am UTC on October 28th as DateTimes with a kind of Utc, when we convert them to local time (on a system in the UK time zone) we should get 1:20am in both cases due to the daylight saving transition. Indeed, that works:
var original1 = new DateTime(2012, 10, 28, 0, 20, 0, DateTimeKind.Utc);
var original2 = new DateTime(2012, 10, 28, 1, 20, 0, DateTimeKind.Utc);
// Convert to local time
var local1 = original1.ToLocalTime();
var local2 = original2.ToLocalTime();
// Result is the same for both values. Information loss?
var expected = new DateTime(2012, 10, 28, 1, 20, 0, DateTimeKind.Local);
Console.WriteLine(local1 == expected); // True
Console.WriteLine(local2 == expected); // True
Console.WriteLine(local1 == local2); // True
If we’ve started with two different values, applied the same operation to both, and ended up with equal values, then we must have lost information, right? That doesn’t mean that operation is "bad" any more than "dividing by 2" is bad. You ought to be aware of that information loss, that’s all.
So, we ought to be able to demonstrate that information loss further by converting back from local time to universal time. Here we have the opposite problem: from our local time of 1:20am, we have two valid universal times we could convert to – either 12:20am UTC or 1:20am UTC. Both answers would be correct – they are universal times at which the local time would be 1:20am. So which one will get picked? Well… here’s the surprising bit:
var roundTrip1 = local1.ToUniversalTime();
var roundTrip2 = local2.ToUniversalTime();
// Values round-trip correctly! Information has been recovered…
Console.WriteLine(roundTrip1 == original1); // True
Console.WriteLine(roundTrip2 == original2); // True
Console.WriteLine(roundTrip1 == roundTrip2); // False
Somehow, each of the local values knows which universal value it came from. The The information has been recovered, so the reverse conversion round-trips each value back to its original one. How is that possible?
It turns out that DateTime actually has four potential kinds: Local, Utc, Unspecified, and "local but treat it as the earlier option when resolving ambiguity". A DateTime is really just a 64-bit number of ticks, but because the range of DateTime is only January 1st 0001 to December 31st 9999. That range can be represented in 62 bits, leaving 2 bits "spare" to represent the kind. 2 bits gives 4 possible values… the three documented ones and the shadowy extra one.
Through experimentation, I’ve discovered that the kind is preserved if you perform arithmetic on the value, too… so if you go to another "fall back" DST transition such as October 30th 2011, the ambiguity resolution works the same way as before:
var local4 = local2.AddYears(-1).AddDays(2);
Console.WriteLine(local3.ToUniversalTime().Hour); // 0
Console.WriteLine(local4.ToUniversalTime().Hour); // 1
If you use DateTime.SpecifyKind with DateTimeKind.Local, however, it goes back to the "normal" kind, even though it looks like it should be a no-op:
var local5 = DateTime.SpecifyKind(local1, local1.Kind);
Console.WriteLine(local5.ToUniversalTime().Hour); // 1
Is this correct behaviour? Or should it be a no-op, just like calling ToLocalTime on a "local" DateTime is? (Yes, I’ve checked – that doesn’t lose the information.) It’s hard to say, really, as this whole business appears to be undocumented… at least, I haven’t seen anything in MSDN about it. (Please add a link in the comments if you find something. The behaviour actually goes against what’s documented, as far as I can tell.)
I haven’t looked into whether various forms of serialization preserve values like this faithfully, by the way – but you’d have to work hard to reproduce it in non-framework code. You can’t explicitly construct a DateTime with the "extra" kind; the only ways I know of to create such a value are via a conversion to local time or through arithmetic on a value which already has the kind. (Admittedly if you’re serializing a DateTime with a Kind of Local, you’re already on potentially shaky ground, given that you could be deserializing it on a machine with a different system time zone.)
I’ve misled you a little, I have to admit. In the code above, when I compared the "expected" value with the results of the first conversions, I deliberately specified DateTimeKind.Local in the constructor call. After all, that’s the kind we do expect. Well, yes – but I then printed the result of comparing this value with local1 and local2… and those comparisons would have been the same regardless of the kind I’d specified in the constructor.
All comparisons between DateTimes ignore the Kind property. It’s not just restricted to equality. So for example, consider this comparison:
var dt1 = new DateTime(2012, 6, 1, 8, 0, 0, DateTimeKind.Utc);
var dt2 = new DateTime(2012, 6, 1, 8, 30, 0, DateTimeKind.Local);
Console.WriteLine(dt1 < dt2); // True
When viewed in terms of "what instants in time do these both represent?" the answer here is wrong – when you convert both values into the same time zone (in either direction), dt1 occurs after dt2. But a simple look at the properties tells a different story. In practice, I suspect that most comparisons between DateTime values of different kinds involve code which is at best sloppy and is quite possibly broken in a meaningful way.
Of course, if you bring Kind=Unspecified into the picture, it becomes impossible to compare meaningfully in a kind-sensitive way. Is 12am UTC before or after 1am Unspecified? It depends what time zone you later use.
To be clear, it is a hard-to-resolve issue, and one that we don’t do terribly well at in Noda Time at the moment for ZonedDateTime. (And even with just LocalDateTime you’ve got issues between calendars.) This is a situation where providing separate Comparer<T> implementations works nicely – so you can explicitly say what kind of comparison you want.
There’s more fun to be had with a similar situation when we look at TimeZoneInfo, but for now, a few lessons:
- Giving a type different "modes" which make it mean fairly significantly different things is likely to cause headaches
- Keeping one of those modes secret (and preventing users from even constructing a value in that mode directly) leads to even more fun and games
- If two instances of your type are considered "equal" but behave differently, you should at least consider whether there’s something smelly going on
- There’s always more fun to be had with DateTime…
20 thoughts on “More fun with DateTime”
I think you should have noted something else here – 1:20am occurs once in UTC(/GMT), and then once again in BST. It is a problem with plain DateTime, since it doesn’t know about this, but if the time zones are preserved then there is no ambiguity.
(Not that it influences the rest of your post, but I think it should be stated explicitly…)
Interesting… so ‘Local’ represents ‘System time and treat as the later if it’s ambiguous’? Presumably using DateTimeOffset fixes all these issues.
I’d put it as “There’s always another timezone bug in your code” :(.
Is there an advantage to representing the DateTime internally as the local time, instead of always storing the number for the UTC time and only paying attention to the Kind when you’re accessing .Hour, .Day, etc.? That way equality and comparisons would be intuitive and work as expected in your last case, and you wouldn’t need any sort of hack to deal with the “2nd 1:20”.
@Porges: I’ve updated the post with the first part of your comment, but not the second. Local time + time zone isn’t enough to know, because “GMT or BST” isn’t the time zone: “the UK time zone” is. If you kept the *part* of the time zone you were in (zone interval in Noda speak), that would work… but another alternative is to keep the local time and offset (like DateTimeOffset does) or local time, offset and time zone (like Noda’s ZonedDateTime does).
@Simon: Yes – although the “special” kind is also reported as Local.
@David: It depends on what you’re trying to represent. If you’re trying to represent the first occurrence of a meeting which repeats every week, then you really do mean the local time – as that can change UTC offset from week to week.
There’s a section in the Noda Time user guide which gives more details about my feelings on this – basically, if you’ve got a decent set of types to work with, use the one which most accurately models the information you’ve *really* got.
Thanks for the post Jon, you’re very good at explaining non-trivial things as if they are very simple
@Gleb: Thanks, but in some ways I think I’m actually good at taking something simple, then making it *sound* like it’s really complicated, but I’m doing you a favour and simplifying it.
It’s like a mischievous salesman selling you something cheap for a “reasonable” price – after convincing you that it’s normally expensive ;)
With a bit of StructLayout hackery, you can observe these hidden flags directly. I’ve adapted your program here:
Here are my results, on Mono and .NET:
.NET does show the behaviour and implementation you describe. Mono 2.10 shows the behaviour you might expect, and no fourth state in its flags.
BTW the IsDaylightSavingTime() method returns different results for local1 and local2:
Console.WriteLine(local1.IsDaylightSavingTime()); // True
Console.WriteLine(local2.IsDaylightSavingTime()); // False
Luckily .NET 3.5 and newer have DateTimeOffset & TimeZoneInfo, so the only oddity left in .NET’s date handling is the inexplicable fact that DateTime has not been deprecated, even in 4.5. Using DateTime instead of DateTimeOffset is a code smell, as is using a 3rd party library that duplicates the functionality, since those libraries were also obsoleted by v3.5.
@John Meyer: I disagree strongly. There are times when using DateTimeOffset is entirely incorrect, and you should use DateTime instead – if you ever find yourself having to *invent* the offset (because you really don’t know it) then you should be using DateTime instead.
MSDN has a good article about choosing between DateTimeOffset and DateTime: http://msdn.microsoft.com/en-us/library/bb384267.aspx
Likewise, I’m not sure which 3rd party library you’re talking about that “duplicates the functionality” – but if you mean Noda Time, I can only assume you haven’t actually looked at the project, as it does *so* much more than duplicate DateTimeOffset/DateTime.
Interesting article, and thanks for all your posts. It certainly does get tricky, comparing DateTimes of different locales.
As I was reading, I was thinking there must be a something to determine DST in the DateTime, and according to Thorns post there is. Possibly included in the same bits as DST, is something used to determine which locale the DateTime was created in. That way, even on deserialization the correct date time “could” be determined (whether it actually does or not is another story). Since there are only around ~24, it wouldn’t take much memory either.
@Robert: There are a few misunderstandings here.
Firstly, a “locale” isn’t the same as a time zone. Much of the US would use the “en-US” locale, but there are lots of different time zones in the US.
Secondly, a DateTime *doesn’t* contain DST information any more than a string contains information about whether it’s a valid URL for a working web server – DateTime only contains information which can be *interpreted* in the context of a time zone. The single bit of “I know that in the case of ambiguity, you should regard me as the earlier occurrence” is the only DST-related information in a DateTime.
Thirdly, there are far more than 24 cultures, and also far more than 24 time zones. Even the Windows time zone database (which is relatively sparse compared with tzdb) contains 101 zones. DateTime certainly doesn’t know which time zone it’s in, beyond the “unspecified, UTC or local” distinction described in the post.
I stand corrected on the definition of timezone vs locale. Thanks for clarifying.
I haven’t delved into how .Net or Mono stores DateTime under the hood. I mean I learned from MSDN about it storing Ticks and that was about it, most the rest was an “educated” guess.
Maybe part of the problems that you describe of the current process is that it doesnt know where it came from. Where that really started to stick out what your talk about (de)serialization into a different timezone.
I’m curious to hear your thoughts on adding something, possibly a couple of enums and/or booleans (more interested in the concept, not specifically what to use), something that could hold this extended information, and whether it could be used to hold us off until we start colonizing other planets.
@Robert: Probably the best thing to do is read the “concepts” guide for Noda Time – given that I’ve designed that to hold what I think is appropriate :)
Saw a fun “time piece” by Bart De Smet on C9 on Rx that I thought you might find interesting. First 1/2 hour talks about absolute vs system time w/ interesting corner cases for “tombstoned” apps and NTP system clock synchronization:
@David: Thanks, will have a look. Bart’s always fun to listen to :)