Storing UTC is not a silver bullet

March 27, 2019 jonskeet 117 Comments

Note: this is a pretty long post. If you’re not interested in the details, the conclusion at the bottom is intended to be read in a standalone fashion. There’s also a related blog post by Lau Taarnskov – if you find this one difficult to read for whatever reason, maybe give that a try.

When I read Stack Overflow questions involving time zones, there’s almost always someone giving the advice to only ever store UTC. Convert to UTC as soon as you can, and convert back to a target time zone as late as you can, for display purposes, and you’ll never have a time zone issue again, they say.

This blog post is intended to provide a counterpoint to that advice. I’m certainly not saying storing UTC is always the wrong thing to do, but it’s not always the right thing to do either.

Note on simplifications: this blog post does not go into supporting non-Gregorian calendar systems, or leap seconds. Hopefully developers writing applications which need to support either of those are already aware of their requirements.

Background: EU time zone rule changes

The timing of this blog post is due to recent European Parliament proceedings that look like they will probably end the clocks changing twice a year into “summer time” or “winter time” within EU member states. The precise details are yet to be finalized and are unimportant to the bigger point, but for the purpose of this blog post I’ll assume that each member state has to decide whether they will “spring forward” one last time on March 28th 2021, then staying in permanent “summer time”, or “fall back” one last time on October 31st 2021, then staying in permanent “winter time”. So from November 1st 2021 onwards, the UTC offset of each country will be fixed – but there may be countries which currently always have the same offset as each other, and will have different offsets from some point in 2021. (For example, France could use winter time and Germany could use summer time.)

The larger point is that time zone rules change, and that applications should expect that they will change. This isn’t a corner case, it’s the normal way things work. There are usually multiple sets of rule changes (as released by IANA) each year. At least in the European changes, we’re likely to have a long notice period. That often isn’t the case – sometimes we don’t find out about rule changes until a few days before they happen.

Application example

For the sake of making everything concrete, I’m going to imagine that we’re writing an application to help conference organizers. A conference organizer can create a conference within the application, specifying when and where it’s happening, and (amongst other things) the application will display a countdown timer of “the number of hours left before the start of the conference”. Obviously a real application would have a lot more going on than this, but that’s enough to examine the implementation options available.

To get even more concrete, we’ll assume that a conference organizer has registered a conference called “KindConf” and has said that it will start at 9am in Amsterdam, on July 10th 2022. They perform this registration on March 27th 2019, when the most recently published IANA time zone database is 2019a, which predicts that the offset observed in Amsterdam on July 10th 2022 will be UTC+2.

For the sake of this example, we’ll assume that the Netherlands decides to fall back on October 31st 2021 for one final time, leaving them on a permanent offset of UTC+1. Just to complete the picture, we’ll assume that this decision is taken on February 1st 2020, and that IANA publishes the changes on March 14th 2020, as part of release 2020c.

So, what can the application developer do? In all the options below, I have not gone into details of the database support for different date/time types. This is important, of course, but probably deserves a separate blog post in its own right, on a per-database basis. I’ll just assume we can represent the information we want to represent, somehow.

Interlude: requirements

Before we get to the implementations, I’ll just mention a topic that’s been brought up a few times in the comments and on Twitter. I’ve been assuming that the conference does still occur at 9am on July 10th 2022… in other words, that the “instant in time at which the conference starts” changes when the rules change.

It’s unlikely that this would ever show up in a requirements document. I don’t remember ever being in a meeting with a product manager where they’d done this type of contingency planning. If you’re lucky, someone would work out that there’s going to be a problem long before the rules actually change. At that point, you’d need to go through the requirements and do the implementation work. I’d argue that this isn’t a new requirement – it’s a sort of latent, undiscovered requirement you’ve always had, but you hadn’t known about before.

Now, back to the options…

Option 1: convert to UTC and just use that forever

The schema for the Conferences table in the database might look like this:

ID: auto-incremented integer
Name: string
Start: date/time in UTC
Address: string

The entry for KindConf would look like this:

ID: 1
Name: KindConf
Start: 2022-07-10T07:00:00Z
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands

That entry is then preserved forever, without change. So what happens to our countdown timer?

Result

The good news is that anyone observing the timer will see it smoothly count down towards 0, with no jumps. The bad news is that when it reaches 0, the conference won’t actually start – there’ll be another hour left. This is not good.

Option 2: convert to UTC immediately, but reconvert after rule changes

The schema for the Conferences table would preserve the time zone ID. (I’m using the IANA ID for simplicity, but it could be the Windows system time zone ID, if absolutely necessary.) Alternatively, the time zone ID could be derived each time it’s required – more on that later.

ID: auto-incremented integer
Name: string
Start: date/time in UTC
Address: string
Time zone ID: string

The initial entry for KindConf would look like this:

ID: 1
Name: KindConf
Start: 2022-07-10T07:00:00Z
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam

On March 14th 2020, when the new time zone database is released, that entry could be changed to make the start time accurate again:

ID: 1
Name: KindConf
Start: 2022-07-10T08:00:00Z
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam

But what does that “change” procedure look like? We need to convert the UTC value back to the local time, and then convert back to UTC using different rules. So which rules were in force when that entry was created? It looks like we actually need an extra field in the schema somewhere: TimeZoneRulesVersion. This could potentially be a database-wide value, although that’s only going to be reasonable if you can update all entries and that value atomically. Allowing a value per entry (even if you usually expect all entries to be updated at roughly the same time) is likely to make things simpler.

So our original entry was actually:

ID: 1
Name: KindConf
Start: 2022-07-10T07:00:00Z
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam
TimeZoneRules: 2019a

And the modified entry is:

ID: 1
Name: KindConf
Start: 2022-07-10T08:00:00Z
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam
TimeZoneRules: 2020c

Of course, the entry could have been updated many times over the course of time, for 2019b, 2019c, …, 2020a, 2020b. Or maybe we only actually update the entry if the start time changes. Either way works.

Result

Now, anyone refreshing the countdown timer for the event will see the counter increase by an hour when the entry is updated. That may look a little odd – but it means that when the countdown timer reaches 0, the conference is ready to start. I’m assuming this is the desired behaviour.

Implementation

Let’s look at roughly what would be needed to perform this update in C# code. I’ll assume the use of Noda Time to start with, but then we’ll consider what happens if you’re not using Noda Time.

public class Conference
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Address { get; set; }
    public Instant Start { get; set; }
    public string TimeZoneId { get; set; }
    public string TimeZoneRules { get; set; }
}

// In other code... some parameters might be fields in the class.
public void UpdateStartTime(
    Conference conference,
    Dictionary<string, IDateTimeZoneProvider> timeZoneProvidersByVersion,
    string latestRules)
{
    // Map the start instant into the time zone using the old rules
    IDateTimeZoneProvider oldProvider = timeZoneProvidersByVersion[conference.TimeZoneRules];
    DateTimeZone oldZone = oldProvider[conference.TimeZoneId];
    ZonedDateTime oldZonedStart = conference.Start.InZone(oldZone);   

    IDateTimeZoneProvider newProvider = timeZoneProvidersByVersion[latestRules];
    DateTimeZone newZone = newProvider[conference.TimeZoneId];
    // Preserve the local time, but with the new time zone rules
    ZonedDateTime newZonedStart = oldZonedStart.LocalDateTime.InZoneLeniently(newZone);

    // Update the conference entry with the new information
    conference.Start = newZonedStart.ToInstant();
    conference.TimeZoneRules = latestRules;
}

The InZoneLeniently call is going to be a common issue – we’ll look at that later (“Ambiguous and skipped times”).

This code would work, and Noda Time would make it reasonably straightforward to build that dictionary of time zone providers, as we publish all the “NZD files” we’ve ever created from 2013 onwards on the project web site. If the code is being updated with the latest stable version of the NodaTime NuGet package, the latestRules parameter wouldn’t be required – DateTimeZoneProviders.Tzdb could be used instead. (And IDateTimeZoneProvider.VersionId could obtain the current version.)

However, this approach has three important requirements:

The concept of “version of time zone rules” has to be available to you
You have to be able to load a specific version of the time zone rules
You have to be able to use multiple versions of the time zone rules in the same application

If you’re using C# but relying on TimeZoneInfo then… good luck with any of those three. (It’s no doubt feasible, but far from simple out of the box, and it may require an external service providing historical data.)

I can’t easily comment on other platforms in any useful way, but I suspect that dealing with multiple versions of time zone data is not something that most developers come across.

Option 3: preserve local time, using UTC as derived data to be recomputed

Spoiler alert: this is my preferred option.

In this approach, the information that the conference organizer supplied (“9am on July 10th 2022”) is preserved and never changed. There is additional information in the entry that is changed when the time zone database is updated: the converted UTC instant. We can also preserve the version of the time zone rules used for that computation, as a way of allowing the process of updating entries to be restarted after a failure without starting from scratch, but it’s not strictly required. (It’s also probably useful as diagnostic information, too.)

The UTC instant is only stored at all for convenience. Having a UTC representation makes it easier to provide total orderings of when things happen, and also to compute the time between “right now” and the given instant, for the countdown timer. Unless it’s actually useful to you, you could easily omit it entirely. (My Noda Time benchmarks suggest it’s unlikely that doing the conversion on every request wouldn’t cause a bottleneck. A single local-to-UTC conversion on my not-terribly-fast benchmark machine only takes ~150ns. In most environments that’s close to noise. But for cases where it’s relevant, it’s fine to store the UTC as described below.)

So the schema would have:

ID: auto-incremented integer
Name: string
Local start: date/time in the specified time zone
Address: string
Time zone ID: string
UTC start: derived field for convenience
Time zone rules version: for optimization purposes

So our original entry is:

ID: 1
Name: KindConf
LocalStart: 2022-07-10T09:00:00
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam
UtcStart: 2022-07-10T07:00:00Z
TimeZoneRules: 2019a

On March 14th 2020, when the time zone database 2020c is released, this is modified to:

ID: 1
Name: KindConf
LocalStart: 2022-07-10T09:00:00
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam
UtcStart: 2022-07-10T08:00:00Z
TimeZoneRules: 2020c

Result

This is the same as option 2: after the update, there’s a jump of an hour, but when it reaches 0, the conference starts.

Implementation

This time, we don’t need to convert our old UTC value back to a local value: the “old” time zone rules version and “old” UTC start time are irrelevant. That simplifies matter significantly:

public class Conference
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Address { get; set; }
    public LocalDateTime LocalStart { get; set; }
    public string TimeZoneId { get; set; }
    public Instant UtcStart { get; set; }
    public string TimeZoneRules { get; set; }
}

// In other code... some parameters might be fields in the class.
public void UpdateUtcStart(
    Conference conference,
    IDateTimeZoneProvider latestZoneProvider)
{
    DateTimeZone newZone = latestZoneProvider[conference.TimeZoneId];
    // Preserve the local time, but with the new time zone rules
    ZonedDateTime newZonedStart = conference.LocalStart.InZoneLeniently(newZone);

    // Update the conference entry with the new information
    conference.UtcStart = newZonedStart.ToInstant();
    conference.TimeZoneRules = latestZoneProvider.VersionId;
}

As the time zone rules version is now optional, this code could be ported to use TimeZoneInfo instead. Obviously from my biased perspective the code wouldn’t be as pleasant, but it would be at least reasonable. The same is probably true on other platforms.

So I prefer option 3, but is it really so different from option 2? We’re still storing the UTC value, right? That’s true, but I believe the difference is important because the UTC value is an optimization, effectively.

Principle of preserving supplied data

For me, the key difference between the options is that in option 3, we store and never change what the conference organizer entered. The organizer told us that the event would start at the given address in Amsterdam, at 9am on July 10th 2022. That’s what we stored, and that information never needs to change (unless the organizer wants to change it, of course). The UTC value is derived from that “golden” information, but can be re-derived if the context changes – such as when time zone rules change.

In option 2, we don’t store the original information – we only store derived information (the UTC instant). We need to store information to tell us all the context about how we derived it (the old time zone rules version) and when updating the entry, we need to get back to the original information before we can re-derive the UTC instant using the new rules.

If you’re going to need the original information anyway, why not just store that? The implementation ends up being simpler, and it means it doesn’t matter whether or not we even have the old time zone rules.

Representation vs information

It’s important to note that I’m only talking about preserving the core information that the organizer entered. For the purposes of this example at least, we don’t need to care about the representation they happened to use. Did they enter it as “July 10 2022 09:00” and we then parsed that? Did they use a calendar control that provided us with “2022-07-10T09:00”? I don’t think that’s important, as it’s not part of the core information.

It’s often a useful exercise to consider what aspects of the data you’re using are “core” and which are incidental. If you’re receiving data from another system as text for example, you probably don’t want to store the complete XML or JSON, as that choice between XML and JSON isn’t relevant – the same data could be represented by an XML file and a JSON file, and it’s unlikely that anything later will need to know or care.

A possible option 4?

I’ve omitted a fourth option which could be useful here, which is a mixture of 2 and 3. If you store a “date/time with UTC offset” then you’ve effectively got both the local start time and the UTC instant in a single field. To show the values again, you’d start off with:

ID: 1
Name: KindConf
Start: 2022-07-10T09:00:00+02:00
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam
TimeZoneRules: 2019a

On March 14th 2020, when the time zone database 2020c is released, this is modified to:

ID: 1
Name: KindConf
Start: 2022-07-10T09:00:00+01:00
Address: Europaplein 24, 1078 GZ Amsterdam, Netherlands
TimeZoneId: Europe/Amsterdam
TimeZoneRules: 2020c

In systems that support “date/time with UTC offset” well in both the database and the languages using it, this might be an attractive solution. It’s important to note that the time zone ID is still required (unless you derive it from the address whenever you need it) – there’s a huge difference between knowing the time zone that’s applied, and knowing the UTC offset in one specific situation.

Personally I’m not sure I’m a big fan of this option, as it combines original and derived data in a single field – the local part is the original data, and the offset is derived. I like the separation between original and derived data in option 3.

With all those options presented, let’s look at a few of the corner cases I’ve mentioned in the course of the post.

Ambiguous and skipped times

In both of the implementations I’ve shown, I’ve used the InZoneLeniently method from Noda Time. While the mapping from UTC instant to local time is always completely unambiguous for a single time zone, the reverse mapping (from local time to UTC instant) is not always unambiguous.

As an example, let’s take the Europe/London time zone. On March 31st 2019, at 1am local time, we will “spring forward” to 2am, changing offset from UTC+0 to UTC+1. On October 27th 2019, at 2am local time, we will “fall back” to 1am, changing offset from UTC+1 to UTC+0. That means that 2019-03-31T01:30 does not happen at all in the Europe/London time zone, and 2019-10-27T01:30 occurs twice.

Now it’s reasonable to validate this when a conference organizer specifies the starting time of a conference, either prohibiting it if the given time is skipped, or asking for more information if the given time is ambiguous. I should point out that this is highly unlikely for a conference, as transitions are generally done in the middle of the night – but other scenarios (e.g. when to schedule an automated backup) may well fall into this.

That’s fine at the point of the first registration, but it’s also possible that a previously-unambiguous local time could become ambiguous under new time zone rules. InZoneLeniently handles that in a way documented in the Resolvers.LenientResolver. That may well not be the appropriate choice for any given application, and developers should consider it carefully, and write tests.

Recurrent events

The example I’ve given so far is for a single event. Recurrent events – such as weekly meetings – end up being trickier still, as a change to time zone rules can change the offsets for some instances but not others. Likewise meetings may well be attended by people from more than a single time zone – so it’s vital that the recurrence would have a single coordinating time zone, but offsets may need to be recomputed for every time zone involved, and for every occurrence. Application developers have to think about how this can be achieved within performance requirements.

Time zone boundary changes and splits

So far we’ve only considered time zone rules changing. In options 2-4, we stored a time zone ID within the entry. That assumes that the time zone associated with the event will not change over time. That assumption may not be valid.

As far as I’m aware, time zone rules change more often than changes to which time zone any given location is in – but it’s entirely possible for things to change over time. Suppose the conference wasn’t in Amsterdam itself, but Rotterdam. Currently Rotterdam uses the Europe/Amsterdam time zone, but what if the Netherlands splits into two countries between 2019 and 2022? It’s feasible that by the time the conference occurs, there could be a Europe/Rotterdam time zone, or something equivalent.

To that end, a truly diligent application developer might treat the time zone ID as derived data based on the address of the conference. As part of checking each entry when the time zone database is updated, they might want to find the time zone ID of the address of the conference, in case that’s changed. There are multiple services that provide this information, although it may need to be a multi-step process, first converting the address into a latitude/longitude position, and then finding the time zone for that latitude/longitude.

Past vs recent past

This post has all been about future date/time values. In Twitter threads discussing time zone rule changes, there’s been a general assertion that it’s safe to only store the UTC instant related to an event in the past. I would broadly agree with that, but with one big caveat: as I mentioned earlier, sometimes governments adopt time zone rule changes with almost no notice at all. Additionally, there can be a significant delay between the changes being published and them being available within applications. (That delay can vary massively based on your platform.)

This means that while a conversion to UTC for a value more than (say) a year ago will probably stay valid, if you’re recording a date and time of “yesterday”, it’s quite possible that you’re using incorrect rules without knowing it. (Even very old rules can change, but that’s rarer in my experience.)

Do you need to account for this? That depends on your application, like so many other things. I’d at least consider the principle described above – and unless it’s much harder for you to maintain the real source information for some reason, I’d default to doing that.

Conclusion

The general advice of “just convert all local date/time data to UTC and store that” is overly broad in my view. For future and near-past events, it doesn’t take into account that time zone rules change, making the initial conversion potentially inaccurate. Part of the point of writing this blog post is to raise awareness, so that even if people do still recommend storing UTC, they can add appropriate caveats rather than treating it as a universal silver bullet.

I should explicitly bring up timestamps at this point. Machine-generated timestamps are naturally instants in time, recording “the instant at which something occurred” in an unambiguous way. Storing those in UTC is entirely reasonable – potentially with an offset or time zone if the location at which the timestamp was generated is relevant. Note that in this case the source of the data isn’t “a local time to be converted”.

That’s the bigger point, that goes beyond dates and times and time zones: choosing what information to store, and how. Any time you discard information, that should be a conscious choice. Are you happy discarding the input format that was used to enter a date? Probably – but it’s still a decision to make. Defaulting to “convert to UTC” is a default to discarding information which in some cases is valid, but not all. Make it a conscious choice, and ensure you store all the information you think may be needed later. You might also want to consider whether and how you separate “source” information from “derived” information – this is particularly relevant when it comes to archiving, when you may want to discard all the derived data to save space. That’s much easier to do if you’re already very aware of which data is derived.

My experience is that developers either don’t think about date/time details nearly enough when coding, or are aware of some of the pitfalls but decide that means it’s just too hard to contemplate. Hopefully this worked example of real life complexity shows that it can be done: it takes a certain amount of conscious thought, but it’s not rocket science.

117 thoughts on “Storing UTC is not a silver bullet”

Pingback: Дата и время: трудно, но возможно / Блог компании JUG Ru Group / Хабр
Pingback: difficult, but possible / JUG Ru Group blog / Habr – Developers
hortonolite says:

December 3, 2020 at 7:58 am

Hi,
I check your last presentation on DOTNEXT 2020 PITER. Thank you that makes difficult things simpler.

But I do not agree with the example of the Paris meeting. IMHO, you are trying to store in the time field information that does not belong to it. The meeting is a complex object, it has many attributes and one of them is time. Another is the location.

I think we should store the time in UTC format and present it to the participants according to the actual TZ (on the presentation moment) in the meeting’s locale if the user attends OFFLINE, or using user TZ if the user attends the meeting ONLINE.

Best,
Stas

LikeLike

Reply
1. jonskeet says:
  
  December 3, 2020 at 8:05 am
  
  “I think we should store the time in UTC format” – so you want to ignore what the user actually said (“I want it to start at 9am in Paris”) and instead store the results of “I want it to start at the time you inaccurately predicted would be 9am in Paris, when the event was created”? That doesn’t sound like a good thing to me.
  
  If the user said they wanted it to start at 9am in Paris, how can it possibly be correct for you to present any time other than 10am to someone in Paris?
  
  Sure, how you adjust the value for other time zones is a different matter, but the fact is that the user asked you to make it start at 9am in Paris, so if you then give any other time of day, that’s surely just wrong.
  
  LikeLiked by 1 person
  
  Reply
  1. jonskeet says:
    
    December 3, 2020 at 8:10 am
    
    Sorry, I meant “how can it possibly be correct for you to present a time of 10am to someone in Paris” in the middle paragraph.
    
    LikeLiked by 1 person
    
    Reply
  2. hortonolite says:
    
    December 3, 2020 at 8:16 am
    
    Sorry, I was slightly wrong. We should store in the database the local time and location. For OFFLINE user present the local time, for ONLINE users presents the time based on actual TZ in the meeting location and online user TZ.
    
    ((localTime + (locale -> actualMeetingTz)) -> actualUtcTime;
    (actualUtcTime + onlineUserTz) -> actualOnlineUserTime.
    
    LikeLiked by 1 person
    
    Reply
    1. jonskeet says:
      
      December 3, 2020 at 8:30 am
      
      We should store in the database the local time and location.
      
      Isn’t that effectively what I’ve said? That’s the whole point of both this blog post and the example in the talk – don’t convert to UTC too early, because you’re losing information. The aspect of “how you present this to offline users vs online users” is an orthogonal matter.
      
      Whether you store the location or the time zone ID could definitely be debated, but the point is that “just store UTC and all will be well” is a flawed approach.
      
      LikeLiked by 1 person
      
      Reply
      1. hortonolite says:
        
        December 3, 2020 at 9:21 am
        
        I think it is wrong to talk about ZonedTime at all in the context of this example. There are two inputs – the local moment of time and meeting location. And IMHO, we never should store the time zone in this case (for what?).
        
        I believe that we should store UTC in all cases when we need to store zoned time, but the case above shouldn’t store the zoned time info at all – that’s the point :)
        
        And yes – you are right in the general point of the post. Any moment/event in the future related to some location can’t be converted to UTC/AnyOtherTz “with confidence”.
        
        LikeLike
        
        Reply
  3. Gerwin says:
    
    April 12, 2022 at 12:17 pm
    
    In general (and the most common use case) store timestamps with a timezone, preferably UTC to save on conversions going in and out and dealing with timezone differences in comparisons. This covers all historical and fixed point in time events.
    
    For calendar events, special rules apply.
    1. Is it a Future and Absolute time? store it with the timezone, don’t convert to UTC. Timezone changes will not affect the event time.
    2. Is it a Floating time for a future (possibly recurring) event such as “Lunch is 1hr long every day from noon”, the occurrence cannot be pinned down until it can be resolved down to having a timezone.
    3. Is it a recurring event? Don’t store the timestamp for each event in the future, store a recurrence rule using the logic provided in RFC5545 so that you can easily generate future events. This means storing the Start time (DTSTART) of the recurring event, recurrence rule(s), and optional end time (DTEND). These combined allow for the generation of any events between the start and end time, and these will either be as an Absolute time if the applicable timezone is known (so treat as case 1), or Floating events if the timezone is not known (so treat as case 2).
    
    LikeLike
    
    Reply
Dan says:

February 7, 2021 at 12:18 pm

Great article. I’m currently storing local time + timezoneid. (and also lat/long for address where the physical service offered takes place). But what I don’t understand is why you are persisting TimeZoneRules. Is this just a failsafe in case everything changes in the future? What do you mean “for optimization purposes”? I thought IANA tracked all historic changes to timezone rules, so that if a country decided to drop daylight saving then this would be reflected?

LikeLike

Reply
1. jonskeet says:
  
  February 7, 2021 at 12:41 pm
  
  It’s often useful to be able to store the result of a UTC conversion under current expected rules – so that you can (for example) perform a complete ordering of events without doing the conversion for every event, every time you need to see the results of a complete ordering.
  
  However, if you’re going to store the UTC conversion, you need to know what version of the data it’s derived from, so you can then perform the conversion again when the rules change. So it’s not about history – it’s about documenting the assumptions made when deriving data.
  
  LikeLiked by 2 people
  
  Reply
talanath says:

February 7, 2021 at 12:30 pm

Hi. First of all, great article. I am currently storing local time as well as timezoneID. (Also, lat / long against a business address where physical service is located). What I am unsure of is why store the timezone rules? What “optimization” are you referring to?

LikeLike

Reply
1. jonskeet says:
  
  February 7, 2021 at 12:42 pm
  
  I’ve approved and replied to your previous comment – at least, I assume it was yours given that it’s got similar content. In future, please allow a bit of time for moderation before posting another comment.
  
  LikeLike
  
  Reply
Pingback: Awesome Falsehood – Massive Collection of Resources – Learn Practice & Share
Pingback: Care este un caz de utilizare valid pentru utilizarea TIMESTAMP FĂRĂ ZONA ORARĂ?
David says:

April 15, 2021 at 5:32 pm

I enjoyed your article. I’m worse than those that insist on using UTC – I store everything in the DB via local time via .Net DateTime with no regard to client time vs server. All these time concepts are a new rabbit hole for me.

I am not interested in storing future dates for the what I’m working on – but storing the correct date / time for instances that have already occurred, and doing so regardless of client time vs server time.

Since I’m not going to be storing future dates, would I be better off storing in UTC?

LikeLike

Reply
1. jonskeet says:
  
  April 15, 2021 at 5:35 pm
  
  If you’re recording the timestamp at which a computer observed something happened (e.g. in logs) then yes, I suspect UTC is best.
  
  If you’re recording the theoretical time of something (e.g. “We had a meeting this morning; it occurred at 9am”) then it’s slightly more complex… you may wish to store UTC and a time zone, or maybe local time, a time zone and an offset… it depends on how you’ll use things. But either way you can at least be reasonably confident you’re not losing data.
  
  LikeLike
  
  Reply
  1. David says:
    
    April 16, 2021 at 4:55 pm
    
    Thanks for the reply.
    I’d say 90-95% of the fields involving date / time in are ‘the timestamp at which a computer observed something happened’. There are a couple of Date inputs that the user can enter (but not time), so I’ve started making those fields in the code as LocalDate.
    
    There’s only one place I can think of that’s going to give me pause on how best to handle it: there is a table with a ‘Created’ column – that’s just computer observed timestamp, but there is also an ‘Original’ field and that one can be one two cases:
    1. The record type is something we created, so Created == Original, and Original is immutable at that point.
    2. The record type is something the user imported from an external source, so Original starts as Created, but then the user can specify a new date / time via a DateTimePicker control. They can edit this value at any time, not just when the record is first brought into the system, and because of the record type, they can edit this value as many times as they want.
    
    Case #1 is by far the most common. But basically, because of the two cases Original may just be the computer timestamp and might not be.
    
    I’ll have to give it more thought before implementation, but I’m thinking I might have Original be its own thing purely for the Case #2 and store it as a ZonedDateTime (and a DateTimeOffset field in MS SQL) with a new field to store which time zone it belongs to. Then in the code add a 3rd read only property that determines if it needs to convert Created to a ZonedDateTime based on record type (converted from UTC to user’s PC timezone), or if it’s other record types, then it returns the Original field value as is.
    
    Reason for that line of thinking is the old way of handling things, Created is never actually visible anywhere in the UI to the end user, and Original is what is displayed to the end user where needed. This new read only property would take over the display responsibility.
    
    LikeLike
    
    Reply
matt says:

November 2, 2021 at 4:25 pm

Not sure why we need to store the original timezone information. I mean it’s obviously nice for display purposes but I’m not sure what it buys you. As an implementation is concerned it seems like time remaining could be updated each morning just after midnight and the timer would adjust then to any timezone rule changes. Am I missing something?

LikeLike

Reply
1. jonskeet says:
  
  November 2, 2021 at 4:27 pm
  
  Um, it buys you “being able to get back to the originally-intended value when the time zone rules change”.
  
  If a user wants to store “9am in Paris” on a particular future date, and you can’t reliably know what the UTC offset will be on that future date, how would you know when they actually mean, if you discard the “in Paris” part?
  
  LikeLiked by 1 person
  
  Reply
Pingback: Should MySQL have its timezone set to UTC? - ErrorsFixing
Pingback: Should MySQL have its timezone set to UTC?
Pingback: Storing UTC is not a silver bullet - The web development company Lzo Media - Senior Backend Developer
Pingback: Storing UTC is not a silver bullet – WONDERFUL PORTAL
Brad Mace says:

August 31, 2022 at 5:18 pm

Seems like this is all completely overlooking the fact that when timezone rules change, you just get the updated rule file and everything is fine.

LikeLike

Reply
1. jonskeet says:
  
  August 31, 2022 at 5:20 pm
  
  No, not at all – the whole point is that yes, you get updated rule files… but any computations you performed with earlier rule files are no longer valid.
  
  LikeLiked by 1 person
  
  Reply
Pingback: Uso de Python datetime para trabajar con fechas y horas - Python super fácil
Pingback: Handling times for an EV charger | Jon Skeet's coding blog
Gunnar Dalsnes says:

August 7, 2023 at 8:18 pm

Great article. I go back and read this once a year and learn something new every time:-) What I learnt this time is the weakness of TimeZoneInfo and that is, it has no version\last updated info, I guess because it uses various sources of information both in the OS itself and various other files. This make it harder to recalculate UTC times… Thou I guess the conversion time could be stored (in UTC) instead of the tzdb version, then all future UTC times could be recalculated every X days elapsed. But yes, heck of a lot easier when you can store the tzdb version 4 sure.

LikeLike

Reply
Pedro Faustino says:

October 11, 2023 at 10:49 pm

Hello Jon,

First off, kudos on the insightful article! I’d like to delve a bit deeper into a specific point. How can we verify that the DateTimeOffset a client sends is consistent with the provided time zone identifier? For instance:

A client from New York (usually UTC-5 or UTC-4 during daylight saving) sends a DateTimeOffset of 2023-01-01T12:00:00+02:00 with a time zone identifier of America/New_York. This offset clearly doesn’t match the time zone.
Similarly, a client from London (typically UTC or UTC+1 during daylight saving) sends 2023-06-01T15:00:00+05:00 but tags it as Europe/London.

How can we ensure the integrity of such data and rectify any discrepancies?

LikeLike

Reply
1. jonskeet says:
  
  October 12, 2023 at 6:32 am
  
  Validating is pretty easy: the DateTimeOffset represents an instant in time, which unambiguously should map to a single UTC offset in a given time zone. So just compute what the offset should be for that instant in time, and compare it with the offset in the DateTimeOffset.
  
  The only tricky aspect about validation is that the source might be using a different source of time zone information from your system. Egregiously wrong data is easy to spot – but what if someone sends -05:00 in New York for an instant in time that your system thinks should be in daylight saving time? Do you reject it, or assume that it’s due to time zone source discrepancies? That’s very much an application-specific decision.
  
  LikeLike
  
  Reply
  1. Pedro Faustino says:
    
    October 12, 2023 at 4:13 pm
    
    Thank you for your answer!
    
    The tricky aspect is well thought, I guess we’re always going to have that problem in a multitude of situations if the client and our system are using different TZDB versions.
    
    LikeLike
    
    Reply