Versioning conundrum for Noda Time – help requested

Obviously I’d normally ask developer questions on Stack Overflow but in this case, it feels like the answers may be at least somewhat opinion-based. If it turns out that it’s sufficiently straightforward that a Stack Overflow question and answer would be useful, I can always repost it there later.

The Facts

Noda Time 1.x exists “in production”, and the latest version is 1.3.1. This targets .NET 3.5 Client profile, .NET 4.0, and PCL Profile 328 (in a directory of lib\portable-net4+sl5+netcore45+wpa81+wp8+MonoAndroid1+MonoTouch1+XamariniOS1)

Noda Time currently includes the IANA time zone data (“TZDB”) – each released version of Noda Time contains the TZDB version that was “most recent” at the time that the Noda Time release was built. This gets out of date quite quickly, as there are multiple releases of TZDB every year. Those releases are named 2016a, 2016b etc. Noda Time also provides the ability to read .nzd files (Noda Zone Data – a custom format) and every time there’s a new release of TZDB, I build a .nzd file and upload it to nodatime.org, updating http://nodatime.org/tzdb/latest.txt to point to the latest version.

Noda Time 2.0 has not been released yet. When I do release it, I expect to target .NET 4.5 and netstandard1.0.

Each Noda Time 1.x release has an AssemblyVersion just based on major/minor, i.e. 1.0, 1.1, 1.2 etc. Based on this blog post, this may have been a mistake – it should quite possibly have been 1.0 for all versions. Obviously I can’t fix that now, but I can make the 2.x releases use 2.0 everywhere.

When 2.0 is “pretty much ready” we’re going to cut a 1.4 release which deprecates things that are removed in 2.0 and provides the new approaches as far as possible. For example, the IClock.Now property from 1.x is removed in 2.0, and replaced by IClock.GetCurrentInstant(). We’ll deprecate the Now property and introduce a GetCurrentInstant() extension method which delegates to it. This shouldn’t break any 1.x users, but should allow them to move over to the new API as far as possible before upgrading to 2.0. The intention is that users wouldn’t stay on 1.4 for very long. (Obviously they could do so, but there’s not a lot of benefit. 1.4 won’t have new features – it’s really just a transition version.)

So far, that’s just the way of the world. Now I want to make it easier for users to stay up-to-date with TZDB – including if nodatime.org goes down. (That’s considerably more likely than nuget.org going down, for example.)

The plan is to introduce a new nearly-data-only assembly, packaged as NodaTime.Tzdb. The aim is to allow users to update their data dependency at build time, in a controlled fashion. If you only want to specify an exact version to depend on, you can do so. If you want to pick up the latest version every time you build, that should be possible too.

The tricky bits come in terms of the versioning.

Some options

Firstly, the versioning scheme for the package ignoring everything else. I plan to use something like this:

  • 2016a => 1.2016.1
  • 2016b => 1.2016.2
  • 2016c => 1.2016.3
  • 2017a => 1.2017.1

This should make it reasonably easy to tell the TZDB version just from the package version.

However, I’m considering a few options within this idea:

  • I could create a single package per TZDB release, targeting .NET 3.5 client profile, .NET 4.0, the Profile 328 PCL, .NET 4.5, and .NET Standard 1.0. The first four of these could depend on Noda Time 1.1, and the last one would have to depend on Noda Time 2.0.
  • I could do the above, but depend on 1.3.1 instead of 1.1.
  • I could create one package with two versions per TZDB release – a 1.x depending on Noda Time 1.1, and a 2.x depending on Noda Time 2.0. For example, when TZDB 2016d is released, I could create 1.2016.4 and 2.2016.4.
  • I could create one package version depending on 1.1, one depending on 1.2, one depending on 1.3, one depending on 1.4 (when that exists) and one depending on 2.0.
  • I could create two separate packages, i.e. include the Noda Time major version number in the package name. I don’t like this idea, but it’s on the table.

Some concerns and questions

There are various aspects to this which cause me a few worries. I’m not sure how well I can really structure or segregate those, so I’ll just list them.

  • Can a non-prerelease package depend on a prerelease package for some frameworks? If not, that possibly blows the “single version” idea out of the water, as I can’t depend on NodaTime v2.0 yet – it’s not out.
  • Even if that’s feasible, is it sane to depend on different major versions of the NodaTime package from within a single version of the NodaTime.Tzdb package, or is that going to cause massive confusion?
  • Should I depend on NodaTime v1.1 or v1.3.1? They have different AssemblyVersion numbers, which I believe means an assembly binding redirect will be required if I depend on 1.1 but users depend on 1.3.1. To be clear, I don’t expect many users to still be on versions older than 1.3.1.
  • Likewise, is it going to cause issues for .NET 4.5 users who use NodaTime 2.0 (eventually) if they depend on a version of NodaTime.Tzdb that depends on NodaTime 1.3.1? Again, presumably assembly binding redirects are needed.
  • If I go with the “two-version” scheme (i.e. 1.2016.4 and 2.2016.4 etc) how careful would NodaTime 1.3.1 users have to be? I wouldn’t want them to accidentally get upgraded to NodaTime 2.0 when that’s released, by accidentally taking the 2.x line of NodaTime.Tzdb.
  • Does dotnet cli support the nuget allowedVersions feature at all? I haven’t found any support for it in DNX, but really it’s vital for this scheme to work at all – basically I’d expect a NodaTime 1.3.1 user to specify an allowed version range for NodaTime.Tzdb of [1,2)
  • Is my scheme of 1.2016.4 (etc) sensible? It’s somewhat abusing major/minor/patch, in that there’s no real difference in meaning between a minor version bump (“it’s the new year”) and a patch bump (“there’s been another release in the same year”). Neither kind of change will be breaking (unless you depend on specific time zones behaving in specific ways, of course), and it’s handy to be able to give a simple mapping between TZDB version and package version, but there may be consequences I’m unaware of.

Please feel free to ask clarifying questions in comments. Will look forward to getting some answers :)

42 thoughts on “Versioning conundrum for Noda Time – help requested”

    1. Ah, I think I may have worked out what you mean. The idea is there’ll be a single class in the NodaTime.Tzdb package which is able to load the data from that package. That makes it simpler than trying to get code in the NodaTime package to load data from the NodaTime.Tzdb package. User code will use the class to load the data, but that single class depends on NodaTime to do the actual loading – it’s really just a convenient way of getting the data. Now I could potentially make the class just return a stream, without depending on the NodaTime package at all, but I suspect that wouldn’t end up being as convenient for user code.

      Like

      1. I was thinking that NodaTime would just depend on Tzdb > 1.0.0, so it could use whichever version is installed. I think this makes most of your awkward questions go away, and I guess you should be able to update Tzdb without rebuilding.

        Like

        1. My thought was running along the same lines — this is data, and it should have as small a dependency footprint as possible. Having NodaTime depend on Tzdb >= 1.0.0 means “I know how to load this data.” Putting the loader in Tzdb just enforces that programmatically — but then the loader interface had better never change.

          Like

        2. Well, you don’t have to use TZDB with Noda Time at all. For 2.0, we’re planning on taking out the embedded data entirely. There are plenty of times where apps wouldn’t need TZDB – it would be odd for them to have a dependency on another package.

          Like

      2. I’m thinking that any loader functionality should be either
        1. self contained in the tzdb package
        2. All in the Noda time package
        3. Split between the packages with a generic loader in the tzdb package and a user friendly parser in the Noda time package.
        Ideally there would be no hard coupling between the two in nuget, not sure if it’s possible to specify a suggested package?
        This would free people to use the data package separately or ignore it if they have a better way to get the data.

        Like

        1. Not really sure what you mean here. All the real logic is in Noda Time – the bits to understand the data. There’s no way it could be entirely in NodaTime.Tzdb withot a reference to NodaTime, as it needs to expose implementations of a NodaTime interface. I don’t see how I could sensibly do this without any dependencies either way – not in a way that would actually be useful to users, which is the point…

          Like

  1. Is my scheme of 1.2016.4 (etc) sensible?

    I would recommend to stick with semver. Coupling your versioning to the versioning of a 3rd party library which you have no control over is always a bad idea. If they change their versioning you (or in this case your users) will be left with a mess. Also what are you going to do if you release 1.2016.4 and all of a sudden you realise that there was a mistake? How are you going to tag the fix for it? 1.2016.5 would have a different meaning. I can see how it might seem attractive to squeeze the TZDB release name into the version of your NuGet package, but I think this is a bad idea and will cause confusion in the beginning and troubles at some point later.

    Especially if “..If you want to pick up the latest version every time you build, that should be possible too..” will likely be the default use of the new package then no one really cares about the version number. Better document the changes in good release notes is my take on this.

    A lot of your questions are related to the idea of adding support for the NodaTime.TZDB to the 1.x users. My advice would be to not do this, at least not now.

    The intention is that users wouldn’t stay on 1.4 for very long

    If that is the intention, then why bother with 1.4 at all? If I am a NodaTime user and I have to go through the hassle of updating my code then why would I want to do it twice (once for 1.4 and later for 2.0). I rather wait for the 2.0 release and then I upgrade my code once and never look back at 1.x.

    Also if I were you I would only release 2.0 and NodaTime.TZDB for 2.0 and tell users to upgrade to 2.0. I wouldn’t go through the hassle of supporting 1.x from NodaTime.TZDB unless there will be users pushing back hard enough with good reasons. If everyone happily upgrades to 2.0 then you are trying to solve a problem now which doesn’t need to be solved.

    Not sure if I was of any help, but I thought I kick off the comments with some thoughts of my own :)

    Happy Easter Monday!

    Like

    1. Re 1.4: The point of 1.4 is that you’ll be able to upgrade your code in pieces. Depend on 1.4, confident that your code will still work, and gradually fix warnings. Then upgrade to 2.0, having read the breaking changes list carefully.

      Obviously if you don’t want to use 1.4, you could go straight to 2.0.

      As for only supporting 2.0 for the NodaTime.Tzdb package – if 2.0 were really imminent, I’d go with that… but I expect it to be at least 6 months before I’m ready, and there’s at least some demand for this package now.

      Will think further about the points in the first part of your comment :)

      Like

  2. Assuming Tzdb only changes a few times per year, maybe you can automatically pull the relevant data directly from IANA on an interval? like, check once a month for the new data directly from the source and process it?

    Like

    1. Anyone can do that if they want to, but the code to process the raw TZDB data isn’t part of the Noda Time package. It doesn’t make much sense to put it there, and occasionally there are incompatible changes so I need to change the “compiler” code. Client code could poll nodatime.org if they wanted to – but expressing a policy of how often to do that, what to do on failure etc means it’s probably simpler for client code to just do that itself. The idea of having it in nuget.org would be to make it build-time rather than execution-time (removing network requirements etc), and to make it really, really easy.

      Like

  3. I must agree with Dustin.

    First, semantic versioning with detailed release notes is the way to go.

    On the point of only NodaTime 2.0 supporting NodaTime.Tzdb. It seems you already have a mechanism for updating the data for your 1.x versions: downloading from nodatime.org. Why not just introduce the TZDB package as part of the 2.0 release? Sure, nodatime.org may not be as reliable as nuget.org, but that’s the experience your users have now. A “more reliable experience” could be a feature of the 2.0 release.

    Like

  4. Seems to me you can solve this by avoiding a direct dependency between Nodatime and the TZDB package, and introducing a new package which depends on BOTH. Call that Nodatime.TZDB.

    TZDB updates a minor rev whenever the timezone data changes, and a major rev whenever it changes the format in which it exposes that data (ideally that would be never, so it remains a 1.x package forever)

    Nodatime updates a minor rev whenever it adds new functionality and a major rev whenever it significantly changes its API – so, 1.3 – 1.4 and 2.0 and beyond all work as you currently plan.

    Nodatime.TZDB depends on a particular Nodatime API version and a particular TZDB format (assume 1.0), and updates a major rev whenever one of those changes (and a minor rev only whenever some internal detail of its implementaiton changes). So, v1 of Nodatime.TZDB depends on Nodatime 1.0+ and TZDB 1.0+; v2 depends on Nodatime 2.0+ and TZDB 1.0+

    That decouples revving TZDB from nodatime, and even permits non nodatime clients to make use of the TZDB data if they so choose, and would also allow projects that for whatever reason have isolated parts which are dependent on both nodatime 1 and nodatime 2 to at least standardize both on the exact same TZDB version.

    Like

    1. Non-NodaTime clients can’t make use of the data directly – its data format is basically defined by Noda Time. (TZDB itself is a different matter, but that’s not what NodaTime uses – it uses a “compiled” version.)

      We already have fairly crude versioning in the nzd format – including optional data which older clients ignore automatically.

      While I can see some theoretical benefits to this system, I suspect it would end up being rather more complex than it needs to, with no actual benefits in the end.

      Like

  5. Assuming the TZDB file format is fairly consistent, I would think that it can be its own nuget package, which is just loaded/updated by the NodaTime code (at runtime, possibly cached locally to minimize latency and hits to nuget). Developers can provide a separate “nuget source” feed for local caches (or gated access to new versions, controlled internet access, etc).

    In this case, TZDB would simply be a repackaging of the data using a nuget naming convention (I like the 2016.a -> 2016.1 format).

    There is also no need for dependencies whatsoever, of the NodaTime code can just pull the latest from nuget directly.

    in terms of versioning, I agree w/ the blogpost about using AssemblyVersion / FileVersion in accordance w/ MS GAC best practices (AssemblyVersion as versioned in the GAC and referenced by code, FileVersion as versioned by bugfixes/etc), and using the Nuget version to mimic the FileVersion (for the same reason as MS uses FileVersion). Also, continue using semantic versioning for assemblyversion (since it addresses API compatibility, which is what assembly version references are concerned with).

    I would also consider supplementing this w/ another common practice of Microsoft’s, which is to provide binding redirections for MAJOR VERSION increments… SharePoint does this extensively, by redirecting bindings of old the old (assembly)version to the current version (SP does this primarily to allow custom extensions to continue using the new version, on the assumption that the SP API didn’t break the custom code)… I could easily see a nuget upgrade adding redirections (from 0.0.0.0 through vcurrent -> vcurrent) to the app/web config.

    Like

    1. The TZDB file format is converted into my own format which is specific to Noda Time. I definitely don’t want to start adding Nuget code to Noda Time – basically, I don’t want to add any networking code into something which is intended to just be a date/time API. It feels like a much better idea to let the app developer use the package version they want, and have that as a dependency. If they want to add fetching code themselves, that’s easy enough to do – it only needs to read all the text of nodatime.org/tzdb/latest.txt to find the URL of the latest NZD file, and if it’s not the version they’re already using, download it. It’s a few lines of code, but the policy around when to check, how to cache the results etc don’t feel like the business of Noda Time.

      Like

  6. I was thinking along the same lines as Tony Finch and John Costello. If the TZDB package could be written not to depend on the particular NodaTime package, then a lot of these issues go away.

    As for versioning, Dustin Moris Gorski brings up some good points about going with semver, but if the TZDB only depends on the IANA database versioning, then I don’t see a problem with matching their versioning scheme. After all, the IANA, being a standards organization, is not likely to change their scheme (although if they do, you may need to branch from their scheme to continue your own to minimize problems). It’s not like they base their versioning scheme off of, say, Microsoft operating systems. ;-)

    Like

    1. The tricky bit comes in terms of getting at the data contained in the package, without there being any dependencies either way.
      NodaTime could depend on NodaTime.Tzdb, but that would be annoying for anyone not using TZDB in their app – it would be a dependency for no reason.
      I could start using reflection to find the references in the AppDomain, but that starts getting tricky in terms of portability.

      I’ve just had one idea, which is to have a method in Noda Time that accepts a type – and that then loads the data in from the assembly owning that type. It would make for a bit of a messy API though, compared with the one I was considering (something like PackagedTzdb.Provider or PackagedTzdb.Source depending on what you wanted – and ideally with a better name than PackagedTzdb.)

      Like

      1. I was thinking along similar lines of a dependency registration method that would have to be called by the developer using both packages at the time of application startup, such as NodaTime.RegisterTzdb(PackagedTzdb.Constructor), but it does fall short of the “it just works” ideology, and would still be tricky to do while avoiding dependencies.

        This might be a case where it would be nice for .NET to support duck-typing.

        Alternately, inverting James Hart’s idea of a third package that depends on both NodaTime and the TZDB, they could instead both depend on a third package which is merely the defined interface between the two. It’s still a little messy (and as I understand it, the interface should belong with the consumer, which would be NodaTime), but if the interface doesn’t change often (or at all) then this might be a viable option.

        Liked by 1 person

      2. It would seem like a “soft” dependency would provide the greatest flexibility than a “hard” dependency, from the perspective of consumers. If the assembly is there, load it (via your type-accepting method, or via catching assembly load exception, or somehow else) otherwise, use the default settings. Then it really is up to the consumer to choose whatever TZDB dll they want, if any.

        Like

  7. You could have NodaTime (Core) import MEF NodaTime (Data) extensions. That core could the choose the most appropriate or most recent dataset to use.

    Like

    1. Separated out into three distinct parts.
      1) NodaTime Interfaces
      2) NodaTime (Core)
      3) NodaTime (Data)

      NodaTime (Core) and NodaTime (Data) reference the Noda Time Interfaces, they don’t reference each other.

      Liked by 1 person

  8. Can a non-prerelease package depend on a prerelease package for some frameworks?

    AFAIK, this is not possible. Only prerelease packages can have prerelease dependencies. I can’t find the official documentation that proves it, but from experience and this question, you cannot have a “release” package depending on pre-release package.

    Liked by 1 person

    1. No, I’m not planning on doing that. We’ve looked at it, but there are so many things to consider in terms of scheduling, how to expose the updated version, error handling etc, that I think it’s simpler for applications to do it themselves. The bit of common code is very simple: 1) fetch latest.txt to check whether there’s a new version; 2) fetch the latest version; 3) use that to build a TzdbDateTimeZoneSource; 4) use DateTimeZoneCache to create a provider. That’s literally about 10 lines of code – but how and when it gets invoked is complex, and different for each app.

      Like

      1. NodaTime.Tzdb exposes its information in a “known format” that is extendable and backwards compatible, right?

        So why not have NodaTime expose a function that allows you to give it a provider for that data?

        The provider and data stay in NodaTime.Tzdb and both NodaTime 1.x and NodaTime 2.x can live without a dependency on that project.

        If you don’t want to use NodaTime.Tzdb you don’t add the package to your project and you don’t register the provider. That would make it both opt-in and give you the option of having providers that behave differently (i.e. download data off the internet every now and then).

        Wouldn’t this work?

        Like

        1. Yes, NodaTime.Tzdb would expose the data in a known format. The difficulty is getting NodaTime to access that data in a portable way.

          • We could try to load a assembly using reflection – but I dread to think what portability (and security) issues that might have.
          • We could avoid having an assembly at all, but assemblies are the simplest things to work with in terms of making sure that the data is available in the right way, again portably. (I don’t want to rely on FileStream, for example.)

          Were you thinking of one of those options, or something else?

          Note that you can already construct a TzdbDateTimeZoneSource from a stream – that ability has been present forever. It’s making it easy to get at the data that is trickier.

          Like

          1. The assembly would be referenced by the project, same as NodaTime, then as a developer you would obtain some instance of that provider and you register it with NodaTime.

            You could use dependency injection for this, or plain old hard-coding.

            Like

            1. I don’t know what you mean by “some instance of that provider”. What provider? Are you suggesting there is code in the NodaTime.Tzdb package or not? Side issue – if you have a reference to an assembly but don’t use anything in it, I think the reference used to be ignored. Haven’t checked whether or not that’s still the case.

              Like

              1. So, NodaTime.Tzdb contains the Tzbd provider, lets call it TzbdProvider, and it also contains the data. TzdbProvider would be an implementation of the ITzbdProvider interface.

                NodaTime exposes a way to register that provider, say SystemClock.RegisterTzdbProvider. You call this function and give it the provider, SystemClock.RegisterTzdbProvider(TzbdProvider.GetNewInstance()).

                So in order to have this working as a user of these libraries, in my project I would reference both the NodaTime and NodaTime.Tzdb projects (they don’t have dependencies between them) and then add this line somewhere in the code.

                SystemClock.RegisterTzdbProvider(TzbdProvider.GetNewInstance());

                Like

              2. If NodaTime.Tzdb contains the provider code, it has to depend on NodaTime… Otherwise what is the type that RegisterTzdbProvider takes and TzdbProvider.GetNewInstance returns?

                Like

              3. Oh, I see what you mean. This could solved by doing something like “public static void RegisterTzbdProvider(dynamic provider)” and skipping the interface since that wouldn’t be an issue anymore (if you don’t want to have a dependency between packages).

                Or you can put the interface in the NodaTime package and the NodaTimeTzdb package has a dependency on it. This would solve the performance hit you would have from late binding.

                I don’t know what was in my head when I thought you could pass in an object without that object’s class/interface being somewhere in NodaTime.

                Like

              4. Right – we’re now basically back to the subject of the blog post, where the Tzdb package has a dependency on NodaTime – I really don’t want to make it dynamic. Another possible option would be to make the Tzdb package just expose a stream… But I don’t want the new code to be any more friction than the existing code, where it’s just a property.

                Like

  9. I don’t see how you would do this without that dependency or using dynamic. You would have to have 2 packages, one for NodaTime 1.x and one for NodaTime 2.x. They would have different build targets but I’m sure this can be automated so you only have one project for NodaTimeTzdb.

    Like

    1. Well, we could potentially do it by making NodaTime.Tzdb just expose something non-NodaTime specific, e.g. a class with a Stream GetTzdbData() method. But that would be more painful to use. From the user’s perspective, it should be as easy as it is in 1.x, just using a property.

      I’m currently considering whether it might be better to just make it easier to release new point releases, and do that whenever TZDB changes…

      Like

  10. One more vote for reversing the dependency. The proposal from Tony Finch fixes almost all of issues and it’s easy to track the versions via VS Manage Packages window or with deploy script on buildserver.

    If you don’t want to ship any TZDB data by default, just make dummy 1.0.0 package that does not contain real data

    If it’s possible to move TZDB parsing logic into NodaTime package, the TZDB 1.0.0 may contain no data at all.

    Like

  11. I like the separation of data and code very much.

    Since nzd are not version specific, as i understand, i would go with your proposed versioning scheme.

    BTW if you target .net core you may find that linux distros keep tzdb updated through their repositories, so the data could be picked up from there in tzdb format unfortunately.

    Maybe it makes sense to create a
    * NodaTime.Core which contains everything that is data provider agnostic
    * NodaTime.BCL for people that use this :(
    * NodaTime.Tzdb for the rest of us :)

    You get the BCL out of the way which, and correct me if i am wrong, is windows specific.
    So anyone that targets .net core on linux only needs Core and Tzdb nugets, for the BCL you need Core and BCL, and for the adventurers amongst us Core, BCL and Tzdb.

    Core should provide a interface to BCL and tzdb for loading any format needed into a agnostic model in order to cover the generation of a new format. Setting the default provider could be a start up thing (configuration). I liked the asp.net core configuration very very you opt in to thinks via extension functions.

    my 2 cents

    Like

  12. I’m not qualified to answer the question, sorry Jon. And it would be off topic on Stack Overflow. I feel the very best place to get answers to this particular question would be http://programmers.stackexchange.com/ which of course has quite a different feel to it from Stack Overflow; and presumably a much higher readership than your blog (undeservedly).

    Like

Leave a comment