Category Archives: Design

Versioning limitations in .NET

June 30, 2019 jonskeet 74 Comments

This is a blog post I’ve intended to write for a very long time. (Other blog posts in that category include a recipe for tiramisu ice cream, and “knights and allies”.) It’s one of those things that’s grown in my mind over time, becoming harder and harder to start. However, there have been three recent incidents that have brought it back into focus:

Azure Functions and Grpc.Core versioning conflict
Grpc.Core considering a major version bump
Recent internal discussions that would involve a major version bump for all the Google Cloud libraries I maintain

TL;DR: Versioning is inherently hard, but the way that .NET infrastructure is set up makes it harder than it needs to be, I suspect.

The sample code for this blog post is available on GitHub.

Refresher: SemVer

NuGet is the de facto standard for distribution of packages now, and it supports semantic versioning, also known as SemVer for short. SemVer version strings (ignoring pre-release versions) are of the form major.minor.patch.

The rules of SemVer sound straightforward from the perspective of a package producer:

If you make a breaking change, you need to bump the major version
If you make backward compatible additions, you need to bump the minor version
If you make backward and forward compatible changes (basically internal implementation changes or documentation changes) you bump the patch version

It also sounds straightforward from the perspective of a package consumer, considering moving from one version to another of a package:

If you move to a different major version, your existing code may not work (because everything can change between major versions)
If you move to a later minor version within the same major version, your code should still work
If you move to an earlier minor version within the same major version, your existing code may not work (because you may be using something that was introduced in the latest minor version)
If you move to a later or earlier patch version within the same major/minor version, your code should still work

Things aren’t quite as clear as they sound though. What counts as a breaking change? What kind of bug fix can go into just a patch version? If a change can be detected, it can break someone, in theory at least.

The .NET Core team has a set of rules about what’s considered breaking or not. That set of rules may not be appropriate for every project. I’d love to see:

Tooling to tell you what kind of changes you’ve made between two commits
A standard format for rules so that the tool from the first bullet can then suggest what your next version number should be; your project can then advertise that it’s following those rules
A standard format to record the kinds of changes made between versions
Tooling to check for “probable compatibility” of the new version of a library you’re consuming, given your codebase and the record of changes

With all that in place, we would all hopefully be able to follow SemVer reliably.

Importantly, this makes the version number a purely technical decision, not a marketing one. If the current version of your package is (say) 2.3.0, and you add a bunch of features in a backward-compatible way, you should release the new version as 2.4.0, even if it’s a “major” version in terms of the work you’ve put in. Use whatever other means you have to communicate marketing messages: keep the version number technical.

Even with packages that follow SemVer predictably and clearly, that’s not enough for peace and harmony in the .NET ecosystem, unfortunately.

The diamond dependency problem

The diamond dependency problem is not new to .NET, and most of the time we manage to ignore it – but it’s still real, and is likely to become more of an issue over time.

The canonical example of a diamond dependency is where an application depends on two libraries, each of which depends on a common third library, like this:

Common diamond dependency

(I’m using NodaTime as an example so I can refer to specific versions in a moment.)

It doesn’t actually need to be this complicated – we don’t need Lib2 here. All we need is two dependencies on the same library, and one of those can be from the application:

Simplified diamond dependency

Multiple dependencies on the same library are fine, so long as they depend on compatible versions. For example, from our discussion of SemVer above, it should be fine for Lib1 to depend on NodaTime 1.3.0, and App to depend on NodaTime 1.2.0. We expect the tooling to resolve all the dependencies and determine that 1.3.0 is the version to use, and the App code should be fine with that – after all, 1.3.0 is meant to be backward-compatible with 1.2.0. The same is true the other way round, if App depends on later version than Lib1, so long as they’re using the same major version.

(Note: there are potential problems even within a minor version number – if App depends on 1.3.0 and Lib1 depends on 1.3.1 which contains a bug fix, but App has a workaround for the bug which then fails under 1.3.1 when the bug is no longer present. Things like that can definitely happen, but I’ll ignore that kind of problem for the rest of this post, and assume that everything conforms to idealized SemVer.)

Diamond dependencies become a problem under SemVer when the dependencies are two different major versions of the same library. To give a concrete example from the NodaTime package, consider the IClock interface. The 1.4.x version contains a single property, Now. The 2.0.x version has the same functionality, but as a method, GetCurrentInstant(). (This was basically a design failing on my part in v1 – I followed the BCL example of DateTime.Now without thinking clearly enough about whether it should have been a property.)

Now suppose App is built with the .NET Core SDK, and depends on NodaTime 2.0.0, and Lib1 depends on NodaTime 1.3.1 – and let’s imagine a world where that was the only breaking change in NodaTime 2.x. (It wasn’t.) When we build the application, we’d expect 2.0 to be used at execution time. If Lib1 never calls IClock.Now, all is well. Under .NET Core tooling, assembly binding redirects are handled automatically so when Lib1 “requests” NodaTime 1.3.1, it gets NodaTime 2.0.0. (The precise way in which this is done depends on the runtime executing the application. In .NET Core, there’s an App.deps.json file; in desktop .NET it’s App.exe.config. Fortunately this doesn’t matter much at the level of this blog post. It may well make a big difference to what’s viable in the future though.)

If Lib1 does call IClock.Now, the runtime will throw a MissingMethodException. Ouch. (Sample code.)

The upshot is that if the transitive set of “package + version” tuples for your entire application contains more than one major version for the same package, it’s entirely possible that you’ll get an exception at execution time such as MissingMethodException, MissingFieldException, TypeNotFoundException or similar.

If that doesn’t sound too likely, please consider that the Newtonsoft.Json package (Json .NET) has 12 major versions as I’m writing this blog post. I suspect that James Newton-King has kept the breaking changes to an absolute minimum, but even so, it’s pretty scary.

Non-proposals

I’d like to propose some enhancements to tooling that might help to address the issue. Before we look at what I am suggesting, I’d like to mention a few options that I’m not suggesting.

Ignore the problem

I’m surprised that few people seem as worried about versioning as I am. I’ve presented talks on versioning a couple of times, but I don’t remember seeing anyone else do so – and certainly not in a .NET-specific deep-dive way. (My talk isn’t that, either.) It’s possible that there are lots of people who are worried, and they’re just being quiet about it.

This blog post is just part of me trying to agitate the community – including but not limited to Microsoft – into taking this problem seriously. If it turns out that there are already smart people working on this, that’s great. It’s also possible that we can live on the edge of versioning anarchy forever and it will always be a potential nightmare, but only cause a small enough number of failures that we decide we can live with it. That feels like a risk we should at least take consciously though.

Build at head, globally

In 2017, Titus Winters presented C++ as a live at head language at CppCon. It’s a great talk; go watch it. (Yes, it’s an hour and a half long. It’s still worth it. It also states a bunch of what I’ve said above in a slightly different way, so it may be helpful in that sense.) The idea is for everyone to build their application based on full source code, and provide tooling to automatically update consumer code based on library changes.

To go back to the Noda Time IClock example, if I build all the code for my application locally (App, Lib1 and NodaTime) then when NodaTime changes from the IClock.Now property to IClock.GetCurrentInstant(), the code in Lib1 that uses IClock.Now can automatically be changed to use IClock.GetCurrentInstant(), and everyone is happy with the same version. The Abseil project is a library (or collection of libraries) for C++ that embrace this concept.

It’s possible that this could eventually be a good solution for .NET. I don’t know of any technical aspects that mean it could work for C++ but not for .NET. However, it’s so far from our current position that I don’t believe it’s a practical choice at the moment, and I think it makes sense to try this experiment in one language first for a few years, then let other languages consider whether it makes sense for them.

I want to make it very clear that I’m not disagreeing with anything Titus said. He’s a very smart guy, and I explicitly do agree with almost everything I’ve heard him say. If I ever decide that I disagree with some aspect and want to make a public debate about it, I’ll be a lot more specific. Vague arguments are irritating for everyone. But the .NET ecosystem does depend on binary distribution of packages at the moment, and that’s an environment Titus deliberately doesn’t try to address. If someone wants to think about all the practical implications of all the world’s .NET consumers living at head in a source-driven (rather than binary-driven) world, I’d be interested in reading the results of that thinking. It’s certainly more feasible now than it was before .NET Core. But I’m not going there right now.

Never make breaking changes in any library

If we never make any changes that will break anyone, none of this is a problem.

I gave the example of Newtonsoft.Json earlier, and that it’s on major version 12. My guess is that that means there really have been 11 sets of breaking changes, but that they’re sufficiently unlikely to cause real failure that we’ve survived.

In the NodaTime package, I know I have made real breaking changes – it’s currently at version 2.4.x, and I’m planning on a 3.0 release some time after C# 8 comes out. I’ve made (or I’m considering) breaking changes in at least three different ways:

Adding members to public interfaces. If you implement those interfaces yourself (which is relatively unlikely) your code will be broken. On the other hand, everyone who wants the functionality I’ve added gets to use it in a clean way.
Removing functionality which is either no longer desirable (binary serialization) or shouldn’t have been present to start with. If you still want that functionality, I can only recommend that you stay on old versions.
Refactoring existing functionality, e.g. the IClock.Now => IClock.GetCurrentInstant() change, or fixing a typo in a method name. It’s annoying for existing consumers, but better for future consumers.

I want to be able to make all of these changes. They’re all good things in the long run, I believe.

So, those are options I don’t want to take. Let’s look at a few that I think we should pursue.

Proposals

Firstly, well done and thank you for making it this far. Before any editing, we’re about 2000 words into the post at this point. A smarter person might have got this far quicker without any loss of important information, but I hope the background has been useful.

Prerequisite: multi-version support

My proposals require that the runtime support loading multiple assemblies with the same name at the same time. Obviously I want to support .NET Core, so this mustn’t require the use of multiple AppDomains. As far as I’m aware, this is already the case, and I have a small demo of this, running with both net471 and netcoreapp2.0 targets:

// Call SystemClock.Instance.Now in NodaTime 1.3.1
string path131 = Path.GetFullPath("NodaTime-1.3.1.dll");
Assembly nodaTime131 = Assembly.LoadFile(path131);
dynamic clock131 = nodaTime131
    .GetType("NodaTime.SystemClock")
    // Instance is a field 1.x
    .GetField("Instance")
    .GetValue(null);
Console.WriteLine(clock131.Now);

// Call SystemClock.Instance.GetCurrentInstant() in NodaTime 2.0.0
string path200 = Path.GetFullPath("NodaTime-2.0.0.dll");
Assembly nodaTime200 = Assembly.LoadFile(path200);
dynamic clock200 = nodaTime200
    .GetType("NodaTime.SystemClock")
    // Instance is a property in 2.x
    .GetProperty("Instance")
    .GetValue(null);
Console.WriteLine(clock200.GetCurrentInstant());

I’ve used dynamic typing here to avoid having to call the Now property or GetCurrentInstant() method using hand-written reflection, but we have to obtain the clock with reflection as it’s accessed via a static member. This is in a project that doesn’t depend on Noda Time at all in a compile-time sense. It’s possible that introducing a compile-time dependency could lead to some interesting problems, but I suspect those are fixable with the rest of the work below.

On brief inspection, it looks like it’s also possible to load two independent copies of the same version of the same assembly, so long as they’re stored in different files. That may be important later on, as we’ll see.

Proposal: execute with the expected major version

The first part of my proposal underlies all the rest. We should ensure that each library ends up executing against a dependency version that has the same major version it requested. If Lib1 depends on Noda Time 1.3.1, tooling should make sure it always gets >= 1.3.1 and = 1.3.1″ which appears to be the default at the moment, but I don’t mind too much if I have to be explicit. The main point is that when different dependencies require different major versions, the result needs to be multiple assemblies present at execution time, rather than either a build error or the approach of “let’s just hope that Lib1 doesn’t use anything removed in 2.0”. (Of course, Lib1 should be able to declare that it is compatible with both NodaTime 1.x and NodaTime 2.x. It would be good to make that ease to validate, too.)

If the rest of the application already depends on NodaTime 1.4.0 (for example) then it should be fine to stick to the simple situation of loading a single copy of the NodaTime assembly. But if the rest of the application is using 2.0.0 but Lib1 depends on 1.3.1, we should make that work by loading both major versions 1 and 2.

This proposal then leads to other problems in terms of how libraries communicate with each other; the remaining proposals attempt to address that.

Proposal: private dependencies

When describing the diamond dependency problem, there’s one aspect I didn’t go into. Sometimes a library will take a dependency as a pure implementation detail. For example, Lib1 could use NodaTime internally, but expose an API that’s purely in terms of DateTime. On the other hand, Lib1 could expose its use of NodaTime via its public (and protected) API, using NodaTime types for some properties, method parameters, method return types, generic type arguments, base types and so on.

Both scenarios are entirely reasonable, but they have different versioning concerns. If Lib1 uses NodaTime as a “private dependency” then App shouldn’t (in an ideal world) need to care which version of NodaTime Lib1 uses.

However, if Lib1 exposes method with an IClock parameter, the method caller really needs to know that it’s using a 1.3.1. They’ll need to have a “1.3.1 IClock” to pass in. That means App needs to be aware of the version of NodaTime that Lib1 depends on.

I propose that the author of Lib1 should be able to make a decision about whether NodaTime is a “public” or “private” dependency, and express that decision within the NuGet package.

The compiler should be able to validate that a private dependency really isn’t exposed in the public API anywhere. Ideally, I’d like this to be part of the C# language eventually; I think versioning is important enough to be a language concern. It’s reasonable to assert that that ship has sailed, however, and that it’s reasonable to just have a Roslyn analyzer for this. Careful thought is required in terms of transitive dependencies, by the way. How should the compiler/analyzer treat a situation where Lib1 privately depends on NodaTime 1.3.1, but publicly depends on Lib2 that publicly depends on NodaTime 2.0.0? I confess I haven’t thought this through in detail; I first want to get enough people interested that the detailed work is worth doing.

Extern aliases for packages

Private dependencies are relatively simple to think about, I believe. They’re implementation details that should – modulo a bunch of caveats – not impact consumers of the library that has the private dependencies.

Public dependencies are trickier. If App wants to use NodaTime 2.0.0 for almost everything, but needs to pass in a 1.3.1 clock to a method in Lib1, then App effectively needs to depend on both 1.3.1 and 2.0.0. Currently, as far as I’m aware, there’s no way of representing this in a project file. C# as a language supports the idea of multiple assemblies exposing the same types, via extern aliases… but we’re missing a way of expressing that in project files.

There’s already a GitHub issue requesting this, so I know I’m not alone in wanting it. We might have something like:

<ProjectReference Include="NodaTime" Version="1.3.1" ExternAlias="noda1" />
<ProjectReference Include="NodaTime" Version="2.0.0" ExternAlias="noda2" />

then in the C# code you might use:

using noda2::NodaTime;
// Use NodaTime types as normal, using NodaTime 2.0.0

// Then pass a 1.3.1 clock into a Lib1 method:
TypeFromLib1.Method(noda1::NodaTime.SystemClock.Instance);

There’s an assumption here: that each package contains a single assembly. That definitely doesn’t have to be true, and a full solution would probably need to address that, allowing more complex syntax for per-assembly aliasing.

It’s worth noting that it would be feasible for library authors to providing “bridging” packages too. For example, I could provide a NodaTime.Bridging package which allowed you to convert between NodaTime 1.x and NodaTime 2.x types. Sometimes those conversions may be lossy, but they’re at least feasible. The visible immutability of almost every type in Noda Time is a big help here, admittedly – but packages like this could really help consumers.

Here be dragons: shared state

So far I’ve thought of two significant problems with the above proposals, and both involve shared state – but in opposite directions.

Firstly, consider singletons that we really want to be singletons. SystemClock.Instance is a singleton in Noda Time. But if multiple assemblies are loaded, one per major version, then it’s really “singleton per major version.” For SystemClock that’s fine, but imagine if your library decided that it would use grab a process-wide resource in its singleton, assuming that it was okay to do so because after all there’s only be one of them. Maybe you’d have an ID generator which would guarantee uniqueness by incrementing a counter. That doesn’t work if there are multiple instances.

Secondly, we need to consider mutable shared state, such as some sort of service locator that code registered implementations in. Two different libraries with supposedly private dependencies on the same service locator package might each want to register the same type in the service locator. At that point, things work fine if they depend on different major versions of the service locator package, but start to conflict if the implementations happen to depend on the same major version, and end up using the same assembly. Our isolation of the private dependency isn’t very isolated after all.

While it’s reasonable to argue that we should avoid this sort of shared state as far as possible, it’s unreasonable to assume that it doesn’t exist, or that it shouldn’t be considered as part of this kind of versioning proposal. At the very least, we need to consider how users can diagnose issues stemming from this with some ease, even if I suspect it’ll always be slightly tricky.

As noted earlier, it’s possible to introduce more isolation by loading the same assembly multiple times, so potentially each private dependency could really be private. That helps in the second case above, but hurts more in the first case. It also has a performance impact in terms of duplication of code etc.

Here be unknown dragons

I’m aware that versioning is really complicated. I’ve probably thought about it more than most developers, but I know there’s a lot I’m unaware of. I don’t expect my proposals to be “ready to go” without serious amounts of detailed analysis and work. While I would like to help with that work, I suspect it will mostly be done by others.

I suspect that even this detailed analysis won’t be enough to get things right – I’d expect that when there’s a prototype, exposing it to real world dependencies will find a bunch more issues.

Conclusion

I believe the .NET ecosystem has a versioning problem that’s currently not being recognized and addressed.

The intention isn’t that these proposals are final, concrete design docs – the intention is that they help either start the ball rolling, or give an already-rolling-slightly ball a little more momentum. I want the community to openly discuss the problems we’re seeing now, so we get a better handle on the problem, and then work together to alleviate those problems as best we can, while recognizing that perfection is unlikely to be possible.

C#, Design

Casting vs “as” – embracing exceptions

September 19, 2013 jonskeet 27 Comments

Note: this blog post has now been turned into a video by Webucator, to go alongside their C# classes.

(I’ve ended up commenting on this issue on Stack Overflow quite a few times, so I figured it would be worth writing a blog post to refer to in the future.)

There are lots of ways of converting values from one type to another – either changing the compile-time type but actually keeping the value the same, or actually changing the value (for example converting int to double). This post will not go into all of those – it would be enormous – just two of them, in one specific situation.

The situation we’re interested in here is where you have an expression (typically a variable) of one reference type, and you want an expression with a different compile-time type, without changing the actual value. You just want a different "view" on a reference. The two options we’ll look at are casting and using the "as" operator. So for example:

object x = "foo";
string cast = (string) x;
string asOperator = x as string;

The major differences between these are pretty well-understood:

Casting is also used for other conversions (e.g. between value types); "as" is only valid for reference type expressions (although the target type can be a nullable value type)
Casting can invoke user-defined conversions (if they’re applicable at compile-time); "as" only ever performs a reference conversion
If the actual value of the expression is a non-null reference to an incompatible type, casting will throw an InvalidCastException whereas the "as" operator will result in a null value instead

The last point is the one I’m interested in for this post, because it’s a highly visible symptom of many developers’ allergic reaction to exceptions. Let’s look at a few examples.

Use case 1: Unconditional dereferencing

First, let’s suppose we have a number of buttons all with the same event handler attached to them. The event handler should just change the text of the button to "Clicked". There are two simple approaches to this:

void HandleUsingCast(object sender, EventArgs e)
{
Button button = (Button) sender;
button.Text = "Clicked";
}

void HandleUsingAs(object sender, EventArgs e)
{
Button button = sender as Button;
button.Text = "Clicked";
}

(Obviously these aren’t the method names I’d use in real code – but they’re handy for differentiating between the two approaches within this post.)

In both cases, when the value of "sender" genuinely refers to a Button instance, the code will function correctly. Likewise when the value of "sender" is null, both will fail with a NullReferenceException on the second line. However, when the value of "sender" is a reference to an instance which isn’t compatible with Button, the two behave differently:

HandleUsingCast will fail on the first line, throwing a InvalidCastException which includes information about the actual type of the object
HandleUsingAs will fail on the second line with a NullReferenceException

Which of these is the more useful behaviour? It seems pretty unambiguous to me that the HandleUsingCast option provides significantly more information, but still I see the code from HandleUsingAs in examples on Stack Overflow… sometimes with the rationale of "I prefer to use as instead of a cast to avoid exceptions." There’s going to be an exception anyway, so there’s really no good reason to use "as" here.

Use case 2: Silently ignoring invalid input

Sometimes a slight change is proposed to the above code, checking for the result of the "as" operator not being null:

void HandleUsingAs(object sender, EventArgs e)
{
    Button button = sender as Button;
    if (button != null)
    {
        button.Text = "Clicked";
    }
}

Now we really don’t have an exception. We can do this with the cast as well, using the "is" operator:

void HandleUsingCast(object sender, EventArgs e)
{
    if (sender is Button)
    {
        Button button = (Button) sender;
        button.Text = "Clicked";
    }
}

These two methods now behave the same way, but here I genuinely prefer the "as" approach. Aside from anything else, it’s only performing a single execution-time type check, rather than checking once with "is" and then once again with the cast. There are potential performance implications here, but in most cases they’d be insignificant – it’s the principle of the thing that really bothers me. Indeed, this is basically the situation that the "as" operator was designed for. But is it an appropriate design to start with?

In this particular case, it’s very unlikely that we’ll have a non-Button sender unless there’s been a coding error somewhere. For example, it’s unlikely that bad user input or network resource issues would lead to entering this method with a sender of (say) a TextBox. So do you really want to silently ignore the problem? My personal view is that the response to a detected coding error should almost always be an exception which either goes completely uncaught or which is caught at some "top level", abandoning the current operation as cleanly as possible. (For example, in a client application it may well be appropriate to terminate the app; in a web application we wouldn’t want to terminate the server process, but we’d want to abort the processing of the problematic request.) Fundamentally, the world is not in a state which we’ve really considered: continuing regardless could easily make things worse, potentially losing or corrupting data.

If you are going to ignore a requested operation involving an unexpected type, at least clearly log it – and then ensure that any such logs are monitored appropriately:

void HandleUsingAs(object sender, EventArgs e)
{
    Button button = sender as Button;
    if (button != null)
    {
        button.Text = "Clicked";
    }
    else
    {
        // Log an error, potentially differentiating between
        // a null sender and input of a non-Button sender.
    }
}

Use case 3: consciously handling input of an unhelpful type

Despite the general thrust of the previous use case, there certainly are cases where it makes perfect sense to use "as" to handle input of a type other than the one you’re really hoping for. The simplest example of this is probably equality testing:

public sealed class Foo : IEquatable<Foo>
{
// Fields, of course

    public override bool Equals(object obj)
    {
        return Equals(obj as Foo);
    }

    public bool Equals(Foo other)
    {
        // Note: use ReferenceEquals if you overload the == operator
        if (other == null)
        {
            return false;
        }
        // Now compare the fields of this and other for equality appropriately.
    }

// GetHashCode etc
}

(I’ve deliberately sealed Foo to avoid having to worry about equality between subclasses.)

Here we know that we really do want to deal with both null and "non-null, non-Foo" references in the same way from Equals(object) – we want to return false. The simplest way of handling that is to delegate to the Equals(Foo) method which needs to handle nullity but doesn’t need to worry about non-Foo reference.

We’re knowingly anticipating the possibility of Equals(object) being called with a non-Foo reference. The documentation for the method explicitly states what we’re meant to do; this does not necessarily indicate a programming error. We could implement Equals with a cast, of course:

public override bool Equals(object obj)
{
return obj is Foo && Equals((Foo) obj);
}

… but I dislike that for the same reasons as I disliked the cast in use case 2.

Use case 4: deferring or delegating the decision

This is the case where we pass the converted value on to another method or constructor, which is likely to store the value for later use. For example:

public Person CreatePersonWithCast(object firstName, object lastName)
{
return new Person((string) firstName, (string) lastName);
}

public Person CreatePersonWithAs(object firstName, object lastName)
{
return new Person(firstName as string, lastName as string);
}

In some ways use case 3 was a special case of this, where we knew what the Equals(Foo) method would do with a null reference. In general, however, there can be a significant delay between the conversion and some definite impact. It may be valid to use null for one or both arguments to the Person constructor – but is that really what you want to achieve? Is some later piece of code going to assume they’re non-null?

If the constructor is going to validate its parameters and check they’re non-null, we’re essentially back to use case 1, just with ArgumentNullException replacing NullReferenceException: again, it’s cleaner to use the cast and end up with InvalidCastException before we have the chance for anything else to go wrong.

In the worst scenario, it’s really not expected that the caller will pass null arguments to the Person constructor, but due to sloppy validation the Person is constructed with no errors. The code may then proceed to do any number of things (some of them irreversible) before the problem is spotted. In this case, there may be lasting data loss or corruption and if an exception is eventually thrown, it may be very hard to trace the problem to the original CreatePersonWithAs parameter value not being a string reference.

Use case 5: taking advantage of "optional" functionality

This was suggested by Stephen Cleary in comments, and is an interesting reversal of use case 3. The idea is basically that if you know an input implements a particular interface, you can take a different – usually optimized – route to implement your desired behaviour. LINQ to Objects does this a lot, taking advantage of the fact that while IEnumerable<T> itself doesn’t provide much functionality, many collections implement other interfaces such as ICollection<T>. So the implementation of Count() might include something like this:

ICollection<T> collection = source as ICollection<T>;
if (collection != null)
{
return collection.Count;
}
// Otherwise do it the long way (GetEnumerator / MoveNext)

Again, I’m fine with using "as" here.

Conclusion

I have nothing against the "as" operator, when used carefully. What I dislike is the assumption that it’s "safer" than a cast, simply because in error cases it doesn’t throw an exception. That’s more dangerous behaviour, as it allows problems to propagate. In short: whenever you have a reference conversion, consider the possibility that it might fail. What sort of situation might cause that to occur, and how to you want to proceed?

If everything about the system design reassures you that it really can’t fail, then use a cast: if it turns out that your understanding of the system is wrong, then throwing an exception is usually preferable to proceeding in a context you didn’t anticipate. Bear in mind that a null reference can be successfully cast to any nullable type, so a cast can never replace a null check.
If it’s expected that you really might receive a reference of the "wrong" type, then think how you want to handle it. Use the "as" operator and then test whether the result was null. Bear in mind that a null result doesn’t always mean the original value was a reference of a different type – it could have been a null reference to start with. Consider whether or not you need to differentiate those two situations.
If you really can’t be bothered to really think things through (and I hope none of my readers are this lazy), default to using a cast: at least you’ll notice if something’s wrong, and have a useful stack trace.

As a side note, writing this post has made me consider (yet again) the various types of "exception" situations we run into. At some point I may put enough thought into how we could express our intentions with regards to these situations more clearly – until then I’d recommend reading Eric Lippert’s taxonomy of exceptions, which has certainly influenced my thinking.

Benchmarking, C#, Design, Wacky Ideas

Array covariance: not just ugly, but slow too

June 22, 2013 jonskeet 26 Comments

It seems to be quite a long time since I’ve written a genuine "code" blog post. Time to fix that.

This material may well be covered elsewhere – it’s certainly not terrifically original, and I’ve been meaning to post about it for a long time. In particular, I remember mentioning it at CodeMash in 2012. Anyway, the time has now come.

Refresher on array covariance

Just as a bit of background before we delve into the performance aspect, let me remind you what array covariance is, and when it applies. The basic idea is that C# allows a reference conversion from type TDerived[] to type TBase[], so long as:

TDerived and TBase are both reference types (potentially interfaces)
There’s a reference conversion from TDerived to TBase (so either TDerived is the same as TBase, or a subclass, or an implementing class etc)

Just to remind you about reference conversions, those are conversions from one reference type to another, where the result (on success) is never a reference to a different object. To quote section 6.1.6 of the C# 5 spec:

Reference conversions, implicit or explicit, never change the referential identity of the object being converted. In other words, while a reference conversion may change the type of the reference, it never changes the type or value of the object being referred to.

So as a simple example, there’s a reference conversion from string to object, therefore there’s a reference conversion from string[] to object[]:

string[] strings = new string[10];
object[] objects = strings;

// strings and objects now refer to the same object

There is not a reference conversion between value type arrays, so you can’t use the same code to conver from int[] to object[].

The nasty part is that every store operation into a reference type array now has to be checked at execution time for type safety. So to extend our sample code very slightly:

string[] strings = new string[10];
object[] objects = strings;

objects[0] = "string"; // This is fine
objects[0] = new Button(); // This will fail

The last line here will fail with an ArrayTypeMismatchException, to avoid storing a Button reference in a String[] object. When I said that every store operation has to be checked, that’s a slight exaggeration: in theory, if the compile-time type is an array with an element type which is a sealed class, the check can be avoided as it can’t fail.

Avoiding array covariance

I would rather arrays weren’t covariant in the first place, but there’s not a lot that can be done about that now. However, we can work around this, if we really need to. We know that value type arrays are not covariant… so how about we use a value type array instead, even if we want to store reference types?

All we need is a value type which can store the reference type we need – which is dead easy with a wrapper type:

public struct Wrapper<T> where T : class
{
    private readonly T value;
    public T Value { get { return value; } }

    public Wrapper(T value)
    {
        this.value = value;
    }

    public static implicit operator Wrapper<T>(T value)
    {
        return new Wrapper<T>(value);
    }
}

Now if we have a Wrapper<string>[], we can’t assign that to a Wrapper<object>[] variable – the types are incompatible. If that feels a bit clunky, we can put the array into its own type:

public sealed class InvariantArray<T> where T : class
{
    private readonly Wrapper<T>[] array;

    public InvariantArray(int size)
    {
        array = new Wrapper<T>[size];
    }

    public T this[int index]
    {
        get { return array[index].Value; }
        set { array[index] = value; }
    }
}

Just to clarify, we now only have value type arrays, but ones where each value is a plain wrapper for a reference. We now can’t accidentally violate type-safety at compile-time, and the CLR doesn’t need to validate write operations.

There’s no memory overhead here – aside from the type information at the start, I’d actually expect the contents of a Wrapper<T>[] to be indistinguishable from a T[] in memory.

Benchmarking

So, how does it perform? I’ve written a small console app to test it. You can download the full code, but the gist of it is that we use a stopwatch to measure how long it takes to either repeatedly write to an array, or repeatedly read from an array (validating that the value read is non-null, just to prove that we’ve really read something). I’m hoping I haven’t fallen foul of any of the various mistakes in benchmarking which are so easy to make.

The test tries four scenarios:

object[] (but still storing strings)
string[]
Wrapper<string>[]
InvariantArray<string>

Running against an array size of 100, with 100 million iterations per test, I get the following results on my Thinkpad Twist :

Array type	Read time (ms)	Write time
object[]	11842	44827
string[]	12000	40865
Wrapper<string>[]	11843	29338
InvariantArray<string>	11825	32973

That’s just one run, but the results are fairly consistent across runs. The one interesting deviation is the write time for object[] – I’ve observed it sometimes being the same as for string[], but not consistently. I don’t understand this, but it does seem that the JIT isn’t performing the optimization for string[] that it could if it spotted that string is sealed.

Both of the workarounds to avoid array covariance make a noticeable difference to the performance of writing to the array, without affecting read performance. Hooray!

Conclusion

I think it would be a very rare application which noticed a significant performance boost here, but I do like the fact that this is one of those situations where a cleaner design also leads to better performance, without many obvious technical downsides.

That said, I doubt that I’ll actually be using this in real code any time soon – the fact that it’s just "different" to normal C# code is a big downside in itself. Hope you found it interesting though :)

C#, Design, Noda Time

The perils of conditional mutability

May 6, 2012 jonskeet 6 Comments

This morning I was wrestling with trying to make some Noda Time unit tests faster. For some reason, the continuous integration host we’re using is really slow at loading resources under .NET 4. The unit tests which run in 10 seconds on my home laptop take over three hours on the continuous integration system. Taking stack traces at regular intervals showed the problem was with the NodaFormatInfo constructor, which reads some resources.

I may look into streamlining the resource access later, but before we get to that point, I wanted to try to reduce the number of times we call that constructor in the first place. NodaFormatInfo is meant to be cached, so I wouldn’t have expected thousands of instances to be created – but it’s only cached when the System.Globalization.CultureInfo it’s based on is read-only. This is where the problems start…

CultureInfo is conditionally mutable (not an official term, just one I’ve coined for the purposes of this post). You can ask whether or not it’s read-only with the IsReadOnly property, and obviously if it’s read-only you can’t change it. Additionally, CultureInfo is composed of other conditionally mutable objects – DateTimeFormatInfo, NumberFormatInfo etc. There’s a static ReadOnly method on CultureInfo to create a read-only wrapper around a mutable CultureInfo. It’s not clearly documented whether that’s expected to take a deep copy (so that callers can really rely on it not changing) or whether it’s expected to reflect any further changes made to the culture info it’s based on. To go in the other direction, you can call Clone on a CultureInfo to create a mutable copy of any existing culture.

Further complications are introduced by the properties on the composite objects – we have properties such as DateTimeFormatInfo.MonthNames which returns a string array. Remember, arrays are always mutable. So it’s really important to know whether the array reference returned from the property refers to a copy of the underlying data, or whether it refers to the array that’s used internally by the type. Obviously for read-only DateTimeFormatInfo objects, I’d expect a copy to be returned – but for a mutable DateTimeFormatInfo, it would potentially make sense to return the underlying array reference. Unfortunately, the documentation doesn’t make this clear – but in practice, it always returns a copy. If you need to change the month names, you need to clone the array, mutate the clone, and then set the MonthNames property.

All of this makes CultureInfo hard to work with. The caching decision earlier on only really works if a "read-only" culture genuinely won’t change behind the scenes. The type system gives you no help to catch subtle bugs at compile-time. Making any of this robust but efficient (in terms of taking minimal copies) is tricky to say the least.

Not only does it make it hard to work with from a client’s point of view, but apparently it’s hard to implement correctly too…

First bug: Mono’s invariant culture isn’t terribly invariant…

(Broken in 2.10.8; apparently fixed later.)

I discovered this while getting Noda Time’s unit tests to pass on Mono. Unfortunately there are some I’ve had to effectively disable at the moment, due to deficiencies in Mono (some of which are being fixed, of course).

Here’s a program which builds a clone of the invariant culture, changes the clone’s genitive month names, and then prints out the first non-genitive name from the plain invariant culture:

using System;
using System.Globalization;

class Test
{
    static void Main()
    {
        CultureInfo clone = (CultureInfo) CultureInfo.InvariantCulture.Clone();
        // Note: not even deliberately changing MonthNames for this culture!
        clone.DateTimeFormat.MonthGenitiveNames[0] = "Changed";

        // Prints Changed
        Console.WriteLine(CultureInfo.InvariantCulture.DateTimeFormat.MonthNames[0]);
    }
}

I believe this bug is really due to the lack of support for genitive month names in Mono at the moment – the MonthGenitiveNames property always just returns a reference to the month names for the invariant culture – without taking a copy first. (It always returns the invariant culture’s month names, even if you’re using a different culture entirely.) The code above shows an "innocent" attempt to change a mutable clone – but in reality we could have used any culture (including an immutable one) to make the change.

Note that in the .NET implementation, the change would only have been to a copy of the underlying data, so even the clone wouldn’t have reflected change anywhere.

Second bug: ReadOnly losing changes

The second bug is the one I found this morning. It looks like it’s fixed in .NET 4, but it’s present in .NET 3.5, which is where it bit me this morning. When you try to make a read-only wrapper around a mutated culture, some of the properties are preserved… but some aren’t:

using System;
using System.Globalization;

class Test
{
    static void Main()
    {
        CultureInfo clone = (CultureInfo) CultureInfo.InvariantCulture.Clone();
        clone.DateTimeFormat.AMDesignator = "ChangedAm";

        // The array is recreated on each call to MonthNames, so changing the
        // value within the array itself doesn’t work :(
        string[] months = (string[]) clone.DateTimeFormat.MonthNames;
        months[0] = "ChangedMonth";
        clone.DateTimeFormat.MonthNames = months;

        CultureInfo readOnlyCopy = CultureInfo.ReadOnly(clone);
        Console.WriteLine(clone.DateTimeFormat.AMDesignator); // ChangedAm
        Console.WriteLine(clone.DateTimeFormat.MonthNames[0]); // ChangedMonth

        Console.WriteLine(readOnlyCopy.DateTimeFormat.AMDesignator); // ChangedAm
        Console.WriteLine(readOnlyCopy.DateTimeFormat.MonthNames[0]); // January (!)
    }
}

I don’t know what’s going on here. In the end I just left the test code using the mutable clone, having added a comment explaining why it wasn’t created a read-only wrapper.

I’ve experimented with a few different options here, including setting the MonthNames property on the clone after creating the wrapper. No joy – I simply can’t make the new month names stick in the read-only copy. <sigh>

Conclusion

I’ve been frustrated by the approach we’ve taken to cultures in Noda Time for a while. I haven’t worked out exactly what we should do about it yet, so it’s likely to stay somewhat annoying for v1, but I may well revisit it significantly for v2. Unfortunately, there’s nothing I can do about CultureInfo itself.

What I would have preferred in all of this is the builder pattern: make CultureInfo, DateTimeFormatInfo etc all immutable, but give each of them mutable builder types, with the ability to create a mutable builder based on an existing immutable instance, and obviously to create a new immutable instance from a builder. That would make all kinds of things simpler – including caching.

For the moment though, I hope we can all learn lessons from this – or have old lessons reinforced, at least:

Making a single type behave in different ways based on different "modes" makes it hard to use correctly. (Yes, this is the same first conclusion as with DateTime in the previous post. Funny, that.)
Immutability has to be deep to be meaningful: it’s not much use having a supposedly read-only object which composes a StringBuilder…
Arrays should be considered somewhat harmful. If you’re going to return an array from a method, make sure you document whether this is a copy of the underlying data, or a "live" reference. (The latter should be very rare, particularly for a public API.) The exception here is if you return an empty array – that’s effectively immutable, so you can always return it with no problems.
The builder pattern rocks – use it!

In my next post I’ll try to get back to the TimeZoneInfo oddities I mentioned last time.

C#, Design, Noda Time

Subtleties in API design – member placement

February 29, 2012 jonskeet 38 Comments

Noda Time is nearing v1.0, which means I’m spending more time writing documentation than code. It also means reviewing the APIs we’ve got with a critical eye – whether that’s removing extraneous members, adding useful ones, or moving things around. (In particular, writing documentation often suggests where a change would make calling code read more naturally.)

This post is about one particular section of the API, and the choices available. Although I do go into some detail around the specific calls involved, that’s just for context… the underlying choices are ones which could be faced when designing any API. I’ve rarely spent as much time thinking about API decisions as I have with Noda Time, so hopefully this will prove interesting to you even if you really don’t care about Noda Time itself as a project.

Introduction: time zones, local date/times and zoned date/times – oh my!

(Okay, so that’s not quite as snappy as the Judy Garland version, but hey…)

The area of API we’re going to focus on is time zones, and converting between "local" date/time values and "zoned" ones. The three types involved are:

LocalDateTime: a "local" date and time, with no specific time zone. So, something like "7:30 in the evening on February 27th 2012". This means different instants in time in different time zones, of course: if you’re arranging a meeting, it’s good enough when the attendees are in the same time zone, but not good enough if you’re meeting with someone on the other side of the world. (A LocalDateTime also has an associated calendar system, but I’m not going to talk about that much in this post.)
DateTimeZone: a time zone. At its core, this maps any "instant" in time to an offset – the difference between UTC and local time in that time zone. The offset changes over time, typically (but not always) due to daylight saving changes.
ZonedDateTime: a date and time in a particular time zone, with an offset from UTC to avoid ambiguity in some cases (and for efficiency). This identifies a specific instant in time (simply by subtracting the offset from the local date/time). Conceptually this is equivalent to just maintaining the "instant" value, the time zone, and the calendar system – but it’s generally cleaner to think of it as a "local" value with an offset from UTC.

If those brief descriptions don’t make sense for you at the moment (this sort of thing is quite hard to describe concisely and precisely) you may want to see whether the Noda Time user guide "concepts" page helps.

The API task: mapping from { LocalDateTime, DateTimeZone } to ZonedDateTime

It’s easy to get from a ZonedDateTime to a LocalDateTime – you can just use the LocalDateTime property. The difficult bit is the other way round. We obviously want to be able to create a ZonedDateTime from the combination of a LocalDateTime and a DateTimeZone, but the question is where to put this functionality. Three options suggest themselves:

A static method (or constructor) in ZonedDateTime which takes both the time zone and the local date/time as arguments
An instance method on LocalDateTime which just takes the time zone as an argument
An instance method on DateTimeZone which just takes the local date/time as an argument

It gets more complicated though – we’re not really talking about one operation here, but potentially several. Although the mapping from instant to offset is unambiguous in DateTimeZone, the mapping from LocalDateTime to offset is not straightforward. There can be 0, 1 or 2 possible results. For example, in the America/Los_Angeles time zone the clocks go forward from 2am to 3am on Sunday March 11th 2012, and go back from 2am to 1am on Sunday 4th November 2012. That means:

The mapping from local date/time to offset at 7.30pm on February 27th 2012 is unambiguous: it’s definitely -8 hours (L.A. is 8 hours behind UTC).
The mapping at 2.30am on March 11th 2012 is impossible: at 2am the clocks were put forward to 3am, so 2.30am simply never occurs.
The mapping at 2.30am on November 4th 2012 is ambiguous: it happens once before the clocks go back at 3am, and once afterwards. The offset is either -7 or -8 hours, depending on which occurrence you mean.

When mapping a local time to "global" time, this is something you should really think about. Most APIs obscure the issue, but one of the purposes of Noda Time is to force developers to think about issues which they should be aware of. This one is particularly insidious in that it’s the kind of problem which is much more likely to arise when you’re asleep than during daylight hours – so it’s unlikely to be found during manual testing. (Ditto the day of week – most time zones have daylight saving transitions early on a Sunday morning.)

So, Noda Time exposes four ways of mapping a LocalDateTime and DateTimeZone to a ZonedDateTime:

Exact: if there’s a single mapping, return it. Otherwise, throw an exception.
Earlier: if there’s a single mapping, return it. If there’s more than one, return the earlier one. If the time is skipped, throw an exception.
Later: if there’s a single mapping, return it. If there’s more than one, return the later one. If the time is skipped, throw an exception.
All information: find out all the information relevant to mapping the given local value – how many matches there are, what they would be, what the time zone information is for each mapping, etc. The caller can then do what they want.

Options available

The question is how we expose these operations. Let’s look at some options, then discuss the pros and cons.

Option 1: methods on LocalDateTime

A lot of Noda Time is designed to be "fluent" so it makes a certain amount of sense to be able to take a LocalDateTime, perform some arithmetic on it, then convert it to a ZonedDateTime, then (say) format it. So we could have something like:

var zoned = local.InZone(zone); // implicitly exact
var zoned = local.InZoneOrEarlier(zone);
var zoned = local.InZoneOrLater(zone);
var mapping = local.MapToZone(zone);

Option 2: methods on DateTimeZone

All the calculations involved are really about the time zone – the local date/time value is just a simple value as far as most of this is concerned. So we can put the methods on DateTimeZone instead:

var zoned = zone.AtExactly(local);
var zoned = zone.AtEarlier(local);
var zoned = zone.AtLater(local);
var mapping = zone.MapLocalDateTime(local);

Option 3: methods (or constructors) on ZonedDateTime

Maybe we consider the two inputs to be fairly equal, but the result type is more important:

var zoned = ZonedDateTime.FromLocal(zone, local);
var zoned = ZonedDateTime.FromLocalOrEarlier(zone, local);
var zoned = ZonedDateTime.FromLocalOrLater(zone, local);
var mapping = ZoneLocalMapping.FromLocal(local)

(I’m not terribly happy about the names here; there could be better ones of course.)

Variation a: any of the above options, but with an enum for ambiguity resolution

We don’t really need four methods on any of these APIs; the first three only differ by how they handle ambiguity (the situation where a particular local date/time occurs twice). We could use an enum to represent that choice instead:

var zoned = local.InZone(zone, ZoneAmbiguityResolver.Error);
var zoned = local.InZone(zone, ZoneAmbiguityResolver.Earlier);
var zoned = local.InZone(zone, ZoneAmbiguityResolver.Later);

(Or a "smart enum" with behaviour, if we wanted. A normal class type with methods etc, but only a restricted set of valid values.)

Variation b: always go via the mapping result

Given that we already have the idea of getting the full mapping results, we can reduce the API to just one method to return the mapping information, and then put extra methods on that:

var zoned = local.MapInZone(zone).SingleMatch;
var zoned = local.MapInZone(zone).SingleOrEarlier;
var zoned = local.MapInZone(zone).SingleOrLater;

(Again, the names aren’t fixed in stone, and the second part could be methods instead of properties if we wanted.)

Variation c: return a sequence of results

If we return a sequence with 0, 1 or 2 ZonedDateTime values, the user can just use LINQ to get the one they want. Again, this can apply wherever we decide to put the method:

var zoned = zone.At(local).Single();
var zoned = zone.At(local).First();
var zoned = zone.At(local).Last();

So, it looks like we effectively have two mostly-orthogonal decisions here:

Where to "start" the conversion – the target type for the method call
How to represent the multiple options

We’ll consider them separately.

Regarding the "source" type

To start with, I’ll reveal my bias: the existing implementation is option 2 (four methods on DateTimeZone). This was after a small amount of debate on the Noda Time mailing list, and this was the most relevant bit of the discussion:

Me (before going with the current implementation):

It feels a little odd to me to use the zone as the principal class here – just in terms of usability. It makes total sense in terms of the logic, but I tend to think of having a LocalDateTime first, and then converting that to use a particular zone – it’s not an operation which feels like it acts on the zone itself.

David N:

I actually feel the opposite: asking a DateTimeZone how a particular LocalDateTime would be represented in that zone feels natural, while asking the LocalDateTime how it would be represented in a zone feels odd. The zone is a first-class entity, with identity and behavior; the LocalDateTime is just a set of values. Why would the LocalDateTime be expected to know how it is represented in a particular zone?

Even though I replied to that in a "maybe" kind of tone, the argument basically convinced me. The trouble is, a colleague was then surprised when he read the documentation around calendar arithmetic and conversions. Surprising users is pretty much a cardinal sin when it comes to API design – and although in this case it was the relatively harmless sort of surprise ("I can’t find the member I want in A; oh, it turns out it’s in B") rather than a behavioural surprise ("I thought it would do X, but instead it does Y") it’s still bad news. I should reveal my colleague’s bias too – he has experience of Joda Time, where the relevant call is LocalDateTime.toDateTime(DateTimeZone). (There are calls in DateTimeZone itself, but they’re lower-level.)

We’ve discussed this a fair amount (leading to this blog post) and so far we’ve concluded that it really depends on how you think about time zones. As a Noda Time user, would you consider them to be rich objects with complex behaviour, or would you think of them as mere "tokens" to be passed around and used without much direct interaction? The two ways of viewing the type aren’t necessarily in conflict – I’ve deliberately designed CalendarSystem to hide its (fairly ugly) innards. There are a few public instance members, but most are internal. But what about time zones?

There’s an argument to be made for educating Noda Time users to think about time zones as more complex beasts than just tokens, and I’m happy to do that in other areas (such as choosing which type to use in the first place) but here it feels like it’s one step too far. On the other hand, I don’t want to stifle users who are thinking of DateTimeZone in that way. In the mailing list thread, David also expressed a dislike for the approach of including functionality in multiple places – and to a certain extent I agree (one of the things I dislike about its API is that it lets you do just about anything with anything)… but in this case it feels like it’s justified.

Regardless of how you’re thinking about DateTimeZone, it’s more likely that you’re going to want to use a LocalDateTime value which is the result of some other expression, and then apply some "constant" zone to it, then potentially keep going. If you think about a LINQ-style pipeline of operations, the part that varies in the conversion is much more likely to be the LocalDateTime than the time zone. As such, a method on LocalDateTime allows for a more fluent calling style:

var zoned = parseResult.Value
.PlusMonths(1)
.InZone(LondonTimeZone);

versus:

var zoned = LondonTimeZone.AtExactly(parseResult.Value.PlusMonths(1));

Or to keep the code order the same as the execution order:

var local = parseResult.Value.PlusMonths(1);
var zoned = LondonTimeZone.AtExactly(local);

Obviously the effects become more noticeable the more operations you perform. Personally I’m happy with either the first or third approach – although it’s worth being aware that either of the first two have the advantage of being one expression, and therefore easy to use when assigning a static readonly field or something similar.

I’m reasonably happy with having one method on each type, or possibly two (MapLocalDateTime and At*) on DateTimeZone and one (just InZone) on LocalDateTime. I really don’t like the idea of having four methods on DateTimeZone and three methods on LocalDateTime. So, let’s consider the different variations which cut down the number of methods required.

Expressing "exactly," "earlier," and "later" in one method

This is essentially a discussion of the "variations" above. Just to recap, the possibilities we’ve come up with are:

Add another parameter to the method to indicate how to handle ambiguities (or impossibilities) – just return a ZonedDateTime
Return a value of a different type (e.g. ZoneLocalMapping) which can be used to get at all the information you could want
Return a sequence of possible ZonedDateTime values, expecting the caller to probably use LINQ’s First/Last/Single/FirstOrDefault etc to get at the value they want

The last of these is the only one which gives an easy way of handling the extreme corner case of a local time occurring more than twice – for example, a time zone which goes back one hour at 2am (to 1am) and then goes back another two hours at 3am. I think it’s reasonable to dismiss this corner case; however mad time zones can be, I haven’t seen anything quite that crazy yet.

At the time of the original discussion, Noda Time was targeting .NET 2.0, which was one reason for not going with the final option here – we couldn’t guarantee that LINQ would be available. Now, Noda Time is targeting .NET 3.5 in order to use TimeZoneInfo, but it still doesn’t feel like an ideal fit:

Returning a sequence doesn’t give information about (say) the two zone intervals "surrounding" a gap
A sequence may be surprising to users who expect just a single value
The exceptions thrown by First, Single etc when their expectations aren’t met are very domain-neutral; we can give more information
FirstOrDefault will return the default value for ZonedDateTime in the case of ambiguity. That would be unfortunate, as ZonedDateTime is a value type, and its default value is actually somewhat unhelpful. (It has a null calendar system, effectively. There’s not a lot we can do about this, but that’s a post for another day.) We could make it a sequence of Nullable<ZonedDateTime> and guarantee that any values in it are actually non-null, but that’s really straining things.

Putting these together, there are enough negative points to this idea that I’m happy to rule it out. But what about the first two?

The first has the advantage that the caller only needs to make a single method call, effectively passing in a "magic token" (the ambiguity resolver) which they don’t really need to understand. On the other hand, if they want more information, they’ll have to call a different method – and I’m not really sure we want to encourage too much of this "magic token" behaviour.

The second has three disadvantages, all fairly slight:

The user may initially expect the result of a method mapping a LocalDateTime to a ZonedDateTime to be a ZonedDateTime… what’s this extra intermediate result doing? This is "only" a matter of user education, and it’s pretty discoverable. It’s an extra concept the user needs to understand, but it’s a good concept to understand.
Calling two methods or a method and a property (e.g. zone.MapLocalDateTime(localDateTime).Earlier) may end up being slightly more long-winded than a single method call. I can’t get excited about this.
We have to allocate an extra object for the mapping, even when we know it’s unique. Usually, this object will become eligible for garbage collection immediately. We could make it a struct, but I don’t think it’s a natural value type – I’d rather trust that allocating objects in gen0 is pretty cheap.

With the second method, we can replace all the existing methods in DateTimeZone with a single one (or rather, just remove the AtXyz methods, leaving MapLocalDateTime). We can then create pleasantly-named methods on ZoneLocalMapping (which isn’t quite right for this purpose at the moment).

Conclusion

This has been an interesting thought experiment for me, and it’s suggested some changes I will be applying before v1. We’ll see how they pan out. If you want to follow them, look for relevant source code changes.

The important points I’ve been thinking about are:

What would a new user expect to be available? If they haven’t read any documentation, what are they likely to try?
What should the user know about? Where there are important decisions to make, how can we provide guidance?
What would an experienced user (who is already thinking about the Noda Time concepts in the way that we want to encourage) expect to be available?
Where does the balance lie between providing a "too crowded" API (with lots of different ways of achieving the same thing) and a "sparse" API (where there’s always one way of achieving a goal, but it may not be the most convenient one for your situation)
How does our choice fit in with other technologies? For example, the final "variation" seems like it plays nicely with LINQ at first – but a few subtleties make it a worse fit than it might seem.
How does this affect performance? (Performance is important in Noda Time – but there would have to be a significant performance problem for me to deviate from an elegant solution.)

So, any other thoughts? Did we miss some options? What other factors should we have taken into consideration?