Category Archives: C#

Reimplementing LINQ to Objects: Part 5 – Empty

Continuing with the non-extension methods, it’s time for possibly the simplest LINQ operator around: "Empty".

What is it?

"Empty" is a generic, static method with just a single signature and no parameters:

public static IEnumerable<TResult> Empty<TResult>()

It returns an empty sequence of the appropriate type. That’s all it does.

There’s only one bit of interesting behaviour: Empty is documented to cache an empty sequence. In other words, it returns a reference to the same empty sequence every time you call it (for the same type argument, of course).

What are we going to test?

There are really only two things we can test here:

  • The returned sequence is empty
  • The returned sequence is cached on a per type argument basis

I’m using the same approach as for Range to call the static method, but this time with an alias of EmptyClass. Here are the tests:

[Test]
public void EmptyContainsNoElements()
{
    using (var empty = EmptyClass.Empty<int>().GetEnumerator())
    {
        Assert.IsFalse(empty.MoveNext());
    }
}

[Test]
public void EmptyIsASingletonPerElementType()
{
    Assert.AreSame(EmptyClass.Empty<int>(), EmptyClass.Empty<int>());
    Assert.AreSame(EmptyClass.Empty<long>(), EmptyClass.Empty<long>());
    Assert.AreSame(EmptyClass.Empty<string>(), EmptyClass.Empty<string>());
    Assert.AreSame(EmptyClass.Empty<object>(), EmptyClass.Empty<object>());

    Assert.AreNotSame(EmptyClass.Empty<long>(), EmptyClass.Empty<int>());
    Assert.AreNotSame(EmptyClass.Empty<string>(), EmptyClass.Empty<object>());
}

Of course, that doesn’t verify that the cache isn’t per-thread, or something like that… but it’ll do.

Let’s implement it!

The implementation is actually slightly more interesting than the description so far may suggest. If it weren’t for the caching aspect, we could just implement it like this:

// Doesn’t cache the empty sequence
public static IEnumerable<TResult> Empty<TResult>()
{
    yield break;
}

… but we want to obey the (somewhat vaguely) documented caching aspect too. It’s not really hard, in the end. There’s a very handy fact that we can use: empty arrays are immutable. Arrays always have a fixed size, but normally there’s no way of making an array read-only… you can always change the value of any element. But an empty array doesn’t have any elements, so there’s nothing to change. So, we can reuse the same array over and over again, returning it directly to the caller… but only if we have an empty array of the right type.

At this point you may be expecting a Dictionary<Type, Array> or something similar… but there’s another useful trick we can take advantage of. If you need a per-type cache and the type is being specific as a type argument, you can use static variables in a generic class, because each constructed type will have a distinct set of static variables.

Unfortunately, Empty is a generic method rather than a non-generic method in a generic type… so we’ve got to create a separate generic type to act as our cache for the empty array. That’s easy to do though, and the CLR takes care of initializing the type in a thread-safe way, too. So our final implementation looks like this:

public static IEnumerable<TResult> Empty<TResult>()
{
    return EmptyHolder<TResult>.Array;
}
        
private static class EmptyHolder<T>
{
    internal static readonly T[] Array = new T[0];       
}

That obeys all the caching we need, and is really simple in terms of lines of code… but it does mean you need to understand how generics work in .NET reasonably well. In some ways this is the opposite of the situation in the previous post – this is a sneaky implementation instead of the slower but arguably simpler dictionary-based one. In this case, I’m happy with the trade-off, because once you do understand how generic types and static variables work, this is simple code. It’s a case where simplicity is in the eye of the beholder.

Conclusion

So, that’s Empty. The next operator – Repeat – is likely to be even simpler, although it’ll have to be another split implementation…

Addendum

Due to the minor revolt over returning an array (which I still think is fine), here’s an alternative implementation:

public static IEnumerable<TResult> Empty<TResult>()
{
    return EmptyEnumerable<TResult>.Instance;
}

#if AVOID_RETURNING_ARRAYS
private class EmptyEnumerable<T> : IEnumerable<T>, IEnumerator<T>
{
    internal static IEnumerable<T> Instance = new EmptyEnumerable<T>();

    // Prevent construction elsewhere
    private EmptyEnumerable()
    {
    }

    public IEnumerator<T> GetEnumerator()
    {
        return this;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this;
    }

    public T Current
    {
        get { throw new InvalidOperationException(); }
    }

    object IEnumerator.Current
    {
        get { throw new InvalidOperationException(); }
    }

    public void Dispose()
    {
        // No-op
    }

    public bool MoveNext()
    {
        return false// There’s never a next entry
    }

    public void Reset()
    {
        // No-op
    }
}

#else
private static class EmptyEnumerable<T>
{
    internal static readonly T[] Instance = new T[0];       
}
#endif

Hopefully now everyone can build a version they’re happy with :)

Reimplementing LINQ to Objects: Part 4 – Range

This will be a short post, and there’ll probably be some more short ones coming up too. I think it makes sense to only cover multiple operators in a single post where they’re really similar. (Count and LongCount spring to mind.) I’m in your hands though – if you would prefer "chunkier" posts, please say so in the comments.

This post will deal with the Range generation operator.

What is it?

Range only has a single signature:

public static IEnumerable<int> Range(
    int start,
    int count)

Unlike most of LINQ, this isn’t an extension method – it’s a plain old static method. It returns an iterable object which will yield "count" integers, starting from "start" and incrementing each time – so a call to Enumerable.Range(6, 3) would yield 6, then 7, then 8.

As it doesn’t operate on an input sequence, there’s no sense in which it could stream or buffer its input, but:

  • The arguments need to be validated eagerly; the count can’t be negative, and it can’t be such that any element of the range could overflow Int32.
  • The values will be yielded lazily – Range should be cheap, rather than creating (say) an array of "count" elements and returning that.

How are we going to test it?

Testing a plain static method brings us a new challenge in terms of switching between the "normal" LINQ implementation and the Edulinq one. This is an artefact of the namespaces I’m using – the tests are in Edulinq.Tests, and the implementation is in Edulinq. "Parent" namespaces are always considered when the compiler tries to find a type, and they take priority over anything in using directives – even a using directive which tries to explicitly alias a type name.

The (slightly ugly) solution to this that I’ve chosen is to include a using directive to create an alias which couldn’t otherwise be resolved – in this case, RangeClass. The using directive will either alias RangeClass to System.Linq.Enumerable or Edulinq.Enumerable. The tests then all use RangeClass.Range. I’ve also changed how I’m switching between implementations – I now have two project configurations, one of which defines the NORMAL_LINQ preprocessor symbol, and the other of which doesn’t. The RangeTest class therefore contains:

#if NORMAL_LINQ
using RangeClass = System.Linq.Enumerable;
#else
using RangeClass = Edulinq.Enumerable;
#endif

There are alternatives to this approach, of course:

  • I could move the tests to a different namespace
  • I could make the project references depend on the configuration… so the "Normal LINQ" configuration wouldn’t reference the Edulinq implementation project, and the "Edulinq implementation" configuration wouldn’t reference System.Core. I could then just use Enumerable.Range with an appropriate using directive for System.Linq conditional on the NORMAL_LINQ preprocessor directive, as per the other tests.

I like the idea of the second approach, but it means manually tinkering with the test project file – Visual Studio doesn’t expose any way of conditionally including a reference. I may do this at a later date… thoughts welcome.

What are we going to test?

There isn’t much we can really test for ranges – I only have eight tests, none of which are particularly exciting:

  • A simple valid range should look right when tested with AssertSequenceEqual
  • The start value should be allowed to be negative
  • Range(Int32.MinValue, 0) is an empty range
  • Range(Int32.MaxValue, 1) yields just Int32.MaxValue
  • The count can’t be negative
  • The count can be zero
  • start+count-1 can’t exceed Int32.MaxValue (so Range(Int32.MaxValue, 2) isn’t valid)
  • start+count-1 can be Int32.MaxValue (so Range(Int32.MaxValue, 1) is valid)

The last two are tested with a few different examples each – a large start and a small count, a small start and a large count, and "fairly large" values for both start and count.

Note that I don’t have any tests for lazy evaluation – while I could test that the returned value doesn’t implement any of the other collection interfaces, it would be a little odd to do so. On the other hand, we do have tests which have an enormous count – such that anything which really tried to allocate a collection of that size would almost certainly fail…

Let’s implement it!

It will surely be no surprise by now that we’re going to use a split implementation, with a public method which performs argument validation eagerly and then uses a private method with an iterator block to perform the actual iteration.

Having validated the arguments, we know that we’ll never overflow the bounds of Int32, so we can be pretty casual in the main part of the implementation.

public static IEnumerable<int> Range(int start, int count)
{
    if (count < 0)
    {
        throw new ArgumentOutOfRangeException("count");
    }
    // Convert everything to long to avoid overflows. There are other ways of checking
    // for overflow, but this way make the code correct in the most obvious way.
    if ((long)start + (long)count – 1L > int.MaxValue)
    {
        throw new ArgumentOutOfRangeException("count");
    }
    return RangeImpl(start, count);
}

private static IEnumerable<int> RangeImpl(int start, int count)
{
    for (int i = 0; i < count; i++)
    {
        yield return start + i;
    }
}

Just a few points to note here:

  • Arguably it’s the combination of "start" and "count" which is invalid in the second check, rather than just count. It would possibly be nice to allow ArgumentOutOfRangeException (or ArgumentException in general) to blame multiple arguments rather than just one. However, using "count" here matches the framework implementation.
  • There are other ways of performing the second check, and I certainly didn’t have to make all the operands in the expression longs. However, I think this is the simplest code which is clearly correct based on the documentation. I don’t need to think about all kinds of different situations and check that they all work. The arithmetic will clearly be valid when using the Int64 range of values, so I don’t need to worry about overflow, and I don’t need to consider whether to use a checked or unchecked context.
  • There are also other ways of looping in the private iterator block method, but I think this is the simplest. Another obvious and easy alternative is to keep two values, one for the count of yielded values and the other for the next value to yield, and increment them both on each iteration. A more complex approach would be to use just one loop variable – but you can’t use "value < start + count" in case the final value is exactly Int32.MaxValue, and you can’t use "value <= start + count – 1" in case the arguments are (int.MinValue, 0). Rather than consider all the border cases, I’ve gone for an obviously-correct solution. If you really, really cared about the performance of Range, you’d want to investigate various other options.

Prior to writing up this post, I didn’t have good tests for Range(Int32.MaxValue, 1) and Range(Int32.MinValue, 0)… but as they could easily go wrong as mentioned above, I’ve now included them. I find it interesting how considering alternative implementations suggests extra tests.

Conclusion

"Range" was a useful method to implement in order to test some other operators – "Count" in particular. Now that I’ve started on the non-extension methods though, I might as well do the other two (Empty and Repeat). I’ve already implemented "Empty", and will hopefully be able to write it up today. "Repeat" shouldn’t take much longer, and then we can move on to "Count" and "LongCount".

I think this code is a good example of situations where it’s worth writing "dumb" code which looks like the documentation, rather than trying to write possibly shorter, possibly slightly more efficient code which is harder to think about. No doubt there’ll be more of that in later posts…

Reimplementing LINQ to Objects: Part 3 – “Select” (and a rename…)

It’s been a long time since I wrote part 1 and part 2 of this blog series, but hopefully things will move a bit more quickly now.

The main step forward is that the project now has a source repository on Google Code instead of just being a zip file on each blog post. I had to give the project a title at that point, and I’ve chosen Edulinq, hopefully for obvious reasons. I’ve changed the namespaces etc in the code, and the blog tag for the series is now Edulinq too. Anyway, enough of the preamble… let’s get on with reimplementing LINQ, this time with the Select operator.

What is it?

Like Where, Select has two overloads:

public static IEnumerable<TResult> Select<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TResult> selector)

public static IEnumerable<TResult> Select<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, int, TResult> selector)

Again, they both operate the same way – but the second overload allows the index into the sequence to be used as part of the projection.

Simple stuff first: the method projects one sequence to another: the "selector" delegate is applied to each input element in turn, to yield an output element. Behaviour notes, which are exactly the same as Where (to the extent that I cut and paste these from the previous blog post, and just tweaked them):

  • The input sequence is not modified in any way.
  • The method uses deferred execution – until you start trying to fetch items from the output sequence, it won’t start fetching items from the input sequence.
  • Despite deferred execution, it will validate that the parameters aren’t null immediately.
  • It streams its results: it only ever needs to look at one result at a time.
  • It will iterate over the input sequence exactly once each time you iterate over the output sequence.
  • The "selector" function is called exactly once per yielded value.
  • Disposing of an iterator over the output sequence will dispose of the corresponding iterator over the input sequence.

What are we going to test?

The tests are very much like those for Where – except that in cases where we tested the filtering aspect of Where, we’re now testing the projection aspect of Select.

There are a few tests of some interest. Firstly, you can tell that the method is generic with 2 type parameters instead of 1 – it has type parameters of TSource and TResult. They’re fairly self-explanatory, but it means it’s worth having a test for the case where the type arguments are different – such as converting an int to a string:

[Test]
public void SimpleProjectionToDifferentType()
{
    int[] source = { 1, 5, 2 };
    var result = source.Select(x => x.ToString());
    result.AssertSequenceEqual("1", "5", "2");
}

Secondly, I have a test that shows what sort of bizarre situations you can get into if you include side effects in your query. We could have done this with Where as well of course, but it’s clearer with Select:

[Test]
public void SideEffectsInProjection()
{
    int[] source = new int[3]; // Actual values won’t be relevant
    int count = 0;
    var query = source.Select(x => count++);
    query.AssertSequenceEqual(0, 1, 2);
    query.AssertSequenceEqual(3, 4, 5);
    count = 10;
    query.AssertSequenceEqual(10, 11, 12);
}

Notice how we’re only calling Select once, but the results of iterating over the results change each time – because the "count" variable has been captured, and is being modified within the projection. Please don’t do things like this.

Thirdly, we can now write query expressions which include both "select" and "where" clauses:

[Test]
public void WhereAndSelect()
{
    int[] source = { 1, 3, 4, 2, 8, 1 };
    var result = from x in source
                 where x < 4
                 select x * 2;
    result.AssertSequenceEqual(2, 6, 4, 2);
}

There’s nothing mind-blowing about any of this, of course – hopefully if you’ve used LINQ to Objects at all, this should all feel very comfortable and familiar.

Let’s implement it!

Surprise surprise, we go about implementing Select in much the same way as Where. Again, I simply copied the implementation file and tweaked it a little – the two methods really are that similar. In particular:

  • We’re using iterator blocks to make it easy to return sequences
  • The semantics of iterator blocks mean that we have to separate the argument validation from the real work. (Since I wrote the previous post, I’ve learned that VB11 will have anonymous iterators, which will avoid this problem. Sigh. It just feels wrong to envy VB users, but I’ll learn to live with it.)
  • We’re using foreach within the iterator blocks to make sure that we dispose of the input sequence iterator appropriately – so long as our output sequence iterator is disposed or we run out of input elements, of course.

I’ll skip straight to the code, as it’s all so similar to Where. It’s also not worth showing you the version with an index – because it’s such a trivial difference.

public static IEnumerable<TResult> Select<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TResult> selector)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    if (selector == null)
    {
        throw new ArgumentNullException("selector");
    }
    return SelectImpl(source, selector);
}

private static IEnumerable<TResult> SelectImpl<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TResult> selector)
{
    foreach (TSource item in source)
    {
        yield return selector(item);
    }
}

Simple, isn’t it? Again, the real "work" method is even shorter than the argument validation.

Conclusion

While I don’t generally like boring my readers (which may come as a surprise to some of you) this was a pretty humdrum post, I’ll admit. I’ve emphasized "just like Where" several times to the point of tedium very deliberately though – because it makes it abundantly clear that there aren’t really as many tricky bits to understand as you might expect.

Something slightly different next time (which I hope will be in the next few days). I’m not quite sure what yet, but there’s an awful lot of methods still to choose from…

Creative freedom, control, and the balance of power

Stephen Colebourne’s comment on my last blog post (adding 1 month -1 day to January 29th) have knocked me for six. To avoid thinking about how I might implement the preferred behaviour in Noda Time while still using Joda Time’s "engine" I’ve decided to write about something else which has been niggling at me.

For a long time, I’ve espoused the idea of "design for inheritance or prohibit it" – in other words, default to sealing classes and making methods non-virtual unless you have a good reason to do otherwise. I’ve usually attributed this phrase to Josh Bloch writing in Effective Java, but it could well have been round for a long time.

Whenever I’ve espoused this in public, it’s caused disagreement. Not universal disagreement (which would be easy to cope with; if everyone else thinks I’m wrong, that’s a very strong indicator that I’m really missing something) but a fairly even distribution of "absolutely" and "hell no". Most people seem to feel passionately one way or the other. This has led me to water down my line of "the C# designers should have made classes sealed by default" to "the C# designers shouldn’t have included a default for classes being sealed or not – make the developer specify it".

One thing I’ve found interesting is that the split between "make everything non-virtual" and "make everything virtual" isn’t one of "smart" vs "dumb". There are plenty of publically admired developers on both sides of the fence, and my colleagues are reasonably evenly split too. However, I have detected a correlation in terms of programming preferences around type systems: I’ve generally found that those who are in favour of making everything virtual by default are generally more likely to also be enthusiastic around dynamic typing. That won’t be universally true of course, but I think one is likely to be a reasonably good predictor of the other.

Ultimately I think it’s about a balance, and different people place different amounts of value on the various pros and cons. It’s also about the relationship between different parties. Different pros and cons affect different parties in different ways.

A relatively inflexible API with a flexible implementation

I’m happy when I know everything that is going on in my code. I interact with other code through obvious dependencies: they are provided to me explicitly. You’re welcome to modify my code’s visible behaviour by implementing those dependencies in different ways, but my code should be fine as long as you abide by the contracts expressed in the dependencies (typically interfaces).

If I call one of my own non-virtual methods from within my code, I know what it’s going to do. If I have two non-virtual methods which could be implemented by one calling the other either way round, then it doesn’t matter which option I pick. I can change my mind later on, and no-one will be any the wiser. All the externally visible behaviour will be exactly the same. I don’t need to document which method calls which – just what the final results are.

If I create an immutable type and seal it, then all the immutability is within my control. If I’ve picked immutable types for my member variables, have Object as a base class, and make sure I don’t mutate anything myself, I’m fine. I can rely on my values being unchanging, and so can my callers. They can cache a value with impunity.

Basically, everything is simple… unless you want to make one of my types behave slightly differently.

Flexibility with the risk of breakage

The above section sounds all nice and rosy… but what if you want to just tweak my type slightly? You only want to override one method – how much harm can that do? You’ve looked at the implementation and seen that nothing actually relies on it working exactly the way it does… and it doesn’t call any other public members. If my type implements an interface including the member you want to tweak, then you could potentially implement the interface and delegate almost all calls to an instance of the original type, but implement that one call differently. Of course, delegation is great in many cases – but can fail when there are complex relationships involved (such as when the delegated instance passes itself to something else). Basically there are identity issues.

It would be much simpler in this case if you could override my method. That might help your testing, or allow you to use my type in a way I’d never anticipated, achieving fabulous things. As Stroustrup wrote, "I wouldn’t like to build a tool that could only do what I had been able to imagine for it." Now I believe there’s a big difference between imagining a "big picture" which my component may be some tiny part of, and imagining a crazy use for the type itself, but the sentiment is still worth considering. Creative freedom is a nice thing to have, after all… who am I to stop you from achieving greatness?

The downside is that you’re opening yourself to the risk of your code breaking if I change my implementation details in another release. Maybe it would only mean your tests breaking – maybe it’s your production code. (While I don’t want to put too many words in the mouths of those who hold the opposite opinion to me, I believe a lot of their reason for wanting to be able to override methods is to make testing easier. Personally I prefer to construct test doubles which implement interfaces directly, but I do understand that’s not always feasible – especially if the component in question hasn’t been designed with testability in mind to start with.)

In many cases there’s genuinely little risk of that actually happening… but how tolerant are you of that risk?

Risk evaluation and propagation

When I wrote about what might break if the library producer changes their code, I mentioned your production code and your test code. There’s a much nastier risk though: you break someone else’s code which is relying on yours.

Suppose I produce library X, and you use it in library Y. Now Joe Developer uses both of our libraries in his application… except he uses a new version of X. Maybe it’s a bug-fix version, which is meant to have no externally-visible changes… except it changes how it uses its own methods, in a way which will break if you’ve overridden some methods in a particular way… and you’ve done so in library Y. As far as Joe Developer is concerned, his combination of X + Y is broken. Who’s fault is it?

  • Mine for changing the behaviour of X in a seemingly sensible way?
  • Yours for overriding a member of X in Y in a way I hadn’t anticipated?
  • Joe’s for using a version of X which you hadn’t developed Y against?

Maybe all three. The trouble is, as the developer of Y you have no way of knowing how likely it is that I’ll change the details of my implementation in "seemingly harmless" ways. Indeed, you may even have performed some testing of Y against the version of X that Joe is using… but maybe Joe’s overriding some other members of the types from X and Y in ways that neither you nor I expected… and the combination could be complex to work out.

Now this all sounds like doom and gloom – but you need to remember that there must have been reasons for overriding those members to start with. Achieving the same goals without using inheritance could certainly have been considerably more complex, and introduced other bugs along the way. Using inheritance could have been a big win all round, right up until the point where everything screwed up… at which point it may still be recoverable, or it could be a knot which can’t be untied. You probably won’t know until the breakage happens, and you probably can’t accurately gauge the likelihood of it happening in the first place. It may well never happen.

Three options as library providers and consumers

It seems to me that when you’re building an API, there are three broad options available (obviously with nuanced positions between them):

  • Make every type unsealed, and every method virtual – but don’t make any guarantees about what will happen if someone overrides methods.
  • Make every type unsealed and every method virtual – but document/guarantee every internal interaction, so that anyone deriving from your class can predict how it will behave.
  • Make everything sealed or non-virtual unless you can foresee a reason for overriding it. Document what sort of overriding you expect to handle, and where the overridden methods will be called.

As the consumer of an API, you have various choices too:

  • Rely on undocumented behaviour, betting that you’ll save more time by doing and fixing breakage later
  • Only rely on documented behaviour when calling code, but rely on undocumented behaviour when overriding code, as this is typically less well documented anyway (very few APIs will specify exactly what’s deemed acceptable)
  • Only rely on documented behaviour

While these options are reasonably easy to describe, they again miss the oh-so-common situation: I’m consuming someone else’s types, but providing my own types to other developers. This mixed behaviour is where a lot of the complexity comes in, increasing the risk of breakage and increasing the cost of fixing the problem.

Conclusion

I still believe that designing for inheritance or prohibiting it makes sense if you want to provide a robust library which makes it hard for the consumer to abuse. However, I appreciate that others want the ability to abuse a library – and are willing to pay the price for that down the line. I’m concerned by the "3 party" scenario though – where developer X can shoot your foot off by abusing my code.

Sadly, I can’t see this long-running argument coming any closer to resolution. Better mix-in support within C# would at least help, I believe – but delegation is no panacea either.

I want to be a pragmatic developer: I dislike the dogmatic prohibition of convenient practices for the sake of purely theoretical reasons as much as the next person… and I genuinely can see where it can be a pain not being able to override behaviour at times. However, I have yet to be convinced that a gung-ho, "It probably won’t break, honest!" attitude is really a good option in the long term. I hope I’m gaining an increasingly open mind though – and I hope that at least by discussing this from slightly less religious viewpoints from normal, both camps can learn something from each other.

The importance of context, and a question of explicitness

(Just to be clear: I hate the word "explicitness". It reminds me of Rowan Atkinson as Marcus Browning MP, saying we should be purposelessnessless. But I can’t think of anything better here.)

For the last few days, I’ve been thinking about context – in the context of C# 5’s proposed asynchronous functionality. Now many of the examples which have been presented have been around user interfaces, with the natural restriction of doing all UI operations on the UI thread. While that’s all very well, I’m more interested in server-side programming, which has a slightly different set of restrictions and emphases.

Back in my post about configuring waiting, I talked about the context of where execution would take place – both for tasks which require their own thread and for the continuations. However, thinking about it further, I suspect we could do with richer context.

What might be included in a context?

We’re already used to the idea of using context, but we’re not always aware of it. When trying to service a request on a server, some or any of the following may be part of our context:

  • Authentication information: who are we acting as? (This may not be the end user, of course. It may be another service who we trust in some way.)
  • Cultural information: how should text destined for an end user by rendered? What other regional information is relevant?
  • Threading information: as mentioned before, what threads should be used both for "extra" tasks and continuations? Are we dealing with thread affinity?
  • Deadlines and cancellation: the overall operation we’re trying to service may have a deadline, and operations we create may have their own deadlines too. Cancellation tokens in TPL can perform this role for us pretty easily.
  • Logging information: if the logs need to tie everything together, there may be some ID generated which should be propagated.
  • Other request information: very much dependent on what you’re doing, of course…

We’re used to some of this being available via properties such as CultureInfo.CurrentCulture and HttpContext.Current – but those are tied to a particular thread. Will they be propagated to threads used for new tasks or continuations? Historically I’ve found that documentation has been very poor around this area. It can be very difficult to work out what’s going to happen, even if you’re aware that there’s a potential problem in the first place.

Explicit or implicit?

It’s worth considering what the above items have in common. Why did I include those particular pieces of information but not others? How can we avoid treating them as ambient context in the first place?

Well, fairly obviously we can pass all the information we need along via method arguments. C# 5’s async feature actually makes this easier than it was before (and much easier that it would have been without anonymous functions) because the control flow is simpler. There should be fewer method calls, each of which would each require decoration with all the contextual information required.

However, in my experience that becomes quite problematic in terms of separation of concerns. If you imagine the request as a tree of asynchronous operations working down from a top node (whatever code initially handles the request), each node has to provide all the information required for all the nodes within its subtree. If some piece of information is only required 5 levels down, it still needs to be handed on at each level above that.

The alternative is to use an implicit context – typically via static methods or properties which have to do the right thing, typically based on something thread-local. The context code itself (in conjunction with whatever is distributing the work between threads) is responsible for keeping track of everything.

It’s easy to point out pros and cons to both approaches:

  • Passing everything through methods makes the dependencies very obvious
  • Changes to "lower" tasks (even for seemingly innocuous reasons such as logging) end up causing chains of changes higher up the task tree – possibly to developers working on completely different projects, depending on how your components work
  • It feels like there’s a lot of work for very little benefit in passing everything explicitly through many layers of tasks
  • Implicit context can be harder to unit test elegantly – as is true of so many things using static calls
  • Implicit context requires everyone to use the same context. It’s no good high level code indicating which thread pool to use in one setting when some lower level code is going to use a different context

Ultimately it feels like a battle between purity and pragmatism: being explicit helps to keep your code purer, but it can mean a lot of fluff around your real logic, just to maintain the required information to pass onward. Different developers will have different approaches to this, but I suspect we want to at least keep the door open to both designs.

The place of Task/Task<T>

Even if Task/Task<T> can pass on the context for scheduling, what do we do about other information (authentication etc)? We have types like ThreadLocal<T> – in a world where threads are more likely to be reused, and aren’t really our unit of asynchrony, do we effectively need a TaskLocal<T>? Can context within a task be pushed and automatically popped, to allow one subtree to "override" the context for its nodes, while another subtree works with the original context?

I’ve been trying to think about whether this can be provided in "userland" code instead of in the TPL itself, but I’m not sure it can, easily… at least not without reinventing a lot of the existing code, which is never a good idea when it’s tricky parallelization code.

Should this be general support, or would it be okay to stick to just TaskScheduler.Current, leaving developers to pass other context explicitly?

Conclusion

These are thoughts which I’m hoping will be part of a bigger discussion. I think it’s something the community should think about and give feedback to Microsoft on well before C# 5 (and whatever framework it comes with) ships. I have lots of contradictory feelings about the right way to go, and I’m fully expecting comments to have mixed opinions too.

I’m sure I’ll be returning to this topic as time goes on.

Addendum (March 27th 2012)

Lucian Wischik recently mailed me about this post, to mention that F#’s support for async has had the ability to retain explicit context from the start. It’s also more flexible than the C# async support – effectively, it allows you to swap out AsyncTaskMethodBuilder etc for your own types, so you don’t always have to go via Task/Task<T>. I’ll take Lucian’s word for that, not knowing much about F# myself. One day…

Multiple exceptions yet again… this time with a resolution

I’ve had a wonderful day with Mads Torgersen, and amongst other things, we discussed multiple exceptions and the way that the default awaiter for Task<T> handles an AggregateException by taking the first exception and discarding the rest.

I now have a much clearer understanding of why this is the case, and also a workaround for the cases where you really want to avoid that truncation.

Why truncate in the first place?

(I’ll use the term "truncate" throughout this post to mean "when an AggregatedException with at least one nested exception is caught by EndAwait, throw the first nested exception instead". It’s just a shorthand.)

Yesterday’s post on multiple exceptions showed what you got if you called Wait() on a task returned from an async method. You still get an AggregateException, so why bother to truncate it?

Let’s consider a slightly different situation: where we’re awaiting an async method that throws an exception, and you want to be able to catch some specific exception that will be thrown by that asynchronous method. Imagine we used my NaiveAwaiter class. That would mean we would have to catch AggregateException, check whether one of those exceptions was actually present, and then handle that. There’d then be an open question about what to do if there were other exceptions as well… but that would be a relatively rare case. (Remember, we’re talking about multiple "top level" exceptions within the AggregateException – not just one exception nested in another, nested in another etc.)

With the current awaiter behaviour, you can catch the exception exactly as you would have done in synchronous code. Here’s an example:

using System;
using System.Threading.Tasks;
using System.Collections.Generic;

public class BangException : Exception 
{
    public BangException(string message) : base(message) {}
}

public class Test
{
    public static void Main()
    {
        FrobAsync().Wait();
    }
    
    private static async Task FrobAsync()
    {
        Task fuse = DelayedThrow(500);
        try
        {
            await fuse;
        }
        catch (BangException e)
        {
            Console.WriteLine("Caught it! ({0})", e.Message);
        }
    }
    
    static async Task DelayedThrow(int delayMillis) 
    { 
        await TaskEx.Delay(delayMillis);
        throw new BangException("Went bang after " + delayMillis + "ms");
    }
}

Nice and clean exception handling… assuming that the task we awaited asynchronously didn’t have multiple exceptions. (Note the improved DelayedThrow method, by the way. Definitely cleaner than my previous version.)

This aspect of "the async code looks like the synchronous code" is the important bit. One of the key aims of the language feature is to make it easy to write asynchronous code as if it were synchronous – because that’s what we’re used to, and what we know how to reason about. We’re fairly used to the idea of catching one exception… not so much on the "multiple things can go wrong at the same time" front.

So that handles the primary case where we really expect to only have one exception (if any) because we’re only performing one job.

What about cases where multiple exceptions are somewhat expected?

Let’s go back to the case where we really to propagate multiple exceptions. I think it’s reasonable that this should be an explicit opt-in, so let’s think about an extension method. For the sake of simplicity I’ll use Task – in real life we’d want Task<T> as well, of course. So for example, this line:

await TaskEx.WhenAll(t1, t2);

would become this:

await TaskEx.WhenAll(t1, t2).PreserveMultipleExceptions();

(Yes, the name is too long… but you get the idea.)

Now, there are two ways we could make this work:

  • We could make the extension method return something which had a GetAwaiter method, returning something which in turn had BeginAwait and EndAwait methods. This means making sure we get all of the awaiter code right, of course – and the returned value has little meaning outside an await expression.
  • We could wrap the task in another task, and use the existing awaiter code. We know that the EndAwait extension method associated with Task (and Task<T>) will go into a single level of AggregateException – but I don’t believe it will do any more than that. So if it’s going to strip one level of exception aggregation off, all we need to do is add another level.

According to Mads, the latter of these is easier. Let’s see if he’s right.

We need an extension method on Task, and we’re going to return Task too. How can we implement that?

  • We can’t await the task, because that will strip the exception before we get to it.
  • We can’t write an async task but call Wait() on the original task, because that will block immediately – we still want to be async.
  • We can use a TaskCompletionSource<T> to build a task. We don’t care about the actual result, so we’ll use TaskCompletionSource<object>. This will actually build a Task<object>, but we’ll return it as a Task anyway, and use a null result if it completes with no exception. (This was Mads’ suggestion.)

So, we know how to build a Task, and we’ve been given a Task – how do we hook the two together? The answer is to ask the original task to call us back when it completes, via the ContinueWith method. We can then set the result of our task accordingly. Without further ado, here’s the code:

public static Task PreserveMultipleExceptions(this Task originalTask)
{
    var tcs = new TaskCompletionSource<object>();
    originalTask.ContinueWith(t => {
        switch (t.Status) {
            case TaskStatus.Canceled:
                tcs.SetCanceled();
                break;
            case TaskStatus.RanToCompletion:
                tcs.SetResult(null);
                break;
            case TaskStatus.Faulted:
                tcs.SetException(originalTask.Exception);
                break;
        }
    }, TaskContinuationOptions.ExecuteSynchronously);
    return tcs.Task;
}

This was thrown together in 5 minutes (in the middle of a user group talk by Mads) so it’s probably not as robust as it might be… but the idea is that when the original task completes, we just piggy-back on the same thread very briefly to make our own task respond appropriately. Now when some code awaits our returned task, we’ll add an extra wrapper of AggregateException on top, ready to be unwrapped by the normal awaiter.

Note that the extra wrapper is actually added for us really, really easily – we just call TaskCompletionSource<T>.SetException with the original task’s AggregateException. Usually we’d call SetException with a single exception (like a BangException) and the method automatically wraps it in an AggregateException – which is exactly what we want.

So, how do we use it? Here’s a complete sample (just add the extension method above):

using System;
using System.Threading.Tasks;

public class BangException : Exception  

    public BangException(string message) : base(message) {} 
}

public class Test
{
    public static void Main()
    {
        FrobAsync().Wait();
    }
    
    public static async Task FrobAsync()
    {
        try
        {
            Task t1 = DelayedThrow(500);
            Task t2 = DelayedThrow(1000);
            Task t3 = DelayedThrow(1500);
            
            await TaskEx.WhenAll(t1, t2, t3).PreserveMultipleExceptions();
        }
        catch (AggregateException e)
        {
            Console.WriteLine("Caught {0} aggregated exceptions", e.InnerExceptions.Count);
        }
        catch (Exception e)
        {
            Console.WriteLine("Caught non-aggregated exception: {0}", e.Message);
        }
    }
    
    static async Task DelayedThrow(int delayMillis)  
    {  
        await TaskEx.Delay(delayMillis); 
        throw new BangException("Went bang after " + delayMillis + "ms"); 
    }
}

The result is what we were after:

Caught 3 aggregated exceptions

The blanket catch (Exception e) block is there so you can experiment with what happens if you remove the call to PreserveMultipleExceptions – in that case we get the original behaviour of a single BangException being caught, and the others discarded.

Conclusion

So, we now have answers to both of my big questions around multiple exceptions with async:

  • Why is the default awaiter truncating exceptions? To make asynchronous exception handling look like synchronous exception handling in the common case.
  • What can we do if that’s not the behaviour we want? Either write our own awaiter (whether that’s invoked explicitly or implicitly via "extension method overriding" as shown yesterday) or wrap the task in another one to wrap exceptions.

I’m happy again. Thanks Mads :)

Propagating multiple async exceptions (or not)

In an earlier post, I mentioned  that in the CTP, an asynchronous method will throw away anything other than the first exception in an AggregateException thrown by one of the tasks it’s waiting for. Reading the TAP documentation, it seems this is partly expected behaviour and partly not. TAP claims (in a section about how "await" is achieved by the compiler):

It is possible for a Task to fault due to multiple exceptions, in which case only one of these exceptions will be propagated; however, the Task’s Exception property will return an AggregateException containing all of the errors.

Unfortunately, that appears not to be the case. Here’s a test program demonstrating the difference between an async method and a somewhat-similar manually written method. The full code is slightly long, but here are the important methods:

static async Task ThrowMultipleAsync()
{
    Task t1 = DelayedThrow(500);
    Task t2 = DelayedThrow(1000);
    await TaskEx.WhenAll(t1, t2);
}

static Task ThrowMultipleManually()
{
    Task t1 = DelayedThrow(500);
    Task t2 = DelayedThrow(1000);
    return TaskEx.WhenAll(t1, t2);
}

static Task DelayedThrow(int delayMillis)
{
    return TaskEx.Run(delegate {
        Thread.Sleep(delayMillis);
        throw new Exception("Went bang after " + delayMillis);
    });
}

The difference is that the async method is generating an extra task, instead of returning the task from TaskEx.WhenAll. It’s waiting for the result of WhenAll itself (via EndAwait). The results show one exception being swallowed:

Waiting for From async method
Thrown exception: 1 error(s):
Went bang after 500

Task exception: 1 error(s):
Went bang after 500

Waiting for From manual method
Thrown exception: 2 error(s):
Went bang after 500
Went bang after 1000

Task exception: 2 error(s):
Went bang after 500
Went bang after 1000

The fact that the "manual" method still shows two exceptions means we can’t blame WhenAll – it must be something to do with the async code. Given the description in the TAP documentation, I’d expect (although not desire) the thrown exception to just be a single exception, but the returned task’s exception should have both in there. That’s clearly not the case at the moment.

Waiter! There’s an exception in my soup!

I can think of one reason why we’d perhaps want to trim down the exception to a single one: if we wanted to remove the aggregation aspect entirely. Given that the async method always returns a Task (or void), I can’t see how that’s feasible anyway… a Task will always throw an AggregateException if its underlying operation fails. If it’s already throwing an AggregateException, why restrict it to just one?

My guess is that this makes it easier to avoid the situation where one AggregateException would contain another, which would contain another, etc.

To demonstrate this, let’s try to write our own awaiting mechanism, instead of using the one built into the async CTP. GetAwaiter() is an extension method, so we can just make our own extension method which has priority over the original one. I’ll go into more detail about that in another post, but here’s the code:

public static class TaskExtensions
{
    public static NaiveAwaiter GetAwaiter(this Task task)
    {
        return new NaiveAwaiter(task);
    }
}

public class NaiveAwaiter
{
    private readonly Task task;

    public NaiveAwaiter(Task task)
    {
        this.task = task;
    }

    public bool BeginAwait(Action continuation)
    {
        if (task.IsCompleted)
        {
            return false;
        }
        task.ContinueWith(_ => continuation());
        return true;
    }

    public void EndAwait()
    {
        task.Wait();
    }
}

Yes, it’s almost the simplest implementation you could come up with. (Hey, we do check whether the task is already completed…) There no scheduler or SynchronizationContext magic… and importantly, EndAwait does nothing with any exceptions. If the task throws an AggregateException when we wait for it, that exception is propagated to the generated code responsible for the async method.

So, what happens if we run exactly the same client code with these classes present? Well, the results for the first part are different:

Waiting for From async method
Thrown exception: 1 error(s):
One or more errors occurred.

Task exception: 1 error(s):
One or more errors occurred.

We have to change the formatting somewhat to see exactly what’s going on – because we now have an AggregateException containing an AggregateException. The previous formatting code simply printed out how many exceptions there were, and their messages. That wasn’t an issue because we immediately got to the exceptions we were throwing. Now we’ve got an actual tree. Just printing out the exception itself results in huge gobbets of text which are unreadable, so here’s a quick and dirty hack to provide a bit more formatting:

static string FormatAggregate(AggregateException e)
{
    StringBuilder builder = new StringBuilder();
    FormatAggregate(e, builder, 0);
    return builder.ToString();
}

static void FormatAggregate(AggregateException e, StringBuilder builder, int level)
{
    string padding = new string(‘ ‘, level);
    builder.AppendFormat("{0}AggregateException with {1} nested exception(s):", padding, e.InnerExceptions.Count);
    builder.AppendLine();
    foreach (Exception nested in e.InnerExceptions)
    {
        AggregateException nestedAggregate = nested as AggregateException;
        if (nestedAggregate != null)
        {
            FormatAggregate(nestedAggregate, builder, level + 1);
            builder.AppendLine();
        }
        else
        {
            builder.AppendFormat("{0} {1}: {2}", padding, nested.GetType().Name, nested.Message);
            builder.AppendLine();
        }
    }
}

Now we can see what’s going on better:

AggregateException with 1 nested exception(s):
AggregateException with 2 nested exception(s):
  Exception: Went bang after 500
  Exception: Went bang after 1000

Hooray – we actually have all our exceptions, eventually… but they’re nested. Now if we introduce another level of nesting – for example by creating an async method which just waits on the task created by ThrowMultipleAsync – we end up with something like this:

AggregateException with 1 nested exception(s):
AggregateException with 1 nested exception(s):
  AggregateException with 2 nested exception(s):
   Exception: Went bang after 500
   Exception: Went bang after 1000

You can imagine that for a deep stack trace of async methods, this could get messy really quickly.

However, I don’t think that losing the information is really the answer. There’s already the Flatten method in AggregateException which will flatten the tree appropriately. I’d be reasonably happy for the exceptions to be flattened at any stage, but I really don’t like the behaviour of losing them.

It does get complicated by how the async language feature has to handle exceptions, however. Only one exception can ever be thrown at a time, even though a task can have multiple exceptions set on it. One option would be for the autogenerated code to handle AggregateException differently, setting all the nested exceptions separately (in the single task which has been returned) rather than either setting the AggregateException which causes nesting (as we’ve seen above) or relying on the awaiter picking just one exception (as is currently the case). It’s definitely a decision I think the community should get involved with.

Conclusion

As we’ve seen, the current behaviour of async methods doesn’t match the TAP documentation or what I’d personally like.

This isn’t down to the language features, but it’s the default behaviour of the extension methods which provide the "awaiter" for Task. That doesn’t mean the language aspect can’t be changed, however – some responsibility could be moved from awaiters to the generated code. I’m sure there are pros and cons each way – but I don’t think losing information is the right approach.

Next up: using extension method resolution rules to add diagnostics to task awaiters.

Configuring waiting

One of the great things about working at Google is that almost all of my colleagues are smarter than me. True, they don’t generally know as much about C#, but they know about language design, and they sure as heck know about distributed/parallel/async computing.

One of the great things about having occasional contact with the C# team is that when Mads Torgersen visits London later in the week, I can introduce him to these smart colleagues. So, I’ve been spreading the word about C# 5’s async support and generally encouraging folks to think about the current proposal so they can give feedback to Mads.

One particularly insightful colleague has persistently expressed a deep concern over who gets to control how the asynchronous execution works. This afternoon, I found some extra information which looks like it hasn’t been covered much so far which may allay his fears somewhat. It’s detailed in the Task-based Asynchronous Pattern documentation, which I strongly recommend you download and read right now.

More than ever, this post is based on documentation rather than experimentation. Please take with an appropriately large grain of salt.

What’s the problem?

In a complex server handling multiple types of request and processing them asynchronously – with some local CPU-bound tasks and other IO-bound tasks – you may well not want everything to be treated equally. Some operations (health monitoring, for example) may require high priority and a dedicated thread pool, some may be latency sensitive but support load balancing easily (so it’s fine to have a small pool of medium/high priority tasks, trusting load balancing to avoid overloading the server) and some may be latency-insensitive but be server-specific – a pool of low-priority threads with a large queue may be suitable here, perhaps.

If all of this is going to work, you need to know for each asynchronous operation:

  • Whether it will take up a thread
  • What thread will be chosen, if one is required (a new one? one from a thread pool? which pool?)
  • Where the continuation will run (on the thread which initiated the asynchronous operation? the thread the asynchronous operation ran on? a thread from a particular pool?)

In many cases reusable low-level code doesn’t know this context… but in the async model, it’s that low-level code which is responsible for actually starting the task. How can we reconcile the two requirements?

Controlling execution flow from the top down

Putting the points above into the concrete context of the async features of C# 5:

  • When an async method is called, it will start on the caller’s thread
  • When it creates a task (possibly as the target of an await expression) that task has control over how it will execute
  • The awaiter created by an await expression has control (or at the very least significant influence) over how where the next part of the async method (the continuation) is executed
  • The caller gets to decide what they will do with the returned task (assuming there is one) – it may be the target of another await expression, or it may be used more directly without any further use of the new language features

Whether a task requires an extra thread really is pretty much up to the task. Either a task will be IO-bound, CPU-bound, or a mixture (perhaps IO-bound to fetch data, and then CPU-bound to process it). As far as I can tell, it’s assumed that IO-bound asynchronous tasks will all use IO completion ports, leaving no real choice available. On other platforms, there may be other choices – there may be multiple IO channels for example, some reserved for higher priority traffic than others. Although the TAP doesn’t explicitly call this out, I suspect that other platforms could create a similar concept of context to the one described below, but for IO-specific operations.

The two concepts that TAP appears to rely on (and I should be absolutely clear that I could be misreading things; I don’t know as much about the TPL that all of this is based on as I’d like) are a SynchronizationContext and a TaskScheduler. The exact difference between the two remains slightly hazy to me, as both give control over which thread delegates are executed on – but I get the feeling that SynchronizationContext is aimed at describing the thread you should return to for callbacks (continuations) and TaskScheduler is aimed at describing the thread you should run work on – whether that’s new work or getting back for a continuation. (In other words, TaskScheduler is more general than SynchronizationContext -  so you can use it for continuations, but you can also use it for other things.)

One vital point is that although these aren’t enforced, they are designed to be the easiest way to carry out work. If there are any places where that isn’t true, that probably represents an issue. For example, the TaskEx.Run method (which will be Task.Run eventually) always uses the default TaskScheduler rather than the current TaskScheduler – so tasks started in that way will always run on the system thread pool. I have doubts about that decision, although it fits in with the general approach of TPL to use a single thread pool.

If everything representing an async operation follows the TAP, it should make it to control how things are scheduled "from this point downwards" in async methods.

ConfigureAwait, SwitchTo, Yield

Various "plain static" and extension methods have been provided to make it easy to change your context within an async method.

SwitchTo allows you to change your context to the ThreadPool or a particular TaskScheduler or Dispatcher. You may not need to do any more work on a particular high priority thread until you’ve actually got your final result – so you’re happy with the continuations being executed either "inline" with the asynchronous tasks you’re executing, or on a random thread pool thread (perhaps from some specific pool).  This may also allow the new CPU-bound tasks to be scheduled appropriately too (I thought it did, but I’m no longer sure). Once you’ve got all your ducks in a row, then you can switch back for the final continuation which needs to provide the results on your original thread.

ConfigureAwait takes an existing task and returns a TaskAwaiter – essentially allowing you to control just the continuation part.

Yield does exactly what it sounds like – yields control temporarily, basically allowing for cooperative multitasking by allowing other work to make progress before continuing. I’m not sure that this one will be particularly useful, personally – it feels a little too much like Application.DoEvents. I dare say there are specialist uses though – in particular, it’s cleaner than Application.DoEvents because it really is yielding, rather than running the message pump in the current stack.

All of these are likely to be used in conjunction with await. For example (these are not expected to all be in the same method, of course!):

// Continue in custom context (may affect where CPU-bound tasks are run too)
await customScheduler.SwitchTo();

// Now get back to the dispatcher thread to manipulate the UI
await control.Dispatcher.SwitchTo();

var task = new WebClient().DownloadStringTaskAsync(url);
// Don’t bother continuing on this thread after downloading; we don’t
// care for the next bit.
await ConfigureAwait(task, flowContext: false);

foreach (Job job in jobs)
{
    // Do some work that has to be done in this thread
    job.Process();

    // Let someone else have a turn – we may have a lot to
    // get through.
    // This will be Task.Yield eventually
    await TaskEx.Yield();
}

Is this enough?

My gut feeling is that this will give enough control over the flow of the application if:

  • The defaults in TAP are chosen appropriately so that the easiest way of starting a computation is also an easily "top-down-configurable" one
  • The top-level application programmer pays attention to what they’re doing, and configures things appropriately
  • Each component programmer lower down pays attention to the TAP and doesn’t do silly things like starting arbitrary threads themselves

In other words, everyone has to play nicely. Is that feasible in a complex system? I suspect it has to be really. If you have any "rogue" elements they’ll manage to screw things up in any system which is flexible enough to meet real-world requirements.

My colleague’s concern is (I think – I may be misrepresenting him) largely that the language shouldn’t be neutral about how the task and continuation are executed. It should allow or even force the caller to provide context. That would make the context hard to ignore lower down. The route I believe Microsoft has chosen is to do this implicitly by propagating context through the "current" SynchronizationContext and TaskScheduler, in the belief that developers will honour them.

We’ll see.

Conclusion

A complex asynchronous system is like a concerto played by an orchestra. Each musician is responsible for keeping time, but they are given direction from the conductor. It only takes one viola player who wants to play fast and loud to ruin the whole effect – so everyone has to behave. How do you force the musicians to watch the conductor? How much do you trust them? How easy is it to conduct in the first place? These are the questions which are hard to judge from documentation, frankly. I’m currently optimistic that by the time C# 5 is actually released, the appropriate balance will have been struck, the default tempo will be appropriate, and we can all listen to some beautiful music. In the key of C#, of course.

Evil code – overload resolution workaround

Another quick break from asynchrony, because I can’t resist blogging about this thoroughly evil idea which came to me on the train.

Your task: to write three static methods such that this C# 4 code:

static void Main() 
{ 
    Foo<int>(); 
    Foo<string>(); 
    Foo<int?>(); 
}

resolves one call to each of them – and will act appropriately for any non-nullable value type, reference type, and nullable value type respectively.

You’re not allowed to change anything in the Main method above, and they have to just be methods – no tricks using delegate-type fields, for example. (I don’t know whether such tricks would help you or not, admittedly. I suspect not.) It can’t just call one method which then determines other methods to call at execution time – we want to resolve this at compile time.

If you want to try this for yourself, look away now. I’ve deliberately included an attempt which won’t work below, so that hopefully you won’t see the working solution accidentally.

The simple (but failed) attempt

You might initially want to try this:

class Test 
{ 
    static void Foo<T>() where T : class {} 

    static void Foo<T>() where T : struct {} 

    // Let's hope the compiler thinks this is "worse"
    // than the others because it has no constraints 
    static void Foo<T>() {} 

    static void Main() 
    { 
        Foo<int>(); 
        Foo<int?>(); 
        Foo<string>(); 
    }  
}

That’s no good at all. I wrote about why it’s no good in this very blog, last week. The compiler only checks generic constraints on the type parameters after overload resolution.

Fail.

First steps towards a solution

You may remember that the compiler does check that parameter types make sense when working out the candidate set. That gives us some hope… all we’ve got to do is propagate our desired constraints into parameters.

Ah… but we’re calling a method with no arguments. So there can’t be any parameters, right?

Wrong. We can have an optional parameter. Okay, now we’re getting somewhere. What type of parameter can we apply to force a parameter to only be valid if a generic type parameter T is a non-nullable type? The simplest option which occurs is to use Nullable – that has an appropriate constraint. So, we end up with a method like

static void Foo<T>(T? ignored = default(T?)) where T : struct {}

Okay, so that’s the first call sorted out – it will be valid for the above method, but neither of the others will.

What about the reference type parameter? That’s slightly trickier – I can’t think of any common generic types in the framework which require their type parameters to be reference types. There may be one, of course – I just can’t think of one offhand. Fortunately, it’s easy to declare such a type ourselves, and then use it in another method:

class ClassConstraint<T> where T : class {} 

static void Foo<T>(ClassConstraint<T> ignored = default(ClassConstraint<T>))
    where T : class {}

Great. Just one more to go. Unfortunately, there’s no constraint which only satisfies nullable value types… Hmm.

The awkwardness of nullable value types

We want to effectively say, “Use this method if neither of the other two work – but use the other methods in preference.” Now if we weren’t already using optional parameters, we could potentially do it that way – by introducing a single optional parameter, we could have a method which was still valid for the other calls, but would be deemed “worse” by overload resolution. Unfortunately, overload resolution takes a binary view of optional parameters: either the compiler is having to fill in some parameters itself, or it’s not. It doesn’t think that filling in two parameter is “worse” than only filling in one.

Luckily, there’s a way out… inheritance to the rescue! (It’s not often you’ll hear me say that.)

The compiler will always prefer applicable methods in a derived class to applicable methods in a base class, even if they’d otherwise be better. So we can write a parameterless method with no type constraints at all in a base class. We can even keep it as a private method, so long as we make the derived class a nested type within its own base class.

Final solution

This leads to the final code – this time with diagnostics to prove it works:

using System;
class Base 
{ 
    static void Foo<T>() 
    { 
        Console.WriteLine("nullable value type"); 
    }

    class Test : Base 
    { 
        static void Foo<T>(T? ignored = default(T?)) 
            where T : struct 
        { 
            Console.WriteLine("non-nullable value type"); 
        } 

        class ClassConstraint<T> where T : class {} 

        static void Foo<T>(ClassConstraint<T> ignored = default(ClassConstraint<T>))
            where T : class 
        { 
            Console.WriteLine("reference type"); 
        } 

        static void Main() 
        { 
            Foo<int>(); 
            Foo<string>(); 
            Foo<int?>(); 
        } 
    } 
}

And the output…

non-nullable value type 
reference type 
nullable value type

Conclusion

This is possibly the most horrible code I’ve ever written.

Please, please don’t use it in real life. Use different method names or something like that.

Still, it’s a fun little puzzle, isn’t it?

Dreaming of multiple tasks again… with occasional exceptions

Yesterday I wrote about waiting for multiple tasks to complete. We had three asynchronous tasks running in parallel, fetching a user’s settings, reputation and recent activity. Broadly speaking, there were two approaches. First we could use TaskEx.WhenAll (which will almost certainly be folded into the Task class for release):

var settingsTask = service.GetUserSettingsAsync(userId); 
var reputationTask = service.GetReputationAsync(userId); 
var activityTask = service.GetRecentActivityAsync(userId); 

await TaskEx.WhenAll(settingsTask, reputationTask, activityTask); 

UserSettings settings = settingsTask.Result; 
int reputation = reputationTask.Result; 
RecentActivity activity = activityTask.Result;

Second we could just wait for each result in turn:

var settingsTask = service.GetUserSettingsAsync(userId);  
var reputationTask = service.GetReputationAsync(userId);  
var activityTask = service.GetRecentActivityAsync(userId);  
      
UserSettings settings = await settingsTask; 
int reputation = await reputationTask; 
RecentActivity activity = await activityTask;

These look very similar, but actually they behave differently if any of the tasks fails:

  • In the first form we will always wait for all the tasks to complete; if the settings task fails within a millisecond but the recent activity task takes 5 minutes, we’ll be waiting 5 minutes. In the second form we only wait for one at a time, so if one task fails, we won’t wait for any currently-unawaited ones to complete. (Of course if the first two tasks both succeed and the last one fails, the total waiting time will be the same either way.)
  • In the first form we should probably get to find out about the errors from all the asynchronous tasks; in the second form we only see the errors from whichever task fails first.

The second point is interesting, because in fact it looks like the CTP will throw away all but the first inner exception of an aggregated exception thrown by a Task that’s being awaited. That feels like a mistake to me, but I don’t know whether it’s by design or just due to the implementation not being finished yet. I’m pretty sure this is the same bit of code (in EndAwait for Task and Task<T>) which makes sure that we don’t get multiple levels of AggregateException wrapping the original exception as it bubbles up. Personally I’d like to at least be able to find all the errors that occurred in an asynchronous operation. Occasionally, that would be useful…

… but actually, in most cases I’d really like to just abort the whole operation as soon as any task fails. I think we’re missing a method – something like WhenAllSuccessful. If any operation is cancelled or faulted, the whole lot should end up being cancelled – with that cancellation propagating down the potential tree of async tasks involved, ideally. Now I still haven’t investigated cancellation properly, but I believe that the cancellation tokens of Parallel Extensions should make this all possible. In many cases we really need success for all of the operations – and we would like to communicate any failures back to our caller as soon as possible.

Now I believe that we could write this now – somewhat inefficiently. We could keep a collection of tasks which still haven’t completed, and wait for any of them to complete. At that point, look for all the completed ones in the set (because two could complete at the same time) and see whether any of them have faulted or been cancelled. If so, cancel the remaining operations and rethrow the exception (aka set our own task as faulted). If we ever get to the stage where all the tasks have completed – successfully – we just return so that the results can be fetched.

My guess is that this could be written more efficiently by the PFX team though. I’m actually surprised that there isn’t anything in the framework that does this. That usually means that either it’s there and I’ve missed it, or it’s not there for some terribly good reason that I’m too dim to spot. Either way, I’d really like to know.

Of course, all of this could still be implemented as extension methods on tuples of tasks, if we ever get language support for tuples. Hint hint.

Conclusion

It’s often easy to concentrate on the success path and ignore possible failure in code. Asynchronous operations make this even more of a problem, as different things could be succeeding and failing at the same time.

If you do need to write code like the second option above, consider ordering the various "await" statements so that the expected time taken in the failure case is minimized. Always consider whether you really need all the results in all cases… or whether any failure is enough to mess up your whole operation.

Oh, and if you know the reason for the lack of something like WhenAllSuccessful, please enlighten me in the comments :)