Eduasync part 14: Data passing in coroutines

(This post covers project 19 in the source code.)

Last time we looked at independent coroutines running in a round-robin fashion. This time we’ll keep the round-robin scheduling, but add in the idea of passing data from one coroutine to another. Each coroutine will act on data of the same type, which is necessary for the scheme to work when one coroutine could "drop out" of the chain by returning.

Designing the data flow

It took me a while to get to the stage where I was happy with the design of how data flowed around these coroutines. I knew I wanted a coordinator as before, and that it should have a Yield method taking the value to pass to the next coroutine and returning an awaitable which would provide the next value when it completed. The tricky part was working out what to do at the start of each method and the end. If the method just took a Coordinator parameter, we wouldn’t have anything to do with the value yielded by the first coroutine, because the second coroutine wouldn’t be ready to accept it yet. Likewise when a coroutine completed, we wouldn’t have another value to pass to the next coroutine.

Writing these dilemmas out in this post, the solution seems blindingly obvious of course: each coroutine should accept a data value on entry, and return one at the end. At any point where we transfer control, we provide a value and have a value which is required by something. The final twist is to make the coordinator’s Start method take an initial value and return the value returned by the last coroutine to complete.

So, that’s the theory… let’s look at the implementation.

Initialization

I’ve changed the coordinator to take all the coroutines as a constructor parameter (of the somewhat fearsome declaration "params Func<Coordinator<T>, T, Task<T>>[] coroutines") which means we don’t need to implement IEnumerable pointlessly any more.

This leads to a code skeleton of this form:

private static void Main(string[] args)
{
    var coordinator = new Coordinator<string>(FirstCoroutine,
                                              SecondCoroutine,
                                              ThirdCoroutine);
    string finalResult = coordinator.Start("m1");
    Console.WriteLine("Final result: {0}", finalResult);
}

private static async Task<string> FirstCoroutine(
    Coordinator<string> coordinator,
    string initialValue)
{
    …
}

// Same signature for SecondCoroutine and ThirdCoroutine

Last time we simply had a Queue<Action> internally in the coordinator as the actions to invoke. You might be expecting a Queue<Func<T, T>> this time – after all, we’re passing in data and returning data at each point. However, the mechanism for that data transfer is "out of band" so to speak. The only time we really "return" an item is when we reach the end of a coroutine. Usually we’ll be providing data to the next step using a method. Likewise the only time a coroutine is given data directly is in the first call – after that, it will have to fetch the value by calling GetResult() on the awaiter which it uses to yield control.

All of this is leading to a requirement for our constructor to convert each coroutine delegate into a simple Action. The trick is working out how to deal with the data flow. I’m going to include SupplyValue() and ConsumeValue() methods within the coordinator for the awaiter to use, so it’s just a case of calling those appropriately from our action. In particular:

  • When the action is called, it should consume the current value.
  • It should then call the coroutine passing in the coordinator ("this") and the initial value.
  • When the task returned by the coroutine has completed, the result of that task should be used to supply a new value.

The only tricky part here is the last bullet – and it’s not that hard really, so long as we remember that we’re absolutely not trying to start any new threads. We just want to hook onto the end of the task, getting a chance to supply the value before the next coroutine tries to pick it up. We can do that using Task.ContinueWith, but passing in TaskContinuationOptions.ExecuteSynchronously so that we use the same thread that the task completes on to execute the continuation.

At this point we can implement the initialization part of the coordinator, assuming the presence of SupplyValue() and ConsumeValue():

public sealed class Coordinator<T>
{
    private readonly Queue<Action> actions;
    private readonly Awaitable awaitable;

    public Coordinator(params Func<Coordinator<T>, T, Task<T>>[] coroutines)
    {
        // We can’t refer to "this" in the variable initializer. We can use
        // the same awaitable for all yield calls.
        this.awaitable = new Awaitable(this);
        actions = new Queue<Action>(coroutines.Select(ConvertCoroutine));
    }

    // Converts a coroutine into an action which consumes the current value,
    // calls the coroutine, and attaches a continuation to it so that the return
    // value is used as the new value.
    private Action ConvertCoroutine(Func<Coordinator<T>, T, Task<T>> coroutine)
    {
        return () =>
        {
            Task<T> task = coroutine(this, ConsumeValue());
            task.ContinueWith(ignored => SupplyValue(task.Result),
                TaskContinuationOptions.ExecuteSynchronously);
        };
    }
}

I’ve broken ConvertCoroutine into a separate method so that we can use it as the projection for the Select call within the constructor. I did initially have it within a lambda expression within the constructor, but it was utterly hideous in terms of readabililty.

One suggestion I’ve received is that I could declare a new delegate type instead of using Func<Coordinator<T>, T, Task<T>> to represent a coroutine. This could either be a non-generic delegate nested in the generic coordinator class, or a generic stand-alone delegate:

public delegate T Coroutine<T>(Coordinator<T> coordinator, T initialValue);

// Or nested…
public sealed class Coordinator<T>
{
    public delegate T Coroutine(Coordinator<T> coordinator, T initialValue);
}

Both of these would work perfectly well. I haven’t made the change at the moment, but it’s certainly worth considering. The debate about whether to use custom delegate types or Func/Action is one for another blog post, I think :)

The one bit of the initialization I haven’t explained yet is the "awaitable" field and the Awaitable type. They’re to do with yielding – so let’s look at them now.

Yielding and transferring data

Next we need to work out how we’re going to transfer data and control between the coroutines. As I’ve mentioned, we’re going to use a method within the coordinator, called from the coroutines, to accomplish this. The coroutines have this sort of code:

private static async Task<string> FirstCoroutine(
    Coordinator<string> coordinator,
    string initialValue)
{
    Console.WriteLine("Starting FirstCoroutine with initial value {0}",
                      initialValue);            
    …
    string received = await coordinator.Yield("x1");
    Console.WriteLine("Returned to FirstCoroutine with value {0}", received);
    …
    return "x3";
}

The method name "Yield" here is a double-edged sword. The word has two meanings – yielding a value to be used elsewhere, and yielding control until we’re called back. Normally it’s not ideal to use a name that can mean subtly different things – but in this case we actually want both of these meanings.

So, what does Yield need to do? Well, the flow control should look something like this:

  • Coroutine calls Yield()
  • Yield() calls SupplyValue() internally to remember the new value to be consumed by the next coroutine
  • Yield() returns an awaitable to the coroutine
  • Due to the await expression, the coroutine calls GetAwaiter() on the awaitable to get an awaiter
  • The coroutine checks IsCompleted on the awaiter, which must return false (to prompt the remaining behaviour)
  • The coroutine calls OnCompleted() passing in the continuation for the rest of the method
  • The coroutine returns to its caller
  • The coordinator proceeds with the next coroutine
  • When we eventually get back to this coroutine, it will call GetResult() to get the "current value" to assign to the "received" variable.

Now you’ll see that Yield() needs to return some kind of awaitable type – in other words, one with a GetAwaiter() method. Previously we put this directly on the Coordinator type, and we could have done that here – but I don’t really want anyone to just "await coordinator" accidentally. You should really need to call Yield() in order to get an awaitable. So we have an Awaitable type, nested in Coordinator.

We then need to decide what the awaiter type is – the result of calling GetAwaiter() on the awaitable. This time I decided to use the Coordinator itself. That means people could accidentally call IsCompleted, OnCompleted() or GetResult(), but I figured that wasn’t too bad. If we were to go to the extreme, we’d create another type just for the Awaiter as well. It would need to have a reference to the coordinator of course, in order to actually do its job. As it is, we can make the Awaitable just return the Coordinator that created it. (Awaitable is nested within Coordinator<T>, which is how it can refer to T without being generic itself.)

public sealed class Awaitable
{
    private readonly Coordinator<T> coordinator;

    internal Awaitable(Coordinator<T> coordinator)
    {
        this.coordinator = coordinator;
    }

    public Coordinator<T> GetAwaiter()
    {
        return coordinator;
    }
}

The only state here is the coordinator, which is why we create an instance of Awaitable on the construction of the Coordinator, and keep it around.

Now Yield() is really simple:

public Awaitable Yield(T value)
{
    SupplyValue(value);
    return awaitable;
}

So to recap, we now just need the awaiter members, SupplyValue() and ConsumeValue(). Let’s look at the awaiter members (in Coordinator) to start with. We already know that IsCompleted will just return false. OnCompleted() just needs to stash the continuation in the queue, and GetResult() just needs to consume the "current" value and return it:

public bool IsCompleted { get { return false; } }

public void OnCompleted(Action continuation)
{
    actions.Enqueue(continuation);
}

public T GetResult()
{
    return ConsumeValue();
}

Simple, huh? Finally, consuming and supplying values:

private T currentValue;
private bool valuePresent;

private void SupplyValue(T value)
{
    if (valuePresent)
    {
        throw new InvalidOperationException
            ("Attempt to supply value when one is already present");
    }
    currentValue = value;
    valuePresent = true;
}

private T ConsumeValue()
{
    if (!valuePresent)
    {
        throw new InvalidOperationException
            ("Attempt to consume value when it isn’t present");
    }
    T oldValue = currentValue;
    valuePresent = false;
    currentValue = default(T);
    return oldValue;
}

These are relatively long methods (compared with the other ones I’ve shown) but pretty simple. Hopefully they don’t need explanation :)

The results

Now that everything’s in place, we can run it. I haven’t posted the full code of the coroutines, but you can see it on Google Code. Hopefully the results speak for themselves though – you can see the relevant values passing from one coroutine to another (and in and out of the Start method).

Starting FirstCoroutine with initial value m1
Yielding ‘x1’ from FirstCoroutine…
    Starting SecondCoroutine with initial value x1
    Starting SecondCoroutine
    Yielding ‘y1’ from SecondCoroutine…
        Starting ThirdCoroutine with initial value y1
        Yielding ‘z1’ from ThirdCoroutine…
Returned to FirstCoroutine with value z1
Yielding ‘x2’ from FirstCoroutine…
    Returned to SecondCoroutine with value x2
    Yielding ‘y2’ from SecondCoroutine…
        Returned to ThirdCoroutine with value y2
        Finished ThirdCoroutine…
Returned to FirstCoroutine with value z2
Finished FirstCoroutine
    Returned to SecondCoroutine with value x3
    Yielding ‘y3’ from SecondCoroutine…
    Returned to SecondCoroutine with value y3
    Finished SecondCoroutine
Final result: y4

Conclusion

I’m not going to claim this is the world’s most useful coroutine model – or indeed useful at all. As ever, I’m more interested in thinking about how data and control flow can be modelled than actual usefulness.

In this case, it was the realization that everything should accept and return a value of the same type which really made it all work. After that, the actual code is pretty straightforward. (At least, I think it is – please let me know if any bits are confusing, and I’ll try to elaborate on them.)

Next time we’ll look at something more like a pipeline model – something remarkably reminiscent of LINQ, but without taking up as much stack space (and with vastly worse readability, of course). Unfortunately the current code reaches the limits of my ability to understand why it works, which means it far exceeds my ability to explain why it works. Hopefully I can simplify it a bit over the next few days.

Eduasync part 13: first look at coroutines with async

(This part covers project 18 in the source code.)

As I mentioned in earlier parts, the "awaiting" part of async methods is in no way limited to tasks. So long as we have a suitable GetAwaiter() method which returns a value of a type which in turn has suitable methods on it, the compiler doesn’t really care what’s going on. It’s time to exploit that to implement some form of coroutines in C#.

Introduction to coroutines

The fundamental idea of coroutines is to have multiple methods executing cooperatively, each of them maintaining their position within the coroutine when they yield to another. You can almost think of them as executing in multiple threads, with only one thread actually running at a time, and signalling between the different threads to control flow. However, we don’t really need multiple threads once we’ve got continuations – we can have a single thread with a complex flow of continuations, and still only a very short "real" stack. (The control flow is stored in normal collections instead of being implicit on the thread’s stack.)

Coroutines were already feasible in C# through the use of iterator blocks, but the async feature of C# allows a slightly more natural way of expressing them, in my view. (The linked Wikipedia page gives a sketch of how coroutines can be built on top of generators, which in the general concept that iterator blocks implement in C#.)

I have implemented various flavours of coroutines in Eduasync. It’s possible that some (all?) of them shouldn’t strictly be called coroutines, but they’re close enough to the real thing in feeling. This is far from an exhaustive set of approaches. Once you’ve got the basic idea of what I’m doing, you may well want to experiment with your own implementations.

I’m not going to claim that the use of coroutines in any of my examples really makes any sense in terms of making real tasks easier. This is purely for the sake of interest and twisting the async feature for fun.

Round-robin independent coroutines

Our first implementation of coroutines is relatively simple. A coordinator effectively "schedules" the coroutines it’s set up with in a round-robin fashion: when one of the coroutines yields control to the coordinator, the coordinator remembers where the coroutine had got to, and then starts the next one. When each coroutine has executed its first piece of code and yielded control, the coordinator will go back to the first coroutine to continue execution, and so on until all coroutines have completed.

The coroutines don’t know about each other, and no data is being passed between them.

Hopefully it’s reasonably obvious that the coordinator contains all the smarts here – the coroutines themselves can be relatively dumb. Let’s look at what the client code looks like (along with the results) before we get to the coordinator code.

Client code

The sample code contains three coroutines, all of which take a Coordinator parameter and have a void return type. These are passed to a new coordinator using a collection initializer and method group conversions; the coordinator is then started. Here’s the entry point code for this:

private static void Main(string[] args)
{
    var coordinator = new Coordinator { 
        FirstCoroutine,
        SecondCoroutine,
        ThirdCoroutine
    };
    coordinator.Start();
}

When each coroutine is initially started, the coordinator passes a reference to itself as the argument to the coroutine. That’s how we solve the chicken-and-egg problem of the coroutine and coordinator having to know about each other. The way a coroutine yields control is simply by awaiting the coordinator. The result type of this await expression is void – it’s just a way of "pausing" the coroutine.

We’re not doing anything interesting in the actual coroutines – just tracing the execution flow. Of course we could do anything we wanted, within reason. We could even await a genuinely asynchronous task such as fetching a web page asynchronously. In that case the whole coroutine collection would be "paused" until the fetch returned.

Here’s the code for the first coroutine – the second and third ones are similar, but use different indentation for clarity. The third coroutine is also shorter, just for fun – it only awaits the coordinator once.

private static async void FirstCoroutine(Coordinator coordinator)
{
    Console.WriteLine("Starting FirstCoroutine");
    Console.WriteLine("Yielding from FirstCoroutine…");

    await coordinator;

    Console.WriteLine("Returned to FirstCoroutine");
    Console.WriteLine("Yielding from FirstCoroutine again…");

    await coordinator;

    Console.WriteLine("Returned to FirstCoroutine again");
    Console.WriteLine("Finished FirstCoroutine");
}

And here’s the output…

Starting FirstCoroutine
Yielding from FirstCoroutine…
    Starting SecondCoroutine
    Yielding from SecondCoroutine…
        Starting ThirdCoroutine
        Yielding from ThirdCoroutine…
Returned to FirstCoroutine
Yielding from FirstCoroutine again…
    Returned to SecondCoroutine
    Yielding from SecondCoroutine again…
        Returned to ThirdCoroutine
        Finished ThirdCoroutine…
Returned to FirstCoroutine again
Finished FirstCoroutine
    Returned to SecondCoroutine again
    Finished SecondCoroutine

Hopefully that’s the output you expected, given the earlier description. Again it may help if you think of the coroutines as running in separate pseudo-threads: the execution within each pseudo-thread is just linear, and the timing is controlled by our explicit "await" expressions. All of this would actually be pretty easy to implement using multiple threads which really did just block on each await expression – but the fun part is keeping it all in one real thread. Let’s have a look at the coordinator.

The Coordinator class

Some of the later coroutine examples end up being slightly brainbusting, at least for me. This one is relatively straightforward though, once you’ve got the basic idea. All we need is a queue of actions to execute. After initialization, we want our queue to contain the coroutine starting points.

When a coroutine yields control, we just need to add the remainder of it to the end of the queue, and move on to the next item. Obviously the async infrastructure will provide "the remainder of the coroutine" as a continuation via the OnContinue method.

When a coroutine just returns, we continue with the next item in the queue as before – it’s just that we won’t add a continuation to the end of the queue. Eventually (well, hopefully) we’ll end up with an empty queue, at which point we can stop.

Initialization and a choice of data structures

We’ll represent our queue using Queue<T> where the T is a delegate type. We have two choices here though, because we have two kinds of delegate – one which takes the Coordinator as a parameter (for the initial coroutine setup) and one which has no parameters (for the continuations). Fortunately we can convert between the two in either direction very simply, bearing in mind that all of this is within the context of a coordinator. For example:

// If we’re given a coroutine and want a plain Action
Action<Coordinator> coroutine = …; 
Action action = () => coroutine(this);

// If we’re given a plain Action and want an Action<Continuation>:
Action continuation = …; 
Action<Coordinator> coroutine = ignored => continuation();

I’ve arbitrarily chosen to use the first option, so there’s a Queue<Action> internally.

Now we need to get the collection initializer working. The C# compiler requires an appropriate Add method (which is easy) and also checks that the type implements IEnumerable. We don’t really need to be able to iterate over the queue of actions, so I’ve use explicit interface implementation to reduce the likelihood of GetEnumerator() being called inappropriately, and made the method throw an exception for good measure. That gives us the skeleton of the class required for setting up:

public sealed class Coordinator : IEnumerable
{
    private readonly Queue<Action> actions = new Queue<Action>();

    // Used by collection initializer to specify the coroutines to run
    public void Add(Action<Coordinator> coroutine)
    {
        actions.Enqueue(() => coroutine(this));
    }

    // Required for collection initializers, but we don’t really want
    // to expose anything.
    IEnumerator IEnumerable.GetEnumerator()
    {
        throw new NotSupportedException("IEnumerable only supported to enable collection initializers");
    }
}

(Note that I haven’t used XML documentation anywhere here – it’s great for real code, but adds clutter in blog posts.)

For production code I’d probably prevent Add from being called after the coordinator had been started, but there’s no need to do it in our well-behaved sample code. We’re only going to add extra actions to the queue via continuations, which will be added due to await expressions.

The main execution loop and async infrastructure

So far we’ve got code to register coroutines in the queue – so now we need to execute them. Bearing in mind that the actions themselves will be responsible for adding continuations, the main loop of the coordinator is embarrassingly simple:

// Execute actions in the queue until it’s empty. Actions add *more*
// actions (continuations) to the queue by awaiting this coordinator.
public void Start()
{
    while (actions.Count > 0)
    {
        actions.Dequeue().Invoke();
    }
}

Of course, the interesting bit is the code which supports the async methods and await expressions. We know we need to provide a GetAwaiter() method, but what should that return? Well, we’re just going to use the awaiter to add a continuation to the coordinator’s queue. It’s got no other state than that – so we might as well return the coordinator itself, and put the other infrastructure methods directly in the coordinator.

Again, this is slightly ugly, as the extra methods don’t really make sense on the coordinator – we wouldn’t want to call them directly from client code, for example. However, they’re fairly irrelevant – we could always create a nested type which just had a reference to its "parent" coordinator if we wanted to. For simplicity, I haven’t bothered with this – I’ve just implemented GetAwaiter() trivially:

// Used by await expressions to get an awaiter
public Coordinator GetAwaiter()
{
    return this;
}

So, that leaves just three members still to implement: IsCompleted, OnCompleted and GetResult. We always want the IsCompleted property to return false, as otherwise the coroutine will just continue executing immediately without returning to cede control; the await expression would be pointless. OnCompleted just needs to add the continuation to the end of the queue – we don’t need to attach it to a task, or anything like that. Finally, GetResult is a no-op – we have no results, no exceptions, and basically nothing to do. You might want to add a bit of logging here, if you were so inclined, but there’s no real need.

So, here are the final three members of Coordinator:

// Force await to yield control
public bool IsCompleted { get { return false; } }

public void OnCompleted(Action continuation)
{
    // Put the continuation at the end of the queue, ready to
    // execute when the other coroutines have had a go.
    actions.Enqueue(continuation);
}

public void GetResult()
{
    // Our await expressions are void, and we never need to throw
    // an exception, so this is a no-op.
}

And that’s it! Fewer than 50 lines of code required, and nothing complicated at all. The interesting behaviour is all due to the way the C# compiler uses the coordinator when awaiting it.

We need AsyncVoidMethodBuilder as before, as we have some async void methods – but that doesn’t need to do anything significant. That’s basically all the code required to implement these basic round-robin coroutines.

Conclusion

Our first foray into the weird and wonderful world of coroutines was relatively tame. The basic idea of a coordinator keeping track of the state of all the different coroutines in one sense or another will keep coming back to us, but with different ways of controlling the execution flow.

Next time we’ll see some coroutines which can pass data to each other.

Eduasync part 12: Observing all exceptions

(This post covers projects 16 and 17 in the source code.)

Last time we looked at unwrapping an AggregateException when we await a result. While there are potentially other interesting things we could look at with respect to exceptions (particularly around cancellation) I’m just going to touch on one extra twist that the async CTP implements before I move on to some weird ways of using async.

TPL and unobserved exceptions

The Task Parallel Library (TPL) on which the async support is based has some interesting behaviour around exceptions. Just as it’s entirely possible for more than one thing to go wrong with a particular task, it’s also quite easy to miss some errors, if you’re not careful.

Here’s a simple example of an async method in C# 5 where we create two tasks, both of which will throw exceptions:

private static async Task<int> CauseTwoFailures()
{
    Task<int> firstTask = Task<int>.Factory.StartNew(() => {
        throw new InvalidOperationException();
    });
    Task<int> secondTask = Task<int>.Factory.StartNew(() => {
        throw new InvalidOperationException();
    });

    int firstValue = await firstTask;
    int secondValue = await secondTask;

    return firstValue + secondValue;
}

Now the timing of the two tasks is actually irrelevant here. The first task will always throw an exception, which means we’re never going to await the second task. That means there’s never any code which asks for the second task’s result, or adds a continuation to it. It’s alone and unloved in a cruel world, with no-one to observe the exception it throws.

If we call this method from the Eduasync code we’ve got at the moment, and wait for long enough (I’ve got a call to GC.WaitForPendingFinalizers in the same code) the program will abort, with this error:

Unhandled Exception: System.AggregateException: A Task’s exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. —> System.InvalidOperationException: Operation is not valid due to the current state of the object.

Ouch. The TPL takes a hard line on unobserved exceptions. They indicate failures (presumably) which you’ll never find out about until you start caring about the result of a task. Basically there are various ways of "observing" a task’s failure, whether by performing some act which causes it to be thrown (usually as part of an AggregateException) or just asking for the exception for a task which is known to be faulted. An unobserved exception will throw an InvalidOperationException in its finalizer, usually causing the process to exit.

That works well in "normal" TPL code, where you’re explicitly managing tasks – but it’s not so handy in async, where perfectly reasonable looking code which starts a few tasks and then awaits them one at a time (possibly doing some processing in between) might hide an unobserved exception.

Observing all exceptions

Fortunately TPL provides a way of us to get out of the normal task behaviour. There’s an event TaskScheduler.UnobservedTaskException which is fired by the finalizer before it goes bang. The handlers of the event are allowed to observe the exception using UnobservedTaskExceptionEventArgs.SetObserved and can also check whether it’s already been observed.

So all we have to do is add a handler for the event and our program doesn’t crash any more:

TaskScheduler.UnobservedTaskException += (sender, e) =>
{
    Console.WriteLine("Saving the day! This exception would have been unobserved: {0}",
                      e.Exception);
    e.SetObserved();
};

In Eduasync this is currently only performed explicitly, in project 17. In the async CTP something like this is performed as part of the type initializer for AsyncTaskMethodBuilder<T>, which you can unfortunately tell because that type initializer crashes when running under medium trust. (That issue will be fixed before the final release.)

Global changes

This approach has a very significant effect: it changes the global behaviour of the system. If you have a system which uses the TPL and you want the existing .NET 4 behaviour of the process terminating when you have unobserved exceptions, you basically can’t use async at all – and if you use any code which does, you’ll see the more permissive behaviour.

You could potentially add your own event handler which aborted the application forcibly, but that’s not terribly nice either. You should quite possibly add a handler to at least log these exceptions, so you can find out what’s been going wrong that you haven’t noticed.

Of course, this only affects unobserved exceptions – anything you’re already observing will not be affected. Still, it’s a pretty big change. I wouldn’t be surprised if this aspect of the behaviour of async in C# 5 changed before release; it feels to me like it isn’t quite right yet. Admittedly I’m not sure how I would suggest changing it, but effectively reversing the existing behaviour goes against Microsoft’s past behaviour when it comes to backwards compatibility. Watch this space.

Conclusion

It’s worth pondering this whole issue yourself (and the one from last time), and making your feelings known to the team. I think it’s symptomatic of a wider problem in software engineering: we’re not really very good at handling errors. Java’s approach of checked exceptions didn’t turn out too well in my view, but the "anything goes" approach of C# has problems too… and introducing alternative models like the one in TPL makes things even more complicated. I don’t have any smart answers here – just that it’s something I’d like wiser people than myself to think about further.

Next, we’re going to move a bit further away from the "normal" ways of using async, into the realm of coroutines. This series is going to get increasingly obscure and silly, all in the name of really getting a feeling for the underlying execution model of async, before it possibly returns to more sensible topics such as task composition.

Eduasync part 11: More sophisticated (but lossy) exception handling

(This post covers projects 13-15 in the source code.)

Long-time readers of this blog may not learn much from this post – it’s mostly going over what I’ve covered before. Still, it’s new to Eduasync.

Why isn’t my exception being caught properly?

Exceptions are inherently problematic in C# 5. There are two conflicting aspects:

  • The point of the async feature in C# 5 is that you can write code which mostly looks like its synchronous code. We expect to be able to catch specific exception types as normal.
  • Asynchronous code may potentially have multiple exceptions "at the same time". The language simply isn’t designed to deal with that in the synchronous case.

Now if the language had been designed for asynchrony to start with, perhaps exception flow would have been designed differently – but we are where we are, and we all expect exceptions to work in a certain way.

Let’s make all of this concrete with a sample:

private static void Main(string[] args)
{
    Task<int> task = FetchOrDefaultAsync();
    Console.WriteLine("Result: {0}", task.Result);
}

private static async Task<int> FetchOrDefaultAsync()
{
    // Nothing special about IOException here
    try
    {
        Task<int> fetcher = Task<int>.Factory.StartNew(() => { throw new IOException(); });
        return await fetcher;
    }
    catch (IOException e)
    {
        Console.WriteLine("Caught IOException: {0}", e);
        return 5;
    }
    catch (Exception e)
    {
        Console.WriteLine("Caught arbitrary exception: {0}", e);
        return 10;
    }
}

Here we have a task which will throw an IOException, and some code which awaits that task – and has a catch block for IOException.

So, what would you expect this to print? With the code we’ve got in Eduasync so far, we get this:

Caught arbitrary exception: System.AggregateException: One or more errors occurred. —> System.IO.IOException: I/O error occurred.

Result: 10

If you run the same code against the async CTP, you get this:

Caught IOException: System.IO.IOException: I/O error occurred.

Result: 5

Hmm… we’re not behaving as per the CTP, and we’re not behaving as we’d really expect the normal synchronous code to behave.

The first thing to work out is which boundary we should be fixing. In this case, the problem is between the async method and the task we’re awaiting, so the code we need to fix is TaskAwaiter<T>.

Handling AggregateException in TaskAwaiter<T>

Before we can fix it, we need to work out what’s going on. The stack trace I hid before actually show this reasonably clearly, with this section:

at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceled Exceptions)
at System.Threading.Tasks.Task`1.get_Result()
at Eduasync.TaskAwaiter`1.GetResult() …TaskAwaiter.cs:line 40
at Eduasync.Program.<FetchOrDefaultAsync>d__2.MoveNext() …Program.cs:line 37

So AggregateException is being thrown by Task<T>.Result, which we’re calling from TaskAwaiter<T>.GetResult(). The documentation for Task<T>.Result isn’t actually terribly revealing here, but the fact that Task<T>.Exception is of type AggregateException is fairly revealing.

Basically, the Task Parallel Library is built with the idea of multiple exceptions in mind – whereas our async method isn’t.

Now the team in Microsoft could have decided that really you should catch AggregateException and iterate over all the exceptions contained inside the exception, handling each of them separately. However, in most cases that isn’t really practical – because in most cases there will only be one exception (if any) and all that looping is relatively painful. They decided to simply extract the first exception from the AggregateException within a task, and throw that instead.

We can do that ourselves in TaskAwaiter<T>, like this:

try
{
    return task.Result;
}
catch (AggregateException aggregate)
{
    if (aggregate.InnerExceptions.Count > 0)
    {
        // Loses the proper stack trace. Oops. For workarounds, see
        // See http://bradwilson.typepad.com/blog/2008/04/small-decisions.html
        throw aggregate.InnerExceptions[0];
    }
    else
    {
        // Nothing better to do, really…
        throw;
    }
}

As you can tell from the comment, we end up losing the stack trace using this code. I don’t know exactly how the stack trace is preserved in the real Async CTP, but it is. I suspect this is done in a relatively obscure way at the moment – it’s possible that for .NET 5, there’ll be a cleaner way that all code can take advantage of.

This code is also pretty ugly, catching the exception only to rethrow it. We can check whether or not the task has faulted using Task<T>.Status and extract the AggregateException using Task<T>.Exception instead of forcing it to be thrown and catching it. We’ll see an example of that in a minute.

With our new code in place, we can catch the IOException in our async code very easily.

What if I want all the exceptions?

In certain circumstances it really makes sense to collect multiple exceptions. This is particularly true when you’re waiting for multiple tasks to complete, e.g. with TaskEx.WhenAll in the Async CTP. This caused me a certain amount of concern for a while, but when Mads came to visit in late 2010 and we talked it over, we realized we could use the compositional nature of Task<T> and the convenience of TaskCompletionSource to implement an extension method preserving all the exceptions.

As we’ve seen, when a task is awaited, its AggregateException is unwrapped, and the first exception rethrown. So if we create a new task which adds an extra layer of wrapping and make our code await that instead, only the extra layer will be unwrapped by the task awaiter, leaving the original AggregateException. To come up with a new task which "looks like" an existing task, we can simply create a TaskCompletionSource, add a continuation to the original task, and return the completion source’s task as the wrapper. When the continuation fires we’ll set the appropriate result on the completion source – cancellation, an exception, or the successful result.

You may expect that we’d have to create a new AggregateException ourselves – but TaskCompletionSource.SetException will already do this for us. This makes it looks like the code below isn’t performing any wrapping at all, but remember that Task<T>.Exception is already an AggregateException, and calling TaskCompletionSource.SetException will wrap it in another AggregateException. Here’s the extension method in question:

public static Task<T> WithAllExceptions<T>(this Task<T> task)
{
    TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();

    task.ContinueWith(ignored =>
    {
        switch (task.Status)
        {
            case TaskStatus.Canceled:
                tcs.SetCanceled();
                break;
            case TaskStatus.RanToCompletion:
                tcs.SetResult(task.Result);
                break;
            case TaskStatus.Faulted:
                // SetException will automatically wrap the original AggregateException
                // in another one. The new wrapper will be removed in TaskAwaiter, leaving
                // the original intact.
                tcs.SetException(task.Exception);
                break;
            default:
                tcs.SetException(new InvalidOperationException("Continuation called illegally."));
                break;
        }
    });

    return tcs.Task;
}

Here you can see the cleaner way of reacting to a task’s status – we don’t just try to fetch the result and catch any exceptions; we handle each status individually.

I don’t know offhand what task scheduler is used for this continuation – it may be that we’d really want to specify the current task scheduler for a production-ready version of this code. However, the core idea is sound.

It’s easy to use this extension method within an async method, as shown here:

private static async Task<int> AwaitMultipleFailures()
{
    try
    {
        await CauseMultipleFailures().WithAllExceptions();
    }
    catch (AggregateException e)
    {
        Console.WriteLine("Caught arbitrary exception: {0}", e);
        return e.InnerExceptions.Count;
    }
    // Nothing went wrong, remarkably!
    return 0;
}

private static Task<int> CauseMultipleFailures()
{
    // Simplest way of inducing multiple exceptions
    Exception[] exceptions = { new IOException(), new ArgumentException() };
    TaskCompletionSource<int> tcs = new TaskCompletionSource<int>();
    tcs.SetException(exceptions);
    return tcs.Task;
}

Note that this will work perfectly well with the Async CTP and should be fine with the full release as well. I wouldn’t be entirely surprised to find something similar provided by the framework itself by release time, too.

Conclusion

It’s worth being aware of the impedance mismatch between the TPL and async methods in C# 5, as well as how this mismatch is handled. I dislike the idea of data loss, but I can see why it’s being handled in this way. It’s very much in line with the approach of trying to make asynchronous methods look like synchronous ones as far as possible.

We’ll probably look at the compositional nature of tasks again later in the series, but this was one simple example of how transparent it can be – a simple extension method can change the behaviour to avoid the risk of losing exception information when you’re expecting that multiple things can go wrong.

It’s worth remembering that this behaviour is very specific to Task and Task<T>, and the awaiter types associated with them. If you’re awaiting other types of expressions, they may behave differently with respect to exceptions.

Before we leave the topic of exceptions, there’s one other aspect we need to look at – what happens when an exception isn’t observed

LINQ To Objects and the performance of nested “Where” calls

This post came out of this Stack Overflow question, which essentially boils down to which is better out of these two options:

var oneBigPredicate = collection.Where(x => Condition1(x)
                                         && Condition2(x)
                                         && Condition3(x));

var multiplePredicates = collection.Where(x => Condition1(x))
                                   .Where(x => Condition2(x))
                                   .Where(x => Condition3(x))

The first case is logically a single "wrapper" sequence around the original collection, with a filter which checks all three conditions (but applying short-circuiting logic, of course) before the wrapper will yield an item from the original sequence.

The second case is logically a set of concentric wrapper sequences, each applying a filter which checks a single condition. Asking the "outer" wrapper for the next item involves that one asking the "middle" wrapper for the next item, which asks the "inner" wrapper for the next item, which asks the original collection for the next item… the item is then passed "outwards" through the wrappers, with the filters being applied as we go, of course.

Now the two will achieve the same result, and I should say up-front that in most realistic cases, it won’t make a significant difference which you use. But realistic cases aren’t nearly as interesting to investigate as pathological cases, so I decided to benchmark a few different options for the second case. In particular, I wanted to find out how long it took to iterate over a query in the cases where the condition was either "always true" or "always false" – and vary the depth of nesting. Note that I’m not actually testing the first kind of query shown above… I suspect it wouldn’t be terribly interesting, at least compared with the results of the second query.

The simplest way of creating a "collection" of a large size is to use Enumerable.Repeat(), and the simplest way of iterating over the whole sequence is to just call Count() on it… so that’s what I do, timing how long the Count() call takes. (I don’t actually print out the results of Count(), but with an "always false" predicate I’ll get 0, and with an "always true" predicate I’ll get the size of the input collection.)

Here’s the sample code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

class Test
{
    static void Main()
    {
        int size = 10000000;
        Console.WriteLine("Always false");
        RunTests(size, x => false);
        
        Console.WriteLine("Always true");
        RunTests(size, x => true);
        
    }
    
    static void RunTests(int size, Func<string, bool> predicate)
    {
        for (int i = 1; i <= 10; i++)
        {
            RunTest(i, size, predicate);
        }
    }
    
    static void RunTest(int depth, int size, Func<string, bool> predicate)
    {
        IEnumerable<string> input = Enumerable.Repeat("value", size);
        
        for (int i = 0; i < depth; i++)
        {
            input = input.Where(predicate);
        }
        
        Stopwatch sw = Stopwatch.StartNew();
        input.Count();
        sw.Stop();
        Console.WriteLine("Depth: {0} Size: {1} Time: {2}ms",
                          depth, size, sw.ElapsedMilliseconds);
    }
}

You may notice there’s no JIT warm-up here – but I’ve tried that, and it doesn’t alter the results much. Likewise I’ve also tried with a larger size of collection, and the trend is the same – and the trend is the interesting bit.

Time to guess the results…

Before I include the results, I’ll explain what I thought would happen. I had the mental model I described before, with multiple sequences feeding each other.

When the condition is always false, I’d expected the call to MoveNext() from Count() to get all the way to the innermost filtering sequence, which would then iterate over all of the input collection, applying the filter on each item and never yielding a result. That innermost MoveNext() call returns false, and that propagates right back to Count(). All the (significant) time is spent in the innermost loop, testing and rejecting items. That shouldn’t depend on the depth of the nesting we’ve got, right? I’d expect it to be linear in terms of the size of the collection, but constant in terms of nesting. We’ll see.

When the condition is always true, we’re in a much worse situation. We need to propagate every item from the collection through all the filters, out to Count(), checking the condition and accepting the item on each check. Each MoveNext() call has to go all the way to the innermost filter, then the result has to propagate out again. That sounds like it should be roughly linear in the depth of the nesting, as well as still being linear in the size of the collection – assuming a very simplistic execution model, of course.

Before you look at the results, check that you understand my logic, and see if you can think why it might not be the case.

The actual results

Here’s what I’ve actually observed, with all times in milliseconds. The size of the collection is constant – we’re only varying the depth

Depth Time for "always false" Time for "always true"
1 182 305
2 219 376
3 246 452
4 305 548
5 350 650
6 488 709
7 480 795
8 526 880
9 583 996
10 882 1849

There are a few things to note here:

  • The "always true" time is going up broadly linearly, as we’d expected
  • The "always false" time is also going going up broadly linearly, even though we’d expected it to be roughly constant
  • The "always false" time is still significantly better than the "always true" time, which is somewhat reassuring
  • The performance at a depth of 10 is significantly worse than at a depth of 9 in both cases

Okay, so that’s bizarre. What can possibly be making the time taken by the "inner" filtering sequence go up with the depth of the nesting? It shouldn’t know about the rest of the calls to Where – that’s not its job. Time to investigate…

Reimplementing Where

As you may have seen before, Where isn’t very hard to implement – at least it’s not hard to implement if you’re not trying to be clever. In order to mess around with things to check that my mental model was correct, I decided to run exactly the same tests again, but against the Edulinq implementation. This time, the results are very different:

Depth Time for "always false" Time for "always true"
1 232 502
2 235 950
3 256 1946
4 229 2571
5 227 3103
6 226 3535
7 226 3901
8 232 4365
9 229 4838
10 226 5219

Well look at that… suddenly our "always false" query has the expected characteristics. The "always true" query is basically linear except for the jump in time taken between a depth of 2 and a depth of 3. This may well be the same sort of discontinuity present in the earlier results, but at a different depth. (I’ll do more research on that another time, I think.)

So if the naive implementation actually works better in some cases, what’s going wrong in the "always false" case in the real LINQ to Objects? We can work this out by taking a stack trace from a predicate. Here’s a sample query which will throw an exception:

var query = Enumerable.Repeat("value", 1)
                      .Where(x => { throw new Exception("Bang!"); })
                      .Where(x => true)
                      .Where(x => true)
                      .Where(x => true)
                      .Where(x => true)
                      .Where(x => true)
                      .Where(x => true)
                      .Where(x => true);

When you try to count that query (or do anything else which will iterate over it) you get a stack trace like this:

Unhandled Exception: System.Exception: Bang!
   at Test.<Main>b__0(String x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.<>c__DisplayClassf`1.<CombinePredicates>b__e(TSource x)
   at System.Linq.Enumerable.WhereEnumerableIterator`1.MoveNext()
   at System.Linq.Enumerable.Count[TSource](IEnumerable`1 source)
   at Test.Main()

Ooh, look at that stack – we’ve got 8 Where clauses, and 7 levels of DisplayClassf`1 – which looks like a generated class, perhaps for a lambda expression.

Between this and a helpful email from Eric Lippert, we can basically work out what’s going on. LINQ to Objects knows that it can combine two Where clauses by just constructing a compound filter. Just for kicks, let’s do the same thing – almost certainly more simplistically than LINQ to Objects – but in a way that will give the right idea:

// We could have an interface for this, of course.
public class FilteredEnumerable<T> : IEnumerable<T>
{
    private readonly IEnumerable<T> source;
    private readonly Func<T, bool> predicate;
    
    internal FilteredEnumerable(IEnumerable<T> source,
                                Func<T, bool> predicate)
    {
        this.source = source;
        this.predicate = predicate;
    }
    
    public IEnumerator<T> GetEnumerator()
    {
        foreach (T item in source)
        {
            if (predicate(item))
            {
                yield return item;
            }
        }
    }
    
    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
    
    public FilteredEnumerable<T> Where(Func<T, bool> extraPredicate)
    {
        return new FilteredEnumerable<T>(source,
                                         x => this.predicate(x) && extraPredicate(x));
    }
}

public static class Extensions
{
    public static IEnumerable<T> Where<T>(this IEnumerable<T> source,
                                          Func<T, bool> predicate)
    {
        var filtered = source as FilteredEnumerable<T>;
        return filtered == null ? new FilteredEnumerable<T>(source, predicate)
                                : filtered.Where(predicate);
    }
}

Let’s run the same tests as before, and see how we do…

Depth Time for "always false" Time for "always true"
1 240 504
2 275 605
3 332 706
4 430 797
5 525 892
6 558 980
7 638 1085
8 715 1208
9 802 1343
10 1312 2768

Oh look – it feels like the real LINQ to Objects implementation. A bit slower, certainly, but that’s okay. The way it trends is the same. In particular, it’s slower than the naive implementation for the "always false" filter, but faster for the "always true" filter.

Could things be improved?

The problem here is the creation of nested delegates. We end up with a large stack (which appears to be causing a bigger problem when it reaches a depth of 10) when really we want to just build another delegate.

The thought occurs that we could potentially use expression trees to do this. Not in a signature-compatible way with LINQ to Objects, but we should be able to combine (a => a == 3) and (b => b != 10) into (x => x == 3 && x != 10) effectively. Then when we’re asked to iterate, we just need to compile the expression tree to a delegate, and filter using that single, efficient delegate.

There are three problems with this:

  • It goes against the normal approach of LINQ to Objects, using delegates instead of expression trees. Heck, with expression trees throughout we could do all kinds of interesting optimizations.
  • Creating and compiling the expression tree may be more expensive than the iteration – we don’t really know. It depends on how large the collection is, etc.
  • It’s somewhat complicated to implement, because we need to rewrite the constituent expressions; note how "a" and "b" both become "x" in the example above. This is the main reason I haven’t actually bothered trying this in order to benchmark it :)

There are various other options to do with building a more efficient way of evaluating the predicates. One pretty simple (but repetitive) solution is to have one class per number of predicates, with a field per predicate – FilteredEnumerable1, FilteredEnumerable2 etc. When you get bored (e.g. FilteredEnumberable9) you construct any further ones by combining predicates as per the LINQ to Objects approach. For example, here’s an implementation of FilteredEnumerable3:

public class FilteredEnumerable3<T> : IFilteredEnumerable<T>
{
    private readonly IEnumerable<T> source;
    private readonly Func<T, bool> predicate0;
    private readonly Func<T, bool> predicate1;
    private readonly Func<T, bool> predicate2;
    
    internal FilteredEnumerable3(IEnumerable<T> source,
                                 Func<T, bool> predicate0,
                                 Func<T, bool> predicate1,
                                 Func<T, bool> predicate2)
    {
        this.source = source;
        this.predicate0 = predicate0;
        this.predicate1 = predicate1;
        this.predicate2 = predicate2;
    }
    
    public IEnumerator<T> GetEnumerator()
    {
        foreach (T item in source)
        {
            if (predicate0(item) &&
                predicate1(item) &&
                predicate2(item))
            {
                yield return item;
            }
        }
    }
    
    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
    
    public IFilteredEnumerable<T> Where(Func<T, bool> extraPredicate)
    {
        return new FilteredEnumerable4<T>(source,
            predicate0,
            predicate1,
            predicate2,
            extraPredicate);
    }
}

In this case, I’m rather pleased with the results:

Depth Time for "always false" Time for "always true"
1 237 504
2 231 551
3 231 599
4 225 625
5 228 703
6 224 787
7 222 876
8 225 966
9 232 1069
10 219 1191

Now that’s more like it. Other than a few cells, we’re actually outperforming LINQ to Objects – and still compatible with it. The only problem is that the cells where it’s slower are the common ones – the top few in the right hand column. And this was just going up to FilteredEnumerable4<T>.

I honestly don’t know why my FilteredEnumerable1<T> is slower than whatever’s in LINQ to Objects. But it’s nice to see the rest going quickly… I suspect there are some downsides as well, and basically I trust the team to make the right decisions for normal cases.

And there’s more…

You may be surprised to hear I’ve kept things simple here – as Eric mentioned to me, it’s not just Where and Select that can be transformed in this way. Select works as well – you can combine projections together pretty easily. So imagine that we’ve effectively got this transformation:

// Normal LINQ query
var query = list.Where(x => Condition1(x))
                .Where(x => Condition2(x))
                .Select(x => Projection1(x))
                .Select(y => Projection2(y));

// After optimization
var query = list.WhereSelect(x => Condition1(x) && Condition2(x),
                             x => Projection2(Projection1(x));

Of course at that point, my FilteredEnumerableX<T> approach becomes even more unwieldy as you have two axes – the number of predicates and the number of projections.

Conclusion

As is so often the case with performance, there’s more to Where than there first appears. When I started benchmarking this I had no idea what I was getting myself into. The great thing is that very few people would ever need to know this. It’s all hidden by the abstraction.

Still, isn’t this sort of thing fun?

Eduasync part 10: CTP bug – don’t combine multiple awaits in one statement…

(This post covers project 12 in the source code.)

Last time, we saw what happens when we have multiple await expressions: we end up with multiple potential states in our state machine. That’s all fine, but there’s a known bug in the current CTP which affects the code generated when you have multiple awaits in the same statement. (Lucian Wischik calls these "nested awaits" although I personally don’t really think of one as being nested inside an another.)

This post is mostly a cautionary tale for those using the CTP and deploying it to production – but it’s also interesting to see the kind of thing that can go wrong. I’m sure this will all be fixed well before release, and the team are already very aware of it. (I expect it’s been fixed internally for a while. It’s not the kind of bug you’d hold off fixing.)

Just as a reminder, here’s the code we used last time to demonstrate multiple await expressions:

// Code from part 9
private static async Task<int> Sum3ValuesAsyncWithAssistance()
{
    Task<int> task1 = Task.Factory.StartNew(() => 1);
    Task<int> task2 = Task.Factory.StartNew(() => 2);
    Task<int> task3 = Task.Factory.StartNew(() => 3);

    int value1 = await task1;
    int value2 = await task2;
    int value3 = await task3;

    return value1 + value2 + value3;
}

To simplify things a bit, let’s reduce it to two tasks. Then as a refactoring, I’m going to perform the awaiting within the summation expression:

private static async Task<int> Sum2ValuesAsyncWithAssistance()
{
    Task<int> task1 = Task.Factory.StartNew(() => 1);
    Task<int> task2 = Task.Factory.StartNew(() => 2); 

    return await task1 + await task2;
}

So, with the local variables value1 and value2 gone, we might expect that in the generated state machine, we’d have lost the corresponding fields. But should we, really?

Think what happens if task1 has completed (and we’ve fetched the results), but task2 hasn’t completed by the time we await it. So we call OnContinue and exit as normal, then keep going when the continuation is invoked.

At that point we should be ready to return… but we’ve got to use the value returned from task1. We need the result to be stored somewhere, and fields are basically all we’ve got.

Unfortunately, in the current async CTP we don’t have any of these – we just have two local awaiter result variables. Here’s the decompiled code, stripped of the outer skeleton as normal:

  int awaitResult1 = 0;
  int awaitResult2 = 0;
  switch (state)
  {
      case 1:
          break;

      case 2:
          goto Label_Awaiter2Continuation;

      default:
          if (state != -1)
          {
              task1 = Task.Factory.StartNew(() => 1);
              task2 = Task.Factory.StartNew(() => 2);

              awaiter1 = task1.GetAwaiter();
              if (awaiter1.IsCompleted)
              {
                  goto Label_GetAwaiter1Result;
              }
              state = 1;
              doFinallyBodies = false;
              awaiter1.OnCompleted(moveNextDelegate);
          }
          return;
  }
  state = 0;
Label_GetAwaiter1Result:
  awaitResult1 = awaiter1.GetResult();
  awaiter1 = default(TaskAwaiter<int>());

  awaiter2 = task2.GetAwaiter();
  if (awaiter2.IsCompleted)
  {
      goto Label_GetAwaiter2Result;
  }
  state = 2;
  doFinallyBodies = false;
  awaiter2.OnCompleted(moveNextDelegate);
  return;
Label_Awaiter2Continuation:
  state = 0;
Label_GetAwaiter2Result:
  awaitResult2 = awaiter2.GetResult();
  awaiter2 = default(TaskAwaiter<int>());

  result = awaitResult1 + awaitResult2;

The first half of the code looks like last time (with the natural adjustments for only having two tasks) – but look at what happens when we’ve fetched the result of task1. We fetch the result into awaitResult1, which is a local variable. We then don’t touch awaitResult1 again unless we reach the end of the code – which will not happen if awaiter2.IsCompleted returns false. When the method returns, the local variable’s value is lost forever… although it will be reset to 0 next time we re-enter, and nothing in the generated code will detect a problem.

So depending on the timing of the two tasks, the final result can be 3 (if the second task has finished by the time it’s checked), or 2 (if we need to add a continuation to task2 instead). This is easy to verify by forcing the tasks to sleep before they return.

Conclusion

The moral of this post is threefold:

  • In general, treat CTP and beta-quality code appropriately: I happened to run into this bug, but I’m sure there are others. This is in no way a criticism of the team. It’s just a natural part of developing a complex feature.
  • To avoid this specific bug, all you have to do is make sure that all relevant state is safely stored in local variables before any logical occurrence of "await".
  • If you’re ever writing a code generator which needs to store state, remember that that state can include expressions which have only been half evaluated so far.

For the moment, this is all I’ve got on the generated code. Next time, we’ll start looking at how exceptions are handled, and in particular how the CTP behaviour differs from the implementation we’ve seen so far. After a few posts on exceptions, I’ll start covering some daft and evil things I’ve done with async.