Category Archives: C# 5

Book Review: Async in C# 5.0


A while ago I was attending one of the Developer, Developer, Developer conference in Reading, and I heard Alex Davies give a talk about actors and async. He mentioned that he was in the process of writing a short book for O’Reilly about async in C# 5, and I offered to review it for him. Many months later (sorry Alex!) I’m finally getting round to it.

Disclaimer: The review copy was given to me for free, and equally the book is arguably a competitor of the upcoming 3rd edition of C# in Depth from the view of readers who already own the 2nd edition… so you could say I’m biased in both directions. Hopefully they cancel out.

This is a book purely on async. It’s not a general C# book, and it doesn’t even cover the tiny non-async features in C# 5. It’s all about asynchrony. As you’d expect, it’s therefore pretty short (92 pages) and can comfortably be consumed in a single session. Alex’s writing style is informal and easy to read. Of course the topic of the book is anything but simple, so even though you may read the whole book in one go first time, that doesn’t mean you’re likely to fully internalize it straight away. The book is divided into 15 short chapters, so you can revisit specific areas as and when you need to.


I’ve been writing and speaking about async for about two and a half years now. I’ve tried various ways of explaining it, and I’m pretty sure it’s one of those awkward concepts which really just needs to click eventually. I’ve had some mails from people for whom my explanation was the one to do the trick… and other mails from folks who only "got it" after seeing another perspective. I’d encourage anyone learning about async to read a variety of books, articles, blog posts and so on. I don’t even think it’s a matter of finding the single "right" explanation for you – it’s a matter of letting them all percolate.

The book covers all the topics you’d expect it to:

  • Why asynchrony is important
  • Drawbacks of library-only approaches
  • How async/await behaves in general
  • Threading and synchronization contexts
  • Exceptions
  • Different code contexts (ASP.NET, WinRT, regular UI apps)
  • How async code is compiled

Additionally there are brief sections on unit testing, parallelism and actors. Personally I’d have preferred the actors part to be omitted, with more discussion on the testing side – particularly in terms of how to write deterministic asynchronous tests. However, I know that Alex is a big fan of actors, so I can forgive a little self-indulgence on that front.

There’s one area where I’m not sure I agree with the advice in the book: exceptions. Alex repeatedly gives the advice that you shouldn’t let exceptions go unobserved. I used to go along with that almost without thinking – but now I’m not so sure. There are definitely cases where that definitely is the case, but I’m not as comfortable with the global advice as I used to be. I’ll try to put my thoughts in order on this front and blog about this separately at a later date.

That aside, this is a good, pragmatic book. To be honest, I suspect no book on async is going to go into quite as many details as the PFX team blog, and that’s probably a good thing. But "Async in C# 5.0" is a very good starting point for anyone wanting to get to grips with async, and I in no way begrudge any potential C# in Depth 3rd edition sales I may lose by saying so ;)

The future of “C# in Depth”

I’m getting fairly frequent questions – mostly on Twitter – about whether there’s going to be a third edition of C# in Depth. I figure it’s worth answering it once in some detail rather than repeatedly in 140 characters ;)

I’m currently writing a couple of new chapters covering the new features in C# 5 – primarily async, of course. The current "plan" is that these will be added to the existing 2nd edition to create a 3rd edition. There will be minimal changes to the existing text of the 2nd edition – basically going over the errata and editing a few places which ought to mention C# 5 early. (In particular the changes to how foreach loop variables are captured.)

So there will definitely be new chapters. I’m hoping there’ll be a full new print (and ebook of course) edition, but no contracts have been signed yet. I’m hoping that the new chapters will be provided free electronically to anyone who’s already got the ebook of the 2nd edition – but we’ll see. Oh, and I don’t have any timelines at the moment. Work is more demanding than it was when I was writing the first and second editions, but obviously I’ll try to get the job done at a reasonable pace. (Writing about async in a way which is both accessible and accurate is really tricky, by the way.)

Of course when I’ve finished those, I’ve got two other C# books I want to be writing… when I’m not working on Noda Time, Tekpub screencasts etc…


I had a question on Twitter around the "two other C# books". I don’t want to go into too many details – partly because they’re very likely to change – but my intention is to write "C# from Scratch" and "C# in Style". The first would be for complete beginners; the second wouldn’t go into "how things work" so much as "how to use the language most effectively." (Yes, competition for Effective C#.) One possibility is that both would be donationware, at least in ebook form, ideally with community involvement in terms of public comments.

I’m hoping that both will use the same codebase as an extended example, where "From Scratch" would explain what the code does, and "In Style" would explain why I chose that approach. Oh, and "From Scratch" would use unit testing as a teaching tool wherever possible, attempting to convey the idea that it’s something every self-respecting dev does :)

Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

A while I ago I blogged about what had changed under the hood of async between the CTP and the VS11 Preview. Well, now that the VS11 Beta is out, it’s time to do it all again…

Note that the code in this post is in the Eduasync codebase, under a different solution (Eduasync VS11.sln). Many of the old existing projects won’t compile with VS11 beta, but I’d rather leave them as they are for posterity, showing the evolution of the feature.

Stephen Toub has an excellent blog post covering some of this, so while I’ll mention things he’s covered, I won’t go into much detail about them. Let’s start off there though…

(EDIT: Stephen has also mailed me with some corrections, which I’ve edited in – mostly without indication, as the post has been up for less than seven hours, and it’ll make for a better reading experience.)

Awaiter pattern changes

The awaiter pattern is now not just a pattern. The IsCompleted property and GetResult method are still "loose" but OnCompleted is now part of an interface: INotifyCompletion. Awaiters have to implement INotifyCompleted, but may also implement ICriticalNotifyCompletion and its UnsafeOnCompleted method.

The OnCompleted method is just as it was before, and needs to flow the execuction context; the UnsafeOnCompleted method is simpler, as it doesn’t need to flow the execution context. All of this only matters if you’re implementing your own awaiters, of course. (More details in Stephen’s blog post. I’ve found this area somewhat confusing, so please do read his post carefully!)

Skeleton method changes

Just as I have previously, I’m using the (entirely unofficial) term "skeleton method" to mean the very short method created by the compiler with the same signature as an async method: this is the entry point to the async method, effectively, which creates and starts the state machine containing all the real logic.

There are two changes I’ve noticed in the skeleton method. Firstly, for some reason the state machine state numbers have changed. Whereas previously a state number of 0 meant "initial or running", positive values meant "between calls, or navigating back to the await expression" and -1 meant "finished", now -1 means "initial or running", non-negative means "between calls, or navigating back to await expression" and -2 means "finished". It’s not clear why this change has been made, given that it requires an extra assignment at the start of every skeleton method (to set the state to -1).

More importantly, the skeleton method no longer calls MoveNext directly on the state machine that it’s built. Instead, it calls Start<TStateMachine> on the AsyncTaskMethodBuilder<T> (or whichever method builder it’s using). It passes the state machine by reference (presumably for efficiency), and TStateMachine is constrained to implement the now-public-and-in-mscorlib IAsyncStateMachine interface. I’ll come back to the relationship between the state machine and the builder later on.

Task caching

(Code is in project 30: TaskCaching)

It’s possible for an async method to complete entirely synchronously. In this situation, the result is known before the method returns, and the task returned by the method is already in the RanToCompletionState. If two tasks have already run to completion with the same value, they can be (apparently) regarded as equivalent… so the beta now caches a task in this situation, for some types and values. (Apparently the preview cached too, but I hadn’t noticed, and the beta caches more.) According to my experiments and some comments:

  • For int, tasks with values -1 to 8 inclusive are cached
  • For bool, both values (true and false)
  • For char, byte, sbyte, short, ushort, uint, long, ulong, IntPtr and UIntPtr tasks with value 0 (or ”) are cached
  • For reference types, null is cached
  • For other types, no tasks are cached

EDIT: After they’ve completed, tasks are normally immutable except for disposal – the cached tasks are tweaked slightly to make disposal a no-op.

State machine interface changes

In the VS11 preview release, each state machine implemented an interface, but that interface was internal to the generated assembly, and contained a single method (SetMoveNextDelegate). It’s now a public interface with two methods:

Personally I’m not keen on the naming of "MoveNext" – I can’t help but feel that if we didn’t have the "naming baggage" of IEnumerator and the fact that at least early on, the code generator was very similar to that used for iterator blocks, we’d have something different. (It is moving to the next state of the state machine, but it still doesn’t quite feel right.) I’d favour something like "ContinueExecution". However, it doesn’t matter – it obviously does what you’d expect, and you’re not going to be calling this yourself.

SetStateMachine is a stranger beast. The documentation states:

Configures the state machine with a heap-allocated replica.

… which says almost nothing, really. The implementation is always simple, just like SetMoveNextDelegate was, although this time it delegates to the builder for the real work (a common theme, as we’ll see):

void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)

Now AsyncTaskMethodBuilder.SetStateMachine is also documented pretty sparsely:

Associates the builder with the specified state machine.

Again, no real help. However, we’ll see that it’s the builder which is responsible for calling OnContinue now, and as it can call MoveNext on an IStateMachine, it makes sense to tell it which state machine it’s associated with… but can’t it do that directly?

Well, not quite. The problem (as I understand it) is around when boxing occurs. We initially create the state machine on the stack, and it contains the builder. (Both are structs.) That’s fine until we need a continuation, but then we’ve got to be able to get back to the current state later, after the current stack frame has been blown away. So we need to box the state machine. That will create a copy of the current builder (within the box). We need the builder within the boxed state machine to contain a reference to the same box. So the order has to be:

  • Box the state machine
  • Tell the state machine about the boxed reference
  • The state machine tells its nested builder about the boxed reference

Back when the state machine was in charge of the boxing, this went via the delegate: the act of creating the box was implicit when creating the delegate, and then casting the delegate target to the interface type allowed a reference to the newly-created delegate to be set within the copy. This is similar, but using the builder instead. It’s hard to follow, but of course it’s not going to matter.

State machine field changes

There are various kinds of fields in the state machine:

  • Those corresponding with local variables and parameters in the async method
  • The state
  • The field(s) associated with awaiters
  • (In the preview/beta) The field associated with the "current execution stack" at the point of an await expression
  • (In the CTP) An unused "disposing" field

Of these, I believe only the awaiters have actually changed, but before we talk about that, let’s revisit local variables.

Local variable hoisting

I’ve just noticed that the local variables are only hoisted to fields when its scope contains an await expressions, but in that case all local variables of that scope are hoisted, whether or not they’re used "across" awaits. It would be possible to hoist only those which need to be maintained between executions, but then you wouldn’t be able to see the others when debugging, which would be somewhat confusing. Likewise local variables of the same type which are never propagated across the same await could be aliased. For example, consider this async method:

static async Task<int> M(Random rng)
    int x = rng.Next(1000);
    int y = x + rng.Next(1000);
    await Task.Yield();
    int z = y + rng.Next(1000);
    await Task.Yield();
    return z;

If the compiler could be confident you didn’t need to debug through this code, it could make do with one field of type "Random" and one field of type "int" – x can be a completely local variable in MoveNext (it’s not used between two awaits) and y and z can be aliased (we never need the value of y after we’ve first written to z).

Local variable aliasing probably isn’t particularly useful for "normal" methods as the JIT may be able to do it (so long as you don’t have a debugger attached, potentially) but in this case we expect the state machine to be boxed at some point, so potentially does make a difference (while the stack is typically reasonably small, you could have a lot of outstanding async methods in some scenarios). Maybe in a future release, the C# compiler could have an aggressive optimization mode around this, to be turned on explicitly. (I don’t think it should be  a particularly high priority, mind you.)

Awaiter fields

(Code is in project 31, AwaiterFields.)

Awaiter fields have changed a bit over the course of async’s history.

In the CTPs (all of them, I believe) each await expression had its own awaiter field in the state machine, with a type corresponding to the declared awaiter type from the awaitable. (Remember that the awaitable is the thing you await, such as a task, and the awaiter is what you get back from calling GetAwaiter on the awaitable).

In the VS11 Preview, there was always a single awaiter field of type object. From what I saw, it was usually populated with a single-element array containing an awaiter. For value type awaiters (i.e. where the awaiter is a struct) this is somewhat similar to boxing, but maintaining strong typing, so calls to IsCompleted etc can still be made. It’s possible that reference type awaiters were stored without the array creation, as it would serve no purpose. (I don’t have any machines with just the preview installed to verify this.)

In the Beta, we have a mixture. If there are any reference type awaiters, they all end up being stored in a single field of type object, which is then cast back to the actual type when required. (Don’t forget that only one awaiter can be "active" at a time, which makes this possible.) This includes awaiters of an interface type – it’s only the compile-time type declared as the return type of the GetAwaiter method of the awaitable which is important.

If any of the awaiter types are value types, each of these types gets its own field. So there might be a TaskAwaiter<int> field and a TaskAwaiter<string> field, for example. However, there can still be "sharing" going on: if there are multiple await expressions all of the same value type awaiter, they will all share a single field. (This all feels a little like the JITting of generics, but it’s somewhat coincidental.)

MoveNext method changes

(Code is in project 32, BetaStateMachine)

As I’ve mentioned earlier, the builder is now responsible for a lot more of the work than it was in earlier versions. The majority of the code remains the same as far as I can tell, in terms of handling branching, evaluating expressions with multiple await expressions and so on.  The code in the source repository shows what the complete state machine looks like, but for the sake of clarity, I’ll just focus on a single snippet. If we have an await expression like this:

await x;

then the state machine code in the VS11 Preview would look something like this:

localTaskAwaiter = x.GetAwaiter();
if (localTaskAwaiter.IsCompleted)
    goto AwaitCompleted;
this.state = 1;
TaskAwaiter[] awaiterArray = { localTaskAwaiter };
this.awaiter = awaiterArray;
Action continuation = this.MoveNextDelegate;
if (continuation == null)
    Task<int> task = this.builder.Task;
    continuation = MoveNext;
    ((IStateMachine) continuation.Target).SetMoveNextDelegate(continuation);


(That’s just setting up the await, of course – there’s then the bit where the result is fetched, but that’s less interesting. There’s also the matter of doFinallyBodies.)

In the VS11 Beta, it’s something this instead (for an awaiter type "AwaiterType" which implements ICriticalNotifyCompletion, in a state machine of type ThisStateMachine).

localAwaiter = x.GetAwaiter();
if (!localAwaiter.IsCompleted)
    state = 0;
    awaiterField = localAwaiter;
    builder.AwaitUnsafeOnCompleted<AwaiterType, ThisStateMachine>(ref localAwaiter, ref this);
    doFinallyBodies = false;

If the awaiter type only implements INotifyCompletion, it calls AwaitOnCompleted instead. Note how the calls are generic (but both type variables are constrained to implement appropriate interfaces) which avoids boxing.

The call to the builder will call back to the state machine’s SetStateMachine method if this is the first awaiter that hasn’t already completed within this execution of the async method. So that handles the section which checked for the continuation being null in the first block of code. Most of the rest of the change is explained by the difference in awaiter types, and obviously AwaitOnCompleted/AwaitUnsafeOnCompleted also ends up calling into OnCompleted on the awaiter itself.

Mutable value type awaiters

(Code is in project 32, MutableAwaiters)

One subtle difference which really shouldn’t hurt people but is fun to explore is what happens if you have an awaiter which is a mutable value type. Due to the way awaiters were carefully handled pre-beta, mutations which were conducted as part of OnCompleted would still be visible in GetResult. That’s not the case in the beta (as Stephen mentions in his blog post). Mind you, it doesn’t mean that all mutations will be ignored… just ones in OnCompleted. A mutation from IsCompleted is still visible, as shown here:

public struct MutableAwaiter : INotifyCompletion
     private string message;

     public MutableAwaiter(string message)
         this.message = message;

     public bool IsCompleted
             message = "Set in IsCompleted";
             return false;

     public void OnCompleted(Action action)
         message = "Set in OnCompleted";
         // Ick! Completes inline. Never mind, it’s only a demo…

     public string GetResult()
         return message;

What would you expect to be returned from this awaiter? You can verify that all three members are called… but "Set in IsCompleted" is returned. That’s because IsCompleted is called before the awaiter value is copied into a field within the state machine. Even though the state machine passes the awaiter by reference, it’s passing the local variable, which is of course a separate variable from the field.

I’m absolutely not suggesting that you should rely on any of this behaviour. If you really need to be able to mutate your awaiter, make it a reference type.


The main changes in the Beta are around the interactions between the AsyncTaskMethodBuilder (et al) and the state machine, including the new interfaces for awaiters. There’s been quite a bit of optimization, although I still see room for a bit more:

  • When there’s only a single kind of reference type awaiter, the field for storing it could be of that type rather than of type object, removing the need for an execution-time cast
  • The "stack" variable could be removed in some cases, and made into a specific type in many others
  • With appropriate optimization flags, local variables which aren’t used await expressions could stay local to the state machine instead of being hoisted, and hoisted variables could be aliased in some cases.

One thing which concerns me slightly is how the C# language specification is going to change – the addition of the new interfaces is definitely going to mean more complexity from this previously "tidy" feature. I’m sure it’s worth it for the sake of efficiency and the like, but part of me sighs at every added tweak.

So, is this now close to the finished version of async? Only time will tell. I haven’t checked whether dynamic awaitables have finally been introduced… if they have, I’ll put that in the next post.

Eduasync part 19: ordering by completion, ahead of time…

Today’s post involves the MagicOrdering project in source control (project 28).

When I wrote part 16 of Eduasync, showing composition in the form of majority voting, one reader mailed me a really interesting suggestion. We don’t really need to wait for any of the tasks to complete on each iteration of the loop – we only need to wait for the next task to complete. Now that sounds impossible – sure, it’s great if we know the completion order of the tasks, but half the point of asynchrony is that many things can be happening at once, and we don’t know when they’ll complete. However, it’s not as silly as it sounds.

If you give me a collection of tasks, I’ll give you back another collection of tasks which will return the same results – but I’ll order them so that the first returned task will have the same result as whichever of your original tasks completes first, and the second returned task will have the same result as whichever of your original tasks completes second, and so on. They won’t be the same tasks as you gave me, reordered – but they’ll be tasks with the same results. I’ll propagate cancellation, exceptions and so on.

It still sounds impossible… until you realize that I don’t have to associate one of my returned tasks with one of your original tasks until it has completed. Before anything has completed, all the tasks look the same. The trick is that as soon as I see one of your tasks complete, I can fetch the result and propagate it to the first of the tasks I’ve returned to you, using TaskCompletionSource<T>. When the second of your tasks completes, I propagate the result to the second of the returned tasks, etc. This is all quite easy using Task<T>.ContinueWith – barring a few caveats I’ll mention later on.

Once we’ve built a method to do this, we can then really easily build a method which is the async equivalent of Parallel.ForEach (and indeed you could write multiple methods for the various overloads). This will execute a specific action on each task in turn, as it completes… it’s like repeatedly calling Task.WhenAny, but we only actually need to wait for one task at a time, because we know that the first task in our "completion ordered" collection will be the first one to complete (duh).

Show me the code!

Enough description – let’s look at how we’ll demonstrate both methods, and then how we implement them.

private static async Task PrintDelayedRandomTasksAsync()
    Random rng = new Random();
    var values = Enumerable.Range(0, 10).Select(_ => rng.Next(3000)).ToList();
    Console.WriteLine("Initial order: {0}", string.Join(" ", values));

    var tasks = values.Select(DelayAsync);

    var ordered = OrderByCompletion(tasks);

    Console.WriteLine("In order of completion:");
    await ForEach(ordered, Console.WriteLine);

/// <summary>
/// Returns a task which delays (asynchronously) by the given number of milliseconds,
/// then return that same number back.
/// </summary>
private static async Task<int> DelayAsync(int delayMillis)
    await TaskEx.Delay(delayMillis);
    return delayMillis;

The idea is that we’re going to create 10 tasks which each just wait for some random period of time, and return the same time period back. We’ll create them in any old order – but obviously they should complete in (at least roughly) the same order as the returned numbers.

Once we’ve created the collection of tasks, we’ll call OrderByCompletion to create a second collection of tasks, returning the same results but this time in completion order – so ordered.ElementAt(0) will be the first task to complete, for example.

Finally, we call ForEach and pass in the ordered task collection, along with Console.WriteLine as the action to take with each value. We await the resulting Task to mimic blocking until the foreach loop has finished. Note that we could make this a non-async method and just return the task returned by ForEach, given that that’s our only await expression and it’s right at the end of the method. This would be marginally faster, too – there’s no need to build an extra state machine. See Stephen Toub’s article about async performance for more information.


I’d like to get ForEach out of the way first, as it’s so simple: it’s literally just iterating over the tasks, awaiting them and propagating the result to the action. We get the "return a task which will wait until we’ve finished" for free by virtue of making it an async method.

/// <summary>
/// Executes the given action on each of the tasks in turn, in the order of
/// the sequence. The action is passed the result of each task.
/// </summary>
private static async Task ForEach<T>(IEnumerable<Task<T>> tasks, Action<T> action)
    foreach (var task in tasks)
        T value = await task;

Simple, right? Let’s get onto the meat…


This is the tricky bit, and I’ve actually split it into two methods to make it slightly easier to comprehend. The PropagateResult method feels like it could be useful in other composition methods, too.

The basic plan is:

  • Copy the input tasks to a list: we need to work out how many there are and iterate over them, so let’s make sure we only iterate once
  • Create a collection of TaskCompletionSource<T> references, one for each input task. Note that we’re not associating any particular input task with any particular completion source – we just need the same number of them
  • Declare an integer to keep track of "the next available completion source"
  • Attach a continuation to each input task which will be increment the counter we’ve just declared, and propagate the just-completed task’s status
  • Return a view onto the collection of TaskCompletionSource<T> values, projecting each one to its Task property

Once you’re happy with the idea, the implementation isn’t too surprising (although it is quite long):

/// <summary>
/// Returns a sequence of tasks which will be observed to complete with the same set
/// of results as the given input tasks, but in the order in which the original tasks complete.
/// </summary>
private static IEnumerable<Task<T>> OrderByCompletion<T>(IEnumerable<Task<T>> inputTasks)
    // Copy the input so we know it’ll be stable, and we don’t evaluate it twice
    var inputTaskList = inputTasks.ToList();

    // Could use Enumerable.Range here, if we wanted…
    var completionSourceList = new List<TaskCompletionSource<T>>(inputTaskList.Count);
    for (int i = 0; i < inputTaskList.Count; i++)
        completionSourceList.Add(new TaskCompletionSource<T>());

    // At any one time, this is "the index of the box we’ve just filled".
    // It would be nice to make it nextIndex and start with 0, but Interlocked.Increment
    // returns the incremented value…
    int prevIndex = -1;

    // We don’t have to create this outside the loop, but it makes it clearer
    // that the continuation is the same for all tasks.
    Action<Task<T>> continuation = completedTask =>
        int index = Interlocked.Increment(ref prevIndex);
        var source = completionSourceList[index];
        PropagateResult(completedTask, source);

    foreach (var inputTask in inputTaskList)
        // TODO: Work out whether TaskScheduler.Default is really the right one to use.

    return completionSourceList.Select(source => source.Task);

/// <summary>
/// Propagates the status of the given task (which must be completed) to a task completion source
/// (which should not be).
/// </summary>
private static void PropagateResult<T>(Task<T> completedTask,
    TaskCompletionSource<T> completionSource)
    switch (completedTask.Status)
        case TaskStatus.Canceled:
        case TaskStatus.Faulted:
        case TaskStatus.RanToCompletion:
            // TODO: Work out whether this is really appropriate. Could set
            // an exception in the completion source, of course…
            throw new ArgumentException("Task was not completed");

You’ll notice there are a couple of TODO comments there. The exception in PropagateResult really shouldn’t happen – the continuation shouldn’t be called when the task hasn’t completed. I still need to think carefully about how tasks should propagate exceptions though.

The arguments to ContinueWith are more tricky: working through my TimeMachine class and some unit tests with Bill Wagner last week showed just how little I know about how SynchronizationContext, the task awaiters, task schedulers, and TaskContinuationOptions.ExecuteSynchronously all interact. I would definitely need to look into that more deeply before TimeMachine was really ready for heavy use… which means you should probably be looking at the TPL in more depth too.


Sure enough, when you run the code, the results appear in order, as the tasks complete. Here’s one sample of the output:

Initial order: 335 468 1842 1991 2512 2603 270 2854 1972 1327
In order of completion:

TODOs aside, the code in this post is remarkable (which I can say with modesty, as I’ve only refactored it from the code sent to me by another reader and Stephen Toub). It makes me smile every time I think about the seemingly-impossible job it accomplishes. I suspect this approach could be useful in any number of composition blocks – it’s definitely one to remember.

Eduasync part 18: Changes between the Async CTP and the Visual Studio 11 Preview

In preparation for CodeMash, I’ve been writing some more async code and decompiling it with Reflector. This time I’m using the Visual Studio 11 Developer Preview – the version which installs alongside Visual Studio 2010 under Windows 7. (Don’t ask me about any other features of Visual Studio 11 – I haven’t explored it thoroughly; I’ve really only used it for the C# 5 bits.)

There have been quite a few changes since the CTP – they’re not visible changes in terms of code that you’d normally write, but the state machine generated by the C# compiler is reasonably different. In this post I’ll describe the differences, as best I understand them. There are still a couple of things I don’t understand (which I’ll highlight within the post) but overall, I think I’ve got a pretty good handle on why the changes have been made.

I’m going to assume you already have a reasonable grasp of the basic idea of async and how it works – the way that the compiler generates a state machine to represent an async method or anonymous function, with originally-local variables being promoted to instance variables within the state machine, etc. If the last sentence was a complete mystery to you, see Eduasync part 7 for more information. I don’t expect you to remember the exact details of what was in the previous CTP though :)

Removal of iterator block leftovers

In the CTP, the code for async methods was based on the iterator block implementation. I suspect that’s still the case, but possibly sharing just a little less code. There used to be a few methods and fields which weren’t used in async methods, but now they’re gone:

  • There’s no now constructor, so no need for the "skeleton" method which replaces the real async method to pass in 0 as the initial state.
  • There’s no Dispose method.
  • There’s no disposing field.

It’s nice to see these gone, but it’s not terribly interesting. Now on to the bigger changes…

Large structural changes

There’s a set of related structural changes which don’t make sense individually. I’ll describe them first, then look at how it all hangs together, and my guess as to the reasoning behind.

The state machine is now a struct

The declaration of the nested type for the state machine is now something like this:

private struct StateMachine : IStateMachine
    // Fields common to all async state machines
    // (with caveats)
    private int state;
    private object awaiter;
    public AsyncTaskMethodBuilder<int> builder;
    public Action moveNextDelegate;
    private object stack;

    // Hoisted local variables

    // Methods
    public void SetMoveNextDelegate(Action action) { … }
    public void MoveNext() { … }

The caveats around the common field are in terms of the return type of the async method (which determines the type of builder used) and whether or not there are any awaits (if there are no awaits, the stack and awaiter fields aren’t generated).

Note that throughout this blog post I’ve changed the names of fields and types – in reality they’re all "unspeakable" names including angle-brackets, just like all compiler-generated names.

There’s a new assembly-wide interface

As you can see from the code above, the state machine implements an interface (actually called <>t__IStateMachine). One of these is created in the global namespace in each assembly that contains at least one async method or anonymous function, and it looks like this:

internal interface IStateMachine
    void SetMoveNextDelegate(Action action);

The implementation for this method is always the same, and it’s trivial:

public void SetMoveNextDelegate(Action action)
    this.moveNextDelegate = action;

Simplified skeleton method

The method which starts the state machine, which I’ve been calling the "skeleton" method everywhere, is now a bit simpler than it was. Something like this:

public static Task<int> FooAsync()
    StateMachine machine = new StateMachine();
    machine.builder = AsyncVoidMethodBuilder.Create();
    return machine.builder.Task;

In fact if you decompile the IL, you’ll see that it doesn’t explicitly initialize the variable to start with – it just declares it, sets the builder field and then calls MoveNext(). That’s not valid C# (as all the struct’s fields aren’t initialized), but it is valid IL. It’s equivalent to the code above though. Note how there’s nothing to set the continuation – where previously the moveNextDelegate field would be populated within the skeleton method.

Just-in-time delegate creation

Now that the skeleton method doesn’t create the delegate representing the continuation, it can be done when it’s first required – which is when we first encounter an await expression for an awaitable which hasn’t already completed. (If the awaitable has completed before we await it, the generated code skips the continuation and just uses the results immediately and synchronously).

The code for that delegate creation is slightly trickier than you might expect, however. It looks something like this:

Action action = this.moveNextDelegate;
if (action == null)
    Task<int> task = this.builder.Task;
    action = new Action(this.MoveNext);
    ((IStateMachine) action.Target).SetMoveNextDelegate(action);

There are two oddities here, one of which I mostly understand and one of which I don’t understand at all.

I really don’t understand the "task" variable here. Why do we need to exercise the AsyncTaskMethodBuilder.Task property? We don’t use the result anywhere… does forcing this flush some memory buffer? I have no clue on this one. (See the update at the bottom of the post…)

The part about setting the delegate via the interface makes more sense, but it’s subtle. You might expect code like this:

// Looks sensible, but is actually slightly broken
Action action = this.moveNextDelegate;
if (action == null)
    action = new Action(this.MoveNext);
    this.moveNextDelegate = action;

That would sort of work – but we’d end up needing to recreate the delegate each time we encountered an appropriate await expression. Although the above code saves the value to the field, it saves it within the current value of the state machine… after we’ve boxed that value as the target of the delegate. The value we want to mutate is the one within the box – which is precisely why there’s an interface, and why the code casts to it.

We can’t even just unbox and then set the field afterwards – at least in C# – because the unbox operation is always followed by a copy operation in normal C#. I believe it would be possible for the C# compiler to generate IL which unboxed action.Target without the copy, and then set the field in that. It’s not clear to me why the team went with the interface approach instead… I would expect that to be slower (as it requires dynamic dispatch) but I could easily be wrong. Of course, it would also make it impossible to decompile the IL to C#, which would make my talks harder, but don’t expect the C# team to bend the compiler implementation for my benefit ;)

(As an aside to all of this, I’ve gone back and forth on whether the "slightly broken" implementation would recreate the delegate on every appropriate await, or only two. I think it would end up being on every occurrence, as even though on the second occurrence we’d be operating within the context of the first boxed instance, the new delegate would have a reference to a new boxed copy each time. It does my head in a little bit, trying to think about this… more evidence that mutable structs are evil and hard to reason about. It’s not the wrong decision in this case, hidden far from the gaze of normal developers, but it’s a pain to reason about.)

Single awaiter variable

In the CTP, each await expression generated a separate field within the state machine, and that field was always of the exact awaiter type. In the VS11 Developer Preview, there’s always exactly one awaiter field (assuming there’s at least one await expression) and it’s always of type object. It’s used like this:

  // Single local variable used by both continuation and first-time paths
  TaskAwaiter<int> localAwaiter;


  if (conditions-for-first-time-execution)
      // Code before await

      localAwaiter = task.GetAwaiter();
      if (localAwaiter.IsCompleted)
          goto Await1Completed;
      this.state = 1;
      TaskAwaiter<int>[] awaiterArray = { localAwaiter };
      this.awaiter = awaiterArray;
      // Lazy delegate creation goes here
  // Continuation would get into here
  localAwaiter = ((TaskAwaiter<int>[]) this.awaiter)[0];
  this.awaiter = null;
  this.state = 0;
  int result = localAwaiter.GetResult(); 
  localAwaiter = default(TaskAwaiter<int>);

I realize there’s a lot of code here, but it does make some sense:

  • The value of the awaiter field is always either null, or a reference to a single-element array of the awaiter type for one of the await expressions.
  • A single localAwaiter variable is shared between the two code paths, populated either from the awaitable (on the initial code path) or by copying the value from the array (in the second code path).
  • The field is always set to null and the local variable is set to its default value after use, presumably for the sake of garbage collection

It’s basically a nice way of using the fact that we’ll only ever need one awaiter at a time. It’s not clear to me why an array is used instead of either using a reference to the awaiter for class-based awaiters, or simply by boxing for struct-based awaiters. The latter would need the same "unbox without copy" approach discussed in the previous section – so if there’s some reason why that’s actually infeasible, it would explain the use of an array here. We can’t use the interface trick in this case, as the compiler isn’t in control of the awaiter type (so can’t make it implement an interface).

Expression stack preservation

This one is actually a fix to a bug in the async CTP, which I’ve written about before. We’re used to the stack containing our local variables (in the absence of iterator blocks, captured variables etc, and modulo the stack being an implementation detail) but it’s also used for intermediate results within a single statement. For example, consider this block of code:

int x = 10;
int y = 5;
int z = x + 50 * y;

That last line is effectively:

  • Load the value of x onto the stack
  • Load the value 50 onto the stack
  • Load the value of y onto the stack
  • Multiply the top two stack values (50 and y) leaving the result on the stack
  • Add the top two stack values (x and the previously-computed result) leaving the result on the stack
  • Store the top stack value into z

Now suppose we want to turn y into a Task<int>:

int x = 10;
Task<int> y = Task.FromResult(5);
int z = x + 50 * await y;

Our state machine needs to make sure that it will preserve the same behaviour as the synchronous version, so it needs the same sort of stack. In the new-style state machine, all of that stack is saved in the "stack" field. It’s only one field, but may need to represent multiple different types within the code at various different await expressions – in the code above, for example, it represents two int values. As far as I can discover, the C# compiler generates code which uses the actual type of the value it needs, if it only requires a single value. If it needs multiple values, it uses an appropriate Tuple type, nesting tuples if it goes beyond the number of type parameters supported by the Tuple<…> family of types. So in our case above, we end up with code a bit like this:

// Local variable used in both paths
Tuple<int, int> tuple;

// Code before the await
if (conditions-for-first-time-execution) 

    tuple = new Tuple<int, int>(this.x, 50);
    this.stack = tuple;

// Continuation would get into here
tuple = (Tuple<int, int>) this.stack;
// IL copies the values from the tuple onto the stack at this point
this.stack = null;

// Both the fast and slow code paths get here eventually
this.z = stack0 + stack1 * awaiter.GetResult()

I say it’s a bit like this, because it’s hard to represent the IL exactly in C# in this case. The tuple is only created if it’s needed, i.e. not in the already-completed fast path. In that case, the values are loaded onto the stack but not then put into the tuple – execution skips straight to the code which uses the values already on the stack.

When the awaitable isn’t complete immediately, then a Tuple<int, int> is created, stored in the "stack" field, and the continuation is handed to the awaiter. On continuation, the tuple is loaded back from the "stack" field (and cast accordingly), the values are loaded onto the stack – and then we’re back into the common code path of fetching the value and performing the add and multiply operations.


As far as I’m aware, those are the most noticeable changes in the generated code. There may well still be a load more changes in Task<T> and the TPL in general – I wouldn’t be at all surprised – but that’s harder to investigate.

I’m sure all of this has been done in the name of performance (and correctness, in the case of stack preservation). The state machine is now much smaller in terms of the number of fields it requires, and objects are created locally as far as possible (including the state machine itself only requiring heap allocation if there’s ever a "slow" awaitable). I suspect there’s still some room for optimization, however:

  • Both the awaiter and the delegate use careful boxing and either arrays or a mutating interface to allow the boxed value to be changed. I suspect that using unbox with the concrete type, but without copying the value, would be more efficient. I may attempt to work this theory up into a test at some point.
  • If there’s only one awaiter type (usually TaskAwaiter<T> for some T), that type could be used instead of object, potentially reducing heap optimization
  • I’ve no idea why the builder.Task property is explicitly fetched and then the results discarded
  • If there’s only one await expression, the "stack" field can be strongly typed, which would also avoid boxing if only a single value needs to be within that stack
  • The stack field could be removed entirely when it’s not needed for intermediate stack value preservation. (I believe that would be the case reasonably often.)

The use of mutable value types is really fascinating (for me, at least) – I’m sure most people on the C# team would still say they’re evil, but when they’re used in a carefully controlled environment where real developers don’t have to reason about their behaviour, they can be useful.

Next time, I’ll hopefully get back to the idea I promised to write up before, about ordering a collection of tasks in completion order… before they’ve completed. (Well, sort of.) Should be fun…

Update (January 16th 2012)

Stephen Toub got in touch with me after I posted the original version of this blog entry, to explain the use of the Task property. Apparently the idea is that at this point, we know we’re going to return out of the state machine, so the skeleton method is going to access the Task property anyway. However, as we haven’t scheduled the continuation yet we also know that nothing will be accessing the Task property on a different thread. If we access it now for the first time, we can lazily allocate the task in the same thread that created the AsyncMethodBuilder, with no risk of contention. If we can force there to be a task ready and waiting for whatever accesses it later, we don’t need any synchronization in that area.

So why might we want to allocate the task lazily in the first place? Well, don’t forget that we might never have to wait for an await (as it were). We might just have an async method which takes the fast path everywhere. If that’s the case, then for certain cases (e.g. a non-generic, successfully completed task, or a Task<bool> which again has completed successfully) we can reuse the same instance repeatedly. Apparently this laziness isn’t yet part of the VS11 Developer Preview, but the reason for the property access is in preparation for this.

Another case of micro-optimization – which is fair enough when it’s at a system level :)

Eduasync part 17: unit testing

In the last post I showed a method to implement "majority voting" for tasks, allowing a result to become available as soon as possible. At the end, I mentioned that I was reasonably confident that it worked because of the unit tests… but I didn’t show the tests themselves. I felt they deserved their own post, as there’s a bigger point here: it’s possible to unit test async code. At least sometimes.

Testing code involving asynchrony is generally a pain. Introducing the exact order of events that you want is awkward, as is managing the threading within tests. With a few benefits with async methods:

  • We know that the async method itself will only execute in a single thread at a time
  • We can control the thread in which the async method will execute, if it doesn’t configure its awaits explicitly
  • Assuming the async method returns Task or Task<T>, we can check whether or not it’s finished
  • Between Task<T> and TaskCompletionSource<T>, we have a way of injecting tasks that we understand

Now in our sample method we have the benefit of passing in the tasks that will be awaited – but assuming you’re using some reasonably testable API to fetch any awaitables within your async method, you should be okay. (Admittedly in the current .NET framework that excludes rather a lot of classes… but the synchronous versions of those calls are also generally hard to test too.)

The plan

For our majority tests, we want to be able to see what happens in various scenarios, with tasks completing at different times and in different ways. Looking at the test cases I’ve implemented I have the following tests:

  • NullSequenceOfTasks
  • EmptySequenceOfTasks
  • NullReferencesWithinSequence
  • SimpleSuccess
  • InputOrderIsIrrelevant
  • MajorityWithSomeDisagreement
  • MajorityWithFailureTask
  • EarlyFailure
  • NoMajority

I’m not going to claim this is a comprehensive set of possible tests – it’s a proof of concept more than anything else. Let’s take one test as an example: MajorityWithFailureTask. The aim of this is to pass three tasks (of type Task<string>) into the method. One will give a result of "x", the second will fail with an exception, and the third will also give a result of "x". The events will occur in that order, and only when all three results are in should the returned task complete, at which point it will also have a success result of "x".

So, the tricky bit (compared with normal testing) is introducing the timing. We want to make it appear as if tasks are completing in a particular order, at predetermined times, so we can check the state of the result between events.

Introducing the TimeMachine class

Okay, so it’s a silly name. But the basic idea is to have something to control the logical flow of time through our test. We’re going to ask the TimeMachine to provide us with tasks which will act in a particular way at a given time, and then when we’ve started our async method we can then ask it to move time forward, letting the tasks complete as they go. It’s probably best to look at the code for MajorityWithFailureTask first, and then see what the implementation of TimeMachine looks like. Here’s the test:

public void MajorityWithFailureTask()
    var timeMachine = new TimeMachine();
    // Second task gives a different result
    var task1 = timeMachine.AddSuccessTask(1, "x");
    var task2 = timeMachine.AddFaultingTask<string>(2, new Exception("Bang!"));
    var task3 = timeMachine.AddSuccessTask(3, "x");

    var resultTask = MoreTaskEx.WhenMajority(task1, task2, task3);

    // Only one result so far – no consensus

    // Second result is a failure

    // Third result gives majority verdict
    Assert.AreEqual(TaskStatus.RanToCompletion, resultTask.Status);
    Assert.AreEqual("x", resultTask.Result);

As you can see, there are two types of method:

  • AddSuccessTask / AddFaultingTask / AddCancelTask (not used here) – these all take the time at which they’re going to complete as their first parameter, and the method name describes the state they’ll reach on completion. The methods return the task created by the time machine, ready to pass into the production code we’re testing.
  • AdvanceTo / AdvanceBy (not used here) – make the time machine "advance time", completing pre-programmed tasks as it goes. When those tasks complete, any continuations attached to them also execute, which is how the whole thing hangs together.

Now forcing tasks to complete is actually pretty simple, if you build them out of TaskCompletionSource<T> to start with. So all we need to do is keep our tasks in "time" order (which I achieve with SortedList), and then when we’re asked to advance time we move through the list and take the appropriate action for all the tasks which weren’t completed before, but are now. I represent the "appropriate action" as a simple Action, which is built with a lambda expression from each of the Add methods. It’s really simple:

public class TimeMachine
    private int currentTime = 0;
    private readonly SortedList<int, Action> actions = new SortedList<int, Action>();

    public int CurrentTime { get { return currentTime; } }

    public void AdvanceBy(int time)
        AdvanceTo(currentTime + time);

    public void AdvanceTo(int time)
        // Okay, not terribly efficient, but it’s simple.
        foreach (var entry in actions)
            if (entry.Key > currentTime && entry.Key <= time)
        currentTime = time;

    public Task<T> AddSuccessTask<T>(int time, T result)
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetResult(result);
        return tcs.Task;

    public Task<T> AddCancelTask<T>(int time)
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetCanceled();
        return tcs.Task;

    public Task<T> AddFaultingTask<T>(int time, Exception e)
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetException(e);
        return tcs.Task;

Okay, that’s a fair amount of code for a blog posts (and yes, it could do with some doc comments etc!) but considering that it makes life testable, it’s pretty simple.

So, is that it?

It works on my machine… with my test runner… in simple cases…

When I first ran the tests using TimeMachine, they worked almost immediately. This didn’t surprise me nearly as much as it should have done. You see, when the tests execute, they use async/await in the normal way – which means the continuations are scheduled on "the current task scheduler". I have no idea what the current task scheduler is in unit tests. Or rather, it feels like something which is implementation specific. It could easily have worked when running the tests from ReSharper, but not from NCrunch, or not from the command line NUnit test runner.

As it happens, I believe all of these run tests on thread pool threads with no task scheduler allocated, which means that the continuation is attached to the task to complete "in-line" – so when the TimeMachine sets the result on a TaskCompletionSource, the continuations execute before that call returns. That means everything happens on one thread, with no ambiguity or flakiness – yay!

However, there are two problems:

  • The words "I believe" aren’t exactly confidence-inspiring when it comes to testing that your software works correctly.
  • Our majority voting code only ever sees one completed task at a time – we’re not testing the situation where several tasks complete so quickly together that the continuation doesn’t get chance to run before they’ve all finished.

Both of these are solvable with a custom TaskScheduler or SynchronizationContext. Without diving into the docs, I’m not sure yet which I’ll need, but the aim will be:

  • Make TimeMachine implement IDisposable
  • In the constructor, set the current SynchronizationContext (or TaskScheduler) to a custom one having remembered what the previous one was
  • On disposal, reset the context
  • Make the custom scheduler keep a queue of jobs, such that when we’re asked to advance to time T, we complete all the appropriate tasks but don’t execute any continuations, then we execute all the pending continuations.

I don’t yet know how hard it will be, but hopefully the Parallel Extensions Samples will help me.


I’m not going to claim this is "the" way of unit testing asynchronous methods. It’s clearly a proof-of-concept implementation of what can only be called a "test framework" in the loosest possible sense. However, I hope it gives an example of a path we might take. I’m looking forward to seeing what others come up with, along with rather more polished implementations.

Next time, I’m going to shamelessly steal an idea that a reader mailed me (with permission, of course). It’s insanely cool, simple and yet slightly brain-bending, and I suspect will handy in many situations. Love it.

Eduasync part 15: implementing COMEFROM with a horrible hack

Ages ago when I wrote my previous Eduasync post, I said we’d look at a pipeline model of coroutines. I’ve decided to skip that, as I do want to cover the topic of this post, and I’ve got some more "normal" async ideas to write about too. If you want to look at the pipeline coroutines code, it’s project 20 in the source repository. Have fun, and don’t blame me if you get confused reading it – so do I.

The code I am going to write about is horrible too. It’s almost as tricky to understand, and it does far nastier things. Things that the C# 5 specification explicitly says you shouldn’t do.

If it makes you feel any better when your head hurts reading this code, spare a thought for me – I haven’t looked at it in over six months, and I don’t have a blog post explaining how it’s meant to work. I just have almost entirely uncommented code which is designed to be hard to understand (in terms of the main program flow).

On no account should any code like this ever be used for anything remotely serious.

With that health warning out of the way, let’s have a look at it…

COMEFROM at the caller level

The idea is to implement the COMEFROM control structure, which is sort of the opposite of GOTO (or in my implementation, more of a GOSUB). There are two operations, effectively:

  • ComeFrom(label): Register interest in a particular label.
  • Label(label): If anyone has registered interested in the given label, keep going from their registration point (which will be within a method), then continue from where we left off afterwards.

In some senses it’s a little like the observer pattern, with labels taking the place of events. However, it looks entirely different and is much harder to get your head round, because instead of having a nicely-encapsulated action which is subscribed to an event, we just have a ComeFrom call which lets us jump back into a method somewhat arbitrarily.

I have two implementations, in project 22 and project 23 in source control. Project 22 is almost sane; a little funky, but not too bad. Project 23 is where the fun really happens. In addition to the operations listed above, there’s an Execute operation which is sort of an implementation detail – it allows an async method containing ComeFrom calls to be executed without returning earlier than we might want.

Let’s look at some code and the output, and try to work out what’s going on.

internal class Program
    private static void Main(string[] args)
        Coordinator coordinator = new Coordinator(SimpleEntryPoint);

    private static async void SimpleEntryPoint(Coordinator coordinator)
        await coordinator.Execute(SimpleOtherMethod);

        Console.WriteLine("First call to Label(x)");
        await coordinator.Label("x");

        Console.WriteLine("Second call to Label(x)");
        await coordinator.Label("x");

        Console.WriteLine("Registering interesting in y");
        bool firstTime = true;
        await coordinator.ComeFrom("y");

        Console.WriteLine("After ComeFrom(y). FirstTime={0}", firstTime);

        if (firstTime)
            firstTime = false;
            await coordinator.Label("y");

    private static async void SimpleOtherMethod(Coordinator coordinator)
        Console.WriteLine("Start of SimpleOtherMethod");

        int count = 0;
        await coordinator.ComeFrom("x");

        Console.WriteLine("After ComeFrom x in SimpleOtherMethod. count={0}. Returning.",

The reason for the "Simple" prefix on the method names is that there’s another example in the same file, with a more complex control flow.

Here’s the output – then we can look at why we’re getting it…

Start of SimpleOtherMethod
After ComeFrom x in SimpleOtherMethod. count=0. Returning.
First call to Label(x)
After ComeFrom x in SimpleOtherMethod. count=1. Returning.
Second call to Label(x)
After ComeFrom x in SimpleOtherMethod. count=2. Returning.
Registering interesting in y
After ComeFrom(y). FirstTime=True
After ComeFrom(y). FirstTime=False

So, the control flow is a bit like this:

  • Start SimpleEntryPoint
    • Call into SimpleOtherMethod
      • Log "Start of SimpleOtherMethod"
      • Initialize the "count" variable with value 0
      • Register interest in x; ComeFrom remembers the continuation but keeps going.
      • Log "After ComeFrom x in SimpleOtherMethod. count=0. Returning."
      • Increment count to 1.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "First call to Label(x)"
  • Call Label("x")…
    • … which takes us back into SimpleOtherMethod (remember, the method we thought we’d finished executing?) just after ComeFrom
      • Log AfterComeFrom x in SimpleOtherMethod. count=1. Returning.
      • Increment count to 2.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "Second call to Label(x)"
  • Call Label("x")…
    • … which takes us back into SimpleOtherMethod again
      • Log AfterComeFrom x in SimpleOtherMethod. count=2. Returning.
      • Increment count to 3.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "Registering interest in y"
  • Initialize the "firstTime" variable with value true.
  • Register interest in y; ComeFrom remembers the continuation and keeps going
  • Log "After ComeFrom(y). FirstTime=True"
  • Check the value of firstTime… It’s true, so:
    • Set firstTime to false
    • Call Label("y")
  • … which takes us back to earlier in the method (just after ComeFrom), like a normal looping construct…
  • Log "After ComeFrom(y). FirstTime=False"
  • Check the value of firstTime… It’s false, so:
    • Log "Finished"
    • Exit!

Doing all of this has a few interesting challenges. Let’s look at them one at a time… and I would strongly advise you not to try to pay too much attention to the details.

Noting a continuation and continuing regardless…

Just as a quick reminder before we get cracking, it’s worth remembering that all of this is entirely synchronous, despite being implemented with async. There’s only a single user thread involved here. As with previous parts, we maintain a stack of actions to call, and basically keep calling from the top until we’re done – but the actions we call can create extra stack entries, of course.

ComeFrom has unusual semantics in terms of async. We want to remember the continuation and keep executing as if we didn’t need to wait. We can easily do one side or the other. If we wanted to just keep going without needing to know about the continuation, we could just return true from IsCompleted. If we just want to remember the continuation, we can make the awaiter’s IsCompleted property return false, and remember the continuation when it’s passed to OnCompleted. How do we do both?

Well, effectively we want to remember the continuation and then call it immediately. But we can’t just call it directly from OnCompleted, as otherwise each ComeFrom call would end up in a "real" execution stack from, whereas our execution stack is stored as a Stack<Action>. So instead, we need to remember the continuation and immediately put it at the top of the stack.

However, that only works if as soon as the generated code returns from the async method containing the ComeFrom call, we go back into the state machine. If we’d just called SimpleOtherMethod directly in SimpleEntryPoint, we would have continued within SimpleEntryPoint with the new stack entry just waiting around. This is why we need the Executor method: that does exactly the same thing, effectively shuffling the stack around. When it’s given something to execute, it puts its own continuation on the action stack, then the action it’s been asked to execute, then returns. The top level code will then pick up the original action, and we’re away.

So, here’s the code for Execute, which is the simplest part of the coordinator:

public ExecuteAwaiter Execute(Action<Coordinator> action)
    return new ExecuteAwaiter(() => action(this), this);

public class ExecuteAwaiter
    private readonly Action action;
    private readonly Coordinator coordinator;

    internal ExecuteAwaiter(Action action, Coordinator coordinator)
        this.action = action;
        this.coordinator = coordinator;

    public ExecuteAwaiter GetAwaiter()
        return this;

    // Always yield
    public bool IsCompleted { get { return false; } }

    public void OnCompleted(Action callerContinuation)
        // We want to execute the action continuation, then get back here,
        // allowing any extra continuations put on the stack *within* the action
        // to be executed.

    public void GetResult()

All the awaitables in this project return themselves as the awaiter – when you don’t need any other state, it’s an easy step to take.

That’s all we need to say about Execute, but how exactly are we capturing the continuation in ComeFrom?

Capturing continuations

Once we’ve got the action stack shuffling under our belts, there are two more problems with ComeFrom:

  • What happens if we ComeFrom the same label twice?
  • How do we really capture a continuation?

The first point didn’t come up in the sample I’ve shown here, but it does come up in the more complex example – imagine if SimpleOtherMethod had two ComeFrom calls; when we jump back to the first one, we’ll execute the second one again. I made a simple policy decision to only allow a single "return point" for any label – if a ComeFrom call tries to register the existing continuation point for a label, we ignore it; otherwise we throw an exception. So we only need to care about a single continuation for any label, which makes life easier.

The second point is trickier. If you remember back to earlier posts in this series, we saw that the state machine generated for async only really contains a single entry point (MoveNext) which is used for all continuations. A variable in the state machine is responsible for remembering where we were within it between calls. So in order to really make the continuation remember the point at which it needs to continue, we need to remember that state. We need to store an object for the continuation, which contains the delegate to invoke, and the state of the state machine when we were first passed the continuation. I’ve created a class for this, unimaginatively called Continuation, which looks like this:

/// <summary>
/// This hack allows a continuation to be executed more than once,
/// contrary to the C# spec. It does this using reflection to store the
/// value of the "state" field within the generated class. NEVER, EVER, EVER
/// try to use this in real code. It’s purely for fun.
/// </summary>
internal sealed class Continuation : IEquatable<Continuation>
    private readonly int savedState;
    private readonly object target;
    private readonly FieldInfo field;
    private readonly Action action;

    internal Continuation(Action action)
        target = action.Target;
        field = target.GetType().GetField("<>1__state", BindingFlags.Instance | BindingFlags.NonPublic);
        savedState = (int) field.GetValue(target);
        this.action = action;

    internal void Execute()
        field.SetValue(target, savedState);

    // Snip Equals/GetHashCode

Yes, we use reflection to fish out the <>1__state variable initially, and poke the state machine with the same value when we next want to execute the continuation. All highly implementation-specific, of course.

Now the ComeFrom method is reasonably straightforward – all we need is a dictionary mapping labels to continuations. Oh, and the same action stack shuffling as for Execute:

// In the coordinator
private readonly Dictionary<string, Continuation> labels = new Dictionary<string, Continuation>();

public ComeFromAwaiter ComeFrom(string label)
    return new ComeFromAwaiter(label, this);

public struct ComeFromAwaiter
    private readonly string label;
    private readonly Coordinator coordinator;

    internal ComeFromAwaiter(string label, Coordinator coordinator)
        this.label = label;
        this.coordinator = coordinator;

    public ComeFromAwaiter GetAwaiter()
        return this;

    // We *always* want to be given the continuation
    public bool IsCompleted { get { return false; } }

    public void OnCompleted(Action action)
        Continuation newContinuation = new Continuation(action);
        Continuation oldContinuation;
        if (!coordinator.labels.TryGetValue(label, out oldContinuation))
            // First time coming from this label. Always succeeds.
            coordinator.labels[label] = newContinuation;
            // Current semantics are to prohibit two different ComeFrom calls for the same label.
            // An alternative would be to just replace the existing continuation with the new one,
            // in which case we wouldn’t need any of this – we could just use
            // coordinator.labels[label] = newContinuation;
            // unconditionally.
            if (!oldContinuation.Equals(newContinuation))
                throw new InvalidOperationException("Additional continuation detected for label " + label);
            // Okay, we’ve seen this one before. Nothing to see here, move on.
        // We actually want to continue from where we were: we’re only really marking the
        // ComeFrom point.

    public void GetResult()

There’s one interesting point here which is somewhat subtle, and screwed me up for a bit…

The default value of a struct is always valid…

You may have noticed that ComeFromAwaiter is a struct. That’s pretty unusual for me. However, it’s also absolutely critical. Without it, we’d get a NullReferenceException when we execute the continuation the second time.

Normally, the flow of async methods looks a bit like this, for an await expression taking the "long" route (i.e. IsCompleted is false):

  • Call GetAwaiter() and assign the result to an awaiter field
  • Call IsCompleted (which returns false in this scenario)
  • Set the state variable to remember where we’d got to
  • Call OnCompleted
  • Return
  • … When we continue…
  • Set state to 0 (running)
  • Call GetResult() on the awaiter
  • Set the awaiter field to default(TypeOfAwaiter)
  • Continue

Now that’s fine when we’re only continuing once – but if we need to jump into the middle of that sequence a second time, we’re going to call GetAwaiter() on the awaiter field after it’s been set to the default value of the awaiter type. If the default value is null, we’ll go bang. So we must use a struct.

Fortunately, our GetResult() call doesn’t need any of the state in the awaiter – it’s purely there to satisfy the normal flow of things. So we’re quite happy with a "default" ComeFrom awaiter.

Finally, labels…

We’ve now done all the hard work. The final piece of the puzzle is Label, which just needs to check whether there’s a continuation to jump to, and shuffle the action stack in the way we’re now painfully accustomed to:

public LabelAwaiter Label(string label)
    Continuation continuation;
    labels.TryGetValue(label, out continuation);
    return new LabelAwaiter(continuation, this);

public class LabelAwaiter
    private readonly Continuation continuation;
    private readonly Coordinator coordinator;

    internal LabelAwaiter(Continuation continuation, Coordinator coordinator)
        this.continuation = continuation;
        this.coordinator = coordinator;

    public LabelAwaiter GetAwaiter()
        return this;

    // If there’s no continuation to execute, just breeze through.
    public bool IsCompleted { get { return continuation == null; } }

    public void OnCompleted(Action action)
        // We want to execute the ComeFrom continuation, then get back here.

    public void GetResult()

Almost painfully simple, really.

So that looks like all the code that’s used, right? Not quite

Reusable builders?

As we saw in the sample code, we can end up finishing the same async method multiple times (SimpleOtherMethod completes three times). That’s going to call SetResult on the AsyncVoidMethodBuilder three times… which feels like it should go bang. Indeed, when I revisited my code earlier I wondered why it didn’t go bang – it’s the sort of illegal state transition the framework is usually pretty good at picking up on.

Then I remembered – this isn’t the framework’s AsyncVoidMethodBuilder – it’s mine. And my SetResult method in this project does absolutely nothing. How convenient!

Make it stop, make it stop!

Okay thiat was a pretty quick tour of some horrible code. You’ll never have to do anything like this with async in sane code, but it certainly made me painfully familiar with how it all worked. Just to recap on the oddities involved:

  • We needed to capture a continuation and then immediately keep going, almost as if the awaiter had said the awaitable had completed already. This involved shenanigans with the execution model and an extra method (Execute)
  • We needed to remember the state of a continuation, which we did with reflection.
  • We needed to make awaiter.GetResult() a valid call after awaiter had been reset to the default value for the type
  • We needed to ensure that the builder created in the skeleton method could have SetResult called on it multiple times

That’s all on continuations and co-routines, I promise.

Next time (hopefully soon) I’ll look at an example of how composition works so neatly in async, and then show how we can unit test async methods – at least sometimes.