Category Archives: C# 5

Book Review: Async in C# 5.0

Resources:

A while ago I was attending one of the Developer, Developer, Developer conference in Reading, and I heard Alex Davies give a talk about actors and async. He mentioned that he was in the process of writing a short book for O’Reilly about async in C# 5, and I offered to review it for him. Many months later (sorry Alex!) I’m finally getting round to it.

Disclaimer: The review copy was given to me for free, and equally the book is arguably a competitor of the upcoming 3rd edition of C# in Depth from the view of readers who already own the 2nd edition… so you could say I’m biased in both directions. Hopefully they cancel out.

This is a book purely on async. It’s not a general C# book, and it doesn’t even cover the tiny non-async features in C# 5. It’s all about asynchrony. As you’d expect, it’s therefore pretty short (92 pages) and can comfortably be consumed in a single session. Alex’s writing style is informal and easy to read. Of course the topic of the book is anything but simple, so even though you may read the whole book in one go first time, that doesn’t mean you’re likely to fully internalize it straight away. The book is divided into 15 short chapters, so you can revisit specific areas as and when you need to.

Aside

I’ve been writing and speaking about async for about two and a half years now. I’ve tried various ways of explaining it, and I’m pretty sure it’s one of those awkward concepts which really just needs to click eventually. I’ve had some mails from people for whom my explanation was the one to do the trick… and other mails from folks who only "got it" after seeing another perspective. I’d encourage anyone learning about async to read a variety of books, articles, blog posts and so on. I don’t even think it’s a matter of finding the single "right" explanation for you – it’s a matter of letting them all percolate.

The book covers all the topics you’d expect it to:

  • Why asynchrony is important
  • Drawbacks of library-only approaches
  • How async/await behaves in general
  • Threading and synchronization contexts
  • Exceptions
  • Different code contexts (ASP.NET, WinRT, regular UI apps)
  • How async code is compiled

Additionally there are brief sections on unit testing, parallelism and actors. Personally I’d have preferred the actors part to be omitted, with more discussion on the testing side – particularly in terms of how to write deterministic asynchronous tests. However, I know that Alex is a big fan of actors, so I can forgive a little self-indulgence on that front.

There’s one area where I’m not sure I agree with the advice in the book: exceptions. Alex repeatedly gives the advice that you shouldn’t let exceptions go unobserved. I used to go along with that almost without thinking – but now I’m not so sure. There are definitely cases where that definitely is the case, but I’m not as comfortable with the global advice as I used to be. I’ll try to put my thoughts in order on this front and blog about this separately at a later date.

That aside, this is a good, pragmatic book. To be honest, I suspect no book on async is going to go into quite as many details as the PFX team blog, and that’s probably a good thing. But "Async in C# 5.0" is a very good starting point for anyone wanting to get to grips with async, and I in no way begrudge any potential C# in Depth 3rd edition sales I may lose by saying so ;)

The future of “C# in Depth”

I’m getting fairly frequent questions – mostly on Twitter – about whether there’s going to be a third edition of C# in Depth. I figure it’s worth answering it once in some detail rather than repeatedly in 140 characters ;)

I’m currently writing a couple of new chapters covering the new features in C# 5 – primarily async, of course. The current "plan" is that these will be added to the existing 2nd edition to create a 3rd edition. There will be minimal changes to the existing text of the 2nd edition – basically going over the errata and editing a few places which ought to mention C# 5 early. (In particular the changes to how foreach loop variables are captured.)

So there will definitely be new chapters. I’m hoping there’ll be a full new print (and ebook of course) edition, but no contracts have been signed yet. I’m hoping that the new chapters will be provided free electronically to anyone who’s already got the ebook of the 2nd edition – but we’ll see. Oh, and I don’t have any timelines at the moment. Work is more demanding than it was when I was writing the first and second editions, but obviously I’ll try to get the job done at a reasonable pace. (Writing about async in a way which is both accessible and accurate is really tricky, by the way.)

Of course when I’ve finished those, I’ve got two other C# books I want to be writing… when I’m not working on Noda Time, Tekpub screencasts etc…

Update

I had a question on Twitter around the "two other C# books". I don’t want to go into too many details – partly because they’re very likely to change – but my intention is to write "C# from Scratch" and "C# in Style". The first would be for complete beginners; the second wouldn’t go into "how things work" so much as "how to use the language most effectively." (Yes, competition for Effective C#.) One possibility is that both would be donationware, at least in ebook form, ideally with community involvement in terms of public comments.

I’m hoping that both will use the same codebase as an extended example, where "From Scratch" would explain what the code does, and "In Style" would explain why I chose that approach. Oh, and "From Scratch" would use unit testing as a teaching tool wherever possible, attempting to convey the idea that it’s something every self-respecting dev does :)

Eduasync 20: Changes between the VS11 Preview and the Visual Studio 11 Beta

A while I ago I blogged about what had changed under the hood of async between the CTP and the VS11 Preview. Well, now that the VS11 Beta is out, it’s time to do it all again…

Note that the code in this post is in the Eduasync codebase, under a different solution (Eduasync VS11.sln). Many of the old existing projects won’t compile with VS11 beta, but I’d rather leave them as they are for posterity, showing the evolution of the feature.

Stephen Toub has an excellent blog post covering some of this, so while I’ll mention things he’s covered, I won’t go into much detail about them. Let’s start off there though…

(EDIT: Stephen has also mailed me with some corrections, which I’ve edited in – mostly without indication, as the post has been up for less than seven hours, and it’ll make for a better reading experience.)

Awaiter pattern changes

The awaiter pattern is now not just a pattern. The IsCompleted property and GetResult method are still "loose" but OnCompleted is now part of an interface: INotifyCompletion. Awaiters have to implement INotifyCompleted, but may also implement ICriticalNotifyCompletion and its UnsafeOnCompleted method.

The OnCompleted method is just as it was before, and needs to flow the execuction context; the UnsafeOnCompleted method is simpler, as it doesn’t need to flow the execution context. All of this only matters if you’re implementing your own awaiters, of course. (More details in Stephen’s blog post. I’ve found this area somewhat confusing, so please do read his post carefully!)

Skeleton method changes

Just as I have previously, I’m using the (entirely unofficial) term "skeleton method" to mean the very short method created by the compiler with the same signature as an async method: this is the entry point to the async method, effectively, which creates and starts the state machine containing all the real logic.

There are two changes I’ve noticed in the skeleton method. Firstly, for some reason the state machine state numbers have changed. Whereas previously a state number of 0 meant "initial or running", positive values meant "between calls, or navigating back to the await expression" and -1 meant "finished", now -1 means "initial or running", non-negative means "between calls, or navigating back to await expression" and -2 means "finished". It’s not clear why this change has been made, given that it requires an extra assignment at the start of every skeleton method (to set the state to -1).

More importantly, the skeleton method no longer calls MoveNext directly on the state machine that it’s built. Instead, it calls Start<TStateMachine> on the AsyncTaskMethodBuilder<T> (or whichever method builder it’s using). It passes the state machine by reference (presumably for efficiency), and TStateMachine is constrained to implement the now-public-and-in-mscorlib IAsyncStateMachine interface. I’ll come back to the relationship between the state machine and the builder later on.

Task caching

(Code is in project 30: TaskCaching)

It’s possible for an async method to complete entirely synchronously. In this situation, the result is known before the method returns, and the task returned by the method is already in the RanToCompletionState. If two tasks have already run to completion with the same value, they can be (apparently) regarded as equivalent… so the beta now caches a task in this situation, for some types and values. (Apparently the preview cached too, but I hadn’t noticed, and the beta caches more.) According to my experiments and some comments:

  • For int, tasks with values -1 to 8 inclusive are cached
  • For bool, both values (true and false)
  • For char, byte, sbyte, short, ushort, uint, long, ulong, IntPtr and UIntPtr tasks with value 0 (or ”) are cached
  • For reference types, null is cached
  • For other types, no tasks are cached

EDIT: After they’ve completed, tasks are normally immutable except for disposal – the cached tasks are tweaked slightly to make disposal a no-op.

State machine interface changes

In the VS11 preview release, each state machine implemented an interface, but that interface was internal to the generated assembly, and contained a single method (SetMoveNextDelegate). It’s now a public interface with two methods:

Personally I’m not keen on the naming of "MoveNext" – I can’t help but feel that if we didn’t have the "naming baggage" of IEnumerator and the fact that at least early on, the code generator was very similar to that used for iterator blocks, we’d have something different. (It is moving to the next state of the state machine, but it still doesn’t quite feel right.) I’d favour something like "ContinueExecution". However, it doesn’t matter – it obviously does what you’d expect, and you’re not going to be calling this yourself.

SetStateMachine is a stranger beast. The documentation states:

Configures the state machine with a heap-allocated replica.

… which says almost nothing, really. The implementation is always simple, just like SetMoveNextDelegate was, although this time it delegates to the builder for the real work (a common theme, as we’ll see):

void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)
{
    this.<>t__builder.SetStateMachine(param0);
}

Now AsyncTaskMethodBuilder.SetStateMachine is also documented pretty sparsely:

Associates the builder with the specified state machine.

Again, no real help. However, we’ll see that it’s the builder which is responsible for calling OnContinue now, and as it can call MoveNext on an IStateMachine, it makes sense to tell it which state machine it’s associated with… but can’t it do that directly?

Well, not quite. The problem (as I understand it) is around when boxing occurs. We initially create the state machine on the stack, and it contains the builder. (Both are structs.) That’s fine until we need a continuation, but then we’ve got to be able to get back to the current state later, after the current stack frame has been blown away. So we need to box the state machine. That will create a copy of the current builder (within the box). We need the builder within the boxed state machine to contain a reference to the same box. So the order has to be:

  • Box the state machine
  • Tell the state machine about the boxed reference
  • The state machine tells its nested builder about the boxed reference

Back when the state machine was in charge of the boxing, this went via the delegate: the act of creating the box was implicit when creating the delegate, and then casting the delegate target to the interface type allowed a reference to the newly-created delegate to be set within the copy. This is similar, but using the builder instead. It’s hard to follow, but of course it’s not going to matter.

State machine field changes

There are various kinds of fields in the state machine:

  • Those corresponding with local variables and parameters in the async method
  • The state
  • The field(s) associated with awaiters
  • (In the preview/beta) The field associated with the "current execution stack" at the point of an await expression
  • (In the CTP) An unused "disposing" field

Of these, I believe only the awaiters have actually changed, but before we talk about that, let’s revisit local variables.

Local variable hoisting

I’ve just noticed that the local variables are only hoisted to fields when its scope contains an await expressions, but in that case all local variables of that scope are hoisted, whether or not they’re used "across" awaits. It would be possible to hoist only those which need to be maintained between executions, but then you wouldn’t be able to see the others when debugging, which would be somewhat confusing. Likewise local variables of the same type which are never propagated across the same await could be aliased. For example, consider this async method:

static async Task<int> M(Random rng)
{
    int x = rng.Next(1000);
    int y = x + rng.Next(1000);
        
    await Task.Yield();
        
    int z = y + rng.Next(1000);
        
    await Task.Yield();
    return z;
}

If the compiler could be confident you didn’t need to debug through this code, it could make do with one field of type "Random" and one field of type "int" – x can be a completely local variable in MoveNext (it’s not used between two awaits) and y and z can be aliased (we never need the value of y after we’ve first written to z).

Local variable aliasing probably isn’t particularly useful for "normal" methods as the JIT may be able to do it (so long as you don’t have a debugger attached, potentially) but in this case we expect the state machine to be boxed at some point, so potentially does make a difference (while the stack is typically reasonably small, you could have a lot of outstanding async methods in some scenarios). Maybe in a future release, the C# compiler could have an aggressive optimization mode around this, to be turned on explicitly. (I don’t think it should be  a particularly high priority, mind you.)

Awaiter fields

(Code is in project 31, AwaiterFields.)

Awaiter fields have changed a bit over the course of async’s history.

In the CTPs (all of them, I believe) each await expression had its own awaiter field in the state machine, with a type corresponding to the declared awaiter type from the awaitable. (Remember that the awaitable is the thing you await, such as a task, and the awaiter is what you get back from calling GetAwaiter on the awaitable).

In the VS11 Preview, there was always a single awaiter field of type object. From what I saw, it was usually populated with a single-element array containing an awaiter. For value type awaiters (i.e. where the awaiter is a struct) this is somewhat similar to boxing, but maintaining strong typing, so calls to IsCompleted etc can still be made. It’s possible that reference type awaiters were stored without the array creation, as it would serve no purpose. (I don’t have any machines with just the preview installed to verify this.)

In the Beta, we have a mixture. If there are any reference type awaiters, they all end up being stored in a single field of type object, which is then cast back to the actual type when required. (Don’t forget that only one awaiter can be "active" at a time, which makes this possible.) This includes awaiters of an interface type – it’s only the compile-time type declared as the return type of the GetAwaiter method of the awaitable which is important.

If any of the awaiter types are value types, each of these types gets its own field. So there might be a TaskAwaiter<int> field and a TaskAwaiter<string> field, for example. However, there can still be "sharing" going on: if there are multiple await expressions all of the same value type awaiter, they will all share a single field. (This all feels a little like the JITting of generics, but it’s somewhat coincidental.)

MoveNext method changes

(Code is in project 32, BetaStateMachine)

As I’ve mentioned earlier, the builder is now responsible for a lot more of the work than it was in earlier versions. The majority of the code remains the same as far as I can tell, in terms of handling branching, evaluating expressions with multiple await expressions and so on.  The code in the source repository shows what the complete state machine looks like, but for the sake of clarity, I’ll just focus on a single snippet. If we have an await expression like this:

await x;

then the state machine code in the VS11 Preview would look something like this:

localTaskAwaiter = x.GetAwaiter();
if (localTaskAwaiter.IsCompleted)
{
    goto AwaitCompleted;
}
this.state = 1;
TaskAwaiter[] awaiterArray = { localTaskAwaiter };
this.awaiter = awaiterArray;
Action continuation = this.MoveNextDelegate;
if (continuation == null)
{
    Task<int> task = this.builder.Task;
    continuation = MoveNext;
    ((IStateMachine) continuation.Target).SetMoveNextDelegate(continuation);
}
awaiterArray[0].OnCompleted(continuation);

return;

(That’s just setting up the await, of course – there’s then the bit where the result is fetched, but that’s less interesting. There’s also the matter of doFinallyBodies.)

In the VS11 Beta, it’s something this instead (for an awaiter type "AwaiterType" which implements ICriticalNotifyCompletion, in a state machine of type ThisStateMachine).

localAwaiter = x.GetAwaiter();
if (!localAwaiter.IsCompleted)
{
    state = 0;
    awaiterField = localAwaiter;
    builder.AwaitUnsafeOnCompleted<AwaiterType, ThisStateMachine>(ref localAwaiter, ref this);
    doFinallyBodies = false;
    return;
}

If the awaiter type only implements INotifyCompletion, it calls AwaitOnCompleted instead. Note how the calls are generic (but both type variables are constrained to implement appropriate interfaces) which avoids boxing.

The call to the builder will call back to the state machine’s SetStateMachine method if this is the first awaiter that hasn’t already completed within this execution of the async method. So that handles the section which checked for the continuation being null in the first block of code. Most of the rest of the change is explained by the difference in awaiter types, and obviously AwaitOnCompleted/AwaitUnsafeOnCompleted also ends up calling into OnCompleted on the awaiter itself.

Mutable value type awaiters

(Code is in project 32, MutableAwaiters)

One subtle difference which really shouldn’t hurt people but is fun to explore is what happens if you have an awaiter which is a mutable value type. Due to the way awaiters were carefully handled pre-beta, mutations which were conducted as part of OnCompleted would still be visible in GetResult. That’s not the case in the beta (as Stephen mentions in his blog post). Mind you, it doesn’t mean that all mutations will be ignored… just ones in OnCompleted. A mutation from IsCompleted is still visible, as shown here:

public struct MutableAwaiter : INotifyCompletion
{
     private string message;

     public MutableAwaiter(string message)
     {
         this.message = message;
     }

     public bool IsCompleted
     {
         get
         { 
             message = "Set in IsCompleted";
             return false;
         }
     }

     public void OnCompleted(Action action)
     {
         message = "Set in OnCompleted";
         // Ick! Completes inline. Never mind, it’s only a demo…
         action();
     }

     public string GetResult()
     {
         return message;
     }
}

What would you expect to be returned from this awaiter? You can verify that all three members are called… but "Set in IsCompleted" is returned. That’s because IsCompleted is called before the awaiter value is copied into a field within the state machine. Even though the state machine passes the awaiter by reference, it’s passing the local variable, which is of course a separate variable from the field.

I’m absolutely not suggesting that you should rely on any of this behaviour. If you really need to be able to mutate your awaiter, make it a reference type.

Conclusion

The main changes in the Beta are around the interactions between the AsyncTaskMethodBuilder (et al) and the state machine, including the new interfaces for awaiters. There’s been quite a bit of optimization, although I still see room for a bit more:

  • When there’s only a single kind of reference type awaiter, the field for storing it could be of that type rather than of type object, removing the need for an execution-time cast
  • The "stack" variable could be removed in some cases, and made into a specific type in many others
  • With appropriate optimization flags, local variables which aren’t used await expressions could stay local to the state machine instead of being hoisted, and hoisted variables could be aliased in some cases.

One thing which concerns me slightly is how the C# language specification is going to change – the addition of the new interfaces is definitely going to mean more complexity from this previously "tidy" feature. I’m sure it’s worth it for the sake of efficiency and the like, but part of me sighs at every added tweak.

So, is this now close to the finished version of async? Only time will tell. I haven’t checked whether dynamic awaitables have finally been introduced… if they have, I’ll put that in the next post.

Eduasync part 19: ordering by completion, ahead of time…

Today’s post involves the MagicOrdering project in source control (project 28).

When I wrote part 16 of Eduasync, showing composition in the form of majority voting, one reader mailed me a really interesting suggestion. We don’t really need to wait for any of the tasks to complete on each iteration of the loop – we only need to wait for the next task to complete. Now that sounds impossible – sure, it’s great if we know the completion order of the tasks, but half the point of asynchrony is that many things can be happening at once, and we don’t know when they’ll complete. However, it’s not as silly as it sounds.

If you give me a collection of tasks, I’ll give you back another collection of tasks which will return the same results – but I’ll order them so that the first returned task will have the same result as whichever of your original tasks completes first, and the second returned task will have the same result as whichever of your original tasks completes second, and so on. They won’t be the same tasks as you gave me, reordered – but they’ll be tasks with the same results. I’ll propagate cancellation, exceptions and so on.

It still sounds impossible… until you realize that I don’t have to associate one of my returned tasks with one of your original tasks until it has completed. Before anything has completed, all the tasks look the same. The trick is that as soon as I see one of your tasks complete, I can fetch the result and propagate it to the first of the tasks I’ve returned to you, using TaskCompletionSource<T>. When the second of your tasks completes, I propagate the result to the second of the returned tasks, etc. This is all quite easy using Task<T>.ContinueWith – barring a few caveats I’ll mention later on.

Once we’ve built a method to do this, we can then really easily build a method which is the async equivalent of Parallel.ForEach (and indeed you could write multiple methods for the various overloads). This will execute a specific action on each task in turn, as it completes… it’s like repeatedly calling Task.WhenAny, but we only actually need to wait for one task at a time, because we know that the first task in our "completion ordered" collection will be the first one to complete (duh).

Show me the code!

Enough description – let’s look at how we’ll demonstrate both methods, and then how we implement them.

private static async Task PrintDelayedRandomTasksAsync()
{
    Random rng = new Random();
    var values = Enumerable.Range(0, 10).Select(_ => rng.Next(3000)).ToList();
    Console.WriteLine("Initial order: {0}", string.Join(" ", values));

    var tasks = values.Select(DelayAsync);

    var ordered = OrderByCompletion(tasks);

    Console.WriteLine("In order of completion:");
    await ForEach(ordered, Console.WriteLine);
}

/// <summary>
/// Returns a task which delays (asynchronously) by the given number of milliseconds,
/// then return that same number back.
/// </summary>
private static async Task<int> DelayAsync(int delayMillis)
{
    await TaskEx.Delay(delayMillis);
    return delayMillis;
}

The idea is that we’re going to create 10 tasks which each just wait for some random period of time, and return the same time period back. We’ll create them in any old order – but obviously they should complete in (at least roughly) the same order as the returned numbers.

Once we’ve created the collection of tasks, we’ll call OrderByCompletion to create a second collection of tasks, returning the same results but this time in completion order – so ordered.ElementAt(0) will be the first task to complete, for example.

Finally, we call ForEach and pass in the ordered task collection, along with Console.WriteLine as the action to take with each value. We await the resulting Task to mimic blocking until the foreach loop has finished. Note that we could make this a non-async method and just return the task returned by ForEach, given that that’s our only await expression and it’s right at the end of the method. This would be marginally faster, too – there’s no need to build an extra state machine. See Stephen Toub’s article about async performance for more information.

ForEach

I’d like to get ForEach out of the way first, as it’s so simple: it’s literally just iterating over the tasks, awaiting them and propagating the result to the action. We get the "return a task which will wait until we’ve finished" for free by virtue of making it an async method.

/// <summary>
/// Executes the given action on each of the tasks in turn, in the order of
/// the sequence. The action is passed the result of each task.
/// </summary>
private static async Task ForEach<T>(IEnumerable<Task<T>> tasks, Action<T> action)
{
    foreach (var task in tasks)
    {
        T value = await task;
        action(value);
    }
}

Simple, right? Let’s get onto the meat…

OrderByCompletion

This is the tricky bit, and I’ve actually split it into two methods to make it slightly easier to comprehend. The PropagateResult method feels like it could be useful in other composition methods, too.

The basic plan is:

  • Copy the input tasks to a list: we need to work out how many there are and iterate over them, so let’s make sure we only iterate once
  • Create a collection of TaskCompletionSource<T> references, one for each input task. Note that we’re not associating any particular input task with any particular completion source – we just need the same number of them
  • Declare an integer to keep track of "the next available completion source"
  • Attach a continuation to each input task which will be increment the counter we’ve just declared, and propagate the just-completed task’s status
  • Return a view onto the collection of TaskCompletionSource<T> values, projecting each one to its Task property

Once you’re happy with the idea, the implementation isn’t too surprising (although it is quite long):

/// <summary>
/// Returns a sequence of tasks which will be observed to complete with the same set
/// of results as the given input tasks, but in the order in which the original tasks complete.
/// </summary>
private static IEnumerable<Task<T>> OrderByCompletion<T>(IEnumerable<Task<T>> inputTasks)
{
    // Copy the input so we know it’ll be stable, and we don’t evaluate it twice
    var inputTaskList = inputTasks.ToList();

    // Could use Enumerable.Range here, if we wanted…
    var completionSourceList = new List<TaskCompletionSource<T>>(inputTaskList.Count);
    for (int i = 0; i < inputTaskList.Count; i++)
    {
        completionSourceList.Add(new TaskCompletionSource<T>());
    }

    // At any one time, this is "the index of the box we’ve just filled".
    // It would be nice to make it nextIndex and start with 0, but Interlocked.Increment
    // returns the incremented value…
    int prevIndex = -1;

    // We don’t have to create this outside the loop, but it makes it clearer
    // that the continuation is the same for all tasks.
    Action<Task<T>> continuation = completedTask =>
    {
        int index = Interlocked.Increment(ref prevIndex);
        var source = completionSourceList[index];
        PropagateResult(completedTask, source);
    };

    foreach (var inputTask in inputTaskList)
    {
        // TODO: Work out whether TaskScheduler.Default is really the right one to use.
        inputTask.ContinueWith(continuation,
                               CancellationToken.None,
                               TaskContinuationOptions.ExecuteSynchronously,
                               TaskScheduler.Default);
    }

    return completionSourceList.Select(source => source.Task);
}

/// <summary>
/// Propagates the status of the given task (which must be completed) to a task completion source
/// (which should not be).
/// </summary>
private static void PropagateResult<T>(Task<T> completedTask,
    TaskCompletionSource<T> completionSource)
{
    switch (completedTask.Status)
    {
        case TaskStatus.Canceled:
            completionSource.TrySetCanceled();
            break;
        case TaskStatus.Faulted:
            completionSource.TrySetException(completedTask.Exception.InnerExceptions);
            break;
        case TaskStatus.RanToCompletion:
            completionSource.TrySetResult(completedTask.Result);
            break;
        default:
            // TODO: Work out whether this is really appropriate. Could set
            // an exception in the completion source, of course…
            throw new ArgumentException("Task was not completed");
    }
}

You’ll notice there are a couple of TODO comments there. The exception in PropagateResult really shouldn’t happen – the continuation shouldn’t be called when the task hasn’t completed. I still need to think carefully about how tasks should propagate exceptions though.

The arguments to ContinueWith are more tricky: working through my TimeMachine class and some unit tests with Bill Wagner last week showed just how little I know about how SynchronizationContext, the task awaiters, task schedulers, and TaskContinuationOptions.ExecuteSynchronously all interact. I would definitely need to look into that more deeply before TimeMachine was really ready for heavy use… which means you should probably be looking at the TPL in more depth too.

Conclusion

Sure enough, when you run the code, the results appear in order, as the tasks complete. Here’s one sample of the output:

Initial order: 335 468 1842 1991 2512 2603 270 2854 1972 1327
In order of completion:
270
335
468
1327
1842
1972
1991
2512
2603
2854

TODOs aside, the code in this post is remarkable (which I can say with modesty, as I’ve only refactored it from the code sent to me by another reader and Stephen Toub). It makes me smile every time I think about the seemingly-impossible job it accomplishes. I suspect this approach could be useful in any number of composition blocks – it’s definitely one to remember.

Eduasync part 18: Changes between the Async CTP and the Visual Studio 11 Preview

In preparation for CodeMash, I’ve been writing some more async code and decompiling it with Reflector. This time I’m using the Visual Studio 11 Developer Preview – the version which installs alongside Visual Studio 2010 under Windows 7. (Don’t ask me about any other features of Visual Studio 11 – I haven’t explored it thoroughly; I’ve really only used it for the C# 5 bits.)

There have been quite a few changes since the CTP – they’re not visible changes in terms of code that you’d normally write, but the state machine generated by the C# compiler is reasonably different. In this post I’ll describe the differences, as best I understand them. There are still a couple of things I don’t understand (which I’ll highlight within the post) but overall, I think I’ve got a pretty good handle on why the changes have been made.

I’m going to assume you already have a reasonable grasp of the basic idea of async and how it works – the way that the compiler generates a state machine to represent an async method or anonymous function, with originally-local variables being promoted to instance variables within the state machine, etc. If the last sentence was a complete mystery to you, see Eduasync part 7 for more information. I don’t expect you to remember the exact details of what was in the previous CTP though :)

Removal of iterator block leftovers

In the CTP, the code for async methods was based on the iterator block implementation. I suspect that’s still the case, but possibly sharing just a little less code. There used to be a few methods and fields which weren’t used in async methods, but now they’re gone:

  • There’s no now constructor, so no need for the "skeleton" method which replaces the real async method to pass in 0 as the initial state.
  • There’s no Dispose method.
  • There’s no disposing field.

It’s nice to see these gone, but it’s not terribly interesting. Now on to the bigger changes…

Large structural changes

There’s a set of related structural changes which don’t make sense individually. I’ll describe them first, then look at how it all hangs together, and my guess as to the reasoning behind.

The state machine is now a struct

The declaration of the nested type for the state machine is now something like this:

[CompilerGenerated]
private struct StateMachine : IStateMachine
{
    // Fields common to all async state machines
    // (with caveats)
    private int state;
    private object awaiter;
    public AsyncTaskMethodBuilder<int> builder;
    public Action moveNextDelegate;
    private object stack;

    // Hoisted local variables

    // Methods
    [DebuggerHidden]
    public void SetMoveNextDelegate(Action action) { … }
    public void MoveNext() { … }
}

The caveats around the common field are in terms of the return type of the async method (which determines the type of builder used) and whether or not there are any awaits (if there are no awaits, the stack and awaiter fields aren’t generated).

Note that throughout this blog post I’ve changed the names of fields and types – in reality they’re all "unspeakable" names including angle-brackets, just like all compiler-generated names.

There’s a new assembly-wide interface

As you can see from the code above, the state machine implements an interface (actually called <>t__IStateMachine). One of these is created in the global namespace in each assembly that contains at least one async method or anonymous function, and it looks like this:

[CompilerGenerated]
internal interface IStateMachine
{
    void SetMoveNextDelegate(Action action);
}

The implementation for this method is always the same, and it’s trivial:

[DebuggerHidden]
public void SetMoveNextDelegate(Action action)
{
    this.moveNextDelegate = action;
}

Simplified skeleton method

The method which starts the state machine, which I’ve been calling the "skeleton" method everywhere, is now a bit simpler than it was. Something like this:

public static Task<int> FooAsync()
{
    StateMachine machine = new StateMachine();
    machine.builder = AsyncVoidMethodBuilder.Create();
    machine.MoveNext();
    return machine.builder.Task;
}

In fact if you decompile the IL, you’ll see that it doesn’t explicitly initialize the variable to start with – it just declares it, sets the builder field and then calls MoveNext(). That’s not valid C# (as all the struct’s fields aren’t initialized), but it is valid IL. It’s equivalent to the code above though. Note how there’s nothing to set the continuation – where previously the moveNextDelegate field would be populated within the skeleton method.

Just-in-time delegate creation

Now that the skeleton method doesn’t create the delegate representing the continuation, it can be done when it’s first required – which is when we first encounter an await expression for an awaitable which hasn’t already completed. (If the awaitable has completed before we await it, the generated code skips the continuation and just uses the results immediately and synchronously).

The code for that delegate creation is slightly trickier than you might expect, however. It looks something like this:

Action action = this.moveNextDelegate;
if (action == null)
{
    Task<int> task = this.builder.Task;
    action = new Action(this.MoveNext);
    ((IStateMachine) action.Target).SetMoveNextDelegate(action);
}

There are two oddities here, one of which I mostly understand and one of which I don’t understand at all.

I really don’t understand the "task" variable here. Why do we need to exercise the AsyncTaskMethodBuilder.Task property? We don’t use the result anywhere… does forcing this flush some memory buffer? I have no clue on this one. (See the update at the bottom of the post…)

The part about setting the delegate via the interface makes more sense, but it’s subtle. You might expect code like this:

// Looks sensible, but is actually slightly broken
Action action = this.moveNextDelegate;
if (action == null)
{
    action = new Action(this.MoveNext);
    this.moveNextDelegate = action;
}

That would sort of work – but we’d end up needing to recreate the delegate each time we encountered an appropriate await expression. Although the above code saves the value to the field, it saves it within the current value of the state machine… after we’ve boxed that value as the target of the delegate. The value we want to mutate is the one within the box – which is precisely why there’s an interface, and why the code casts to it.

We can’t even just unbox and then set the field afterwards – at least in C# – because the unbox operation is always followed by a copy operation in normal C#. I believe it would be possible for the C# compiler to generate IL which unboxed action.Target without the copy, and then set the field in that. It’s not clear to me why the team went with the interface approach instead… I would expect that to be slower (as it requires dynamic dispatch) but I could easily be wrong. Of course, it would also make it impossible to decompile the IL to C#, which would make my talks harder, but don’t expect the C# team to bend the compiler implementation for my benefit ;)

(As an aside to all of this, I’ve gone back and forth on whether the "slightly broken" implementation would recreate the delegate on every appropriate await, or only two. I think it would end up being on every occurrence, as even though on the second occurrence we’d be operating within the context of the first boxed instance, the new delegate would have a reference to a new boxed copy each time. It does my head in a little bit, trying to think about this… more evidence that mutable structs are evil and hard to reason about. It’s not the wrong decision in this case, hidden far from the gaze of normal developers, but it’s a pain to reason about.)

Single awaiter variable

In the CTP, each await expression generated a separate field within the state machine, and that field was always of the exact awaiter type. In the VS11 Developer Preview, there’s always exactly one awaiter field (assuming there’s at least one await expression) and it’s always of type object. It’s used like this:

  // Single local variable used by both continuation and first-time paths
  TaskAwaiter<int> localAwaiter;

  …

  if (conditions-for-first-time-execution)
  {
      // Code before await

      localAwaiter = task.GetAwaiter();
      if (localAwaiter.IsCompleted)
      {
          goto Await1Completed;
      }
      this.state = 1;
      TaskAwaiter<int>[] awaiterArray = { localAwaiter };
      this.awaiter = awaiterArray;
      // Lazy delegate creation goes here
      awaiterArray[0].OnCompleted(action);
      return;
  }
  // Continuation would get into here
  localAwaiter = ((TaskAwaiter<int>[]) this.awaiter)[0];
  this.awaiter = null;
  this.state = 0;
Await1Completed:
  int result = localAwaiter.GetResult(); 
  localAwaiter = default(TaskAwaiter<int>);

I realize there’s a lot of code here, but it does make some sense:

  • The value of the awaiter field is always either null, or a reference to a single-element array of the awaiter type for one of the await expressions.
  • A single localAwaiter variable is shared between the two code paths, populated either from the awaitable (on the initial code path) or by copying the value from the array (in the second code path).
  • The field is always set to null and the local variable is set to its default value after use, presumably for the sake of garbage collection

It’s basically a nice way of using the fact that we’ll only ever need one awaiter at a time. It’s not clear to me why an array is used instead of either using a reference to the awaiter for class-based awaiters, or simply by boxing for struct-based awaiters. The latter would need the same "unbox without copy" approach discussed in the previous section – so if there’s some reason why that’s actually infeasible, it would explain the use of an array here. We can’t use the interface trick in this case, as the compiler isn’t in control of the awaiter type (so can’t make it implement an interface).

Expression stack preservation

This one is actually a fix to a bug in the async CTP, which I’ve written about before. We’re used to the stack containing our local variables (in the absence of iterator blocks, captured variables etc, and modulo the stack being an implementation detail) but it’s also used for intermediate results within a single statement. For example, consider this block of code:

int x = 10;
int y = 5;
int z = x + 50 * y;

That last line is effectively:

  • Load the value of x onto the stack
  • Load the value 50 onto the stack
  • Load the value of y onto the stack
  • Multiply the top two stack values (50 and y) leaving the result on the stack
  • Add the top two stack values (x and the previously-computed result) leaving the result on the stack
  • Store the top stack value into z

Now suppose we want to turn y into a Task<int>:

int x = 10;
Task<int> y = Task.FromResult(5);
int z = x + 50 * await y;

Our state machine needs to make sure that it will preserve the same behaviour as the synchronous version, so it needs the same sort of stack. In the new-style state machine, all of that stack is saved in the "stack" field. It’s only one field, but may need to represent multiple different types within the code at various different await expressions – in the code above, for example, it represents two int values. As far as I can discover, the C# compiler generates code which uses the actual type of the value it needs, if it only requires a single value. If it needs multiple values, it uses an appropriate Tuple type, nesting tuples if it goes beyond the number of type parameters supported by the Tuple<…> family of types. So in our case above, we end up with code a bit like this:

// Local variable used in both paths
Tuple<int, int> tuple;

// Code before the await
if (conditions-for-first-time-execution) 

    …
    tuple = new Tuple<int, int>(this.x, 50);
    this.stack = tuple;
    …
}

// Continuation would get into here
tuple = (Tuple<int, int>) this.stack;
// IL copies the values from the tuple onto the stack at this point
this.stack = null;

// Both the fast and slow code paths get here eventually
this.z = stack0 + stack1 * awaiter.GetResult()

I say it’s a bit like this, because it’s hard to represent the IL exactly in C# in this case. The tuple is only created if it’s needed, i.e. not in the already-completed fast path. In that case, the values are loaded onto the stack but not then put into the tuple – execution skips straight to the code which uses the values already on the stack.

When the awaitable isn’t complete immediately, then a Tuple<int, int> is created, stored in the "stack" field, and the continuation is handed to the awaiter. On continuation, the tuple is loaded back from the "stack" field (and cast accordingly), the values are loaded onto the stack – and then we’re back into the common code path of fetching the value and performing the add and multiply operations.

Conclusion

As far as I’m aware, those are the most noticeable changes in the generated code. There may well still be a load more changes in Task<T> and the TPL in general – I wouldn’t be at all surprised – but that’s harder to investigate.

I’m sure all of this has been done in the name of performance (and correctness, in the case of stack preservation). The state machine is now much smaller in terms of the number of fields it requires, and objects are created locally as far as possible (including the state machine itself only requiring heap allocation if there’s ever a "slow" awaitable). I suspect there’s still some room for optimization, however:

  • Both the awaiter and the delegate use careful boxing and either arrays or a mutating interface to allow the boxed value to be changed. I suspect that using unbox with the concrete type, but without copying the value, would be more efficient. I may attempt to work this theory up into a test at some point.
  • If there’s only one awaiter type (usually TaskAwaiter<T> for some T), that type could be used instead of object, potentially reducing heap optimization
  • I’ve no idea why the builder.Task property is explicitly fetched and then the results discarded
  • If there’s only one await expression, the "stack" field can be strongly typed, which would also avoid boxing if only a single value needs to be within that stack
  • The stack field could be removed entirely when it’s not needed for intermediate stack value preservation. (I believe that would be the case reasonably often.)

The use of mutable value types is really fascinating (for me, at least) – I’m sure most people on the C# team would still say they’re evil, but when they’re used in a carefully controlled environment where real developers don’t have to reason about their behaviour, they can be useful.

Next time, I’ll hopefully get back to the idea I promised to write up before, about ordering a collection of tasks in completion order… before they’ve completed. (Well, sort of.) Should be fun…

Update (January 16th 2012)

Stephen Toub got in touch with me after I posted the original version of this blog entry, to explain the use of the Task property. Apparently the idea is that at this point, we know we’re going to return out of the state machine, so the skeleton method is going to access the Task property anyway. However, as we haven’t scheduled the continuation yet we also know that nothing will be accessing the Task property on a different thread. If we access it now for the first time, we can lazily allocate the task in the same thread that created the AsyncMethodBuilder, with no risk of contention. If we can force there to be a task ready and waiting for whatever accesses it later, we don’t need any synchronization in that area.

So why might we want to allocate the task lazily in the first place? Well, don’t forget that we might never have to wait for an await (as it were). We might just have an async method which takes the fast path everywhere. If that’s the case, then for certain cases (e.g. a non-generic, successfully completed task, or a Task<bool> which again has completed successfully) we can reuse the same instance repeatedly. Apparently this laziness isn’t yet part of the VS11 Developer Preview, but the reason for the property access is in preparation for this.

Another case of micro-optimization – which is fair enough when it’s at a system level :)

Eduasync part 17: unit testing

In the last post I showed a method to implement "majority voting" for tasks, allowing a result to become available as soon as possible. At the end, I mentioned that I was reasonably confident that it worked because of the unit tests… but I didn’t show the tests themselves. I felt they deserved their own post, as there’s a bigger point here: it’s possible to unit test async code. At least sometimes.

Testing code involving asynchrony is generally a pain. Introducing the exact order of events that you want is awkward, as is managing the threading within tests. With a few benefits with async methods:

  • We know that the async method itself will only execute in a single thread at a time
  • We can control the thread in which the async method will execute, if it doesn’t configure its awaits explicitly
  • Assuming the async method returns Task or Task<T>, we can check whether or not it’s finished
  • Between Task<T> and TaskCompletionSource<T>, we have a way of injecting tasks that we understand

Now in our sample method we have the benefit of passing in the tasks that will be awaited – but assuming you’re using some reasonably testable API to fetch any awaitables within your async method, you should be okay. (Admittedly in the current .NET framework that excludes rather a lot of classes… but the synchronous versions of those calls are also generally hard to test too.)

The plan

For our majority tests, we want to be able to see what happens in various scenarios, with tasks completing at different times and in different ways. Looking at the test cases I’ve implemented I have the following tests:

  • NullSequenceOfTasks
  • EmptySequenceOfTasks
  • NullReferencesWithinSequence
  • SimpleSuccess
  • InputOrderIsIrrelevant
  • MajorityWithSomeDisagreement
  • MajorityWithFailureTask
  • EarlyFailure
  • NoMajority

I’m not going to claim this is a comprehensive set of possible tests – it’s a proof of concept more than anything else. Let’s take one test as an example: MajorityWithFailureTask. The aim of this is to pass three tasks (of type Task<string>) into the method. One will give a result of "x", the second will fail with an exception, and the third will also give a result of "x". The events will occur in that order, and only when all three results are in should the returned task complete, at which point it will also have a success result of "x".

So, the tricky bit (compared with normal testing) is introducing the timing. We want to make it appear as if tasks are completing in a particular order, at predetermined times, so we can check the state of the result between events.

Introducing the TimeMachine class

Okay, so it’s a silly name. But the basic idea is to have something to control the logical flow of time through our test. We’re going to ask the TimeMachine to provide us with tasks which will act in a particular way at a given time, and then when we’ve started our async method we can then ask it to move time forward, letting the tasks complete as they go. It’s probably best to look at the code for MajorityWithFailureTask first, and then see what the implementation of TimeMachine looks like. Here’s the test:

[Test]
public void MajorityWithFailureTask()
{
    var timeMachine = new TimeMachine();
    // Second task gives a different result
    var task1 = timeMachine.AddSuccessTask(1, "x");
    var task2 = timeMachine.AddFaultingTask<string>(2, new Exception("Bang!"));
    var task3 = timeMachine.AddSuccessTask(3, "x");

    var resultTask = MoreTaskEx.WhenMajority(task1, task2, task3);
    Assert.IsFalse(resultTask.IsCompleted);

    // Only one result so far – no consensus
    timeMachine.AdvanceTo(1);
    Assert.IsFalse(resultTask.IsCompleted);

    // Second result is a failure
    timeMachine.AdvanceTo(2);
    Assert.IsFalse(resultTask.IsCompleted);

    // Third result gives majority verdict
    timeMachine.AdvanceTo(3);
    Assert.AreEqual(TaskStatus.RanToCompletion, resultTask.Status);
    Assert.AreEqual("x", resultTask.Result);
}

As you can see, there are two types of method:

  • AddSuccessTask / AddFaultingTask / AddCancelTask (not used here) – these all take the time at which they’re going to complete as their first parameter, and the method name describes the state they’ll reach on completion. The methods return the task created by the time machine, ready to pass into the production code we’re testing.
  • AdvanceTo / AdvanceBy (not used here) – make the time machine "advance time", completing pre-programmed tasks as it goes. When those tasks complete, any continuations attached to them also execute, which is how the whole thing hangs together.

Now forcing tasks to complete is actually pretty simple, if you build them out of TaskCompletionSource<T> to start with. So all we need to do is keep our tasks in "time" order (which I achieve with SortedList), and then when we’re asked to advance time we move through the list and take the appropriate action for all the tasks which weren’t completed before, but are now. I represent the "appropriate action" as a simple Action, which is built with a lambda expression from each of the Add methods. It’s really simple:

public class TimeMachine
{
    private int currentTime = 0;
    private readonly SortedList<int, Action> actions = new SortedList<int, Action>();

    public int CurrentTime { get { return currentTime; } }

    public void AdvanceBy(int time)
    {
        AdvanceTo(currentTime + time);
    }

    public void AdvanceTo(int time)
    {
        // Okay, not terribly efficient, but it’s simple.
        foreach (var entry in actions)
        {
            if (entry.Key > currentTime && entry.Key <= time)
            {
                entry.Value();
            }
        }
        currentTime = time;
    }

    public Task<T> AddSuccessTask<T>(int time, T result)
    {
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetResult(result);
        return tcs.Task;
    }

    public Task<T> AddCancelTask<T>(int time)
    {
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetCanceled();
        return tcs.Task;
    }

    public Task<T> AddFaultingTask<T>(int time, Exception e)
    {
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetException(e);
        return tcs.Task;
    }
}

Okay, that’s a fair amount of code for a blog posts (and yes, it could do with some doc comments etc!) but considering that it makes life testable, it’s pretty simple.

So, is that it?

It works on my machine… with my test runner… in simple cases…

When I first ran the tests using TimeMachine, they worked almost immediately. This didn’t surprise me nearly as much as it should have done. You see, when the tests execute, they use async/await in the normal way – which means the continuations are scheduled on "the current task scheduler". I have no idea what the current task scheduler is in unit tests. Or rather, it feels like something which is implementation specific. It could easily have worked when running the tests from ReSharper, but not from NCrunch, or not from the command line NUnit test runner.

As it happens, I believe all of these run tests on thread pool threads with no task scheduler allocated, which means that the continuation is attached to the task to complete "in-line" – so when the TimeMachine sets the result on a TaskCompletionSource, the continuations execute before that call returns. That means everything happens on one thread, with no ambiguity or flakiness – yay!

However, there are two problems:

  • The words "I believe" aren’t exactly confidence-inspiring when it comes to testing that your software works correctly.
  • Our majority voting code only ever sees one completed task at a time – we’re not testing the situation where several tasks complete so quickly together that the continuation doesn’t get chance to run before they’ve all finished.

Both of these are solvable with a custom TaskScheduler or SynchronizationContext. Without diving into the docs, I’m not sure yet which I’ll need, but the aim will be:

  • Make TimeMachine implement IDisposable
  • In the constructor, set the current SynchronizationContext (or TaskScheduler) to a custom one having remembered what the previous one was
  • On disposal, reset the context
  • Make the custom scheduler keep a queue of jobs, such that when we’re asked to advance to time T, we complete all the appropriate tasks but don’t execute any continuations, then we execute all the pending continuations.

I don’t yet know how hard it will be, but hopefully the Parallel Extensions Samples will help me.

Conclusion

I’m not going to claim this is "the" way of unit testing asynchronous methods. It’s clearly a proof-of-concept implementation of what can only be called a "test framework" in the loosest possible sense. However, I hope it gives an example of a path we might take. I’m looking forward to seeing what others come up with, along with rather more polished implementations.

Next time, I’m going to shamelessly steal an idea that a reader mailed me (with permission, of course). It’s insanely cool, simple and yet slightly brain-bending, and I suspect will handy in many situations. Love it.

Eduasync part 15: implementing COMEFROM with a horrible hack

Ages ago when I wrote my previous Eduasync post, I said we’d look at a pipeline model of coroutines. I’ve decided to skip that, as I do want to cover the topic of this post, and I’ve got some more "normal" async ideas to write about too. If you want to look at the pipeline coroutines code, it’s project 20 in the source repository. Have fun, and don’t blame me if you get confused reading it – so do I.

The code I am going to write about is horrible too. It’s almost as tricky to understand, and it does far nastier things. Things that the C# 5 specification explicitly says you shouldn’t do.

If it makes you feel any better when your head hurts reading this code, spare a thought for me – I haven’t looked at it in over six months, and I don’t have a blog post explaining how it’s meant to work. I just have almost entirely uncommented code which is designed to be hard to understand (in terms of the main program flow).

On no account should any code like this ever be used for anything remotely serious.

With that health warning out of the way, let’s have a look at it…

COMEFROM at the caller level

The idea is to implement the COMEFROM control structure, which is sort of the opposite of GOTO (or in my implementation, more of a GOSUB). There are two operations, effectively:

  • ComeFrom(label): Register interest in a particular label.
  • Label(label): If anyone has registered interested in the given label, keep going from their registration point (which will be within a method), then continue from where we left off afterwards.

In some senses it’s a little like the observer pattern, with labels taking the place of events. However, it looks entirely different and is much harder to get your head round, because instead of having a nicely-encapsulated action which is subscribed to an event, we just have a ComeFrom call which lets us jump back into a method somewhat arbitrarily.

I have two implementations, in project 22 and project 23 in source control. Project 22 is almost sane; a little funky, but not too bad. Project 23 is where the fun really happens. In addition to the operations listed above, there’s an Execute operation which is sort of an implementation detail – it allows an async method containing ComeFrom calls to be executed without returning earlier than we might want.

Let’s look at some code and the output, and try to work out what’s going on.

internal class Program
{
    private static void Main(string[] args)
    {
        Coordinator coordinator = new Coordinator(SimpleEntryPoint);
        coordinator.Start();
    }

    private static async void SimpleEntryPoint(Coordinator coordinator)
    {
        await coordinator.Execute(SimpleOtherMethod);

        Console.WriteLine("First call to Label(x)");
        await coordinator.Label("x");

        Console.WriteLine("Second call to Label(x)");
        await coordinator.Label("x");

        Console.WriteLine("Registering interesting in y");
        bool firstTime = true;
        await coordinator.ComeFrom("y");

        Console.WriteLine("After ComeFrom(y). FirstTime={0}", firstTime);

        if (firstTime)
        {
            firstTime = false;
            await coordinator.Label("y");
        }
        Console.WriteLine("Finished");
    }

    private static async void SimpleOtherMethod(Coordinator coordinator)
    {
        Console.WriteLine("Start of SimpleOtherMethod");

        int count = 0;
        await coordinator.ComeFrom("x");

        Console.WriteLine("After ComeFrom x in SimpleOtherMethod. count={0}. Returning.",
                          count);
        count++;
    }
}

The reason for the "Simple" prefix on the method names is that there’s another example in the same file, with a more complex control flow.

Here’s the output – then we can look at why we’re getting it…

Start of SimpleOtherMethod
After ComeFrom x in SimpleOtherMethod. count=0. Returning.
First call to Label(x)
After ComeFrom x in SimpleOtherMethod. count=1. Returning.
Second call to Label(x)
After ComeFrom x in SimpleOtherMethod. count=2. Returning.
Registering interesting in y
After ComeFrom(y). FirstTime=True
After ComeFrom(y). FirstTime=False
Finished

So, the control flow is a bit like this:

  • Start SimpleEntryPoint
    • Call into SimpleOtherMethod
      • Log "Start of SimpleOtherMethod"
      • Initialize the "count" variable with value 0
      • Register interest in x; ComeFrom remembers the continuation but keeps going.
      • Log "After ComeFrom x in SimpleOtherMethod. count=0. Returning."
      • Increment count to 1.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "First call to Label(x)"
  • Call Label("x")…
    • … which takes us back into SimpleOtherMethod (remember, the method we thought we’d finished executing?) just after ComeFrom
      • Log AfterComeFrom x in SimpleOtherMethod. count=1. Returning.
      • Increment count to 2.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "Second call to Label(x)"
  • Call Label("x")…
    • … which takes us back into SimpleOtherMethod again
      • Log AfterComeFrom x in SimpleOtherMethod. count=2. Returning.
      • Increment count to 3.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "Registering interest in y"
  • Initialize the "firstTime" variable with value true.
  • Register interest in y; ComeFrom remembers the continuation and keeps going
  • Log "After ComeFrom(y). FirstTime=True"
  • Check the value of firstTime… It’s true, so:
    • Set firstTime to false
    • Call Label("y")
  • … which takes us back to earlier in the method (just after ComeFrom), like a normal looping construct…
  • Log "After ComeFrom(y). FirstTime=False"
  • Check the value of firstTime… It’s false, so:
    • Log "Finished"
    • Exit!

Doing all of this has a few interesting challenges. Let’s look at them one at a time… and I would strongly advise you not to try to pay too much attention to the details.

Noting a continuation and continuing regardless…

Just as a quick reminder before we get cracking, it’s worth remembering that all of this is entirely synchronous, despite being implemented with async. There’s only a single user thread involved here. As with previous parts, we maintain a stack of actions to call, and basically keep calling from the top until we’re done – but the actions we call can create extra stack entries, of course.

ComeFrom has unusual semantics in terms of async. We want to remember the continuation and keep executing as if we didn’t need to wait. We can easily do one side or the other. If we wanted to just keep going without needing to know about the continuation, we could just return true from IsCompleted. If we just want to remember the continuation, we can make the awaiter’s IsCompleted property return false, and remember the continuation when it’s passed to OnCompleted. How do we do both?

Well, effectively we want to remember the continuation and then call it immediately. But we can’t just call it directly from OnCompleted, as otherwise each ComeFrom call would end up in a "real" execution stack from, whereas our execution stack is stored as a Stack<Action>. So instead, we need to remember the continuation and immediately put it at the top of the stack.

However, that only works if as soon as the generated code returns from the async method containing the ComeFrom call, we go back into the state machine. If we’d just called SimpleOtherMethod directly in SimpleEntryPoint, we would have continued within SimpleEntryPoint with the new stack entry just waiting around. This is why we need the Executor method: that does exactly the same thing, effectively shuffling the stack around. When it’s given something to execute, it puts its own continuation on the action stack, then the action it’s been asked to execute, then returns. The top level code will then pick up the original action, and we’re away.

So, here’s the code for Execute, which is the simplest part of the coordinator:

public ExecuteAwaiter Execute(Action<Coordinator> action)
{
    return new ExecuteAwaiter(() => action(this), this);
}

public class ExecuteAwaiter
{
    private readonly Action action;
    private readonly Coordinator coordinator;

    internal ExecuteAwaiter(Action action, Coordinator coordinator)
    {
        this.action = action;
        this.coordinator = coordinator;
    }

    public ExecuteAwaiter GetAwaiter()
    {
        return this;
    }

    // Always yield
    public bool IsCompleted { get { return false; } }

    public void OnCompleted(Action callerContinuation)
    {
        // We want to execute the action continuation, then get back here,
        // allowing any extra continuations put on the stack *within* the action
        // to be executed.
        coordinator.stack.Push(callerContinuation);
        coordinator.stack.Push(action);
    }

    public void GetResult()
    {
    }
}

All the awaitables in this project return themselves as the awaiter – when you don’t need any other state, it’s an easy step to take.

That’s all we need to say about Execute, but how exactly are we capturing the continuation in ComeFrom?

Capturing continuations

Once we’ve got the action stack shuffling under our belts, there are two more problems with ComeFrom:

  • What happens if we ComeFrom the same label twice?
  • How do we really capture a continuation?

The first point didn’t come up in the sample I’ve shown here, but it does come up in the more complex example – imagine if SimpleOtherMethod had two ComeFrom calls; when we jump back to the first one, we’ll execute the second one again. I made a simple policy decision to only allow a single "return point" for any label – if a ComeFrom call tries to register the existing continuation point for a label, we ignore it; otherwise we throw an exception. So we only need to care about a single continuation for any label, which makes life easier.

The second point is trickier. If you remember back to earlier posts in this series, we saw that the state machine generated for async only really contains a single entry point (MoveNext) which is used for all continuations. A variable in the state machine is responsible for remembering where we were within it between calls. So in order to really make the continuation remember the point at which it needs to continue, we need to remember that state. We need to store an object for the continuation, which contains the delegate to invoke, and the state of the state machine when we were first passed the continuation. I’ve created a class for this, unimaginatively called Continuation, which looks like this:

/// <summary>
/// This hack allows a continuation to be executed more than once,
/// contrary to the C# spec. It does this using reflection to store the
/// value of the "state" field within the generated class. NEVER, EVER, EVER
/// try to use this in real code. It’s purely for fun.
/// </summary>
internal sealed class Continuation : IEquatable<Continuation>
{
    private readonly int savedState;
    private readonly object target;
    private readonly FieldInfo field;
    private readonly Action action;

    internal Continuation(Action action)
    {
        target = action.Target;
        field = target.GetType().GetField("<>1__state", BindingFlags.Instance | BindingFlags.NonPublic);
        savedState = (int) field.GetValue(target);
        this.action = action;
    }

    internal void Execute()
    {
        field.SetValue(target, savedState);
        action();
    }

    // Snip Equals/GetHashCode
}

Yes, we use reflection to fish out the <>1__state variable initially, and poke the state machine with the same value when we next want to execute the continuation. All highly implementation-specific, of course.

Now the ComeFrom method is reasonably straightforward – all we need is a dictionary mapping labels to continuations. Oh, and the same action stack shuffling as for Execute:

// In the coordinator
private readonly Dictionary<string, Continuation> labels = new Dictionary<string, Continuation>();

public ComeFromAwaiter ComeFrom(string label)
{
    return new ComeFromAwaiter(label, this);
}

public struct ComeFromAwaiter
{
    private readonly string label;
    private readonly Coordinator coordinator;

    internal ComeFromAwaiter(string label, Coordinator coordinator)
    {
        this.label = label;
        this.coordinator = coordinator;
    }

    public ComeFromAwaiter GetAwaiter()
    {
        return this;
    }

    // We *always* want to be given the continuation
    public bool IsCompleted { get { return false; } }

    public void OnCompleted(Action action)
    {
        Continuation newContinuation = new Continuation(action);
        Continuation oldContinuation;
        if (!coordinator.labels.TryGetValue(label, out oldContinuation))
        {
            // First time coming from this label. Always succeeds.
            coordinator.labels[label] = newContinuation;
        }
        else
        {
            // Current semantics are to prohibit two different ComeFrom calls for the same label.
            // An alternative would be to just replace the existing continuation with the new one,
            // in which case we wouldn’t need any of this – we could just use
            // coordinator.labels[label] = newContinuation;
            // unconditionally.
            if (!oldContinuation.Equals(newContinuation))
            {
                throw new InvalidOperationException("Additional continuation detected for label " + label);
            }
            // Okay, we’ve seen this one before. Nothing to see here, move on.
        }
        // We actually want to continue from where we were: we’re only really marking the
        // ComeFrom point.
        coordinator.stack.Push(action);
    }

    public void GetResult()
    {
    }
}

There’s one interesting point here which is somewhat subtle, and screwed me up for a bit…

The default value of a struct is always valid…

You may have noticed that ComeFromAwaiter is a struct. That’s pretty unusual for me. However, it’s also absolutely critical. Without it, we’d get a NullReferenceException when we execute the continuation the second time.

Normally, the flow of async methods looks a bit like this, for an await expression taking the "long" route (i.e. IsCompleted is false):

  • Call GetAwaiter() and assign the result to an awaiter field
  • Call IsCompleted (which returns false in this scenario)
  • Set the state variable to remember where we’d got to
  • Call OnCompleted
  • Return
  • … When we continue…
  • Set state to 0 (running)
  • Call GetResult() on the awaiter
  • Set the awaiter field to default(TypeOfAwaiter)
  • Continue

Now that’s fine when we’re only continuing once – but if we need to jump into the middle of that sequence a second time, we’re going to call GetAwaiter() on the awaiter field after it’s been set to the default value of the awaiter type. If the default value is null, we’ll go bang. So we must use a struct.

Fortunately, our GetResult() call doesn’t need any of the state in the awaiter – it’s purely there to satisfy the normal flow of things. So we’re quite happy with a "default" ComeFrom awaiter.

Finally, labels…

We’ve now done all the hard work. The final piece of the puzzle is Label, which just needs to check whether there’s a continuation to jump to, and shuffle the action stack in the way we’re now painfully accustomed to:

public LabelAwaiter Label(string label)
{
    Continuation continuation;
    labels.TryGetValue(label, out continuation);
    return new LabelAwaiter(continuation, this);
}

public class LabelAwaiter
{
    private readonly Continuation continuation;
    private readonly Coordinator coordinator;

    internal LabelAwaiter(Continuation continuation, Coordinator coordinator)
    {
        this.continuation = continuation;
        this.coordinator = coordinator;
    }

    public LabelAwaiter GetAwaiter()
    {
        return this;
    }

    // If there’s no continuation to execute, just breeze through.
    public bool IsCompleted { get { return continuation == null; } }

    public void OnCompleted(Action action)
    {
        // We want to execute the ComeFrom continuation, then get back here.
        coordinator.stack.Push(action);
        coordinator.stack.Push(continuation.Execute);
    }

    public void GetResult()
    {
    }
}

Almost painfully simple, really.

So that looks like all the code that’s used, right? Not quite

Reusable builders?

As we saw in the sample code, we can end up finishing the same async method multiple times (SimpleOtherMethod completes three times). That’s going to call SetResult on the AsyncVoidMethodBuilder three times… which feels like it should go bang. Indeed, when I revisited my code earlier I wondered why it didn’t go bang – it’s the sort of illegal state transition the framework is usually pretty good at picking up on.

Then I remembered – this isn’t the framework’s AsyncVoidMethodBuilder – it’s mine. And my SetResult method in this project does absolutely nothing. How convenient!

Make it stop, make it stop!

Okay thiat was a pretty quick tour of some horrible code. You’ll never have to do anything like this with async in sane code, but it certainly made me painfully familiar with how it all worked. Just to recap on the oddities involved:

  • We needed to capture a continuation and then immediately keep going, almost as if the awaiter had said the awaitable had completed already. This involved shenanigans with the execution model and an extra method (Execute)
  • We needed to remember the state of a continuation, which we did with reflection.
  • We needed to make awaiter.GetResult() a valid call after awaiter had been reset to the default value for the type
  • We needed to ensure that the builder created in the skeleton method could have SetResult called on it multiple times

That’s all on continuations and co-routines, I promise.

Next time (hopefully soon) I’ll look at an example of how composition works so neatly in async, and then show how we can unit test async methods – at least sometimes.

Speaking engagement: Progressive .NET, London, September 7th

Just a quick note to mention an event I’ll be speaking at in September. SkillsMatter will be hosting Progressive .NET, a 3-day event set of tutorials on September 5th-7th in London. I’ll be speaking about C# 5’s async feature on the last day (9.30am-1pm) but there’s a host of other speakers too. Should be good. For my own part, with four hours or so to cover async, I should be able to cover both the high level stuff and the implementation details, with plenty of time for the inevitable questions.

This one isn’t free though, I’m afraid – it’s normally £425. Hardly pocket money, but pretty good value for three full days of deep-dive sessions. However, there are two bits of good news:

  • Readers of this blog can get £50 off using the promo code "PROGNET50" at the checkout.
  • I have two free tickets to give away.

In an effort to make the ticket give-away fair, I’m thinking of a 32-bit number – mail me (skeet@pobox.com) an Int32, and the two readers with the closest value will get the tickets. Please include "Progressive .NET" in the subject line of the mail so I can filter them easily :)

Anyway, hope to see you there – please grab me to say hi.

Update (August 4th): and the winners are…

Congratulations to The Configurator and Haris Hasan who submitted the closest numbers to the one I was thinking of: -890978631.

In fact, The Configurator guessed the exact value – which is the result of calling "Progressive .NET".GetHashCode() on my 32-bit laptop running .NET 4. (I can’t remember which versions have different hash algorithms, but as it’s pretty arbitrary, it seemed good enough…) I’m impressed!

I’ll be emailing SkillsMatter to let them know about the winners – and thanks to everyone else who mailed me a guess. Hope I’ll see some of you there anyway!

Eduasync part 14: Data passing in coroutines

(This post covers project 19 in the source code.)

Last time we looked at independent coroutines running in a round-robin fashion. This time we’ll keep the round-robin scheduling, but add in the idea of passing data from one coroutine to another. Each coroutine will act on data of the same type, which is necessary for the scheme to work when one coroutine could "drop out" of the chain by returning.

Designing the data flow

It took me a while to get to the stage where I was happy with the design of how data flowed around these coroutines. I knew I wanted a coordinator as before, and that it should have a Yield method taking the value to pass to the next coroutine and returning an awaitable which would provide the next value when it completed. The tricky part was working out what to do at the start of each method and the end. If the method just took a Coordinator parameter, we wouldn’t have anything to do with the value yielded by the first coroutine, because the second coroutine wouldn’t be ready to accept it yet. Likewise when a coroutine completed, we wouldn’t have another value to pass to the next coroutine.

Writing these dilemmas out in this post, the solution seems blindingly obvious of course: each coroutine should accept a data value on entry, and return one at the end. At any point where we transfer control, we provide a value and have a value which is required by something. The final twist is to make the coordinator’s Start method take an initial value and return the value returned by the last coroutine to complete.

So, that’s the theory… let’s look at the implementation.

Initialization

I’ve changed the coordinator to take all the coroutines as a constructor parameter (of the somewhat fearsome declaration "params Func<Coordinator<T>, T, Task<T>>[] coroutines") which means we don’t need to implement IEnumerable pointlessly any more.

This leads to a code skeleton of this form:

private static void Main(string[] args)
{
    var coordinator = new Coordinator<string>(FirstCoroutine,
                                              SecondCoroutine,
                                              ThirdCoroutine);
    string finalResult = coordinator.Start("m1");
    Console.WriteLine("Final result: {0}", finalResult);
}

private static async Task<string> FirstCoroutine(
    Coordinator<string> coordinator,
    string initialValue)
{
    …
}

// Same signature for SecondCoroutine and ThirdCoroutine

Last time we simply had a Queue<Action> internally in the coordinator as the actions to invoke. You might be expecting a Queue<Func<T, T>> this time – after all, we’re passing in data and returning data at each point. However, the mechanism for that data transfer is "out of band" so to speak. The only time we really "return" an item is when we reach the end of a coroutine. Usually we’ll be providing data to the next step using a method. Likewise the only time a coroutine is given data directly is in the first call – after that, it will have to fetch the value by calling GetResult() on the awaiter which it uses to yield control.

All of this is leading to a requirement for our constructor to convert each coroutine delegate into a simple Action. The trick is working out how to deal with the data flow. I’m going to include SupplyValue() and ConsumeValue() methods within the coordinator for the awaiter to use, so it’s just a case of calling those appropriately from our action. In particular:

  • When the action is called, it should consume the current value.
  • It should then call the coroutine passing in the coordinator ("this") and the initial value.
  • When the task returned by the coroutine has completed, the result of that task should be used to supply a new value.

The only tricky part here is the last bullet – and it’s not that hard really, so long as we remember that we’re absolutely not trying to start any new threads. We just want to hook onto the end of the task, getting a chance to supply the value before the next coroutine tries to pick it up. We can do that using Task.ContinueWith, but passing in TaskContinuationOptions.ExecuteSynchronously so that we use the same thread that the task completes on to execute the continuation.

At this point we can implement the initialization part of the coordinator, assuming the presence of SupplyValue() and ConsumeValue():

public sealed class Coordinator<T>
{
    private readonly Queue<Action> actions;
    private readonly Awaitable awaitable;

    public Coordinator(params Func<Coordinator<T>, T, Task<T>>[] coroutines)
    {
        // We can’t refer to "this" in the variable initializer. We can use
        // the same awaitable for all yield calls.
        this.awaitable = new Awaitable(this);
        actions = new Queue<Action>(coroutines.Select(ConvertCoroutine));
    }

    // Converts a coroutine into an action which consumes the current value,
    // calls the coroutine, and attaches a continuation to it so that the return
    // value is used as the new value.
    private Action ConvertCoroutine(Func<Coordinator<T>, T, Task<T>> coroutine)
    {
        return () =>
        {
            Task<T> task = coroutine(this, ConsumeValue());
            task.ContinueWith(ignored => SupplyValue(task.Result),
                TaskContinuationOptions.ExecuteSynchronously);
        };
    }
}

I’ve broken ConvertCoroutine into a separate method so that we can use it as the projection for the Select call within the constructor. I did initially have it within a lambda expression within the constructor, but it was utterly hideous in terms of readabililty.

One suggestion I’ve received is that I could declare a new delegate type instead of using Func<Coordinator<T>, T, Task<T>> to represent a coroutine. This could either be a non-generic delegate nested in the generic coordinator class, or a generic stand-alone delegate:

public delegate T Coroutine<T>(Coordinator<T> coordinator, T initialValue);

// Or nested…
public sealed class Coordinator<T>
{
    public delegate T Coroutine(Coordinator<T> coordinator, T initialValue);
}

Both of these would work perfectly well. I haven’t made the change at the moment, but it’s certainly worth considering. The debate about whether to use custom delegate types or Func/Action is one for another blog post, I think :)

The one bit of the initialization I haven’t explained yet is the "awaitable" field and the Awaitable type. They’re to do with yielding – so let’s look at them now.

Yielding and transferring data

Next we need to work out how we’re going to transfer data and control between the coroutines. As I’ve mentioned, we’re going to use a method within the coordinator, called from the coroutines, to accomplish this. The coroutines have this sort of code:

private static async Task<string> FirstCoroutine(
    Coordinator<string> coordinator,
    string initialValue)
{
    Console.WriteLine("Starting FirstCoroutine with initial value {0}",
                      initialValue);            
    …
    string received = await coordinator.Yield("x1");
    Console.WriteLine("Returned to FirstCoroutine with value {0}", received);
    …
    return "x3";
}

The method name "Yield" here is a double-edged sword. The word has two meanings – yielding a value to be used elsewhere, and yielding control until we’re called back. Normally it’s not ideal to use a name that can mean subtly different things – but in this case we actually want both of these meanings.

So, what does Yield need to do? Well, the flow control should look something like this:

  • Coroutine calls Yield()
  • Yield() calls SupplyValue() internally to remember the new value to be consumed by the next coroutine
  • Yield() returns an awaitable to the coroutine
  • Due to the await expression, the coroutine calls GetAwaiter() on the awaitable to get an awaiter
  • The coroutine checks IsCompleted on the awaiter, which must return false (to prompt the remaining behaviour)
  • The coroutine calls OnCompleted() passing in the continuation for the rest of the method
  • The coroutine returns to its caller
  • The coordinator proceeds with the next coroutine
  • When we eventually get back to this coroutine, it will call GetResult() to get the "current value" to assign to the "received" variable.

Now you’ll see that Yield() needs to return some kind of awaitable type – in other words, one with a GetAwaiter() method. Previously we put this directly on the Coordinator type, and we could have done that here – but I don’t really want anyone to just "await coordinator" accidentally. You should really need to call Yield() in order to get an awaitable. So we have an Awaitable type, nested in Coordinator.

We then need to decide what the awaiter type is – the result of calling GetAwaiter() on the awaitable. This time I decided to use the Coordinator itself. That means people could accidentally call IsCompleted, OnCompleted() or GetResult(), but I figured that wasn’t too bad. If we were to go to the extreme, we’d create another type just for the Awaiter as well. It would need to have a reference to the coordinator of course, in order to actually do its job. As it is, we can make the Awaitable just return the Coordinator that created it. (Awaitable is nested within Coordinator<T>, which is how it can refer to T without being generic itself.)

public sealed class Awaitable
{
    private readonly Coordinator<T> coordinator;

    internal Awaitable(Coordinator<T> coordinator)
    {
        this.coordinator = coordinator;
    }

    public Coordinator<T> GetAwaiter()
    {
        return coordinator;
    }
}

The only state here is the coordinator, which is why we create an instance of Awaitable on the construction of the Coordinator, and keep it around.

Now Yield() is really simple:

public Awaitable Yield(T value)
{
    SupplyValue(value);
    return awaitable;
}

So to recap, we now just need the awaiter members, SupplyValue() and ConsumeValue(). Let’s look at the awaiter members (in Coordinator) to start with. We already know that IsCompleted will just return false. OnCompleted() just needs to stash the continuation in the queue, and GetResult() just needs to consume the "current" value and return it:

public bool IsCompleted { get { return false; } }

public void OnCompleted(Action continuation)
{
    actions.Enqueue(continuation);
}

public T GetResult()
{
    return ConsumeValue();
}

Simple, huh? Finally, consuming and supplying values:

private T currentValue;
private bool valuePresent;

private void SupplyValue(T value)
{
    if (valuePresent)
    {
        throw new InvalidOperationException
            ("Attempt to supply value when one is already present");
    }
    currentValue = value;
    valuePresent = true;
}

private T ConsumeValue()
{
    if (!valuePresent)
    {
        throw new InvalidOperationException
            ("Attempt to consume value when it isn’t present");
    }
    T oldValue = currentValue;
    valuePresent = false;
    currentValue = default(T);
    return oldValue;
}

These are relatively long methods (compared with the other ones I’ve shown) but pretty simple. Hopefully they don’t need explanation :)

The results

Now that everything’s in place, we can run it. I haven’t posted the full code of the coroutines, but you can see it on Google Code. Hopefully the results speak for themselves though – you can see the relevant values passing from one coroutine to another (and in and out of the Start method).

Starting FirstCoroutine with initial value m1
Yielding ‘x1’ from FirstCoroutine…
    Starting SecondCoroutine with initial value x1
    Starting SecondCoroutine
    Yielding ‘y1’ from SecondCoroutine…
        Starting ThirdCoroutine with initial value y1
        Yielding ‘z1’ from ThirdCoroutine…
Returned to FirstCoroutine with value z1
Yielding ‘x2’ from FirstCoroutine…
    Returned to SecondCoroutine with value x2
    Yielding ‘y2’ from SecondCoroutine…
        Returned to ThirdCoroutine with value y2
        Finished ThirdCoroutine…
Returned to FirstCoroutine with value z2
Finished FirstCoroutine
    Returned to SecondCoroutine with value x3
    Yielding ‘y3’ from SecondCoroutine…
    Returned to SecondCoroutine with value y3
    Finished SecondCoroutine
Final result: y4

Conclusion

I’m not going to claim this is the world’s most useful coroutine model – or indeed useful at all. As ever, I’m more interested in thinking about how data and control flow can be modelled than actual usefulness.

In this case, it was the realization that everything should accept and return a value of the same type which really made it all work. After that, the actual code is pretty straightforward. (At least, I think it is – please let me know if any bits are confusing, and I’ll try to elaborate on them.)

Next time we’ll look at something more like a pipeline model – something remarkably reminiscent of LINQ, but without taking up as much stack space (and with vastly worse readability, of course). Unfortunately the current code reaches the limits of my ability to understand why it works, which means it far exceeds my ability to explain why it works. Hopefully I can simplify it a bit over the next few days.

Eduasync part 13: first look at coroutines with async

(This part covers project 18 in the source code.)

As I mentioned in earlier parts, the "awaiting" part of async methods is in no way limited to tasks. So long as we have a suitable GetAwaiter() method which returns a value of a type which in turn has suitable methods on it, the compiler doesn’t really care what’s going on. It’s time to exploit that to implement some form of coroutines in C#.

Introduction to coroutines

The fundamental idea of coroutines is to have multiple methods executing cooperatively, each of them maintaining their position within the coroutine when they yield to another. You can almost think of them as executing in multiple threads, with only one thread actually running at a time, and signalling between the different threads to control flow. However, we don’t really need multiple threads once we’ve got continuations – we can have a single thread with a complex flow of continuations, and still only a very short "real" stack. (The control flow is stored in normal collections instead of being implicit on the thread’s stack.)

Coroutines were already feasible in C# through the use of iterator blocks, but the async feature of C# allows a slightly more natural way of expressing them, in my view. (The linked Wikipedia page gives a sketch of how coroutines can be built on top of generators, which in the general concept that iterator blocks implement in C#.)

I have implemented various flavours of coroutines in Eduasync. It’s possible that some (all?) of them shouldn’t strictly be called coroutines, but they’re close enough to the real thing in feeling. This is far from an exhaustive set of approaches. Once you’ve got the basic idea of what I’m doing, you may well want to experiment with your own implementations.

I’m not going to claim that the use of coroutines in any of my examples really makes any sense in terms of making real tasks easier. This is purely for the sake of interest and twisting the async feature for fun.

Round-robin independent coroutines

Our first implementation of coroutines is relatively simple. A coordinator effectively "schedules" the coroutines it’s set up with in a round-robin fashion: when one of the coroutines yields control to the coordinator, the coordinator remembers where the coroutine had got to, and then starts the next one. When each coroutine has executed its first piece of code and yielded control, the coordinator will go back to the first coroutine to continue execution, and so on until all coroutines have completed.

The coroutines don’t know about each other, and no data is being passed between them.

Hopefully it’s reasonably obvious that the coordinator contains all the smarts here – the coroutines themselves can be relatively dumb. Let’s look at what the client code looks like (along with the results) before we get to the coordinator code.

Client code

The sample code contains three coroutines, all of which take a Coordinator parameter and have a void return type. These are passed to a new coordinator using a collection initializer and method group conversions; the coordinator is then started. Here’s the entry point code for this:

private static void Main(string[] args)
{
    var coordinator = new Coordinator { 
        FirstCoroutine,
        SecondCoroutine,
        ThirdCoroutine
    };
    coordinator.Start();
}

When each coroutine is initially started, the coordinator passes a reference to itself as the argument to the coroutine. That’s how we solve the chicken-and-egg problem of the coroutine and coordinator having to know about each other. The way a coroutine yields control is simply by awaiting the coordinator. The result type of this await expression is void – it’s just a way of "pausing" the coroutine.

We’re not doing anything interesting in the actual coroutines – just tracing the execution flow. Of course we could do anything we wanted, within reason. We could even await a genuinely asynchronous task such as fetching a web page asynchronously. In that case the whole coroutine collection would be "paused" until the fetch returned.

Here’s the code for the first coroutine – the second and third ones are similar, but use different indentation for clarity. The third coroutine is also shorter, just for fun – it only awaits the coordinator once.

private static async void FirstCoroutine(Coordinator coordinator)
{
    Console.WriteLine("Starting FirstCoroutine");
    Console.WriteLine("Yielding from FirstCoroutine…");

    await coordinator;

    Console.WriteLine("Returned to FirstCoroutine");
    Console.WriteLine("Yielding from FirstCoroutine again…");

    await coordinator;

    Console.WriteLine("Returned to FirstCoroutine again");
    Console.WriteLine("Finished FirstCoroutine");
}

And here’s the output…

Starting FirstCoroutine
Yielding from FirstCoroutine…
    Starting SecondCoroutine
    Yielding from SecondCoroutine…
        Starting ThirdCoroutine
        Yielding from ThirdCoroutine…
Returned to FirstCoroutine
Yielding from FirstCoroutine again…
    Returned to SecondCoroutine
    Yielding from SecondCoroutine again…
        Returned to ThirdCoroutine
        Finished ThirdCoroutine…
Returned to FirstCoroutine again
Finished FirstCoroutine
    Returned to SecondCoroutine again
    Finished SecondCoroutine

Hopefully that’s the output you expected, given the earlier description. Again it may help if you think of the coroutines as running in separate pseudo-threads: the execution within each pseudo-thread is just linear, and the timing is controlled by our explicit "await" expressions. All of this would actually be pretty easy to implement using multiple threads which really did just block on each await expression – but the fun part is keeping it all in one real thread. Let’s have a look at the coordinator.

The Coordinator class

Some of the later coroutine examples end up being slightly brainbusting, at least for me. This one is relatively straightforward though, once you’ve got the basic idea. All we need is a queue of actions to execute. After initialization, we want our queue to contain the coroutine starting points.

When a coroutine yields control, we just need to add the remainder of it to the end of the queue, and move on to the next item. Obviously the async infrastructure will provide "the remainder of the coroutine" as a continuation via the OnContinue method.

When a coroutine just returns, we continue with the next item in the queue as before – it’s just that we won’t add a continuation to the end of the queue. Eventually (well, hopefully) we’ll end up with an empty queue, at which point we can stop.

Initialization and a choice of data structures

We’ll represent our queue using Queue<T> where the T is a delegate type. We have two choices here though, because we have two kinds of delegate – one which takes the Coordinator as a parameter (for the initial coroutine setup) and one which has no parameters (for the continuations). Fortunately we can convert between the two in either direction very simply, bearing in mind that all of this is within the context of a coordinator. For example:

// If we’re given a coroutine and want a plain Action
Action<Coordinator> coroutine = …; 
Action action = () => coroutine(this);

// If we’re given a plain Action and want an Action<Continuation>:
Action continuation = …; 
Action<Coordinator> coroutine = ignored => continuation();

I’ve arbitrarily chosen to use the first option, so there’s a Queue<Action> internally.

Now we need to get the collection initializer working. The C# compiler requires an appropriate Add method (which is easy) and also checks that the type implements IEnumerable. We don’t really need to be able to iterate over the queue of actions, so I’ve use explicit interface implementation to reduce the likelihood of GetEnumerator() being called inappropriately, and made the method throw an exception for good measure. That gives us the skeleton of the class required for setting up:

public sealed class Coordinator : IEnumerable
{
    private readonly Queue<Action> actions = new Queue<Action>();

    // Used by collection initializer to specify the coroutines to run
    public void Add(Action<Coordinator> coroutine)
    {
        actions.Enqueue(() => coroutine(this));
    }

    // Required for collection initializers, but we don’t really want
    // to expose anything.
    IEnumerator IEnumerable.GetEnumerator()
    {
        throw new NotSupportedException("IEnumerable only supported to enable collection initializers");
    }
}

(Note that I haven’t used XML documentation anywhere here – it’s great for real code, but adds clutter in blog posts.)

For production code I’d probably prevent Add from being called after the coordinator had been started, but there’s no need to do it in our well-behaved sample code. We’re only going to add extra actions to the queue via continuations, which will be added due to await expressions.

The main execution loop and async infrastructure

So far we’ve got code to register coroutines in the queue – so now we need to execute them. Bearing in mind that the actions themselves will be responsible for adding continuations, the main loop of the coordinator is embarrassingly simple:

// Execute actions in the queue until it’s empty. Actions add *more*
// actions (continuations) to the queue by awaiting this coordinator.
public void Start()
{
    while (actions.Count > 0)
    {
        actions.Dequeue().Invoke();
    }
}

Of course, the interesting bit is the code which supports the async methods and await expressions. We know we need to provide a GetAwaiter() method, but what should that return? Well, we’re just going to use the awaiter to add a continuation to the coordinator’s queue. It’s got no other state than that – so we might as well return the coordinator itself, and put the other infrastructure methods directly in the coordinator.

Again, this is slightly ugly, as the extra methods don’t really make sense on the coordinator – we wouldn’t want to call them directly from client code, for example. However, they’re fairly irrelevant – we could always create a nested type which just had a reference to its "parent" coordinator if we wanted to. For simplicity, I haven’t bothered with this – I’ve just implemented GetAwaiter() trivially:

// Used by await expressions to get an awaiter
public Coordinator GetAwaiter()
{
    return this;
}

So, that leaves just three members still to implement: IsCompleted, OnCompleted and GetResult. We always want the IsCompleted property to return false, as otherwise the coroutine will just continue executing immediately without returning to cede control; the await expression would be pointless. OnCompleted just needs to add the continuation to the end of the queue – we don’t need to attach it to a task, or anything like that. Finally, GetResult is a no-op – we have no results, no exceptions, and basically nothing to do. You might want to add a bit of logging here, if you were so inclined, but there’s no real need.

So, here are the final three members of Coordinator:

// Force await to yield control
public bool IsCompleted { get { return false; } }

public void OnCompleted(Action continuation)
{
    // Put the continuation at the end of the queue, ready to
    // execute when the other coroutines have had a go.
    actions.Enqueue(continuation);
}

public void GetResult()
{
    // Our await expressions are void, and we never need to throw
    // an exception, so this is a no-op.
}

And that’s it! Fewer than 50 lines of code required, and nothing complicated at all. The interesting behaviour is all due to the way the C# compiler uses the coordinator when awaiting it.

We need AsyncVoidMethodBuilder as before, as we have some async void methods – but that doesn’t need to do anything significant. That’s basically all the code required to implement these basic round-robin coroutines.

Conclusion

Our first foray into the weird and wonderful world of coroutines was relatively tame. The basic idea of a coordinator keeping track of the state of all the different coroutines in one sense or another will keep coming back to us, but with different ways of controlling the execution flow.

Next time we’ll see some coroutines which can pass data to each other.