All posts by jonskeet

Mad props to @arcaderage for the "Princess Rescue" image - see https://toggl.com/programming-princess for the full original

Book Review: Fluent C# (Rebecca Riordan, Sams)

(As usual, I will be sending the publisher a copy of this review to give them and the author a chance to reply to it before I publish it to the blog. Other than including their comments and correcting any factual mistakes they may point out, I don’t intend to change the review itself.)

Resources:

Introduction and disclaimers

In late October, Sams (the publisher) approached me to ask if I’d be interested in reviewing their newest introductory book on C#. Despite my burgeoning review stack, I said I was interested – I’m always on the lookout for good books to recommend. So, the first disclaimer is that this was a review copy – I didn’t have to pay for it. I don’t believe that has biased this review though.

Second disclaimer: obviously as C# in Depth is also "a book about C#" you might be wondering whether the two books are competitors. I don’t believe this is the case: Fluent C# explicitly talks about its target audience, which is primarily complete newcomers to programming. C# in Depth pretty much requires you to know at least C# 1, or perhaps be very comfortable with a similar language such as Java. I find it hard to imagine someone for whom both books would be suitable.

Obviously that puts me firmly out of the target audience. As I’ve written before, if you think the two most important questions to answer in a technical book review are "Is it accurate?" and "How good is at teaching its topic?" then any one person will find it hard to answer both questions. Although I’m far from an expert in some of the areas of the book – notably WPF – I’m sure I don’t have the same approach as a true newcomer. In particular, I find myself asking the questions I’d need the answers to in order to develop software professionally: how do I test it? How does the deployment model work? How does the data flow? These aren’t the same concerns as someone who is coming to programming for the first time. This review should be read with that context in mind: that my approach to the subject matter won’t be the same as a regular reader’s.

Physical format and style

Fluent C# is very reminiscent of Head-First C# in its approach, even down to the introductory "why this book is great at teaching you" blurb. It’s all very informal, with lots of pictures, diagrams and reader exercises. It’s a chunky book, at nearly 900 pages including the index – which I’d expect to be pretty daunting to a newcomer. However, that isn’t the main impression you come away with. Instead…

It’s brown. Everywhere. The diagrams, the text, the pictures – they’re all printed in brown, on off-white paper.

Combined with using multiple fonts including cursive ones, this makes for a pointlessly irritating reading experience right from the outset, however good or bad the actual content is. Now it’s possible that this is actually deliberate: I was speaking to someone recently who mentioned some research that shows if you use a hard-to-read font in presentations, people tend to end up reading it several times, so you end up with better memories of the content than if it had been "clean". I don’t know if that’s what Sams intended with this book, but I frequently found myself longing for simple black ink on clean white paper.

Leaving that to one side, I’m not sure I’ll ever really be a fan of the general tone of books like this, but I can certainly see that it’s popular and therefore presumably helpful to many people. It’s not clear to me whether it’s possible to create a book which retains the valuable elements of this style while casting off the aspects which rub me up the wrong way. It’s something about the enforced jollity which just doesn’t quite sit right, but it wouldn’t surprise me if that were more a peculiarity of my personality than anything about the book. Again, I’ve tried to set this to one side when reviewing the book, but it may come through nonetheless.

Structure

The book is broken up into the following sections, with several chapters per section:

  • Getting started (122 pages – finding your way around Visual Studio, debugging, deployment)
  • The Language (100 pages – introduction to C#)
  • The .NET Framework Library (162 pages – text, date/time APIs, collections – and actually more about C# as a language)
  • Best practice (116 pages – inheritance, some principles, design patterns)
  • WPF (341 pages)

I’ve included the page count for each section to show just how much is devoted to WPF. The book goes into much more detail about WPF than it does about the C# language itself (for example, drop shadow effects are included, but the "using" statement and nullable value types aren’t). If you want to write any kind of application other than a WPF one, a large part of the book won’t be useful to you. That’s not to say it’s useless per se – and in fact from my point of view, the WPF section was the most useful. The section on brushes is probably the best written in the whole book, for example. At time it feels to me like the author really wanted to write a book about WPF, but was asked to make it one about C# instead. That may well not be the case at all – it was just an impression.

Even though the best practice section talks briefly about MVC, MVP and MVVM, it doesn’t really go into enough detail to build anything like a real application – and in fact there’s no coverage of persistence of any form. No files, no XML, no database – nothing below the presentation layer, really. As such, although the book claims it’s enough to get you started with application development, it actually only provides a veneer. Even though I didn’t like the first edition of Head-First C# back in 2008, it did at least take the reader end-to-end – the exercises led to complete applications. The best practice section isn’t entirely about architecture and design patterns, however – it’s at this point that inheritance is properly introduced. While I wouldn’t personally count that as a "best practice" as such, it does at least come at the start of the section, before the genuine patterns/architecture areas which would have been harder to understand without that background.

One aspect which concerned me was the emphasis on the debugger and interactive diagnostics. The author states that developers should expect to spend a large part of their time in the debugger, and she says how she prefers using MessageBox.Show for diagnostics over Console.WriteLine information appearing in the output window. While I’m all for something more sophisticated than Console.WriteLine, there are solutions which are a lot less invasive than popping up a dialog, and which can be left in the code (possibly under an execution-time configuration) to allow diagnostics to be produced for real systems.

The "testing and deployment" chapter says nothing about automated tests – it’s as if the author believes that "testing" only involves "running the app in the debugger and seeing if it breaks". I hope that’s not actually the case, and I can understand why newcomers ought to at least know about the debugger – but I’d have welcomed at least a few pages introducing unit testing as a way of recording and checking expectations of how your code behaves. My own preference is to spend as little time in the debugger as possible; I know that’s not always practical, particularly for UI work, but I think it’s a reasonable aim.

Accuracy

Anyone following me on Twitter or Google+ knows where I’m going with this section. After reading through the book, pen in hand (as I always do, even for the books I like), I decided that it was more important to get out some form of errata quickly than this review. As such, I started a Google document which is publicly available to read and add comments to. The result is over 60 pages of notes and errata, and that’s excluding the introduction and table of contents. To be fair to the book, some of those notes are matters of disagreement which are more personal opinion than incontrovertible fact – but there are plenty of simple matters of inaccuracy. Some of the worst are:

  • Claims that String is a value type. (It’s a reference type.)
  • Inconsistency between whether arrays are value types or reference types – but consistently claiming that arrays are immutable, with the exception of the size which can be changed (slowly) using Array.Resize. (Array types are always reference types, and they’re always mutable except the size, which is fixed after creation. Array.Resize creates a new array, it doesn’t change the size of the existing one.)
  • Incorrect syntax for chaining from one constructor to another.
  • The claim that all reference types are mutable. (Some aren’t, and indeed I often aim for immutability. The canonical example of an immutable reference type is String.)

There are plenty more – including huge number of samples which simply won’t compile. Whole double page spreads where every class declaration is missing the "class" keyword. Pieces of code using VB syntax… the list goes on. (The VB syntax errors are probably explained by the author’s other book published at the same time: "Fluent Visual Basic". I suspect there was a certain amount of copy/paste, and the editing process didn’t catch all the changes which were needed to reflect the differences between the languages.)

Beyond the factually incorrect statements, there’s the matter of terminology. Now I’m well aware that I care more about terminology than more people – but there’s simply no reason to start making up terminology or misusing the perfectly good terminology from the specification. The book has a whole section on "commands" in C#, including things like for statements, switch statements, try/catch/finally statements. Additionally, it mislabels class and namespace declarations as "statements", and even mislabels using directives as statements – although it later goes back on the latter point. The word "object" is used at various times to mean any of variable, type, class and object, with no sense of consistency that I could fathom. For example, at one point it’s used in two different senses within the same sentence: "As we’ll see, you can define several different kinds of objects (called TYPES) in C#, but the one you’ll probably work with most often is the OBJECT."

Both accuracy and staying consistent with accepted terminology (primarily the specification) are particularly important for newcomers. If there’s a typo in a relatively advanced book – or in one which is about a particular technology (e.g. MVC) rather than an introductory text on a language, the reader is fairly likely to be able to guess what should really be there based on their existing experience. If a beginner comes across the same problem, they’re likely to assume it’s their fault that the code won’t compile. Likewise if they learn the wrong terminology to start with, they’ll be severely hampered in communicating effectively with other developers – as well as when reading other books.

I don’t want to make it sound like I expect perfection in a book – just yesterday someone mailed me a correction to C# in Depth, and I’d be foolish to try to hold other authors to standards I couldn’t meet myself. Nor am I suggesting it’s easy to be both accessible and accurate – so often an author may have an accurate picture of a complex topic, but have to simplify it in their writing, particularly for an introductory book like Fluent C#. But there are limits – and in my view this book goes well past the level of error that I’m willing to put up with.

Conclusion

I really don’t like ranting. I don’t like sounding mean – and I wanted to like this book. While I like C# 4.0 in a Nutshell and Essential C# 4.0, I’m still looking for a book which I can recommend to readers who want a more "lively" kind of book. Unfortunately I really can’t recommend Fluent C# to anyone – it is simply too inaccurate, and I believe it will cause confusion and instil bad habits in its readers.

So, what next? I’m hoping that the publisher and author will take my errata on board for the next printing, and revise it thoroughly. At that point I still don’t think I’d actually like the book due to its structure and WPF focus (and the colour scheme, which I don’t expect to change), but it would at least be more a matter of taste then.

I have some reason to be hopeful – because my review of Head-First C# was somewhat like this one, and one of the authors of that book (Andrew Stellman) was incredibly good about the whole thing, and as a result the second edition of Head-First C# is a much better book than the first edition. Again, it’s not quite my preferred style, but for readers who like that sort of thing, it’s a much better option than Fluent C# at the moment, and one I’m happy to recommend (with the express caveat of getting the second edition).

At the same time, reading Fluent C# (and particularly thinking about its debugger-first approach) has set me something of a challenge. You see, I’ve mostly avoided writing for new programmers so far – but I feel it’s really important to get folks off on the right foot, and I’d like to have a stab at it. In particular, I would like to see if it’s possible to write an introductory text which teaches C# using unit tests wherever possible… but without being dry. Can we have a "fun" but accurate book, which tries to teach C# from scratch without giving the impression that user interfaces are the be-all and end-all of programming? Can I write in a way which is more personal but doesn’t feel artificial? I can’t see myself starting such a project any time in the next year, but maybe some time in 2013… Watch this space. In the meantime, I’ll keep an eye out for any more introductory books which might be more promising than Fluent C#.

Eduasync part 17: unit testing

In the last post I showed a method to implement "majority voting" for tasks, allowing a result to become available as soon as possible. At the end, I mentioned that I was reasonably confident that it worked because of the unit tests… but I didn’t show the tests themselves. I felt they deserved their own post, as there’s a bigger point here: it’s possible to unit test async code. At least sometimes.

Testing code involving asynchrony is generally a pain. Introducing the exact order of events that you want is awkward, as is managing the threading within tests. With a few benefits with async methods:

  • We know that the async method itself will only execute in a single thread at a time
  • We can control the thread in which the async method will execute, if it doesn’t configure its awaits explicitly
  • Assuming the async method returns Task or Task<T>, we can check whether or not it’s finished
  • Between Task<T> and TaskCompletionSource<T>, we have a way of injecting tasks that we understand

Now in our sample method we have the benefit of passing in the tasks that will be awaited – but assuming you’re using some reasonably testable API to fetch any awaitables within your async method, you should be okay. (Admittedly in the current .NET framework that excludes rather a lot of classes… but the synchronous versions of those calls are also generally hard to test too.)

The plan

For our majority tests, we want to be able to see what happens in various scenarios, with tasks completing at different times and in different ways. Looking at the test cases I’ve implemented I have the following tests:

  • NullSequenceOfTasks
  • EmptySequenceOfTasks
  • NullReferencesWithinSequence
  • SimpleSuccess
  • InputOrderIsIrrelevant
  • MajorityWithSomeDisagreement
  • MajorityWithFailureTask
  • EarlyFailure
  • NoMajority

I’m not going to claim this is a comprehensive set of possible tests – it’s a proof of concept more than anything else. Let’s take one test as an example: MajorityWithFailureTask. The aim of this is to pass three tasks (of type Task<string>) into the method. One will give a result of "x", the second will fail with an exception, and the third will also give a result of "x". The events will occur in that order, and only when all three results are in should the returned task complete, at which point it will also have a success result of "x".

So, the tricky bit (compared with normal testing) is introducing the timing. We want to make it appear as if tasks are completing in a particular order, at predetermined times, so we can check the state of the result between events.

Introducing the TimeMachine class

Okay, so it’s a silly name. But the basic idea is to have something to control the logical flow of time through our test. We’re going to ask the TimeMachine to provide us with tasks which will act in a particular way at a given time, and then when we’ve started our async method we can then ask it to move time forward, letting the tasks complete as they go. It’s probably best to look at the code for MajorityWithFailureTask first, and then see what the implementation of TimeMachine looks like. Here’s the test:

[Test]
public void MajorityWithFailureTask()
{
    var timeMachine = new TimeMachine();
    // Second task gives a different result
    var task1 = timeMachine.AddSuccessTask(1, "x");
    var task2 = timeMachine.AddFaultingTask<string>(2, new Exception("Bang!"));
    var task3 = timeMachine.AddSuccessTask(3, "x");

    var resultTask = MoreTaskEx.WhenMajority(task1, task2, task3);
    Assert.IsFalse(resultTask.IsCompleted);

    // Only one result so far – no consensus
    timeMachine.AdvanceTo(1);
    Assert.IsFalse(resultTask.IsCompleted);

    // Second result is a failure
    timeMachine.AdvanceTo(2);
    Assert.IsFalse(resultTask.IsCompleted);

    // Third result gives majority verdict
    timeMachine.AdvanceTo(3);
    Assert.AreEqual(TaskStatus.RanToCompletion, resultTask.Status);
    Assert.AreEqual("x", resultTask.Result);
}

As you can see, there are two types of method:

  • AddSuccessTask / AddFaultingTask / AddCancelTask (not used here) – these all take the time at which they’re going to complete as their first parameter, and the method name describes the state they’ll reach on completion. The methods return the task created by the time machine, ready to pass into the production code we’re testing.
  • AdvanceTo / AdvanceBy (not used here) – make the time machine "advance time", completing pre-programmed tasks as it goes. When those tasks complete, any continuations attached to them also execute, which is how the whole thing hangs together.

Now forcing tasks to complete is actually pretty simple, if you build them out of TaskCompletionSource<T> to start with. So all we need to do is keep our tasks in "time" order (which I achieve with SortedList), and then when we’re asked to advance time we move through the list and take the appropriate action for all the tasks which weren’t completed before, but are now. I represent the "appropriate action" as a simple Action, which is built with a lambda expression from each of the Add methods. It’s really simple:

public class TimeMachine
{
    private int currentTime = 0;
    private readonly SortedList<int, Action> actions = new SortedList<int, Action>();

    public int CurrentTime { get { return currentTime; } }

    public void AdvanceBy(int time)
    {
        AdvanceTo(currentTime + time);
    }

    public void AdvanceTo(int time)
    {
        // Okay, not terribly efficient, but it’s simple.
        foreach (var entry in actions)
        {
            if (entry.Key > currentTime && entry.Key <= time)
            {
                entry.Value();
            }
        }
        currentTime = time;
    }

    public Task<T> AddSuccessTask<T>(int time, T result)
    {
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetResult(result);
        return tcs.Task;
    }

    public Task<T> AddCancelTask<T>(int time)
    {
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetCanceled();
        return tcs.Task;
    }

    public Task<T> AddFaultingTask<T>(int time, Exception e)
    {
        TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
        actions[time] = () => tcs.SetException(e);
        return tcs.Task;
    }
}

Okay, that’s a fair amount of code for a blog posts (and yes, it could do with some doc comments etc!) but considering that it makes life testable, it’s pretty simple.

So, is that it?

It works on my machine… with my test runner… in simple cases…

When I first ran the tests using TimeMachine, they worked almost immediately. This didn’t surprise me nearly as much as it should have done. You see, when the tests execute, they use async/await in the normal way – which means the continuations are scheduled on "the current task scheduler". I have no idea what the current task scheduler is in unit tests. Or rather, it feels like something which is implementation specific. It could easily have worked when running the tests from ReSharper, but not from NCrunch, or not from the command line NUnit test runner.

As it happens, I believe all of these run tests on thread pool threads with no task scheduler allocated, which means that the continuation is attached to the task to complete "in-line" – so when the TimeMachine sets the result on a TaskCompletionSource, the continuations execute before that call returns. That means everything happens on one thread, with no ambiguity or flakiness – yay!

However, there are two problems:

  • The words "I believe" aren’t exactly confidence-inspiring when it comes to testing that your software works correctly.
  • Our majority voting code only ever sees one completed task at a time – we’re not testing the situation where several tasks complete so quickly together that the continuation doesn’t get chance to run before they’ve all finished.

Both of these are solvable with a custom TaskScheduler or SynchronizationContext. Without diving into the docs, I’m not sure yet which I’ll need, but the aim will be:

  • Make TimeMachine implement IDisposable
  • In the constructor, set the current SynchronizationContext (or TaskScheduler) to a custom one having remembered what the previous one was
  • On disposal, reset the context
  • Make the custom scheduler keep a queue of jobs, such that when we’re asked to advance to time T, we complete all the appropriate tasks but don’t execute any continuations, then we execute all the pending continuations.

I don’t yet know how hard it will be, but hopefully the Parallel Extensions Samples will help me.

Conclusion

I’m not going to claim this is "the" way of unit testing asynchronous methods. It’s clearly a proof-of-concept implementation of what can only be called a "test framework" in the loosest possible sense. However, I hope it gives an example of a path we might take. I’m looking forward to seeing what others come up with, along with rather more polished implementations.

Next time, I’m going to shamelessly steal an idea that a reader mailed me (with permission, of course). It’s insanely cool, simple and yet slightly brain-bending, and I suspect will handy in many situations. Love it.

Eduasync part 16: Example of composition: majority voting

Note: For the rest of this series, I’ll be veering away from the original purpose of the project (investigating what the compiler is up to) in favour of discussing the feature itself. As such, I’ve added a requirement for AsyncCtpLib.dll – but due to potential distribution restrictions, I’ve felt it safest not to include that in the source repository. If you’re running this code yourself, you’ll need to copy the DLL from your installation location into the Eduasynclib directory before it will build – or change each reference to it.

One of the things I love about async is the compositional aspect. This is partly due to the way that the Task Parallel Library encourages composition to start with, but async/await makes it even easier by building the tasks for you. In the next few posts I’ll talk about a few examples of interesting building blocks. I wouldn’t be surprised to see an open source library with a proper implementation of some of these ideas (Eduasync is not designed for production usage) whether from Microsoft or a third party.

In project 26 of Eduasync, I’ve implemented "majority voting" via composition. The basic idea is simple, and the motivation should be reasonably obvious in this day and age of redundant services. You have (say) five different tasks which are meant to be computing the same thing. As soon as you have a single answer which the majority of the tasks agree on, the code which needs the result can continue. If the tasks disagree, or fail (or a combination leading to no single successful majority result), the overall result is failure too.

My personal experience with services requiring a majority of operations to return is with Megastore, a storage system we use at Google. I’m not going to pretend to understand half of the details of how Megastore works, and I’m certainly not about to reveal any confidential information about its internals or indeed how we use it, but basically when discussing it with colleagues at around the time that async was announced, I contemplated what a handy feature async would be when implementing a Megastore client. It could also be used in systems where each calculation is performed in triplicate to guard against rogue errors – although I suspect the chances of those systems being implemented in C# are pretty small.

It’s worth mentioning that the implementation here wouldn’t be appropriate for something like a stock price service, where the result can change rapidly and you may be happy to tolerate a small discrepancy, within some bounds.

The API

Here’s the signatures of the methods we’ll implement:

public static Task<T> WhenMajority<T>(params Task<T>[] tasks)

public static Task<T> WhenMajority<T>(IEnumerable<Task<T>> tasks)

Obviously the first just delegates to the second, but it’s helpful to have both forms, so that we can pass in a few tasks in an ad hoc manner with the first overload, or a LINQ-generated sequence of tasks with the second.

The name is a little odd – it’s meant to match WhenAll and WhenAny, but I’m sure there are better options. I’m not terribly interested in that at the moment.

It’s easy to use within an async method:

Task<int> firstTask = firstServer.ComputeSomethingAsync(input);
Task<int> secondTask = selectServer.ComputeSomethingAsync(input);
Task<int> thirdTask = thirdServer.ComputeSomethingAsync(input);

int result = await MoreTaskEx.WhenMajority(firstTask, secondTask, thirdTask);

Or using the LINQ-oriented overload:

var tasks = servers.Select(server => server.ComputeSomethingAsync(input));
int result = await MoreTaskEx.WhenMajority(tasks);

Of course we could add an extension method (dropping the When prefix as it doesn’t make as much sense there, IMO):

int result = await servers.Select(server => server.ComputeSomethingAsync(input))
                          .MajorityAsync();

The fact that we’ve stayed within the Task<T> model is what makes it all work so smoothly. We couldn’t easily express the same API for other awaitable types in general although we could do it for any other specific awaitable type of course. It’s possible that it would work using dynamic, but I’d rather avoid that :) Let’s implement it now.

Implementation

There are two parts to the implementation, in the same way that we implemented LINQ operators in Edulinq – and for the same reason. We want to go bang immediately if there are any clear input violations – such as the sequence of tasks being null or empty. This is in line with the Task-based Asynchronous Pattern white paper:

An asynchronous method should only directly raise an exception to be thrown out of the MethodNameAsync call in response to a usage error*. For all other errors, exceptions occurring during the execution of an asynchronous method should be assigned to the returned Task.

Now it occurs to me that we don’t really need to do this in two separate methods (one for precondition checking, one for real work). We could create an async lambda expression of type Func<Task<T>>, and make the method just return the result of invoking it – but I don’t think that would be great in terms of readability.

So, the first part of the implementation performing validation is really simple:

public static Task<T> WhenMajority<T>(params Task<T>[] tasks)
{
    return WhenMajority((IEnumerable<Task<T>>) tasks);
}

public static Task<T> WhenMajority<T>(IEnumerable<Task<T>> tasks)
{
    if (tasks == null)
    {
        throw new ArgumentNullException("tasks");
    }
    List<Task<T>> taskList = new List<Task<T>>(tasks);
    if (taskList.Count == 0)
    {
        throw new ArgumentException("Empty sequence of tasks");
    }
    foreach (var task in taskList)
    {
        if (task == null)
        {
            throw new ArgumentException("Null task in sequence");
        }
    }
    return WhenMajorityImpl(taskList);
}

The interesting part is obviously in WhenMajorityImpl. It’s mildly interesting to note that I create a copy of the sequence passed in to start with – I know I’ll need it in a fairly concrete form, so it’s appropriate to remove any laziness at this point.

So, here’s WhenMajorityImpl, which I’ll then explain:

private static async Task<T> WhenMajorityImpl<T>(List<Task<T>> tasks)
{
    // Need a real majority – so for 4 or 5 tasks, must have 3 equal results.
    int majority = (tasks.Count / 2) + 1;
    int failures = 0;
    int bestCount = 0;
            
    Dictionary<T, int> results = new Dictionary<T, int>();
    List<Exception> exceptions = new List<Exception>();
    while (true)
    {
        await TaskEx.WhenAny(tasks);
        var newTasks = new List<Task<T>>();
        foreach (var task in tasks)
        {
            switch (task.Status)
            {
                case TaskStatus.Canceled:
                    failures++;
                    break;
                case TaskStatus.Faulted:
                    failures++;
                    exceptions.Add(task.Exception.Flatten());
                    break;
                case TaskStatus.RanToCompletion:
                    int count;
                    // Doesn’t matter whether it was there before or not – we want 0 if not anyway
                    results.TryGetValue(task.Result, out count);
                    count++;
                    if (count > bestCount)
                    {
                        bestCount = count;
                        if (count >= majority)
                        {
                            return task.Result;
                        }
                    }
                    results[task.Result] = count;
                    break;
                default:
                    // Keep going next time. may not be appropriate for Created
                    newTasks.Add(task);
                    break;
            }
        }
        // The new list of tasks to wait for
        tasks = newTasks;

        // If we can’t possibly work, bail out.
        if (tasks.Count + bestCount < majority)
        {
            throw new AggregateException("No majority result possible", exceptions);
        }
    }
}

I should warn you that this isn’t a particularly efficient implementation – it was just one I wrote until it worked. The basic steps are:

  • Work out how many results make a majority, so we know when to stop
  • Keep track of how many "votes" our most commonly-returned result has, along with the counts of all the votes
  • Repeatedly:
    • Wait (asynchronously) for at least of the remaining tasks to finish (many may finish "at the same time")
    • Start a new list of "tasks we’re going to wait for next time"
    • Process each task in the current list, taking an action on each state:
      • If it’s been cancelled, we’ll treat that as a failure (we could potentially treat "the majority have been cancelled" as a cancellation, but for the moment a failure is good enough)
      • If it’s faulted, we’ll add the exception to the list of exceptions, so that if the overall result ends up as failure, we can throw an AggregateException with all of the individual exceptions
      • If it’s finished successfully, we’ll check the result:
        • Add 1 to the count for that result (the dictionary will use the default comparer for the result type, which we assume is good enough)
        • If this is greater than the previous "winner" (which could be for the same result), check for it being actually an overall majority, and return if so.
      • If it’s still running (or starting), add it to the new task list
    • Check whether enough tasks have failed – or given different results – so ensure that a majority is now impossible. If so, throw an AggregateException to say so. This may have some exceptions, but it may not (if there are three tasks which gave different results, none of them actually failed)

Each iteration of the "repeatedly" will have a smaller list to check than before, so we’ll definitely terminate at some point.

I mentioned that it’s inefficient. In particular, we’re ignoring the fact that WhenAny returns a Task<Task<T>>, so awaiting that will actually tell us a task which has finished. We don’t need to loop over the whole collection at that point – we could just remove that single task from the collection. We could do that efficiently if we kept a Dictionary<Task<T>, LinkedListNode<Task<T>> and a LinkedList<Task<T>> – we’d just look up the task which had completed in the dictionary, remove its node from the list, and remove the entry from the dictionary. We wouldn’t need to create a new collection each time, or iterate through all of the old one. However, that’s a job for another day… as is allowing a cancellation token to be passed in, and a custom equality comparer.

Conclusion

So we can make this implementation smarter and more flexible, certainly – but it’s not insanely tricky to write. I’m reasonably confident that it works, too – as I have unit tests for it. They’ll come in the next part. The important point  from this post is that by sticking within the Task<T> world, we can reasonably easily create building blocks to allow for composition of asynchronous operations. While it would be nice to have someone more competent than myself write a bullet-proof, efficient implementation of this operation, I wouldn’t feel too unhappy using a homegrown one in production. The same could not have been said pre-async/await. I just wouldn’t have had a chance of getting it right.

Next up – the unit tests for this code, in which I introduce the TimeMachine class.

Eduasync part 15: implementing COMEFROM with a horrible hack

Ages ago when I wrote my previous Eduasync post, I said we’d look at a pipeline model of coroutines. I’ve decided to skip that, as I do want to cover the topic of this post, and I’ve got some more "normal" async ideas to write about too. If you want to look at the pipeline coroutines code, it’s project 20 in the source repository. Have fun, and don’t blame me if you get confused reading it – so do I.

The code I am going to write about is horrible too. It’s almost as tricky to understand, and it does far nastier things. Things that the C# 5 specification explicitly says you shouldn’t do.

If it makes you feel any better when your head hurts reading this code, spare a thought for me – I haven’t looked at it in over six months, and I don’t have a blog post explaining how it’s meant to work. I just have almost entirely uncommented code which is designed to be hard to understand (in terms of the main program flow).

On no account should any code like this ever be used for anything remotely serious.

With that health warning out of the way, let’s have a look at it…

COMEFROM at the caller level

The idea is to implement the COMEFROM control structure, which is sort of the opposite of GOTO (or in my implementation, more of a GOSUB). There are two operations, effectively:

  • ComeFrom(label): Register interest in a particular label.
  • Label(label): If anyone has registered interested in the given label, keep going from their registration point (which will be within a method), then continue from where we left off afterwards.

In some senses it’s a little like the observer pattern, with labels taking the place of events. However, it looks entirely different and is much harder to get your head round, because instead of having a nicely-encapsulated action which is subscribed to an event, we just have a ComeFrom call which lets us jump back into a method somewhat arbitrarily.

I have two implementations, in project 22 and project 23 in source control. Project 22 is almost sane; a little funky, but not too bad. Project 23 is where the fun really happens. In addition to the operations listed above, there’s an Execute operation which is sort of an implementation detail – it allows an async method containing ComeFrom calls to be executed without returning earlier than we might want.

Let’s look at some code and the output, and try to work out what’s going on.

internal class Program
{
    private static void Main(string[] args)
    {
        Coordinator coordinator = new Coordinator(SimpleEntryPoint);
        coordinator.Start();
    }

    private static async void SimpleEntryPoint(Coordinator coordinator)
    {
        await coordinator.Execute(SimpleOtherMethod);

        Console.WriteLine("First call to Label(x)");
        await coordinator.Label("x");

        Console.WriteLine("Second call to Label(x)");
        await coordinator.Label("x");

        Console.WriteLine("Registering interesting in y");
        bool firstTime = true;
        await coordinator.ComeFrom("y");

        Console.WriteLine("After ComeFrom(y). FirstTime={0}", firstTime);

        if (firstTime)
        {
            firstTime = false;
            await coordinator.Label("y");
        }
        Console.WriteLine("Finished");
    }

    private static async void SimpleOtherMethod(Coordinator coordinator)
    {
        Console.WriteLine("Start of SimpleOtherMethod");

        int count = 0;
        await coordinator.ComeFrom("x");

        Console.WriteLine("After ComeFrom x in SimpleOtherMethod. count={0}. Returning.",
                          count);
        count++;
    }
}

The reason for the "Simple" prefix on the method names is that there’s another example in the same file, with a more complex control flow.

Here’s the output – then we can look at why we’re getting it…

Start of SimpleOtherMethod
After ComeFrom x in SimpleOtherMethod. count=0. Returning.
First call to Label(x)
After ComeFrom x in SimpleOtherMethod. count=1. Returning.
Second call to Label(x)
After ComeFrom x in SimpleOtherMethod. count=2. Returning.
Registering interesting in y
After ComeFrom(y). FirstTime=True
After ComeFrom(y). FirstTime=False
Finished

So, the control flow is a bit like this:

  • Start SimpleEntryPoint
    • Call into SimpleOtherMethod
      • Log "Start of SimpleOtherMethod"
      • Initialize the "count" variable with value 0
      • Register interest in x; ComeFrom remembers the continuation but keeps going.
      • Log "After ComeFrom x in SimpleOtherMethod. count=0. Returning."
      • Increment count to 1.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "First call to Label(x)"
  • Call Label("x")…
    • … which takes us back into SimpleOtherMethod (remember, the method we thought we’d finished executing?) just after ComeFrom
      • Log AfterComeFrom x in SimpleOtherMethod. count=1. Returning.
      • Increment count to 2.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "Second call to Label(x)"
  • Call Label("x")…
    • … which takes us back into SimpleOtherMethod again
      • Log AfterComeFrom x in SimpleOtherMethod. count=2. Returning.
      • Increment count to 3.
      • Return.
    • Return takes us back to SimpleEntryPoint…
  • Log "Registering interest in y"
  • Initialize the "firstTime" variable with value true.
  • Register interest in y; ComeFrom remembers the continuation and keeps going
  • Log "After ComeFrom(y). FirstTime=True"
  • Check the value of firstTime… It’s true, so:
    • Set firstTime to false
    • Call Label("y")
  • … which takes us back to earlier in the method (just after ComeFrom), like a normal looping construct…
  • Log "After ComeFrom(y). FirstTime=False"
  • Check the value of firstTime… It’s false, so:
    • Log "Finished"
    • Exit!

Doing all of this has a few interesting challenges. Let’s look at them one at a time… and I would strongly advise you not to try to pay too much attention to the details.

Noting a continuation and continuing regardless…

Just as a quick reminder before we get cracking, it’s worth remembering that all of this is entirely synchronous, despite being implemented with async. There’s only a single user thread involved here. As with previous parts, we maintain a stack of actions to call, and basically keep calling from the top until we’re done – but the actions we call can create extra stack entries, of course.

ComeFrom has unusual semantics in terms of async. We want to remember the continuation and keep executing as if we didn’t need to wait. We can easily do one side or the other. If we wanted to just keep going without needing to know about the continuation, we could just return true from IsCompleted. If we just want to remember the continuation, we can make the awaiter’s IsCompleted property return false, and remember the continuation when it’s passed to OnCompleted. How do we do both?

Well, effectively we want to remember the continuation and then call it immediately. But we can’t just call it directly from OnCompleted, as otherwise each ComeFrom call would end up in a "real" execution stack from, whereas our execution stack is stored as a Stack<Action>. So instead, we need to remember the continuation and immediately put it at the top of the stack.

However, that only works if as soon as the generated code returns from the async method containing the ComeFrom call, we go back into the state machine. If we’d just called SimpleOtherMethod directly in SimpleEntryPoint, we would have continued within SimpleEntryPoint with the new stack entry just waiting around. This is why we need the Executor method: that does exactly the same thing, effectively shuffling the stack around. When it’s given something to execute, it puts its own continuation on the action stack, then the action it’s been asked to execute, then returns. The top level code will then pick up the original action, and we’re away.

So, here’s the code for Execute, which is the simplest part of the coordinator:

public ExecuteAwaiter Execute(Action<Coordinator> action)
{
    return new ExecuteAwaiter(() => action(this), this);
}

public class ExecuteAwaiter
{
    private readonly Action action;
    private readonly Coordinator coordinator;

    internal ExecuteAwaiter(Action action, Coordinator coordinator)
    {
        this.action = action;
        this.coordinator = coordinator;
    }

    public ExecuteAwaiter GetAwaiter()
    {
        return this;
    }

    // Always yield
    public bool IsCompleted { get { return false; } }

    public void OnCompleted(Action callerContinuation)
    {
        // We want to execute the action continuation, then get back here,
        // allowing any extra continuations put on the stack *within* the action
        // to be executed.
        coordinator.stack.Push(callerContinuation);
        coordinator.stack.Push(action);
    }

    public void GetResult()
    {
    }
}

All the awaitables in this project return themselves as the awaiter – when you don’t need any other state, it’s an easy step to take.

That’s all we need to say about Execute, but how exactly are we capturing the continuation in ComeFrom?

Capturing continuations

Once we’ve got the action stack shuffling under our belts, there are two more problems with ComeFrom:

  • What happens if we ComeFrom the same label twice?
  • How do we really capture a continuation?

The first point didn’t come up in the sample I’ve shown here, but it does come up in the more complex example – imagine if SimpleOtherMethod had two ComeFrom calls; when we jump back to the first one, we’ll execute the second one again. I made a simple policy decision to only allow a single "return point" for any label – if a ComeFrom call tries to register the existing continuation point for a label, we ignore it; otherwise we throw an exception. So we only need to care about a single continuation for any label, which makes life easier.

The second point is trickier. If you remember back to earlier posts in this series, we saw that the state machine generated for async only really contains a single entry point (MoveNext) which is used for all continuations. A variable in the state machine is responsible for remembering where we were within it between calls. So in order to really make the continuation remember the point at which it needs to continue, we need to remember that state. We need to store an object for the continuation, which contains the delegate to invoke, and the state of the state machine when we were first passed the continuation. I’ve created a class for this, unimaginatively called Continuation, which looks like this:

/// <summary>
/// This hack allows a continuation to be executed more than once,
/// contrary to the C# spec. It does this using reflection to store the
/// value of the "state" field within the generated class. NEVER, EVER, EVER
/// try to use this in real code. It’s purely for fun.
/// </summary>
internal sealed class Continuation : IEquatable<Continuation>
{
    private readonly int savedState;
    private readonly object target;
    private readonly FieldInfo field;
    private readonly Action action;

    internal Continuation(Action action)
    {
        target = action.Target;
        field = target.GetType().GetField("<>1__state", BindingFlags.Instance | BindingFlags.NonPublic);
        savedState = (int) field.GetValue(target);
        this.action = action;
    }

    internal void Execute()
    {
        field.SetValue(target, savedState);
        action();
    }

    // Snip Equals/GetHashCode
}

Yes, we use reflection to fish out the <>1__state variable initially, and poke the state machine with the same value when we next want to execute the continuation. All highly implementation-specific, of course.

Now the ComeFrom method is reasonably straightforward – all we need is a dictionary mapping labels to continuations. Oh, and the same action stack shuffling as for Execute:

// In the coordinator
private readonly Dictionary<string, Continuation> labels = new Dictionary<string, Continuation>();

public ComeFromAwaiter ComeFrom(string label)
{
    return new ComeFromAwaiter(label, this);
}

public struct ComeFromAwaiter
{
    private readonly string label;
    private readonly Coordinator coordinator;

    internal ComeFromAwaiter(string label, Coordinator coordinator)
    {
        this.label = label;
        this.coordinator = coordinator;
    }

    public ComeFromAwaiter GetAwaiter()
    {
        return this;
    }

    // We *always* want to be given the continuation
    public bool IsCompleted { get { return false; } }

    public void OnCompleted(Action action)
    {
        Continuation newContinuation = new Continuation(action);
        Continuation oldContinuation;
        if (!coordinator.labels.TryGetValue(label, out oldContinuation))
        {
            // First time coming from this label. Always succeeds.
            coordinator.labels[label] = newContinuation;
        }
        else
        {
            // Current semantics are to prohibit two different ComeFrom calls for the same label.
            // An alternative would be to just replace the existing continuation with the new one,
            // in which case we wouldn’t need any of this – we could just use
            // coordinator.labels[label] = newContinuation;
            // unconditionally.
            if (!oldContinuation.Equals(newContinuation))
            {
                throw new InvalidOperationException("Additional continuation detected for label " + label);
            }
            // Okay, we’ve seen this one before. Nothing to see here, move on.
        }
        // We actually want to continue from where we were: we’re only really marking the
        // ComeFrom point.
        coordinator.stack.Push(action);
    }

    public void GetResult()
    {
    }
}

There’s one interesting point here which is somewhat subtle, and screwed me up for a bit…

The default value of a struct is always valid…

You may have noticed that ComeFromAwaiter is a struct. That’s pretty unusual for me. However, it’s also absolutely critical. Without it, we’d get a NullReferenceException when we execute the continuation the second time.

Normally, the flow of async methods looks a bit like this, for an await expression taking the "long" route (i.e. IsCompleted is false):

  • Call GetAwaiter() and assign the result to an awaiter field
  • Call IsCompleted (which returns false in this scenario)
  • Set the state variable to remember where we’d got to
  • Call OnCompleted
  • Return
  • … When we continue…
  • Set state to 0 (running)
  • Call GetResult() on the awaiter
  • Set the awaiter field to default(TypeOfAwaiter)
  • Continue

Now that’s fine when we’re only continuing once – but if we need to jump into the middle of that sequence a second time, we’re going to call GetAwaiter() on the awaiter field after it’s been set to the default value of the awaiter type. If the default value is null, we’ll go bang. So we must use a struct.

Fortunately, our GetResult() call doesn’t need any of the state in the awaiter – it’s purely there to satisfy the normal flow of things. So we’re quite happy with a "default" ComeFrom awaiter.

Finally, labels…

We’ve now done all the hard work. The final piece of the puzzle is Label, which just needs to check whether there’s a continuation to jump to, and shuffle the action stack in the way we’re now painfully accustomed to:

public LabelAwaiter Label(string label)
{
    Continuation continuation;
    labels.TryGetValue(label, out continuation);
    return new LabelAwaiter(continuation, this);
}

public class LabelAwaiter
{
    private readonly Continuation continuation;
    private readonly Coordinator coordinator;

    internal LabelAwaiter(Continuation continuation, Coordinator coordinator)
    {
        this.continuation = continuation;
        this.coordinator = coordinator;
    }

    public LabelAwaiter GetAwaiter()
    {
        return this;
    }

    // If there’s no continuation to execute, just breeze through.
    public bool IsCompleted { get { return continuation == null; } }

    public void OnCompleted(Action action)
    {
        // We want to execute the ComeFrom continuation, then get back here.
        coordinator.stack.Push(action);
        coordinator.stack.Push(continuation.Execute);
    }

    public void GetResult()
    {
    }
}

Almost painfully simple, really.

So that looks like all the code that’s used, right? Not quite

Reusable builders?

As we saw in the sample code, we can end up finishing the same async method multiple times (SimpleOtherMethod completes three times). That’s going to call SetResult on the AsyncVoidMethodBuilder three times… which feels like it should go bang. Indeed, when I revisited my code earlier I wondered why it didn’t go bang – it’s the sort of illegal state transition the framework is usually pretty good at picking up on.

Then I remembered – this isn’t the framework’s AsyncVoidMethodBuilder – it’s mine. And my SetResult method in this project does absolutely nothing. How convenient!

Make it stop, make it stop!

Okay thiat was a pretty quick tour of some horrible code. You’ll never have to do anything like this with async in sane code, but it certainly made me painfully familiar with how it all worked. Just to recap on the oddities involved:

  • We needed to capture a continuation and then immediately keep going, almost as if the awaiter had said the awaitable had completed already. This involved shenanigans with the execution model and an extra method (Execute)
  • We needed to remember the state of a continuation, which we did with reflection.
  • We needed to make awaiter.GetResult() a valid call after awaiter had been reset to the default value for the type
  • We needed to ensure that the builder created in the skeleton method could have SetResult called on it multiple times

That’s all on continuations and co-routines, I promise.

Next time (hopefully soon) I’ll look at an example of how composition works so neatly in async, and then show how we can unit test async methods – at least sometimes.

Laptop review: Kobalt G150

EDIT, 17th October 2011: Last week Kobalt closed down… so the choice about whether or not I’d buy from them again is now moot. However, PC Specialist sells a very similar spec, now including the matte screen…

As some of you will know, our house was burgled in April, and the thieves took three laptops (and very little else), including my main personal laptop. Obviously I ordered a replacement, partly covered by the insurance from my old laptop. However, I took the opportunity to spoil myself a little… I ordered a G150 from Kobalt Computers. Various people have taken an interest in the progress and results of this, so this post is a little review.

Specs

As I said, I spoiled myself a little… the specs are somewhat silly:

  • Overall, it’s a G150 which is based on the Clevo P150HM chassis
  • Screen: 1920×1080, matte, 95% gamut
  • CPU: Intel Sandybridge Core-i7 2720QM; quad core 2.2-3.33GHz
  • Graphics: Nvidia GeForce GTX 580M
  • RAM: 16GB Corsair DDR3
  • Disk: SSD – Intel 510, 250GB
  • Optical: Blu-ray ROM; DVD/RW

How does it run?

Well how do you think it runs? :) Obviously it’s very nippy indeed. I’m not sure I’ve ever used more than a couple of cores at a time yet, but it’s nice to know I can test out parallelization when I want to :) While Windows Experience numbers are obviously pretty crude, they’re pleasant enough that I might as well show them off:

Visual Studio 2010 still takes a while to come up (10-15 seconds) whereas Office components start pretty much instantly – so I don’t know where the bottleneck for VS is. You can do an awful lot in terms of both disk and CPU in 10 seconds on this laptop, so it’s a bit of a mystery. (Eclipse starts noticeably faster.) However, once running, Visual Studio is as fast as you’d expect on this machine – builds and tests are nippy, and the editor is noticeably more responsive than on the "make-do" laptop I’ve been using since April.

I can’t say I’m much of a gamer, but my experiences in Call Of Duty: Black Ops and Portal 2 have been wonderful so far; I can put everything on high settings until it looks absolutely beautiful, and still get a good frame rate. I’ve never had a laptop with a really good graphics card before, and the one in this beast is one of the most powerful out there, so I’ve really got no excuse for not getting into gaming more, other than my obvious lack of free time. I’m also looking forward to investigating the possibility of writing code in C# to be executed on the GPU – I believe there are a couple of projects around that, so it’ll be fun to look into.

With a fast SSD, the boot time is fabulously fast – although it does take a while to go to sleep, as I use hybrid sleep mode and it needs to dump memory to disk first. That’s one downside of having a large amount of memory, of course. There’s still a lot of debate around the longevity of solid state drives, but the improved performance is so noticeable that I’d definitely not go back to a "normal" hard drive now. I chose the Intel 510 over some slightly faster drives as the 510 is generally reckoned to be more reliable – so I’m hedging my bets somewhat. I suspect the difference in performance between "stupidly fast" and "ridiculously fast" is irrelevant to me.

The screen is beautiful – just "really nice" for normal desktop work, but amazing for games and video – that’s where the matte nature really wins out. The contrast is particularly nice, at least in the short tests I’ve performed so far. The hinge feels pleasantly firm but not stiff – it’s hard to judge this early, but hopefully it’ll prove robust over time. This screen is one of the reasons I chose Kobalt – I believe they’re the only UK company selling machines with this screen, and I’d certainly recommend that anyone who has the chance to go for the matte screen should do so.

Personally I use a separate keyboard most of the time (see below) but the keyboard on the G150 itself is a nice chiclet style one. I’m not terribly fond of the layout of the cursor keys of the fact that there’s no separate home / end / page up / page down keys, but it’s not a big deal. The feeling of the keys themselves is good, and not too loud. The trackpad is fine – I’ve turned off "tap to click" as it always ends up activating when I don’t want to, but that’s not specific to this particular laptop – I always find the same thing. Maybe I type with my palms particularly close to the trackpad, or something like that.

A few more random, esoteric points:

  • The power socket is "grippy" which makes it slightly harder to pull out the power, but does give a feeling of security. This is clearly a deliberate decision, and while it’s not one which would suit everyone, I’m pretty happy with it.
  • The fan comes on and goes off reasonably regularly, which can prove a little distracting sometimes, but is the natural result of having a fast/hot laptop, I guess. The fan itself is fairly quiet under normal load, so I’d probably be fine with it being constantly on – it’s the stop / start nature that jars a little. Not a big deal once you get used to it.
  • The webcam is really washed out – I don’t think I’d really want to use it, to be honest. I don’t know whether it’s my particular hardware, the general make/model, or the settings (which I’ve played with and improved somewhat, but not to really acceptable levels)
  • The built-in microphone is located on the keyboard, which is a little bizarre and obviously not helpful for any time you’d be typing as well as talking. There’s a lot of white noise with it compared with actual signal – I couldn’t get a Skype test call to be audible without it also having huge amounts of hiss. Fortunately, this isn’t a problem for me – I have a standalone microphone which I use for screencasts etc. That’s previously been problematically quiet with other laptops, possibly due to drawing a lot of power – but it works perfectly with this laptop. I’m also getting a Corsair headset, which should be handy both for gaming and podcast recordings.
  • The machine itself is fairly bulky, and the power brick is huge – but the feeling of the chassis is a very attractive matte black. If you’re looking for a sleek laptop to carry around a lot, this probably wouldn’t be the best choice, but most of the time I keep my laptop on the same table at home. I went for a 15" rather than 17" as it makes a big difference when carrying it around to conferences, but the extra thickness required to house and cool the powerful components doesn’t really bother me.

Overall, I’m really pleased with the laptop. As far as I can tell so far, the build quality is excellent – no problems at all. The poor quality of the microphone and webcam are a niggly disappointment, but not one that bothers me enough to find out whether they’re "working as expected" or not.

The buying experience

This is where some of you may be relishing the prospect of reading a rant against Kobalt – but I’m not going to air lots of dirty laundry here. It did take a very long time for the laptop to arrive – over three months – but the causes of this were varied, and are worth mentioning at least in passing. I’m hoping that a summary of frustrations and the good parts will give the right impression without getting ugly.

My first cause of frustration occurred very early – before I’d even received a detailed order confirmation from Kobalt. They’d suffered from a high proportion of staff going down with norovirus around the time I ordered. This is the sort of thing which will obviously hit a small company like Kobalt rather more than bigger ones, but it wasn’t a great start; for a week all I had was an email saying that I’d paid Kobalt for something, but I couldn’t even check whether the details were right.

Some of the delays actually put Kobalt into a better light as far as I’m concerned. In particular, I’d originally ordered an AMD 6970 graphics card and a Vertex 3 SSD. Kobalt withdrew both of these components from sale: the 6970 was failing too often in testing, and the Vertex 3 was causing blue screens, sometimes in testing and occasionally on customer machines, due to an issue between the laptop chipset and the disk. Some other vendors would no doubt have kept selling these components and let the customer take the risk of losing the use of the laptop while it went back for repair, and I applaud Kobalt for not doing this.

Likewise my order was actually built twice: the first one failed testing – the motherboard failed, so had to be built from scratch. Again, I’m very happy about this – I’d obviously rather have a delay but get a working laptop in the end than get a defective one sooner. This also worked to my advantage in terms of the graphics card; although I’d only ordered a 485M, the first one was used in another customer’s machine while waiting for a new motherboard for me… by which time the 485M was no longer readily available. Kobalt swapped in the 580M for no extra charge.

Other delays were only partially under Kobalt’s control – for example, their order management system blew up in August. Can you blame a company for their internal systems failing? It’s hard to say – and I’ll be the first to admit I don’t have the details. Was it due to cutting corners by getting a cheap ordering system which might be expected to be flaky? Was it due to something catastrophic which would only happen once in a million years? Should there have been better "emergency backup" procedures? I don’t know – but it feels like something that customers shouldn’t be exposed to.

One benefit of the delays was that I was able to change my order a couple of times – upping the CPU and memory, choosing a different graphics card and disk drive etc. Kobalt have been very flexible around this; it was considerably easier to change the configuration than I suspect it would have been with somewhere like Dell.

My main issue through all of this was communication – which wasn’t all bad, but was definitely flaky. I suspect Kobalt would argue that I had unreasonable expectations, whereas I’d say it’s just part of providing good customer service. The problems that Kobalt experienced obviously made all of this much worse than normal, but I still think there’s a mismatched set of expectations there. I clearly wasn’t the only one getting frustrated – the forums had a lot of posts complaining of a lack of updates – but equally, satisfied customers don’t tend to post much to provide balance. It’s very obvious on the forums that while people have been frustrated with the buying process, almost everyone’s really happy with the machine they get in the end, and Kobalt are actually really good at answering questions about drivers etc afterwards.

My own experience of communicating with Kobalt was negative until the reasonably late stages of the order, but I was then phoned to be informed of the laptop coming out of testing and again to check shipping dates, which was good. (In this case it was particularly important as without the check, the laptop would have been delivered to the office on Saturday, where there may well not have been anyone prepared to sign for it.)

For a week or so after I received my laptop I still saw quite a few frustrated posts from other customers, but now that seems to have gone down significantly. It’s possible that I’m just not seeing the other complaining posts (as such posts are often deleted – leading to the customer in question getting more frustrated, of course). I’m currently hopeful that I happen to have just ordered at a really bad time where Kobalt was suffering a series of unrelated problems. Whether they handled those as well as they could have done is up for debate, but hopefully new customers wouldn’t see the same problems.

It should be noted that Kobalt is definitely trying to get better, too. In August they introduced a new Customer Promise around price, upgrades, and delivery times. In particular, if your order takes more than 6 weeks, you can choose between various games and accessories as a gift. (I chose two games from Steam.) Also, they’re actively recruiting, so hopefully that will help on the communications front, too.

Accessories

As tends to happen while waiting for something, I got itchy and started buying accessories to go with the new laptop. I thought I might as well mention those at the same time…

External USB 3 drive: Western Digital MyPassport. I would probably have bought a eSata drive if I’d found one with as neat a form factor as this, but there just don’t seem to be many eSata drives around yet. Hopefully this will change, as I do notice my USB mouse/keyboard responding sluggishly while I’m putting a lot of traffic through the drive – but apart from that, it’s lovely. It’s a really nice form factor, and seems to take advantage of the USB 3 ports on my laptop to deliver pretty reasonable performance. I’m expecting this to be used primarily for VMs – I’ve heard mixed reports of using VMs with SSDs, including the possibility that the two really don’t play nicely, leading to early drive failure. I don’t know whether this is accurate or not, but I’m being cautious. Overall, I’m pretty happy with this, although it’s reasonably hard to get excited about a disk drive…

Keyboard: I do quite a bit of typing, and while I’m used to laptop keyboards, they’re obviously somewhat constrained. I had previously been using a Logitech K340 which is nice, but I treated myself to a K800 for the new machine. Both of these use Logitech’s "Unifying" receiver, which is wonderful – it’s a tiny little receiver which I just leave permanently in the USB port. It works with quite a few Logitech peripherals, so I share the same receiver for my keyboard and the Anywhere MX mouse I use. The K800 keyboard is really nice – a lovely action, a sensible layout of Ins/Del/Home/End/PgUp/PgDn (which is its main benefit over the K340, to be honest) and it’s rechargable via USB. The backlighting is a nice extra, although it’s probably not going to actually be useful for me. It’s fun to just wave your hand over it and watch it light up though… I’m easily pleased :)

I also bought a Belkin Screencast WiDi 2.0, which turns out to have been a mistake. I had thought that because I was using a Sandybridge laptop with an appropriate Intel wifi card, I’d be able to use WiDi – a technology which allows you to transmit video and sound to a receiver which can then plug into the TV. Yay, I can display Youtube etc on the TV without leaving the comfort of the sofa, right? Not so much – it turns out that this only works if you’re also using the integrated graphics card; as I’m using a separate GPU, it’s a no-go. This wasn’t made as obvious as it might be on Intel’s web site about WiDi – it’s there if you dig, but it’s not obvious. Just to be clear, this is in no way Kobalt’s fault – they never claimed the new laptop would be WiDi compatible. I’ve now sent the Screencast (which was no use to me for anything else) to Kobalt so they can try it out with other laptops. (I suspect the GS150 may work with it, for example.)

VM experiment

As I’ve tweeted before, I did have one hope for actually using most of the 16GB of memory. I don’t want to run VMs directly from the SSD, as I mentioned before – but I had a thought of having the virtual disk on the SSD, but then mounting it as a ram drive. That way I’d only need to write to the disk after shutting down the VM – one big write instead of frequent spraying access.

That would only work with a small drive, of course… but I hoped I’d just about be able to get Windows 7 Home Premium + Visual Studio into a small enough drive. With 1GB of memory for the VM and 2GB of memory for the host machine I can have a 13GB ram drive – and I can install Windows 7 on that using Virtual PC, but Virtual PC also uses disk space alongside the VHD for memory for the VM, which obviously takes another 1GB off the usable size. I nearly managed it, but not quite. I may give it another go with Virtual Box and the Express edition of VS11… I’ll blog again if I get it working, but I didn’t want to hold up this post forever…

In terms of getting anything working, it took a little while – DataRam’s RAMDisk product kept hanging while closing down; imdisk gave me access problems even after trying all the suggested tweaks, but VSuite Ramdisk (server edition) seems to do the job. It’s not hugely cheap (and I need the server edition to support a drive over 4GB) but if I can get everything working, I may go with it. Currently I’m using the trial edition.

Conclusion

I guess the obvious question to ask is "If I had my time again, would I take the same action?"

Well, it’s obviously been a frustrating experience, but the results should keep me happy for a long time. I think I would be cautious about buying from Kobalt again, but probably less so than you might expect. I’d probably hang out in the forums for a while to see whether folks were generally happy at the time. I’m hoping I was just unlucky, and hit a particularly nasty time in Kobalt’s history – I can’t imagine the staff there have enjoyed those three months any more than I did – and that normally everything runs smoothly. If I were in a real hurry I’d probably go for an off-the-shelf solution, but that’s a different matter – when you buy a custom machine you should expect it to take a bit longer. Just not three months, normally :)

I’d certainly be happy to buy from Kobalt again in terms of the quality of the product – it’s a lovely laptop, and I’m delighted with its performance, display and general handling. Obviously I regret buying the Screencast, but all my other decisions – memory, disk, external keyboard etc – have turned out well so far.

Upcoming speaking engagements

It’s just occurred to me that I’ve forgotten to mention a few of the things I’ll be up to in the near-ish future. (I’ve talked about next week’s Progressive .NET session before.) This is just a quick rundown – follow the links for more blurb and details.

.NET Developer Network – Bristol, September 21st (evening)

I’ll be talking about async in Bristol – possibly at a high level, possibly in detail, depending on the audience experience. This is my first time talking with this particular user group, although I’m sure there’ll be some familiar faces. Come along if you’re in the area.

Øredev 2011 – Malmö, November 9th

It’s a whistle-stop trip to Sweden as I’m running out of vacation days; I’m flying out on the Tuesday evening and back on the Wednesday evening, but while I’m there I’ll give two talks:

  • Async 101 (yes, more async; I wonder at what point I’ll have given as many talks about it as Mads)
  • Effective technical communication (not a particularly technical talk, but definitely specific to technical communication)

Last year I had an absolute blast – looking forward to this year, even though I won’t have as much time for socializing.

Stack Overflow Dev Days 2011 – London, November 14th – cancelled!

Update: Dev Days has been cancelled. I’m still hoping to do something around this topic, and there may be small-scale meet-ups in London anyway.

Two years ago I talked about how humanity had let the world of software engineering down. This was one of the best talks I’ve ever given, and introduced the world to Tony the Pony. Unfortunately that puts the bar relatively high for this year’s talk – at least, high by my own pretty low standards.

In a somewhat odd topic for a Christian and a happy employee of a company with a code of conduct which starts "Don’t be evil," this year’s talk is entitled "Thinking in evil." As regular readers are no doubt aware, I love torturing the C# language and forcing the compiler to work with code which would make any right-thinking software engineer cringe. I was particularly gratified recently when Eric Lippert commented on one of my Stack Overflow answers that this was "the best abuse of C# I’ve seen in a while." I’m looking forward to talking about why I think it’s genuinely a good idea to think about nasty code like this – not to use it, but to get to know your language of choice more intimately. Like last time, I have little idea of exactly what this talk will be like, but I’m really looking forward to it.

Optimization and generics, part 2: lambda expressions and reference types

As with almost any performance work, your mileage may vary (in particular the 64-bit JIT may work differently) and you almost certainly shouldn’t care. Relatively few people write production code which is worth micro-optimizing. Please don’t take this post as an invitation to make code more complicated for the sake of irrelevant and possibly mythical performance changes.

It took me a surprisingly long time to find the problem described in the previous blog post, and almost no time at all to fix it. I understood why it was happening. This next problem took a while to identify at all, but even when I’d found a workaround I had no idea why it worked. Furthermore, I couldn’t reproduce it in a test case… because I was looking for the wrong set of triggers. I’ve now found at least some of the problem though.

This time the situation in Noda Time is harder to describe, although it concerns the same area of code. In various places I need to create new delegates containing parsing steps and add them to the list of steps required for a full parse. I can always use lambda expressions, but in many cases I’ve got the same logic repeatedly… so I decided to pull it out into a method. Bang – suddenly the code runs far slower. (In reality, I’d performed this refactoring first, and "unrefactored" it to speed things up.)

I think the problem comes down to method group conversions with generic methods and a type argument which is a reference type. The CLR isn’t very good at them, and the C# compiler uses them more than it needs to.

Show me the benchmark!

The complete benchmark code is available of course, but fundamentally I’m doing the same thing in each test case: creating a delegate of type Action which does nothing, and then checking that the delegate reference is non-null (just to avoid the JIT optimizing it away). In each case this is done in a generic method with a single type parameter. I call each method in two ways: once with int as the type argument, and once with string as the type argument. Here are the different cases involved:

  • Use a lambda expression: Action foo = () => {};
  • Fake what I expected the compiler to do: keep a separate generic cache class with a static variable for the delegate; populate the cache once if necessary, and thereafter use the cache field
  • Fake what the compiler is actually doing with the lambda expression: write a separate generic method and perform a method group conversion to it
  • Do what the compiler could do: write a separate non-generic method and perform a method group conversion to it
  • Use a method group conversion to a static (non-generic) method on a generic type
  • Use a method group conversion to an instance (non-generic) method on a generic type, via a generic cache class with a single field in referring to an instance of the generic class

(Yes, the last one is a bit convoluted – but the line in the method itself is simple: Action foo = ClassHolder<T>.SampleInstance.NoOpInstance;

Remember, we’re doing each of these in a generic method, and calling that generic method using a type argument of either int or string. (I’ve run a few tests, and the exact type isn’t important – all that matters is that int is a value type, and string is a reference type.)

Importantly, we’re not capturing any variables, and the type parameter is not involved in either the delegate type or any part of the implementation body.

Benchmark results

Again, times are in milliseconds – but this time I didn’t want to run it for 100 million iterations, as the "slow" versions would have taken far too long. I’ve run this on the x64 JIT as well and seen the same effect, but I haven’t included the figures here.

Times in milliseconds for 10 million iterations

Test TestCase<int> TestCase<string>
Lambda expression 180 29684
Generic cache class 90 288
Generic method group conversion 184 30017
Non-generic method group conversion 178 189
Static method on generic type 180 29276
Instance method on generic type 202 299

Yes, it’s about 150 times slower to create a delegate from a generic method with a reference type as the type argument than with a value type… and yet this is the first I’ve heard of this. (I wouldn’t be surprised if there were a post from the CLR team about it somewhere, but I don’t think it’s common knowledge by any means.)

Conclusion

One of the tricky things is that it’s hard to know exactly what the C# compiler is going to do with any given lambda expression. In fact, the method which was causing me grief earlier on isn’t generic, but it’s in a generic type and captures some variables which use the type parameters – so perhaps that’s causing a generic method group conversion somewhere along the way.

Noda Time is a relatively extreme case, but if you’re using delegates in any performance-critical spots, you should really be aware of this issue. I’m going to ping Microsoft (first informally, and then via a Connect report if that would be deemed useful) to see if there’s an awareness of this internally as potential "gotcha", and whether there’s anything that can be done. Normal trade-offs of work required vs benefit apply, of course. It’s possible that this really is an edge case… but with lambdas flying everywhere these days, I’m not sure that it is.

Maybe tomorrow I’ll actually be able to finish getting Noda Time moved onto the new system… all of this performance work has been a fun if surprising distraction from the main job of shipping working code…

Optimization and generics, part 1: the new() constraint (updated: now with CLR v2 results)

As with almost any performance work, your mileage may vary (in particular the 64-bit JIT may work differently) and you almost certainly shouldn’t care. Relatively few people write production code which is worth micro-optimizing. Please don’t take this post as an invitation to make code more complicated for the sake of irrelevant and possibly mythical performance changes.

I’ve been doing quite a bit of work on Noda Time recently – and have started getting my head round all the work that James Keesey has put into the parsing/formatting. I’ve been reworking it so that we can do everything without throwing any exceptions, and also to work on the idea of parsing a pattern once and building a sequence of actions for both formatting and parsing from the action. To format or parse a value, we then just need to apply the actions in turn. Simples.

Given that this is all in the name of performance (and I consider Noda Time to be worth optimizing pretty hard) I was pretty cross when I ran a complete revamp through the little benchmarking tool we use, and found that my rework had made everything much slower. Even parsing a value after parsing the pattern was slower than parsing both the value and the pattern together. Something was clearly very wrong.

In fact, it turns out that at least two things were very wrong. The first (the subject of this post) was easy to fix and actually made the code a little more flexible. The second (the subject of the next post, which may be tomorrow) is going to be harder to work around.

The new() constraint

In my SteppedPattern type, I have a generic type parameter – TBucket. It’s already constrained in terms of another type parameter, but that’s irrelevant as far as I’m aware. (After today though, I’m taking very little for granted…) The important thing is that before I try to parse a value, I want to create a new bucket. The idea is that bits of information end up in the bucket as they’re being parsed, and at the very end we put everything together. So each parse operation requires a new bucket. How can we create one in a nice generic way?

Well, we can just call its public parameterless constructor. I don’t mind the types involved having such a constructor, so all we need to do is add the new() constraint, and then we can call new TBucket():

// Somewhat simplified…
internal sealed class SteppedPattern<TBucket> : IParsePattern<TBucket>
    where TBucket : new()
{
    public ParseResult Parse(string value)
    {
        TBucket bucket = new TBucket();

        // Rest of parsing goes here
    }
}

Great! Nice and simple. Unfortunately, it turned out that that one line of code was taking 75% of the time to parse a value. Just creating an empty bucket – pretty much the simplest bit of parsing. I was amazed when I discovered that.

Fixing it with a provider

The fix is reasonably easy. We just need to tell the type how to create an instance, and we can do that with a delegate:

// Somewhat simplified…
internal sealed class SteppedPattern<TBucket> : IParsePattern<TBucket>
{
    private readonly Func bucketProvider;

    internal SteppedPattern(Func bucketProvider)
    {
        this.bucketProvider = bucketProvider;
    }

    public ParseResult Parse(string value)
    {
        TBucket bucket = bucketProvider();

        // Rest of parsing goes here
    }
}

Now I can just call new SteppedPattern(() => new OffsetBucket()) or whatever. This also means I can keep the constructor internal, not that I care much. I could even reuse old parse buckets if that wouldn’t be a semantic problem – in other cases it could be useful. Hooray for lambda expressions – until we get to the next post, anyway.

Show me the figures!

You don’t want to have to run Noda Time’s benchmarks to see the results for yourself, so I wrote a small benchmark to time just the construction of a generic type. As a measure of how insignificant this would be for most apps, these figures are in milliseconds, performing 100 million iterations of the action in question. Unless you’re going to do this in performance-critical code, you just shouldn’t care.

Anyway, the benchmark has four custom types: two classes, and two structs – a small and a large version of each. The small version has a single int field; the large version has eight long fields. For each type, I benchmarked both approaches to initialization.

The results on two machines (32-bit and 64-bit) are below, for both the v2 CLR and v4. The 64-bit machine is much faster in general – you should only compare results within one machine, as it were.)

CLR v4: 32-bit results (ms per 100 million iterations)

Test type new() constraint Provider delegate
Small struct 689 1225
Large struct 11188 7273
Small class 16307 1690
Large class 17471 3017

CLR v4: 64-bit results (ms per 100 million iterations)

Test type new() constraint Provider delegate
Small struct 473 868
Large struct 2670 2396
Small class 8366 1189
Large class 8805 1529

CLR v2: 32-bit results (ms per 100 million iterations)

Test type new() constraint Provider delegate
Small struct 703 1246
Large struct 11411 7392
Small class 143967 1791
Large class 143107 2581

CLR v2: 64-bit results (ms per 100 million iterations)

Test type new() constraint Provider delegate
Small struct 510 686
Large struct 2334 1731
Small class 81801 1539
Large class 83293 1896

(An earlier version of this post had a mistake – my original tests used structs for everything, despite the names.)

Others have reported slightly different results, including the new() constraint being better for both large and small structs.

Just in case you hadn’t spotted them, look at the results for classes. Those are the real results – it took over 2 minutes to run the test using the new() constraint on my 32-bit laptop, compared with under two seconds for the provider. Yikes. This was actually the situation I was in for Noda Time, which is built on .NET 2.0 – it’s not surprising that so much of my benchmark’s time was spent constructing classes, given results like this.

Of course you can download the benchmark program for yourself and see how it performs on your machine. It’s a pretty cheap-and-cheerful benchmark, but when the differences are this big, minor sources of inaccuracy don’t bother me too much. The simplest way to run under CLR v2 is to compile with the .NET 3.5 C# compiler to start with.

What’s going on under the hood?

As far as I’m aware, there’s no IL to give support for the new() constraint, in terms of using the parameterless constructor. (The constraint itself can be expressed in IL though.) Instead, when we write new T() in C#, the compiler emits a call to Activator.CreateInstance. Apparently, that’s slower than calling a delegate – presumably due to trying to find an accessible constructor with reflection, and invoking it. I suspect it could be optimized relatively easily – e.g. by caching the results per type it’s called with, in terms of delegates. I’m slightly surprised this hasn’t (apparently) been optimized, given how easy it is to cache values by generic type. No doubt there’s a good reason lurking there somewhere, even if it’s only the memory taken up by the cache.

Either way, it’s easy to work around in general.

Conclusion

I wouldn’t have found this gotcha if I didn’t have before and after tests (or in this case, side-by-side tests of the old way and the new way of parsing). The real lesson of this post shouldn’t be about the new() constraint – it should be how important it is to test performance (assuming you care), and how easy it is to assume certain operations are cheap.

Next post: something much weirder.

Speaking engagement: Progressive .NET, London, September 7th

Just a quick note to mention an event I’ll be speaking at in September. SkillsMatter will be hosting Progressive .NET, a 3-day event set of tutorials on September 5th-7th in London. I’ll be speaking about C# 5’s async feature on the last day (9.30am-1pm) but there’s a host of other speakers too. Should be good. For my own part, with four hours or so to cover async, I should be able to cover both the high level stuff and the implementation details, with plenty of time for the inevitable questions.

This one isn’t free though, I’m afraid – it’s normally £425. Hardly pocket money, but pretty good value for three full days of deep-dive sessions. However, there are two bits of good news:

  • Readers of this blog can get £50 off using the promo code "PROGNET50" at the checkout.
  • I have two free tickets to give away.

In an effort to make the ticket give-away fair, I’m thinking of a 32-bit number – mail me (skeet@pobox.com) an Int32, and the two readers with the closest value will get the tickets. Please include "Progressive .NET" in the subject line of the mail so I can filter them easily :)

Anyway, hope to see you there – please grab me to say hi.

Update (August 4th): and the winners are…

Congratulations to The Configurator and Haris Hasan who submitted the closest numbers to the one I was thinking of: -890978631.

In fact, The Configurator guessed the exact value – which is the result of calling "Progressive .NET".GetHashCode() on my 32-bit laptop running .NET 4. (I can’t remember which versions have different hash algorithms, but as it’s pretty arbitrary, it seemed good enough…) I’m impressed!

I’ll be emailing SkillsMatter to let them know about the winners – and thanks to everyone else who mailed me a guess. Hope I’ll see some of you there anyway!

Eduasync part 14: Data passing in coroutines

(This post covers project 19 in the source code.)

Last time we looked at independent coroutines running in a round-robin fashion. This time we’ll keep the round-robin scheduling, but add in the idea of passing data from one coroutine to another. Each coroutine will act on data of the same type, which is necessary for the scheme to work when one coroutine could "drop out" of the chain by returning.

Designing the data flow

It took me a while to get to the stage where I was happy with the design of how data flowed around these coroutines. I knew I wanted a coordinator as before, and that it should have a Yield method taking the value to pass to the next coroutine and returning an awaitable which would provide the next value when it completed. The tricky part was working out what to do at the start of each method and the end. If the method just took a Coordinator parameter, we wouldn’t have anything to do with the value yielded by the first coroutine, because the second coroutine wouldn’t be ready to accept it yet. Likewise when a coroutine completed, we wouldn’t have another value to pass to the next coroutine.

Writing these dilemmas out in this post, the solution seems blindingly obvious of course: each coroutine should accept a data value on entry, and return one at the end. At any point where we transfer control, we provide a value and have a value which is required by something. The final twist is to make the coordinator’s Start method take an initial value and return the value returned by the last coroutine to complete.

So, that’s the theory… let’s look at the implementation.

Initialization

I’ve changed the coordinator to take all the coroutines as a constructor parameter (of the somewhat fearsome declaration "params Func<Coordinator<T>, T, Task<T>>[] coroutines") which means we don’t need to implement IEnumerable pointlessly any more.

This leads to a code skeleton of this form:

private static void Main(string[] args)
{
    var coordinator = new Coordinator<string>(FirstCoroutine,
                                              SecondCoroutine,
                                              ThirdCoroutine);
    string finalResult = coordinator.Start("m1");
    Console.WriteLine("Final result: {0}", finalResult);
}

private static async Task<string> FirstCoroutine(
    Coordinator<string> coordinator,
    string initialValue)
{
    …
}

// Same signature for SecondCoroutine and ThirdCoroutine

Last time we simply had a Queue<Action> internally in the coordinator as the actions to invoke. You might be expecting a Queue<Func<T, T>> this time – after all, we’re passing in data and returning data at each point. However, the mechanism for that data transfer is "out of band" so to speak. The only time we really "return" an item is when we reach the end of a coroutine. Usually we’ll be providing data to the next step using a method. Likewise the only time a coroutine is given data directly is in the first call – after that, it will have to fetch the value by calling GetResult() on the awaiter which it uses to yield control.

All of this is leading to a requirement for our constructor to convert each coroutine delegate into a simple Action. The trick is working out how to deal with the data flow. I’m going to include SupplyValue() and ConsumeValue() methods within the coordinator for the awaiter to use, so it’s just a case of calling those appropriately from our action. In particular:

  • When the action is called, it should consume the current value.
  • It should then call the coroutine passing in the coordinator ("this") and the initial value.
  • When the task returned by the coroutine has completed, the result of that task should be used to supply a new value.

The only tricky part here is the last bullet – and it’s not that hard really, so long as we remember that we’re absolutely not trying to start any new threads. We just want to hook onto the end of the task, getting a chance to supply the value before the next coroutine tries to pick it up. We can do that using Task.ContinueWith, but passing in TaskContinuationOptions.ExecuteSynchronously so that we use the same thread that the task completes on to execute the continuation.

At this point we can implement the initialization part of the coordinator, assuming the presence of SupplyValue() and ConsumeValue():

public sealed class Coordinator<T>
{
    private readonly Queue<Action> actions;
    private readonly Awaitable awaitable;

    public Coordinator(params Func<Coordinator<T>, T, Task<T>>[] coroutines)
    {
        // We can’t refer to "this" in the variable initializer. We can use
        // the same awaitable for all yield calls.
        this.awaitable = new Awaitable(this);
        actions = new Queue<Action>(coroutines.Select(ConvertCoroutine));
    }

    // Converts a coroutine into an action which consumes the current value,
    // calls the coroutine, and attaches a continuation to it so that the return
    // value is used as the new value.
    private Action ConvertCoroutine(Func<Coordinator<T>, T, Task<T>> coroutine)
    {
        return () =>
        {
            Task<T> task = coroutine(this, ConsumeValue());
            task.ContinueWith(ignored => SupplyValue(task.Result),
                TaskContinuationOptions.ExecuteSynchronously);
        };
    }
}

I’ve broken ConvertCoroutine into a separate method so that we can use it as the projection for the Select call within the constructor. I did initially have it within a lambda expression within the constructor, but it was utterly hideous in terms of readabililty.

One suggestion I’ve received is that I could declare a new delegate type instead of using Func<Coordinator<T>, T, Task<T>> to represent a coroutine. This could either be a non-generic delegate nested in the generic coordinator class, or a generic stand-alone delegate:

public delegate T Coroutine<T>(Coordinator<T> coordinator, T initialValue);

// Or nested…
public sealed class Coordinator<T>
{
    public delegate T Coroutine(Coordinator<T> coordinator, T initialValue);
}

Both of these would work perfectly well. I haven’t made the change at the moment, but it’s certainly worth considering. The debate about whether to use custom delegate types or Func/Action is one for another blog post, I think :)

The one bit of the initialization I haven’t explained yet is the "awaitable" field and the Awaitable type. They’re to do with yielding – so let’s look at them now.

Yielding and transferring data

Next we need to work out how we’re going to transfer data and control between the coroutines. As I’ve mentioned, we’re going to use a method within the coordinator, called from the coroutines, to accomplish this. The coroutines have this sort of code:

private static async Task<string> FirstCoroutine(
    Coordinator<string> coordinator,
    string initialValue)
{
    Console.WriteLine("Starting FirstCoroutine with initial value {0}",
                      initialValue);            
    …
    string received = await coordinator.Yield("x1");
    Console.WriteLine("Returned to FirstCoroutine with value {0}", received);
    …
    return "x3";
}

The method name "Yield" here is a double-edged sword. The word has two meanings – yielding a value to be used elsewhere, and yielding control until we’re called back. Normally it’s not ideal to use a name that can mean subtly different things – but in this case we actually want both of these meanings.

So, what does Yield need to do? Well, the flow control should look something like this:

  • Coroutine calls Yield()
  • Yield() calls SupplyValue() internally to remember the new value to be consumed by the next coroutine
  • Yield() returns an awaitable to the coroutine
  • Due to the await expression, the coroutine calls GetAwaiter() on the awaitable to get an awaiter
  • The coroutine checks IsCompleted on the awaiter, which must return false (to prompt the remaining behaviour)
  • The coroutine calls OnCompleted() passing in the continuation for the rest of the method
  • The coroutine returns to its caller
  • The coordinator proceeds with the next coroutine
  • When we eventually get back to this coroutine, it will call GetResult() to get the "current value" to assign to the "received" variable.

Now you’ll see that Yield() needs to return some kind of awaitable type – in other words, one with a GetAwaiter() method. Previously we put this directly on the Coordinator type, and we could have done that here – but I don’t really want anyone to just "await coordinator" accidentally. You should really need to call Yield() in order to get an awaitable. So we have an Awaitable type, nested in Coordinator.

We then need to decide what the awaiter type is – the result of calling GetAwaiter() on the awaitable. This time I decided to use the Coordinator itself. That means people could accidentally call IsCompleted, OnCompleted() or GetResult(), but I figured that wasn’t too bad. If we were to go to the extreme, we’d create another type just for the Awaiter as well. It would need to have a reference to the coordinator of course, in order to actually do its job. As it is, we can make the Awaitable just return the Coordinator that created it. (Awaitable is nested within Coordinator<T>, which is how it can refer to T without being generic itself.)

public sealed class Awaitable
{
    private readonly Coordinator<T> coordinator;

    internal Awaitable(Coordinator<T> coordinator)
    {
        this.coordinator = coordinator;
    }

    public Coordinator<T> GetAwaiter()
    {
        return coordinator;
    }
}

The only state here is the coordinator, which is why we create an instance of Awaitable on the construction of the Coordinator, and keep it around.

Now Yield() is really simple:

public Awaitable Yield(T value)
{
    SupplyValue(value);
    return awaitable;
}

So to recap, we now just need the awaiter members, SupplyValue() and ConsumeValue(). Let’s look at the awaiter members (in Coordinator) to start with. We already know that IsCompleted will just return false. OnCompleted() just needs to stash the continuation in the queue, and GetResult() just needs to consume the "current" value and return it:

public bool IsCompleted { get { return false; } }

public void OnCompleted(Action continuation)
{
    actions.Enqueue(continuation);
}

public T GetResult()
{
    return ConsumeValue();
}

Simple, huh? Finally, consuming and supplying values:

private T currentValue;
private bool valuePresent;

private void SupplyValue(T value)
{
    if (valuePresent)
    {
        throw new InvalidOperationException
            ("Attempt to supply value when one is already present");
    }
    currentValue = value;
    valuePresent = true;
}

private T ConsumeValue()
{
    if (!valuePresent)
    {
        throw new InvalidOperationException
            ("Attempt to consume value when it isn’t present");
    }
    T oldValue = currentValue;
    valuePresent = false;
    currentValue = default(T);
    return oldValue;
}

These are relatively long methods (compared with the other ones I’ve shown) but pretty simple. Hopefully they don’t need explanation :)

The results

Now that everything’s in place, we can run it. I haven’t posted the full code of the coroutines, but you can see it on Google Code. Hopefully the results speak for themselves though – you can see the relevant values passing from one coroutine to another (and in and out of the Start method).

Starting FirstCoroutine with initial value m1
Yielding ‘x1’ from FirstCoroutine…
    Starting SecondCoroutine with initial value x1
    Starting SecondCoroutine
    Yielding ‘y1’ from SecondCoroutine…
        Starting ThirdCoroutine with initial value y1
        Yielding ‘z1’ from ThirdCoroutine…
Returned to FirstCoroutine with value z1
Yielding ‘x2’ from FirstCoroutine…
    Returned to SecondCoroutine with value x2
    Yielding ‘y2’ from SecondCoroutine…
        Returned to ThirdCoroutine with value y2
        Finished ThirdCoroutine…
Returned to FirstCoroutine with value z2
Finished FirstCoroutine
    Returned to SecondCoroutine with value x3
    Yielding ‘y3’ from SecondCoroutine…
    Returned to SecondCoroutine with value y3
    Finished SecondCoroutine
Final result: y4

Conclusion

I’m not going to claim this is the world’s most useful coroutine model – or indeed useful at all. As ever, I’m more interested in thinking about how data and control flow can be modelled than actual usefulness.

In this case, it was the realization that everything should accept and return a value of the same type which really made it all work. After that, the actual code is pretty straightforward. (At least, I think it is – please let me know if any bits are confusing, and I’ll try to elaborate on them.)

Next time we’ll look at something more like a pipeline model – something remarkably reminiscent of LINQ, but without taking up as much stack space (and with vastly worse readability, of course). Unfortunately the current code reaches the limits of my ability to understand why it works, which means it far exceeds my ability to explain why it works. Hopefully I can simplify it a bit over the next few days.