async | Jon Skeet's coding blog

Eric Lippert’s latest (at the time of this writing) blog post asks for suggestions for alternatives to the new "await" keyword, which has inappropriate connotations that confuse people. In particular, it sounds like the thread is going to sit there waiting for the result, when the whole point is that it doesn’t.

There have been many suggestions in the comments, and lots of them involve "yield". I was initially in favour of this too, but on further reflection I don’t think it’s appropriate, for the same reason: it has a connotation which may not be true. It sounds like it’s always going to yield control, when sometimes it doesn’t. To demonstrate this, I’ve come up with a tiny example. It’s a stock market class which allows you to compute the total value of your holdings asynchronously. The class would make web service calls to fetch real prices, but then cache values for some period. The caching bit is the important part here – and in fact it’s the only part I’ve actually implemented.

The point is that when the asynchronous "total value" computation can fetch a price from the cache, it doesn’t need to wait for anything, so it doesn’t need to yield control. This is the purpose of the return value of BeginAwait: if it returns false, the task has been completed and EndAwait can be called immediately. In this case the continuation is ignored – when BeginAwait returns, the ComputeTotalValue method keeps going rather than returning a Task to the caller.

Here’s the complete code:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

class Test
{
    static void Main()
    {
        StockMarket market = new StockMarket();
        market.AddCached("GOOG", 613.70m);
        market.AddCached("MSFT", 26.67m);
        market.AddCached("AAPL", 300.98m);

        Task<decimal> total = market.ComputeTotalValueAsync(
            Tuple.Create("AAPL", 10),
            Tuple.Create("GOOG", 20),
            Tuple.Create("MSFT", 25)
        );
        Console.WriteLine("Completed already? {0}", total.IsCompleted);
        Console.WriteLine("Total value: {0}", total.Result);
    }
}

class StockMarket
{
    private readonly Dictionary<string, decimal> cache =
        new Dictionary<string, decimal>();

    internal void AddCached(string ticker, decimal price)
    {
        cache[ticker] = price;
    }

    public async Task<decimal> ComputeTotalValueAsync(params Tuple<string, int>[] holdings)
    {
        // In real code we may well want to parallelize this, of course
        decimal total = 0m;
        foreach (var pair in holdings)
        {
            total += await new StockFetcher(this, pair.Item1) * pair.Item2;
        }
        Console.WriteLine("Diagnostics: completed ComputeTotalValue");
        return total;
    }

    private class StockFetcher
    {
        private readonly StockMarket market;
        private readonly string ticker;
        private decimal value;

        internal StockFetcher(StockMarket market, string ticker)
        {
            this.market = market;
            this.ticker = ticker;
        }

        internal StockFetcher GetAwaiter()
        {
            return this;
        }

        internal bool BeginAwait(Action continuation)
        {
            // If it’s in the cache, we can complete synchronously
            if (market.cache.TryGetValue(ticker, out value))
            {
                return false;
            }
            // Otherwise, we need to make an async web request and do
            // cunning things. Not implemented :)
            throw new NotImplementedException("Oops.");
        }

        internal decimal EndAwait()
        {
            // Usually more appropriate checking here, of course
            return value;
        }
    }
}

(Note that we’d probably have a public method to fetch a single stock value asynchronously too, and that would probably return a task – in this case I wanted to keep everything as simple as possible, not relying on any other implementation of asynchrony. This also shows how the compiler uses GetAwait/BeginAwait/EndAwait… and that they don’t even need to be public methods.)

The result shows that everything was actually computed synchronously – the returned task is complete by the time the method returns. You may be wondering why we’ve bothered using async at all here – and the key is the bit that throws the NotImplementedException. While everything returns synchronously in this case, we’ve allowed for the possibility of asynchronous fetching, and the only bit of code which would need to change is BeginAwait.

So what does this have to do with the choice of keywords? It shows that "yield" really isn’t appropriate here. When the action completes very quickly and synchronously, it isn’t yielding at all.

What’s the alternative?

There are two aspects of the behaviour of the current "await" contextual keyword:

We might yield control, returning a task to the caller.
We will continue processing at some point after the asynchronous subtask has completed – whether it’s completed immediately or whether our continuation is called back.

It’s hard to capture both of those aspects in one or two words, but I think it make sense to at least capture the aspect which is always valid. So I propose something like "continue after":

foreach (var pair in holdings)
{
total += continue after new StockFetcher(this, pair.Item1) * pair.Item2;
}

I’m not particularly wedded to the "after" bit – it could be "when" or "with" for example.

I don’t think this is perfect – I’m really just trying to move the debate forward a bit. I think the community doesn’t really have enough of a "feeling" for the new feature yet to come up with a definitive answer at the moment (and I include myself there). I think focusing on which aspects we want to emphasize – with a clear understanding of how the feature actually behaves – it a good start though.

As many of you will have seen yesterday, the major new feature of C# 5 was announced yesterday: support for asynchronous operations with new contextual keywords of "async" and "await". I haven’t had a lot of time to play with them yet, but I have a few initial thoughts. Note that these won’t make any sense to you unless you’ve also watched the talk Anders gave, or read Eric’s blog post, or something similar. Links are at the bottom of this post rather than peppered throughout.

Asynchronous != parallel

Do you remember when LINQ was first announced, and there was a huge fear that it was effectively putting SQL into C#, when in fact the truth was more general? Well, it’s the same here. The language feature allows a method to execute asynchronously, but it won’t create new threads and the like automatically.

I suspect there’s going to be a lot of confusion about which parts of the asynchronous model are provided by the language and which by the framework. Obviously I’ll hope to clarify things where I can, but it’s important to understand that the language isn’t going to automatically start extra threads or anything like that.

In the Channel9 PDC interview, Anders stressed this point pretty hard – which suggests he thinks it’s going to cause confusion too. I became a lot more comfortable with what’s going on after reading the draft spec – which is part of the CTP.

Language and framework interaction

It looks like the language changes have been designed to be pattern-based, a little like LINQ and foreach. The async modifier is present more for the sake of developers than the compiler, and the await contextual keyword simply requires a GetAwaiter() method to be available (which can be an extension method); that method has to return something for which BeginAwait(Action) and EndAwait() are valid – and again, these can be extension methods.

One of the tests I performed yesterday was to try to use this myself, with a custom "awaiter factory" – it worked fine, and it’s definitely an interesting way of exploring what’s going on. Expect a blog post along those lines some time next week.

So far, so framework-neutral… the only slight fly in the ointment is the reliance on Task and Task<T>. Just as an iterator block with yield return 10; would usually be declared to return IEnumerable<int>, so an async method ending in return 10; would have a return type of Task<int>. I don’t think that’s actually too bad – but I’ll have to dig into exactly what the language relies on to be completely comfortable with. Are there any aspects of tasks which are usually configured by their factories, for example? If so, what does the compiler do?

My gut feeling is that the language is still keeping the framework at an appropriate distance as far as it can, but that it’s got to have something like Task<T> as a way of representing an asynchronous operation.

But is it any use?

Again, I’m really going on gut feelings so far. I think it’s going to be very useful – and I wish I had something similar in Java when writing internal services at work. I like the fact that it’s relatively simple – it feels much simpler than dynamic typing, for example – so I think I’ll be able to get my head round it.

It’s important to note that it’s not a free lunch, and doesn’t try to be. It removes much of the error-prone mechanical drudgery of writing asynchronous code, but it doesn’t attempt to magically parallelize everything you do. You still need to think about what makes sense to do asynchronously, and how to introduce parallelism where appropriate. That’s a really good thing, in my view: it’s about managing complexity rather than hiding it.

Conclusion

I think C# 5 is going to be a lot of fun… but I also think you’re going to need to understand the Task Parallel Library to really make the most of it. Keep the two separate in your head, and understand how they complement each other. If ever there was a good time to learn TPL thoroughly, it’s now. Learning the C# 5 features themselves won’t take long. Mastering the combination of the two in a practical way could be a much bigger task.

Jon Skeet's coding blog

Category Archives: async

C# 5 async and choosing terminology: why I’m against “yield”