(Just to be clear: I hate the word "explicitness". It reminds me of Rowan Atkinson as Marcus Browning MP, saying we should be purposelessnessless. But I can’t think of anything better here.)
For the last few days, I’ve been thinking about context – in the context of C# 5’s proposed asynchronous functionality. Now many of the examples which have been presented have been around user interfaces, with the natural restriction of doing all UI operations on the UI thread. While that’s all very well, I’m more interested in server-side programming, which has a slightly different set of restrictions and emphases.
Back in my post about configuring waiting, I talked about the context of where execution would take place – both for tasks which require their own thread and for the continuations. However, thinking about it further, I suspect we could do with richer context.
What might be included in a context?
We’re already used to the idea of using context, but we’re not always aware of it. When trying to service a request on a server, some or any of the following may be part of our context:
- Authentication information: who are we acting as? (This may not be the end user, of course. It may be another service who we trust in some way.)
- Cultural information: how should text destined for an end user by rendered? What other regional information is relevant?
- Threading information: as mentioned before, what threads should be used both for "extra" tasks and continuations? Are we dealing with thread affinity?
- Deadlines and cancellation: the overall operation we’re trying to service may have a deadline, and operations we create may have their own deadlines too. Cancellation tokens in TPL can perform this role for us pretty easily.
- Logging information: if the logs need to tie everything together, there may be some ID generated which should be propagated.
- Other request information: very much dependent on what you’re doing, of course…
We’re used to some of this being available via properties such as CultureInfo.CurrentCulture and HttpContext.Current – but those are tied to a particular thread. Will they be propagated to threads used for new tasks or continuations? Historically I’ve found that documentation has been very poor around this area. It can be very difficult to work out what’s going to happen, even if you’re aware that there’s a potential problem in the first place.
Explicit or implicit?
It’s worth considering what the above items have in common. Why did I include those particular pieces of information but not others? How can we avoid treating them as ambient context in the first place?
Well, fairly obviously we can pass all the information we need along via method arguments. C# 5’s async feature actually makes this easier than it was before (and much easier that it would have been without anonymous functions) because the control flow is simpler. There should be fewer method calls, each of which would each require decoration with all the contextual information required.
However, in my experience that becomes quite problematic in terms of separation of concerns. If you imagine the request as a tree of asynchronous operations working down from a top node (whatever code initially handles the request), each node has to provide all the information required for all the nodes within its subtree. If some piece of information is only required 5 levels down, it still needs to be handed on at each level above that.
The alternative is to use an implicit context – typically via static methods or properties which have to do the right thing, typically based on something thread-local. The context code itself (in conjunction with whatever is distributing the work between threads) is responsible for keeping track of everything.
It’s easy to point out pros and cons to both approaches:
- Passing everything through methods makes the dependencies very obvious
- Changes to "lower" tasks (even for seemingly innocuous reasons such as logging) end up causing chains of changes higher up the task tree – possibly to developers working on completely different projects, depending on how your components work
- It feels like there’s a lot of work for very little benefit in passing everything explicitly through many layers of tasks
- Implicit context can be harder to unit test elegantly – as is true of so many things using static calls
- Implicit context requires everyone to use the same context. It’s no good high level code indicating which thread pool to use in one setting when some lower level code is going to use a different context
Ultimately it feels like a battle between purity and pragmatism: being explicit helps to keep your code purer, but it can mean a lot of fluff around your real logic, just to maintain the required information to pass onward. Different developers will have different approaches to this, but I suspect we want to at least keep the door open to both designs.
The place of Task/Task<T>
Even if Task/Task<T> can pass on the context for scheduling, what do we do about other information (authentication etc)? We have types like ThreadLocal<T> – in a world where threads are more likely to be reused, and aren’t really our unit of asynchrony, do we effectively need a TaskLocal<T>? Can context within a task be pushed and automatically popped, to allow one subtree to "override" the context for its nodes, while another subtree works with the original context?
I’ve been trying to think about whether this can be provided in "userland" code instead of in the TPL itself, but I’m not sure it can, easily… at least not without reinventing a lot of the existing code, which is never a good idea when it’s tricky parallelization code.
Should this be general support, or would it be okay to stick to just TaskScheduler.Current, leaving developers to pass other context explicitly?
Conclusion
These are thoughts which I’m hoping will be part of a bigger discussion. I think it’s something the community should think about and give feedback to Microsoft on well before C# 5 (and whatever framework it comes with) ships. I have lots of contradictory feelings about the right way to go, and I’m fully expecting comments to have mixed opinions too.
I’m sure I’ll be returning to this topic as time goes on.
Addendum (March 27th 2012)
Lucian Wischik recently mailed me about this post, to mention that F#’s support for async has had the ability to retain explicit context from the start. It’s also more flexible than the C# async support – effectively, it allows you to swap out AsyncTaskMethodBuilder etc for your own types, so you don’t always have to go via Task/Task<T>. I’ll take Lucian’s word for that, not knowing much about F# myself. One day…