Category Archives: Wacky Ideas

C# 4.0: dynamic ?

I’ve not played with the VS2010 CTP much yet, and I’ve only looked briefly at the documentation and blogs about the new C# 4.0 dynamic type, but a thought occurred to me: why not have the option of making it generic as a way of saying “I will dynamically support this set of operations”?

As an example of what I mean, suppose you have an interface IMessageRouter like this:

public interface IMessageRouter
{
    void Send(string message, string destination);
}

(This is an arbitrary example, by the way. The idea isn’t specifically more suitable for message routing than anything else.)

I may have various implementations, written in various languages (or COM) which support the Send method with those parameters. Some of those implementations actually implement IMessageRouter but some don’t. I’d like to be able to do the following:

dynamic<IMessageRouter> router = GetRouter();

// This is fine (but still invoked dynamically)
router.Send(“message”, “skeet@pobox.com”);
// Compilation error: no such overload
router.Send(“message”, “skeet@pobox.com”, 20);

Intellisense would work, and we’d still have some of the benefits of static typing but without the implementations having to know about your interface. Of course, it would be quite easy to create an implementation of the interface which did exactly this – but now imagine that instead of IMessageRouter we had MessageRouter – a concrete class. In this case the compiler would still restrict the caller to the public API of the class, but it wouldn’t have to be the real class. No checking would be performed by the compiler that your dynamic type actually supported the operations – given that we’re talking about dynamic invocation, that would be impossible to do. It would instead be an “opt-in” restriction the client places on themselves. It could also potentially help with performance – if the binding involved realised that the actual type of the dynamic object natively implemented the interface or was/derived from the class, then no real dynamic calls need be made; just route all directly.

This may all sound a bit fuzzy – I’m extremely sleepy, to be honest – but I think it’s a potentially interesting idea. Thoughts?

Update

Apparently this post wasn’t as clear as it might be. I’m quite happy to keep the currently proposed dynamic type idea as well – I’d like this as an additional way of using dynamic objects.

I’m Sorry, I Haven’t a Cluestick

Unless you’ve listened to I’m Sorry, I Haven’t a Clue, the BBC Radio 4 “antidote to panel games” which was chaired by the recently departed (and much missed) Humphrey Lyttelton, this post may not make a lot of sense to you. That’s not intended to guarantee that it’ll make any sense to you even if you have listened to it, mind you.

ISIHAC was full of very silly improvisation games. I was listening to an episode last night, and thought a developer version could be equally silly. Games played might include the following: (links show the Wikipedia description of the original)

  • One language to the API of another: Not so much a joke as a strategy now at both Microsoft and Sun, but I’m sure it could be taken to extremes. Ever fancied using the ZX Spectrum BASIC functions from C#? How about SmallTalk from Java?
  • Singleton Crescent: Players name design patterns, seemingly at random to the unenlightened observer, until one of them declares “Singleton”. Multiple variations exist, such as Gamma’s Folly: “After a diagonal move ending on a creational pattern, a behavioural pattern is not permitted.”
  • Fluent Gorge: Players start with a single method call and add additional expressions. The player who inadvertently completes the statement is the loser. This game is already played at lunchtime in many ISVs, some of whom accidentally check in the tortuous code afterwards.
  • Pick-up algorithm: A player begins copying code out of a book, which is then removed after a few lines. They then have to continue coding as accurately as possible until the book is brought back. Points are awarded if they’re within a gnat’s semi-colon of the original. Bonus points used to be awarded if the resulting code had fewer bugs in than the printed original, but this practice has recently been discontinued as being a near-trivial way of achieving a high score.
  • Call my bluff: One player is given a single-line project goal, and then several sets of requirements purporting to support that goal. The player has to guess which set of requirements is the real one for the project. The twist is that none of the sets of requirements is even slightly realistic. Again, this game is in widespread use, played by Business Analysts on unsuspecting developers.

Right, that’s more than enough silliness. Something more serious next time, I promise.

Mandelbrot revisited – benchmark edition

I’ve had fun with the Mandelbrot set in this blog before, using it as an example of an embarrassingly parallelisable problem and demonstrating Parallel LINQ with it.

This morning, over breakfast, I described the problem to Christian Brunschen, a colleague of mine who has some parallelisation experience through implementing OpenMP in a portable manner. He immediately suggested a few possible changes to the way that I’ve approached things. Given the number of different attempts I’ve now had, I thought it would make sense to write a litte benchmarking app which could easily be expanded as further implementations were considered. The benchmark is available for download so you can play around with it yourself. (As is often the case with this kind of thing, it’s not the nicest code in the world for various reasons – the duplication of the sequence generation method, for example… Please don’t use it as a tutorial on actual coding and organisation.)

Benchmark implementation details

The benchmark allows you to vary the size of the generated image and the maximum number of iterations per point. The images can be displayed after the test run, but only the time taken to populate a byte array is recorded. The byte arrays are all allocated before any tests are run, and the garbage collector is invoked (as far as it can be) between tests. The images themselves are only generated after the tests have all completed. Each implementation is given a tiny “dummy run” (creating an image 10 pixels across, with a maximum of 2 iterations per point) first to hopefully remove JITting from the benchmarking times. I won’t pretend that this puts everything on a completely level playing field (benchmarking is hard) but hopefully it’s a good start. We check the similarity between the results of each test and the first one – in some cases they could be “nearly all the same” without there being a bug, due to the subtleties of floating point operations and inlining.

The base class that all the generators use takes care of working out the height from the width, remembering the various configuration options, allocating the array, and also providing a simple imperative method to obtain the value of a single point, without using anything fancy.

I originally wanted to put the whole thing in a nice UI, but after wasting nearly an hour trying to get WPF to behave, I decided to go for a simple console app. One day I really must learn how to write GUIs quickly…

Okay, enough introduction – let’s look at the implementations we’re testing, and then the results. The idea is to get a look at how a number of areas influence the results, as well as seeing some nifty ways of expressing things functionally.

SingleThreadImperativeSimple

This is the most trivial code you could write, basically. As the base class provides the calculation method, we just go through each row and column, work out the value, and store it in the array. The only attempt at optimisation is keeping the “current index” into the array rather than calculating it for every point.

 

public override void Generate()
{
    int index = 0;
    for (int row = 0; row < Height; row++)
    {
        for (int col = 0; col < Width; col++)
        {
            Data[index++] = ComputeMandelbrotIndex(row, col);
        }
    }
}

 

where ComplexMandelbrotIndex is in the base class:

 

protected byte ComputeMandelbrotIndex(int row, int col)
{
    double x = (col * SampleWidth) / Width + OffsetX;
    double y = (row * SampleHeight) / Height + OffsetY;

    double y0 = y;
    double x0 = x;

    for (int i = 0; i < MaxIterations; i++)
    {
        if (x * x + y * y >= 4)
        {
            return (byte)((i % 255) + 1);
        }
        double xtemp = x * x – y * y + x0;
        y = 2 * x * y + y0;
        x = xtemp;
    }
    return 0;
}

 

SingleThreadImperativeInline

This is largely the same code as SingleThreadImperativeSimple, but with everything inlined. Within the main body, there’s no access to anything other than local variables, and no method calls. One further optimisation is available: points which will have a value of 0 aren’t stored (we assume we’ll always start with a cleared array).

 

public override void Generate()
{
    int width = Width;
    int height = Height;
    int maxIterations = MaxIterations;
    int index = 0;
    byte[] data = Data;

    for (int row = 0; row < height; row++)
    {
        for (int col = 0; col < width; col++)
        {
            double x = (col * SampleWidth) / width + OffsetX;
            double y = (row * SampleHeight) / height + OffsetY;

            double y0 = y;
            double x0 = x;

            for (int i = 0; i < maxIterations; i++)
            {
                if (x * x + y * y >= 4)
                {
                    data[index] = (byte)((i % 255) + 1);
                    break;
                }
                double xtemp = x * x – y * y + x0;
                y = 2 * x * y + y0;
                x = xtemp;
            }
            // Leave data[index] = 0 by default
            index++;
        }
    }
}

 

MultiThreadUpFrontSplitImperative

This should be pretty efficient – work out how many cores we’ve got, split the work equally between them (as chunks of rows, which helps in terms of locality and just incrementing an index in the byte array each time), and then run. Wait until all the threads have finished, and we’re done. Shouldn’t be a lot of context switching required normally, and no synchronization is required. However, if some cores are busy with another process, we’ll end up context switching for no gain.

 

public override void Generate()
{
    int cores = Environment.ProcessorCount;

    int rowsPerCore = Height / cores;

    List<Thread> threads = new List<Thread>();
    for (int i = 0; i < cores; i++)
    {
        int firstRow = rowsPerCore * i;
        int rowsToCompute = rowsPerCore;
        if (i == cores – 1)
        {
            rowsToCompute = Height-(rowsPerCore*(cores-1));
        }
        Thread thread = new Thread(() =>
            {
                int index = firstRow * Width;
                for (int row = firstRow; row < firstRow + rowsToCompute; row++)
                {
                    for (int col = 0; col < Width; col++)
                    {
                        Data[index++] = ComputeMandelbrotIndex(row, col);
                    }
                }
            });
        thread.Start();
        threads.Add(thread);
    }

    threads.ForEach(thread => thread.Join());
}

 

MultiThreadRowFetching

Again, we use a fixed number of threads – but this time we start off with a queue of work – one task per row. The queue has to be synchronized every time we use it, and we also have to check whether or not it’s empty. Plenty of scope for lock contention here, particularly as the number of cores increases.

 

public override void Generate()
{
    Queue<int> rowsLeft = new Queue<int>(Enumerable.Range(0, Height));

    List<Thread> threads = new List<Thread>();
    for (int i = 0; i < Environment.ProcessorCount; i++)
    {
        Thread thread = new Thread(() =>
            {
                while (true)
                {
                    int row;
                    lock (rowsLeft)
                    {
                        if (rowsLeft.Count == 0)
                        {
                            break;
                        }
                        row = rowsLeft.Dequeue();
                    }
                    int index = row * Width;
                    for (int col = 0; col < Width; col++)
                    {
                        Data[index++] = ComputeMandelbrotIndex(row, col);
                    }
                }
            });
        thread.Start();
        threads.Add(thread);
    }

    threads.ForEach(thread => thread.Join());
}

SingleThreadLinqSimple

This is the first version of Mandelbrot generation I wrote since thinking of using LINQ, tweaked a litte to dump the output into an existing array instead of creating a new one. It’s essentially the same as SingleThreadImperativeSimple but with the for loops replaced with a query expression.

 

public override void Generate()
{
    var query = from row in Enumerable.Range(0, Height)
                from col in Enumerable.Range(0, Width)
                select ComputeMandelbrotIndex(row, col);

    int index=0;
    foreach (byte b in query)
    {
        Data[index++] = b;
    }
}

 

ParallelLinqRowByRowWithCopy

My first attempt using Parallel LINQ was somewhat disastrous due to the nature of PLINQ’s ordering. To combat this effect, I initially computed a row at a time, then copying each row into place afterwards:

 

private byte[] ComputeMandelbrotRow(int row)
{
    byte[] ret = new byte[Width];
    for (int col = 0; col < Width; col++)
    {
        ret[col] = ComputeMandelbrotIndex(row, col);
    }
    return ret;
}

public override void Generate()
{
    var query = from row in Enumerable.Range(0, Height).AsParallel(ParallelQueryOptions.PreserveOrdering)
                select ComputeMandelbrotRow(row);

    int rowStart = 0;
    foreach (byte[] row in query)
    {
        Array.Copy(row, 0, Data, rowStart, Width);
        rowStart += Width;
    }
}

 

ParallelLinqRowByRowInPlace

An obvious potential improvement is to write the data to the eventual target array as we go, to avoid creating the extra array creation and copying. At this point we have less of a purely functional solution, but it’s interesting anyway. Note that to use this idea in a query expression we have to return something even though it’s not useful. Likewise we have to make the query execute fully – so I use Max() to ensure that all the results will be computed. (I originally tried Count() but that didn’t work – presumably because the count of the results is known before the actual values are.) As we’re writing the data to the right place on each iteration, we no longer need to preserve ordering.

 

private int ComputeMandelbrotRow(int row)
{
    int index = row * Width;
    for (int col = 0; col < Width; col++)
    {
        Data[index++] = ComputeMandelbrotIndex(row, col);
    }
    return 0;
}

public override void Generate()
{
    var query = from row in Enumerable.Range(0, Height).AsParallel()
                select ComputeMandelbrotRow(row);

    query.Max();
}

 

ParallelLinqWithSequenceOfPoints

After this initial foray into Parallel LINQ, Nicholas Palladinos suggested that I could adapt the original idea in a more elegant manner by treating the sequence of points as a single input sequence, and asking Parallel LINQ to preserve the order of that whole sequence.

 

public override void Generate()
{
    var points = from row in Enumerable.Range(0, Height)
                 from col in Enumerable.Range(0, Width)
                 select new { row, col };

    var query = from point in points.AsParallel(ParallelQueryOptions.PreserveOrdering)
                select ComputeMandelbrotIndex(point.row, point.col);

    int index=0;
    foreach (byte b in query)
    {
        Data[index++] = b;
    }
}

 

SingleThreadLinqWithGenerator

My next step was to try to put the whole guts of the algorithm into the query expression, using a Complex value type and a generator to create a sequence of complex numbers generated from the starting point. This is quite a natural step, as the value is computed by just considering this infinite sequence and seeing how quickly it escapes from the circle of radius 2 on the complex plane. Aside from anything else, I think it’s a nice example of deferred execution and streaming – the sequence really would go on forever if we didn’t limit it with the Take and TakeWhile calls, but the generator itself doesn’t know it’s being limited. This version uses a sequence of points as per ParallelLinqWithSequenceOfPoints to make it easier to parallelise later.

It’s not exactly an efficient way of doing things though – there’s a huge amount of calling going on from one iterator to another. I’m actually quite surprised that the final results aren’t worse than they are.

 

public override void Generate()
{
    var points = from row in Enumerable.Range(0, Height)
                 from col in Enumerable.Range(0, Width)
                 select new { row, col };

    var query = from point in points
                // Work out the initial complex value from the row and column
                let c = new Complex((point.col * SampleWidth) / Width + OffsetX,
                                    (point.row * SampleHeight) / Height + OffsetY)
                // Work out the number of iterations
                select GenerateSequence(c, x => x * x + c).TakeWhile(x => x.SquareLength < 4)
                                                          .Take(MaxIterations)
                                                          .Count() into count
                // Map that to an appropriate byte value
                select (byte)(count == MaxIterations ? 0 : (count % 255) + 1);

    int index = 0;
    foreach (byte b in query)
    {
        Data[index++] = b;
    }
}

public static IEnumerable<T> GenerateSequence<T>(T start, Func<T, T> step)
{
    T value = start;
    while (true)
    {
        yield return value;
        value = step(value);
    }
}

 

ParallelLinqWithGenerator

From the previous implementation, parallelisation is simple.

 

public override void Generate()
{
    var points = from row in Enumerable.Range(0, Height)
                 from col in Enumerable.Range(0, Width)
                 select new { row, col };

    var query = from point in points.AsParallel(ParallelQueryOptions.PreserveOrdering)
                // Work out the initial complex value from the row and column
                let c = new Complex((point.col * SampleWidth) / Width + OffsetX,
                                    (point.row * SampleHeight) / Height + OffsetY)
                // Work out the number of iterations
                select GenerateSequence(c, x => x * x + c).TakeWhile(x => x.SquareLength < 4)
                                                          .Take(MaxIterations)
                                                          .Count() into count
                // Map that to an appropriate byte value
                select (byte)(count == MaxIterations ? 0 : (count % 255) + 1);

    int index = 0;
    foreach (byte b in query)
    {
        Data[index++] = b;
    }
}

 

SingleThreadImperativeWithComplex

Having already seen how slow the generator version was in the past, I thought it would be worth checking whether or not this was due to the use of the Complex struct – so this is a version which uses that, but is otherwise just like the very first implementation (SingleThreadImperativeSimple).

 

public override void Generate()
{
    int index = 0;
    for (int row = 0; row < Height; row++)
    {
        for (int col = 0; col < Width; col++)
        {
            Data[index++] = ComputeMandelbrotIndexWithComplex(row, col);
        }
    }
}

byte ComputeMandelbrotIndexWithComplex(int row, int col)
{
    double x = (col * SampleWidth) / Width + OffsetX;
    double y = (row * SampleHeight) / Height + OffsetY;

    Complex start = new Complex(x, y);
    Complex current = start;

    for (int i = 0; i < MaxIterations; i++)
    {
        if (current.SquareLength >= 4)
        {
            return (byte)((i % 255) + 1);
        }
        current = current * current + start;
    }
    return 0;
}

 

UnorderedParallelLinqSimple

This is where my colleague, Christian, came in. He suggested that instead of ordering the results, we just return the point as well as its value. We can then populate the array directly, regardless of the order of the results. The first version just uses an anonymous type to represent the combination:

 

public override void Generate()
{
    var query = from row in Enumerable.Range(0, Height).AsParallel()
                from col in Enumerable.Range(0, Width)
                select new { row, col, value = ComputeMandelbrotIndex(row, col) };

    foreach (var x in query)
    {
        Data[x.row * Width + x.col] = x.value;
    }
}

 

UnorderedParallelLinqWithStruct

Creating a new garbage-collected object on the heap for each point isn’t the world’s greatest idea. A very simple transformation is to use a struct instead. Without knowing the PLINQ implementation, it seems likely that the values will still end up on the heap somehow (how else could they be communicated between threads?) but I’d still expect a certain saving from this simple step.

 

public override void Generate()
{
    var query = from row in Enumerable.Range(0, Height).AsParallel()
                from col in Enumerable.Range(0, Width)
                select new Result (row, col, ComputeMandelbrotIndex(row, col));

    foreach (var x in query)
    {
        Data[x.Row * Width + x.Column] = x.Value;
    }
}

struct Result
{
    internal int row, col;
    internal byte value;

    internal int Row { get { return row; } }
    internal int Column { get { return col; } }
    internal byte Value { get { return value; } }

    internal Result(int row, int col, byte value)
    {
        this.row = row;
        this.col = col;
        this.value = value;
    }
}

 

UnorderedParallelLinqInPlace

Why bother returning the data at all? We can just write the data in place, like we did earlier with ParallelLinqRowByRowInPlace but in a more aggressive fashion (and the disadvantage of computing the index for each point). Again, we have to return a dummy value and iterate through those dummy results to force all the computations to go through.

 

public override void Generate()
{

    var query = from row in Enumerable.Range(0, Height).AsParallel()
                from col in Enumerable.Range(0, Width)
                // Note side-effect!
                select Data[row*Width + col] = ComputeMandelbrotIndex(row, col);

    // Force iteration through all results
    query.Max();
}

 

UnorderedParallelLinqInPlaceWithDelegate

UnorderedParallelLinqInPlace feels slightly nasty – we’ve clearly got a side-effect within the loop, it’s just that we know the side-effects are all orthogonal. We can make ourselves feel slightly cleaner by separating the side-effect from the main algorithm, by way of a delegate. The loop can hold its hands up and say, “I’m side-effect free if you are.” It’s not hugely satisfactory, but it’ll be interesting to see the penalty we pay for this attempt to be purer.

 

IEnumerable<T> Generate<T>(Func<int, int, T> transformation)
{
    var query = from row in Enumerable.Range(0, Height).AsParallel()
                from col in Enumerable.Range(0, Width)
                // Side-effect only if transformation contains one…
                select transformation(row, col);

    return query;
}

public override void Generate()
{
    // Transformation with side-effect
    Func<int,int,byte> transformation = (row, col) => Data[row * Width + col] = ComputeMandelbrotIndex(row, col);

    Generate(transformation).Max();
}

 

UnorderedParallelLinqInPlaceWithGenerator

We can easily combine the earlier generator code with any of these new ways of processing the results. Here we use our “algorithm in the query” approach but process the results as we go to avoid ordering issues.

 

public override void Generate()
{
    var query = from row in Enumerable.Range(0, Height).AsParallel()
                from col in Enumerable.Range(0, Width)
                // Work out the initial complex value from the row and column
                let c = new Complex((col * SampleWidth) / Width + OffsetX,
                                    (row * SampleHeight) / Height + OffsetY)
                // Work out the number of iterations
                let count = GenerateSequence(c, x => x * x + c).TakeWhile(x => x.SquareLength < 4)
                                                               .Take(MaxIterations)
                                                               .Count()
                // Map that to an appropriate byte value – and write it in place
                select Data[row * Width + col]  = (byte)(count == MaxIterations ? 0 : (count % 255) + 1);

    // Force execution
    query.Max();
}

 

ParallelForLoop

The Parallel Extensions library doesn’t just contain Parallel LINQ. It also has various other building blocks for parallelism, including a parallel for loop. This allows very simple parallelism of our very first generator – we just turn the outside loop into a parallel for loop, turning the previous inner loop into a delegate. We need to move the index variable into the outer loop so there’ll be one per row (otherwise they’d trample on each other) but that’s about it:

 

public override void Generate()
{
    Parallel.For(0, Height, row =>
    {
        int index = row * Width;
        for (int col = 0; col < Width; col++)
        {
            Data[index++] = ComputeMandelbrotIndex(row, col);
        }
    });
}

 

Results

A benchmark is nothing without results, so here they are on my dual core laptop, from three test runs. The first is the “default” settings I used to develop the benchmark – nothing hugely strenuous, but enough to see differences. I then tried a larger image with the same maximum number of iterations, then the original size with a larger number of iterations. The results are in alphabetical order because that’s how the test prints them :) Times are in milliseconds.

Implementation Width=1200; MaxIterations=200 Width=3000; MaxIterations=200 Width=1200; MaxIterations=800
MultiThreadRowFetching 380 2479 1311
MultiThreadUpFrontSplitImperative 384 2545 2088
ParallelForLoop 376 2361 1292
ParallelLinqRowByRowInPlace 378 2347 1295
ParallelLinqRowByRowWithCopy 382 2376 1288
ParallelLinqWithGenerator 4782 29752 16626
ParallelLinqWithSequenceOfPoints 549 3413 1462
SingleThreadImperativeInline 684 4352 2455
SingleThreadImperativeSimple 704 4353 2372
SingleThreadImperativeWithComplex 2795 16720 9800
SingleThreadLinqSimple 726 4522 2438
SingleThreadLinqWithGenerator 8902 52586 30075
UnorderedParalleLinqInPlace 422 2586 1317
UnorderedParallelLinqInPlaceWithDelegate 509 3093 1392
UnorderedParallelLinqInPlaceWithGenerator 5046 31657 17026
UnorderedParallelLinqSimple 556 3449 1448
UnorderedParalelLinqWithStruct 511 3227 1427

Conclusions

So, what have we learned? Well… bearing in mind that benchmarks like this are often misleading compared with real applications, etc it’s still interesting that:

  • Parallel Extensions rocks. If I were trying to include a Mandelbrot generation implementation in a production setting, I’d definitely go for the parallel for loop – it’s simple, but works just as well as anything else.
  • The micro-optimisation of SingleThreadImperativeInline really doesn’t help much, but makes the code harder to understand – just like so many micro-optimisations.
  • The “generator” form of LINQ really doesn’t perform well at all. It does parallelise pretty well, as you’d expect, but it’s just plain slow.
  • Part of the slowness is almost certainly due to the use of the Complex struct, given the results of SingleThreadImperativeWithComplex. Not sure what’s going on there.
  • The extra abstraction step from UnorderedParallelLinqInPlace to UnorderedParallelLinqInPlaceWithDelegate does have a significant impact, at least when the maximum number of iterations per point isn’t the dominant force. That doesn’t mean it would be a bad idea in production, of course – just somewhere to consider when running against performance issues.

I suspect I’ve missed some notable points though – comments?

Visualising the Mandelbrot set with LINQ – yet again

I’ve been thinking about ranges again, particularly after catching a book error just in time, and looking at Marc’s generic complex type. It struck me that my previous attempts were all very well, and demonstrated parallelisation quite neatly, but weren’t very LINQy. In particular, they certainly didn’t use LINQ for the tricky part in the same way that Luke Hoban’s ray tracer does.

The thing is, working out the “value” of a particular point in the Mandelbrot set visualisation is actually quite well suited to LINQ:

  1. Start with a complex value
  2. Apply a transformation to it (square the current value, add the original value from step 1).
  3. Does the result have a modulus > 2? If so, stop.
  4. Have we performed as many iterations as we’re willing to? If so, stop.
  5. Take our new value, and go back to 2.

We only need two “extra” bits of code to implement this: a Complex type, and a way of applying the same transformation repeatedly.

Well, here’s the Complex type – it contains nothing beyond what we need for this demo, but it’ll do. Obviously Marc’s generic implementation is rather more complete.

public struct Complex
{
    readonly double real;
    readonly double imaginary;

    public Complex(double real, double imaginary)
    {
        this.real = real;
        this.imaginary = imaginary;
    }

    public static Complex operator +(Complex c1, Complex c2)
    {
        return new Complex(c1.real + c2.real, c1.imaginary + c2.imaginary);
    }

    public static Complex operator *(Complex c1, Complex c2)
    {
        return new Complex(c1.real*c2.real – c1.imaginary*c2.imaginary,
                           c1.real*c2.imaginary + c2.real*c1.imaginary);
    }

    public double SquareLength
    {
        get { return real * real + imaginary * imaginary; }
    }
}

Simple stuff, assuming you know anything about complex numbers.

The other new piece of code is even simpler. It’s just a generator. It’s a static method which takes an initial value, and a delegate to apply to one value to get the next. It then lazily returns the generated sequece – forever.

public static IEnumerable<T> Generate<T>(T start, Func<T, T> step)
{
    T value = start;
    while (true)
    {
        yield return value;
        value = step(value);
    }
}

Just as an example of use, remember Enumerable.Range which starts at a particular integer, then adds one repeatedly until it’s yielded as many results as you’ve asked for? Well, here’s a possible implementation, given the Generate method:

public static IEnumerable<int> Range(int start, int count)
{
    return Generate(start, x => x + 1).Take(count);
}

These are all the building blocks we require to build our Mandelbrot visualisation. We want a query which will return a sequence of bytes, one per pixel, where each byte represents the colour to use. Anything which goes beyond our maximum number of iterations ends up black (colour 0) and other values will cycle round the colours. I won’t show the drawing code, but the query is now more self-contained:

var query = from row in Enumerable.Range(0, ImageHeight)
            from col in Enumerable.Range(0, ImageWidth)
            // Work out the initial complex value from the row and column
            let c = new Complex((col * SampleWidth) / ImageWidth + OffsetX,
                                (row * SampleHeight) / ImageHeight + OffsetY)
            // Work out the number of iterations
            select Generate(c, x => x * x + c).TakeWhile(x => x.SquareLength < 4)
                                              .Take(MaxIterations)
                                              .Count() into count
            // Map that to an appropriate byte value
            select (byte)(count == MaxIterations ? 0 : (count % 255) + 1);

(The various constants used in the expression are defined elsewhere.) This works, and puts the Mandelbrot logic directly into the query. However, I have to admit that it’s much slower than my earlier versions. Heck, I’m still proud of it though.

As ever, full source code is available for download, should you so wish.

Boxing day

Here in the UK (and possibly elsewhere – I’m too lazy to check Wikipedia) December 26th is called Boxing Day. I want to know why only this one aspect of .NET is given its own public holiday. Expanding the opportunities to language features as well as those of the runtime, I’d like to see:

  • Null coalescing day (“Take 3 nulls into the shower? Not me. I just coalesce and go!”)
  • Type inference day (not popular with HR)
  • Garbage collection day (as a proper holiday, not just when the bins are collected)
  • Just-In-Time day (arguably every day for procrastinators such as myself)
  • Lambda expression day (get those lambdas out of the closet)

I could go on, but it’s a pretty thin joke and dinner is ready.

Refactris

This post is in lieu of writing a proper one, either on the generic maths operators which Marc Gravell has been hard at work on, or on C# 4 which I have a number of opinions about (no surprise there). I will write about both of those topics, but I really ought to do some more work on the manuscript for chapter 2 of the book before I go to bed. Posting a blog entry is a reward for finishing indexing chapters 2 and 13, but both of the serious posts will take longer than I really have time for right now.

So, Refactris. This is a silly idea born a couple of weeks ago, at work. You see, several months ago, I had run out of work on a Friday afternoon at 4pm. My then-team-leader, Rohan, foolishly challenged me to write console-mode Tetris in an hour. I had great fun, and had a demonstrable game of Tetris working precisely one hour later. Now combine that with refactoring, one of the ideas of which is that you can remove lines of code by finding code duplication etc. Put the two together, and you get Refactris.

Letters, digits and symbols would fall from the top, and whenever they landed (in a normal Tetris manner) the game would try to compile the code in the bucket, finding as much to compile as possible. It would try the top line, then the top two lines, then the top three lines, etc, until it reached the bottom – then try the second line, the second and third lines together, etc. Code which compiled would be removed.

Clearly it’s a stupid idea, and I haven’t actually tried to implement it or anything silly like that. Funny enough to share though, and if any of you wish to give it a go, I’d love to see the results.

Group pipelining returns: new and improved design

Last night’s blog post provoked a flurry of emails between myself and Marc Gravell. Looking back, trying to base the pipeline on a pseudo-asynchronous version of IEnumerable<T> was a mistake. We’ve now got a much more attractive interface to write extensions against:

 

public interface IDataProducer<T>
{
    event Action<T> DataProduced;
    event Action EndOfData;
}

Why is this so much better? A few reasons:

  1. Acting on all the data is much easier: just subscribe to the events. No need for a weird ForEach.
  2. It fits the normal processing model much better – we tend to want to process the data, not whether or not there are more elements on their way.
  3. It allows unsubscription when we’re not interested in any more data.
  4. It allows multiple subscribers.

The last point is very, very interesting. It means we can implement GroupWithPipeline (current new name – I suspect it won’t last forever though) to take multiple pipelines, producing a KeyValueTuple with, say, three values in, each of different types, still strongly typed. If we don’t care about the type safety, we can use “params…” to deal with as many pipelines as we want – the client then needs to cast, which isn’t ideal, but isn’t too bad.

As an idea of what I mean by this, consider this call to find the Max, Min and Average values, all without buffering:

 

var query = someSequenceOfOrders
                .GroupWithPipeline (entry => entry.Customer,
                                    seq => seq.Min(entry => entry.OrderSize),
                                    seq => seq.Average(entry => entry.OrderSize),
                                    seq => seq.Max(entry => entry.OrderSize));
                .Select(x => new { Customer = x.Key,
                                   Min = x.Value1,
                                   Avereage = x.Value2,
                                   Max = x.Value3 });
                                   
foreach (var result in query)
{
    Console.WriteLine (“Customer {0}: {1}/{2}/{3}”
                       result.Customer,
                       result.Min,
                       result.Average,
                       result.Max);
}

We specify three different pipelines, all of which will be applied to the same sequence of data. The fact that we’ve specified OrderSize three times is unfortunate – a new overload to transform the entries passed to the pipeline is probably in order – but it’s all doable.

This sort of “in pipeline” multiple aggregation is very, very cool IMO. It’s turned the whole idea from “interesting” to “useful enough to get into MiscUtil”.

I haven’t actually written Min, Max or Average yet – although Marc has, I believe. (We’re collaborating and sharing source, but not working off a common source control system yet. It’s all a bit ad hoc.) What I know he’s done which is possibly even more useful than all of this to start with is used expression trees to implement generic maths. This is only execution time checked, which is unfortunate, but I don’t believe that will be a problem in real life.

The upshot is that the above code will work with any type with appropriate operators defined. No need for loads of overloads for decimal, long, int, float, double etc – it will just work. If you’re worried about performance, you can relax – it performs very, very well, which was a bit of a surprise to both of us.

More of that in another post though… I wanted to share the new design because it’s so much nicer than the deeply complicated stuff I was working with yesterday.

Don’t call us, we’ll call you – push enumerators

Update: I’ve got a new and simpler design now. I’m leaving this in for historical interest, but please see the entry about the new design for more recent information.

This post is going to be hard to write, simply because I can’t remember ever writing quite such bizarre code before. When I find something difficult to keep in my own head, explaining it to others is somewhat daunting, especially when blogging is so much more restrictive than face-to-face discussion. Oh, and you’ll find it hard if you’re not familiar with lambda expression syntax (x => x.Foo etc). Just thought I’d warn you.

It’s possibly easiest to explain with an example. It’s one I’m hoping to use in a magazine article – but I certainly won’t be trying to explain this in the article. Imagine you’ve got a lot of log entries on disk – by which I mean hundreds of millions. You certainly don’t want all of that in memory. However, each of the log entries contains a customer ID, and you want to find out for each customer how many entries there are. Here’s a LINQ query which would work but be horribly inefficient, loading everything into memory:

var query = from entry in logEntryReader
            group entry by entry.Customer into entriesByCustomer
            let count = entriesByCustomer.Count()
            orderby count descending
            select new { Customer = entriesByCustomer.Key, Count = count };

Now, it’s easy to improve this somewhat just by changing the “group entry by” to “group 1 by” – that way the entries themselves are thrown away. However, you’ve still got some memory per entry – a huge great enumeration of 1s to count after grouping.

The problem is that you can’t tell “group … by” how to aggregate the sequence associated with each key. This isn’t just because there’s no syntax to express it – it’s to do with the nature of IEnumerable itself. You see, the “pull” nature of IEnumerable is a problem. While a thread is waiting for more data, it will just block. Normally, an aggregator (like Count) just picks data off a sequence until it reaches the end, then returns the result. How can that work when there are multiple sequences involved (one for each customer)?

There are three answers to this:

1) Write your own group-and-count method. This is pretty straightforward, and potentially useful in many situations. It’s also fairly easy to understand. You just iterate through a sequence, and keep a dictionary of key to int, increasing each key’s count as you see elements. This is the pragmatic solution when faced with a specific problem – but it feels like there should be something better, something that lets us group and then specify the processing in terms of standard query operators.

2) Create a new thread and producer/consumer style IEnumerable for each key. Clearly this doesn’t scale.

3) Invert control of enumerations: put the producer in the driving seat instead of the consumer. This is the approach we’re talking about for the rest of the post.

A word on the term “asynchronous”

I don’t know whether my approach could truly be called asynchronous. What I haven’t done is make any of the code thread-aware at all, or even thread-safe. All the processing of multiple sequences happens in a single thread. I also don’t have the full BeginXXX, EndXXX using IAsyncResult pattern. I started down that line, but it ended up being a lot more complicated than what I’ve got now.

I’m pretty sure that what I’ve been writing is along the lines of CSPs (Communicating Sequential Processes) but I wouldn’t in any way claim that it’s a CSP framework, either.

However, you may find that it helps to think about asynchronous APIs like Stream.BeginRead when looking at the rest of the code. In particular, reading a stream asynchronously has the same “say I’m interested in data, react to data, request some more” pattern.

Keeping the Count aggregator in mind, what we want to do is maintain a private count, and request some data. When we get called back to say there is more data, we increment our count (ignoring the data) and request some more. When we are told that there’s no more data, we can return the count.

With that said, here’s the interface for what I’ve called IPushEnumerator. The name is open for change – I’ve been through a few options, and I’m still not comfortable with it. Please feel free to suggest another one! Note that there isn’t an IPushEnumerable – again, I started off with one, but found it didn’t make sense. Maybe someone smarter than me will come up with a way of it fitting.

IPushEnumerator

/// <summary>
/// An enumerator which works in an inverse manner – the consumer requests
/// to be notified when a value is available, and then the enumerator
/// will call back to the consumer when the producer makes some data
/// available or indicates that all the data has been produced.
/// </summary>
public interface IPushEnumerator<T>
{
    /// <summary>
    /// Fetch the current value. Not valid until
    /// a callback with a True argument has occurred,
    /// or after a callback 
    /// </summary>
    T Current { get; }

    /// <summary>
    /// Requests notification (via the specified callback) when more
    /// data is available or when the end of the data has been reached.
    /// The argument passed to the callback indicates which of these
    /// conditions has been met – logically the result of MoveNext
    /// on a normal IEnumerator.
    /// </summary>
    void BeginMoveNext(Action<bool> callback);
}

That bit is relatively easy to understand. I can ask to be called back when there’s data, and typically I’ll fetch the data within the callback and ask for more.

So far, so good. But what’s going to create these in the first place? How do we interface with LINQ? Time for an extension method.

Enumerable.GroupWithPush

I wanted to create an extension to IEnumerable<T> which had a “LINQ feel” to it. It should be quite like GroupBy, but then allow the processing of the subsequences to be expressed in a LINQ-like way. (Actual C# query expressions aren’t terribly useful in practice because there isn’t specific syntax for the kind of operators which turn out to be useful with this approach.) We’ll want to have type parameters for the original sequence (TElement), the key used for grouping (TKey) and the results of whatever processing is performed on each sequence (TResult).

So, the first parameter of our extension method is going to be an IEnumerable<TElement>. We’ll use a Func<TElement,TKey> to map source elements to keys. We could optionally allow an IEqualityComparer<TKey> too – but I’m certainly not planning on supporting as many overloads as Enumerable.GroupBy does. The final parameter, however, needs to be something to process the subsequence. The first thought would be Func<IPushEnumerator<TElement>,TResult> – until you start trying to implement the extension method or indeed the delegate doing the processing.

You see, given an IPushEnumerator<TElement> you really don’t want to return a result. Not just yet. After all, you don’t have the data yet, just a way of being given the data. What you want to return is the means of the caller obtaining the result after all the data has been provided. This is where we need to introduce a Future<T>.

Future<T>

If you don’t know about the idea of a future, it’s basically an IOU for a result. In proper threading libraries, futures allow the user to find out whether a computation has completed or not, wait for the result etc. My implementation of Future<T> is not that smart. It’s not smart at all. Here it is:

/// <summary>
/// Poor-man’s version of a Future. This wraps a result which *will* be
/// available in the future. It’s up to the caller/provider to make sure
/// that the value has been specified by the time it’s requested.
/// </summary>
public class Future<T>
{
    T value;
    bool valueSet = false;

    public T Value 
    {
        get
        {
            if (!valueSet)
            {
                throw new InvalidOperationException(“No value has been set yet”);
            }
            return value;
        }
        set
        {
            valueSet = true;
            this.value = value;
        }
    }
}

With this in place, we can reveal the actual signature of GroupWithPush:

public static IEnumerable<KeyValuePair<TKey, TResult>> GroupWithPush<TElement, TKey, TResult>
    (this IEnumerable<TElement> source,
     Func<TElement, TKey> mapping,
     Func<IPushEnumerator<TElement>, Future<TResult>> pipeline)

I shall leave you to mull over that – I don’t know about you, but signatures of generic methods always take me a little while to decode.

The plan is to then implement extension methods on IPushEnumerator<T> so that we can write code like this:

var query = logEntryReader.GroupWithCount(entry => entry.Customer,
                                          sequence => sequence.Count());

foreach (var result in query)
{
    Console.WriteLine (“Customer {0}: {1} entries”,
                       result.Key,
                       result.Value);
}

Okay, so how do we implement these operators? Let’s give an example – Count being pretty a simple case.

Implementing Count()

Let’s start off by looking at a possible Count implementation for a normal sequence, to act as a sort of model for the implementation in the weird and wacky land of futures and push enumerators:

public static int Count<T>(IEnumerable<T> source)
{
    int count = 0;
    foreach (T item in source)
    {
        count++;
    }
    return count;
}

Now, we’ve got two problems. Firstly, we’re not going to return the count – we’re going to return a Future. Secondly, we certainly can’t use foreach on an IPushEnumerator<T> – the whole point is to avoid blocking while we wait for data. However, the concept of “for each element in a sequence” is useful – so let’s see whether we can do something similar with another extension method, then come back and use it in Count.

Implementing ForEach()

Warning: this code hurts my head, and I wrote it. Even the idea of it hurts my head a bit. The plan is to implement a ForEach method which takes two delegates – one which is called for each item in the enumerator, and one which is called after all the data has been processed. It will return without blocking, but it will call BeginMoveNext first, using a delegate of its own. That delegate will be called when data is provided, and it will in turn call the delegates passed in as parameters, before calling BeginMoveNext again, etc.

Ready?

public static void ForEach<T>(this IPushEnumerator<T> source, 
                              Action<T> iteration,
                              Action completion)
{
    Action<bool> moveNextCallback = null;
            
    moveNextCallback = dataAvailable =>
         {
             if (dataAvailable)
             {
                 iteration(source.Current);
                 source.BeginMoveNext(moveNextCallback);
             }
             else
             {
                 completion();
             }
         };

    source.BeginMoveNext(moveNextCallback);
}

What I find particularly disturbing is that moveNextCallback is self-referential – it calls BeginMoveNext passing itself a the parameter. (Interestingly, you still need to assign it to null first, otherwise the compiler complains that it might be used without being assigned. I seem to remember reading a blog post about this before now, and thinking that I’d never ever run into such a situation. Hmm.)

Nasty as ForEach is in terms of implementation, it’s not too bad to use.

Implementing Count() – the actual code

The translation of the original Count is now relatively straightforward. We prepare the Future wrapper for the result, and indicate that we want to iterate through all the entries, counting them and then setting the result value when we’ve finished (which will be long after the method first returns, don’t forget).

public static Future<int> Count<T>(this IPushEnumerator<T> source)
{
    Future<int> ret = new Future<int>();
    int count = 0;

    source.ForEach(t => count++, 
                   () => ret.Value = count);
    return ret;
}

We’re nearly there now. All we need to do is complete the original GroupWithPush method:

Implementing GroupWithPush

There are three phases to GroupWithPush, as mentioned before: pushing the data to the consumers (creating those consumers as required based on the keys we see); telling all the consumers that we’ve finished; retrieving the results. It’s probably easiest just to show the code – it’s actually not too hard to understand.

public static IEnumerable<KeyValuePair<TKey, TResult>> GroupWithPush<TElement, TKey, TResult>
    (this IEnumerable<TElement> source,
     Func<TElement, TKey> mapping,
     Func<IPushEnumerator<TElement>, Future<TResult>> pipeline)
{
    var enumerators = new Dictionary<TKey, SingleSlotPushEnumerator<TElement>>();
    var results = new Dictionary<TKey, Future<TResult>>();
    // Group the data, pushing it to the enumerators at the same time.
    foreach (TElement element in source)
    {
        TKey key = mapping(element);
        SingleSlotPushEnumerator<TElement> push;
        if (!enumerators.TryGetValue(key, out push))
        {
            push = new SingleSlotPushEnumerator<TElement>();
            results[key] = pipeline(push);
            enumerators[key] = push;
        }
        push.Push(element);
    }
    // Indicate to all the enumerators that we’ve finished
    foreach (SingleSlotPushEnumerator<TElement> push in enumerators.Values)
    {
        push.End();
    }
    // Collect the results, converting Future<T> into T for each one.
    foreach (var result in results)
    {
        yield return new KeyValuePair<TKey, TResult>(result.Key, result.Value.Value);
    }
}

I haven’t introduced SingleSlotPushEnumerator before, but as you can imagine, it’s an implementation of IPushEnumerator, with Push() and End() methods to provide data or indicate the end of the data stream. It’s not terribly interesting to see, in my view.

Conclusion

So, that’s what I’ve been looking at and thinking about for the last few evenings. I’ve implemented quite a few of the standard query operators, although not all of them are worth doing. I’m not currently viewing this as anything more than an interesting exercise, partly in terms of seeing how far I can push the language, but if anyone thinks it’s worth pursuing further (e.g. as a complete implementation as far as sensibly possible, either in MiscUtil or on SourceForge) I’d be very happy to hear your ideas. Frankly, I’d be glad and slightly surprised just to find out that anyone made it this far.

Oh, exercise for the reader – draw out a sequence diagram of how all this behaves :)

Wacky Ideas 3: Object life-cycle support

No, don’t leave yet! This isn’t another article about non-deterministic finalization, RAII etc. That’s what we almost always think of when someone mentions the object life-cycle, but I’m actually interested in the other end of the cycle – the “near birth” end.

We often take it as read that when an object’s constructor has completed successfully, the object should be ready to use. However, frameworks and technologies like Spring and XAML often make it easier to create an object and then populate it with dependencies, configuration etc. Yes, in some cases it’s more appropriate to have a separate configuration class which is used for nothing but a bunch of properties, and then the configuration can be passed into the “real” constructor in one go, with none of the readability problems of constructors taking loads of parameters. It’s all a bit unsatisfactory though.

What we most naturally want is to say, “Create me an empty X. Now configure it. Now use it.” (Okay, and as an obligatory mention, potentially “Now make it clean up after itself.”)

While configuring the object, we don’t want to call any of the “real” methods which are likely to want to do things. We may want to be able to fetch some of the configuration back again, e.g. so that some values can be relative to others easily, but we don’t want the main business to take place. Likewise, when we’ve finished configuring the object, we generally want to validate the configuration, and after that we don’t want anyone to be able to change the configuration. Sometimes there’s even a third phase, where we’ve cleaned up and want to still be able to get some calculated results (the byte array backing a MemoryStream, for instance) but not call any of the “main” methods any more.

I’d really like some platform support for this. None of it’s actually that hard to do – just a case of keeping track of which phase you’re in, and then adding a check to the start of each method. Wouldn’t it be nicer to have it available as attributes though? Specify the “default phase” for any undecorated members, and specify which phases are valid for other members – so configuration setters would only be valid in the configuration phase, for instance. Another attribute could dictate the phase transition – so the ValidateAndInitialize method (or whatever you’d call it) would have an attribute stating that on successful completion (no exceptions thrown) the phase would move from “configure” to “use”.

Here’s a short code sample. The names and uses of the attributes could no doubt be improved, and if there were only a few phases which were actually useful, they could be named in an enum instead, which would be neat.

[Phased(defaultRequirement=2, initial=1)]
class Sample
{
    IAuthenticator authenticator;
    
    public IAuthenticator Authenticator
    {
        [Phase(1)]
        [Phase(2)]
        get
        {
            return authenticator;
        }
        [Phase(1)]
        set
        {
            authenticator = value;
        }
    }
    
    [Phase(1)]
    [PhaseTransition(2)]
    public void ValidateAndInitialize()
    {
        if (authenticator==null)
        {
            throw new InvalidConfigurationException("I need an authenticator");
        }
    }
    
    public void DoSomething()
    {
        // Use authenticator, assuming it's valid
    }
    
    public void DoSomethingElse()
    {
        // Use authenticator, assuming it's valid
    }
}

Hopefully it’s obvious what you could and couldn’t do at what point.

This looks to me like a clear example of where AOP should get involved. I believe that Anders isn’t particularly keen on it, and when abused it’s clearly nightmarish – but for certain comment things, it just makes life easier. The declarative nature of the above is simpler to read (IMO – particularly if names were used instead of numbers) than manually checking the state at the start of each method. I don’t know if any AOP support is on the slate for Java 7 – I believe things have been made easier for AOP frameworks by Java 6, although I doubt that any target just Java 6 yet. We shall have to see.

One interesting question is whether you’d unit test that all the attributes were there appropriately. I guess it depends on the nature of the project, and just how thoroughly you want to unit test. It wouldn’t add any coverage, and would be hard to exhaustively test in real life, but the tests would be proving something…

Wacky Ideas 2: Class interfaces

(Disclaimer: I’m 99% sure I’ve heard someone smarter than me talking about this before, so it’s definitely not original. I thought it worth pursuing though.)

One of the things I love about Java and C# over C/C++ is the lack of .h files. Getting everything in the right place, only doing the right things in the right files, and coping with bits being included twice etc is a complete pain, particularly if you only do it every so often rather than it being part of your everyday life.

Unfortunately, as I’ve become more interface-based, I’ve often found myself doing effectively the same thing. Java and C# make life a lot easier than C in this respect, of course, but it still means duplicating the method signatures etc. Often there’s only one implementation of the interface – or at least one initial implementation – but separating it out as an interface gives a warm fuzzy feeling and makes stubbing/mocking easier for testing.

So, the basic idea here is to extract an interface from a class definition. In the most basic form:

class interface Sample
{
    public void ThisIsPartOfTheInterface()
    {
    }
    
    public void SoIsThis()
    {
    }
    
    protected void NotIncluded()
    {
    }
    
    private void AlsoNotIncluded()
    {
    }
}

So the interface Sample just has ThisIsPartOfTheInterface and SoIsThis even though the class Sample has the extra methods.

Now, I can see a lot of cases where you would only want part of the public API of the class to contribute to the interface – particularly if you’ve got properties etc which are meant to be used from an Inversion of Control framework. This could either be done with cunning keyword use, or (to make fewer syntax changes) a new attribute could be introduced which could decorate each member you wanted to exclude (or possibly include, if you could make the default “exclude” with a class-level attribute).

So far, so good – but now we’ve got two types with the same name. What happens when the compiler runs across one of the types? Well, here’s the list of uses I can think of, and what they should do:

  • Variable declaration: Use the interface
  • Construction: Sse the class
  • Array declaration/construction: Use the interface (I think)
  • typeof: Tricky. Not sure. (Note that in Java, we could use Sample.class and Sample.interface to differentiate.)
  • Type derivation: Not sure. Possibly make it explicit: “DerivedSample : class Sample” or “DerivedSample : interface Sample
  • Generics: I think this would depend on the earlier “not sure” answers, and would almost certainly be complicated

As an example, the line of code “Sample x = new Sample();” would declare a variable x of the interface type, but create an instance of the concrete class to be its initial value.

So, it’s not exactly straightforward. It would also violate .NET naming conventions. Would it be worth it, over just using an “Extract Interface” refactoring? My gut feeling is that there’s something genuinely useful in here, but the complications do seem to overwhelm the advantages.

Perhaps the answer is not to try to have two types with the same name (which is where the complications arise) but to be able to explicitly say “I’m declaring interface ISample and implementing it in Sample” both within the same file. At that point it may be unintuitive to get to the declaration of ISample, and seeing just the members of it isn’t straightforward either.

Is this a case where repeating yourself is fundamentally necessary, or is there yet another way of getting round things that I’m missing?