This will be a short post, and there’ll probably be some more short ones coming up too. I think it makes sense to only cover multiple operators in a single post where they’re really similar. (Count and LongCount spring to mind.) I’m in your hands though – if you would prefer "chunkier" posts, please say so in the comments.
This post will deal with the Range generation operator.
What is it?
Range only has a single signature:
Unlike most of LINQ, this isn’t an extension method – it’s a plain old static method. It returns an iterable object which will yield "count" integers, starting from "start" and incrementing each time – so a call to Enumerable.Range(6, 3) would yield 6, then 7, then 8.
As it doesn’t operate on an input sequence, there’s no sense in which it could stream or buffer its input, but:
- The arguments need to be validated eagerly; the count can’t be negative, and it can’t be such that any element of the range could overflow Int32.
- The values will be yielded lazily – Range should be cheap, rather than creating (say) an array of "count" elements and returning that.
How are we going to test it?
Testing a plain static method brings us a new challenge in terms of switching between the "normal" LINQ implementation and the Edulinq one. This is an artefact of the namespaces I’m using – the tests are in Edulinq.Tests, and the implementation is in Edulinq. "Parent" namespaces are always considered when the compiler tries to find a type, and they take priority over anything in using directives – even a using directive which tries to explicitly alias a type name.
The (slightly ugly) solution to this that I’ve chosen is to include a using directive to create an alias which couldn’t otherwise be resolved – in this case, RangeClass. The using directive will either alias RangeClass to System.Linq.Enumerable or Edulinq.Enumerable. The tests then all use RangeClass.Range. I’ve also changed how I’m switching between implementations – I now have two project configurations, one of which defines the NORMAL_LINQ preprocessor symbol, and the other of which doesn’t. The RangeTest class therefore contains:
using RangeClass = System.Linq.Enumerable;
using RangeClass = Edulinq.Enumerable;
There are alternatives to this approach, of course:
- I could move the tests to a different namespace
- I could make the project references depend on the configuration… so the "Normal LINQ" configuration wouldn’t reference the Edulinq implementation project, and the "Edulinq implementation" configuration wouldn’t reference System.Core. I could then just use Enumerable.Range with an appropriate using directive for System.Linq conditional on the NORMAL_LINQ preprocessor directive, as per the other tests.
I like the idea of the second approach, but it means manually tinkering with the test project file – Visual Studio doesn’t expose any way of conditionally including a reference. I may do this at a later date… thoughts welcome.
What are we going to test?
There isn’t much we can really test for ranges – I only have eight tests, none of which are particularly exciting:
- A simple valid range should look right when tested with AssertSequenceEqual
- The start value should be allowed to be negative
- Range(Int32.MinValue, 0) is an empty range
- Range(Int32.MaxValue, 1) yields just Int32.MaxValue
- The count can’t be negative
- The count can be zero
- start+count-1 can’t exceed Int32.MaxValue (so Range(Int32.MaxValue, 2) isn’t valid)
- start+count-1 can be Int32.MaxValue (so Range(Int32.MaxValue, 1) is valid)
The last two are tested with a few different examples each – a large start and a small count, a small start and a large count, and "fairly large" values for both start and count.
Note that I don’t have any tests for lazy evaluation – while I could test that the returned value doesn’t implement any of the other collection interfaces, it would be a little odd to do so. On the other hand, we do have tests which have an enormous count – such that anything which really tried to allocate a collection of that size would almost certainly fail…
Let’s implement it!
It will surely be no surprise by now that we’re going to use a split implementation, with a public method which performs argument validation eagerly and then uses a private method with an iterator block to perform the actual iteration.
Having validated the arguments, we know that we’ll never overflow the bounds of Int32, so we can be pretty casual in the main part of the implementation.
if (count < 0)
throw new ArgumentOutOfRangeException("count");
// Convert everything to long to avoid overflows. There are other ways of checking
// for overflow, but this way make the code correct in the most obvious way.
if ((long)start + (long)count – 1L > int.MaxValue)
throw new ArgumentOutOfRangeException("count");
return RangeImpl(start, count);
private static IEnumerable<int> RangeImpl(int start, int count)
for (int i = 0; i < count; i++)
yield return start + i;
Just a few points to note here:
- Arguably it’s the combination of "start" and "count" which is invalid in the second check, rather than just count. It would possibly be nice to allow ArgumentOutOfRangeException (or ArgumentException in general) to blame multiple arguments rather than just one. However, using "count" here matches the framework implementation.
- There are other ways of performing the second check, and I certainly didn’t have to make all the operands in the expression longs. However, I think this is the simplest code which is clearly correct based on the documentation. I don’t need to think about all kinds of different situations and check that they all work. The arithmetic will clearly be valid when using the Int64 range of values, so I don’t need to worry about overflow, and I don’t need to consider whether to use a checked or unchecked context.
- There are also other ways of looping in the private iterator block method, but I think this is the simplest. Another obvious and easy alternative is to keep two values, one for the count of yielded values and the other for the next value to yield, and increment them both on each iteration. A more complex approach would be to use just one loop variable – but you can’t use "value < start + count" in case the final value is exactly Int32.MaxValue, and you can’t use "value <= start + count – 1" in case the arguments are (int.MinValue, 0). Rather than consider all the border cases, I’ve gone for an obviously-correct solution. If you really, really cared about the performance of Range, you’d want to investigate various other options.
Prior to writing up this post, I didn’t have good tests for Range(Int32.MaxValue, 1) and Range(Int32.MinValue, 0)… but as they could easily go wrong as mentioned above, I’ve now included them. I find it interesting how considering alternative implementations suggests extra tests.
"Range" was a useful method to implement in order to test some other operators – "Count" in particular. Now that I’ve started on the non-extension methods though, I might as well do the other two (Empty and Repeat). I’ve already implemented "Empty", and will hopefully be able to write it up today. "Repeat" shouldn’t take much longer, and then we can move on to "Count" and "LongCount".
I think this code is a good example of situations where it’s worth writing "dumb" code which looks like the documentation, rather than trying to write possibly shorter, possibly slightly more efficient code which is harder to think about. No doubt there’ll be more of that in later posts…