Reimplementing LINQ to Objects: Part 10 – Any and All

Another day, another blog post. I should emphasize that this rate of posting is likely to be short-lived… although if I get into the habit of writing a post on the morning commute when I go back to work after the Christmas holidays, I could keep ploughing through until we’re done.

Anyway, today we have a pair of operators: Any and All.

What are they?

"Any" has two overloads; there’s only one for "All":

public static bool Any<TSource>(
this IEnumerable<TSource> source)

public static bool Any<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)

public static bool All<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)

The names really are fairly self-explanatory:

"Any" without a predicate returns whether there are any elements in the input sequence
"Any" with a predicate returns whether any elements in the input sequence match the predicate
"All" returns whether all the elements in the input sequence match the given predicate

Both operators use immediate execution – they don’t return until they’ve got the answer, basically.

Importantly, "All" has to read through the entire input sequence to return true, but can return as soon as it’s found a non-matching element; "Any" can return true as soon as it’s found a matching element, but has to iterate over the entire input sequence in order to return false. This gives rise to one very simple LINQ performance tip: it’s almost never a good idea to use a query like

// Don’t use this
if (query.Count() != 0)

That has to iterate over *all* the results in the query… when you really only care whether or not there are any results. So use "Any" instead:

// Use this instead
if (query.Any())

If this is part of a bigger LINQ to SQL query, it may not make a difference – but in LINQ to Objects it can certainly be a huge boon.

Anyway, let’s get on to testing the three methods…

What are we going to test?

Feeling virtuous tonight, I’ve even tested argument validation again… although it’s easy to get that right here, as we’re using immediate execution.

Beyond that, I’ve tested a few scenarios:

An empty sequence will return false with Any, but true with All. (Whatever the predicate is for All, there are no elements which fail it.)
A sequence with any elements at all will make the predicate-less Any return true.
If all the elements don’t match the predicate, both Any and All return false.
If some elements match the predicate, Any will return true but All will return false.
If all elements match the predicate, All will return true.

Those are all straightforward, so I won’t give the code. One final test is interesting though: we prove that Any returns as soon as it’s got the result by giving it a query which will throw an exception if it’s iterated over completely. The easiest way of doing this is to start out with a sequence of integers including 0, and then use Select with a projection which divides some constant value by each element. In this test case, I’ve given it a value which will match the predicate before the value which will cause the exception to be thrown:

[Test]
public void SequenceIsNotEvaluatedAfterFirstMatch()
{
    int[] src = { 10, 2, 0, 3 };
    var query = src.Select(x => 10 / x);
    // This will finish at the second element (x = 2, so 10/x = 5)
    // It won’t evaluate 10/0, which would throw an exception
    Assert.IsTrue(query.Any(y => y > 2));
}

There’s an equivalent test for All, where a non-matching element occurs before the exceptional one.

So, with all the tests written, let’s get on with the interesting bit:

Let’s implement them!

The first thing to note is that all of these could be implemented in terms of either Any-with-a-predicate or All. For example, given All, we could implement Any as:

public static bool Any<TSource>(
this IEnumerable<TSource> source)
{
return source.Any(x => true);
}

public static bool Any<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, bool> predicate)
{
    if (predicate == null)
    {
        throw new ArgumentNullException("predicate");
    }
    return !source.All(x => !predicate(x));
}

It’s simplest to implement the predicate-less Any in terms of the predicated one – using a predicate which returns true for any element means that Any will return true for any element at all, which is what we want.

The inversions in the call to All take a minute to get your head round, but it’s basically De Morgan’s law in LINQ form: we effectively invert the predicate to find out if all of the elements don’t match the original predicate… then return the inverse. Due to the inversion, this still returns early in all the appropriate situations, too.

While we could do that, I’ve actually preferred a straightforward implementation of all of the separate methods:

public static bool Any<TSource>(
    this IEnumerable<TSource> source)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }

    using (IEnumerator<TSource> iterator = source.GetEnumerator())
    {
        return iterator.MoveNext();
    }
}

public static bool Any<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, bool> predicate)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    if (predicate == null)
    {
        throw new ArgumentNullException("predicate");
    }

    foreach (TSource item in source)
    {
        if (predicate(item))
        {
            return true;
        }
    }
    return false;
}

public static bool All<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, bool> predicate)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    if (predicate == null)
    {
        throw new ArgumentNullException("predicate");
    }

    foreach (TSource item in source)
    {
        if (!predicate(item))
        {
            return false;
        }
    }
    return true;
}

Aside from anything else, this makes it obvious where the "early out" comes in each case – and also means that any stack traces generated are rather easier to understand. It would be quite odd from a client developer’s point of view to call Any but see All in the stack trace, or vice versa.

One interesting point to note is that I don’t actually use a foreach loop in Any – although I could, of course. Instead, I just get the iterator and then return whether the very first call to MoveNext indicates that there are any elements. I like the fact that reading this method it’s obvious (at least to me) that we really couldn’t care less what the value of the first element is – because we never ask for it.

Conclusion

Probably the most important lesson here is the advice to use Any (without a predicate) instead of Count when you can. The rest was pretty simple – although it’s always fun to see one operator implemented in terms of another.

So, what next? Possibly Single/SingleOrDefault/First/FirstOrDefault/Last/LastOrDefault. I might as well do them all together – partly as they’re so similar, and partly to emphasize the differences which do exist.

12 thoughts on “Reimplementing LINQ to Objects: Part 10 – Any and All”

Anders Borum says:

December 28, 2010 at 3:40 pm

Hi Jon,
another fun article – thanks for series so far. You should take a look at the following code comment and correct the last part :)

// This will finish at the second element (x = 2, so 10/3 = 5)

LikeLike

Lasse Vågsæther Karlsen says:

December 28, 2010 at 3:47 pm

Note that 10/3 is never 5, except for very low values of 3.

Love the series, keep ’em coming :)

LikeLike

skeet says:

December 28, 2010 at 3:53 pm

@Anders: Thanks, fixed.

LikeLike

Jesper says:

December 28, 2010 at 6:50 pm

Good to know that you’re willing to implement Any and All LINQ operators.

(Boo, hiss.)

LikeLike

simon weijgers says:

December 29, 2010 at 1:41 am

In Any (without predicate) you only call MoveNext. This is fine, but I guess the .Value property of the iterator could throw an exception if evaluated. I guess what I’m saying is if the standard .NET version of Any does look at the .Value property (perhaps implicitly by doing a foreach (item in source) {return true}) and yours doesn’t then they behave differently in this admittedly rather contrived edge case ;)

LikeLike

skeet says:

December 29, 2010 at 2:37 am

@simon: Yes, I wondered about that myself. It’s not actually specified either way… but it’s an interesting edge case.

LikeLike

[ICR] says:

December 29, 2010 at 5:07 pm

I would prefer to write the test as follows:

var list = new List<Func>() {
() => { return false; },
() => { return true; },
() => { throw new Exception(); }
};
Assert.IsTrue(list.Any(x => x()));

I feel it makes it a bit more explicit than the implicit DivideByZeroException and 10/10 2.

LikeLike

Jacob says:

December 30, 2010 at 11:56 am

One interesting test for All would be to pass an obviously false predicate on an empty enumeration:

Assert.IsTrue(list.All(p=>false));

LikeLike

hwiechers says:

January 20, 2011 at 1:30 am

“It’s simplest to implement the predicate-less Any in terms of the predicated one – using a predicate which returns true for any element means that Any will return true for any element at all, which is what we want.”

In Linq to Objects, Any() and Any(x => True) do the same thing as long as the underlying Enumerator.Current doesn’t have side-effects or raise exceptions. This is because the predicate-less Any never actually calls Enumerator.Current. (Just like yours!)

You can’t actually implement one in terms of the other if you want to conform.

LikeLike

Rob Siklos says:

September 25, 2014 at 4:21 pm

For the predicate-less Any, do you think there’s value in first checking if source is an ICollection, and then just checking if Count > 0? Doing this avoids the need for instantiating the enumerator.

LikeLike

1. jonskeet says:
  
  September 25, 2014 at 4:44 pm
  
  Yes, that would make sense. I’m not sure why the .NET implementation doesn’t do this. (It appears not to, anyway – it would be possible to check, of course.)
  
  LikeLike
  
  1. rob263 says:
    
    September 25, 2014 at 5:12 pm
    
    Yes, I checked, and the .NET implementation doesn’t do it either. The only reason I could think of is that maybe some ICollection implementations have an expensive Count function, and they wanted to play it safe. However, I don’t think I’ve ever run across any real code which has an expensive ICollection.Count implementation.
    
    LikeLike

Jon Skeet's coding blog

Reimplementing LINQ to Objects: Part 10 – Any and All

What are they?

What are we going to test?

Let’s implement them!

Conclusion

12 thoughts on “Reimplementing LINQ to Objects: Part 10 – Any and All”

Leave a comment Cancel reply

What are they?

What are we going to test?

Let’s implement them!

Conclusion

Share this:

Related

12 thoughts on “Reimplementing LINQ to Objects: Part 10 – Any and All”

Leave a comment Cancel reply