A thread came up yesterday on the C# newsgroup about when to use the “normal” foreach
and when to use List<T>.ForEach
(assuming, of course, that one is dealing with a
List<T> in the first place). Obviously there are readability issues, but we ended up focusing
on performance. (Isn’t that telling in its own right? How often is the iteration part rather than
the body going to dominate and be a significant bottleneck? Anyway, I digress.)
So, I wrote a small benchmark, and Patrick asked me to blog about it. I’ve refactored the test I posted on the newsgroup and added a couple more tests as suggested by Willy Denoyette. The source code is a little bit unwieldy (and frankly tedious) to include in this blog post – download it if you’re interested.
The test basically creates a list of strings, each being “x”. Each test case iterates through the
list a fixed number of times, keeping a running total of the lengths of strings it sees. The result
is checked and the time taken is reported. This is what the individual tests do:
LanguageForEach
just usesforeach (string x in list)
in the obvious way.NewDelegateEachTime
uses an anonymous method as the parameter toList.ForEach<T>
, where that method captures a different variable each “outer” iteration. That means a new delegate has to be created each time.CachedDelegate
creates a single delegate and uses that for all calls toList<T>.ForEach
.LanguageForEachWithCopy1
copies the list to an array each “outer” iteration, and then usesforeach
over that array.LanguageForEachWithCopy2
copies the list to an array once at the start of the test, and then usesforeach
over that array.
Here are the results, with a few different test cases (all doing the same amount of work overall). I shall attempt to tabulate them a bit better when I get some time :)
Test parameters: Size=10000000; Iterations=100 Test 00:00:11.8251914: LanguageForEach Test 00:00:05.3463387: NewDelegateEachTime Test 00:00:05.3238162: CachedDelegate Test 00:00:22.1342570: LanguageForEachWithCopy1 Test 00:00:03.7493164: LanguageForEachWithCopy2 |
Test parameters: Size=1000000; Iterations=1000 Test 00:00:11.8163135: LanguageForEach Test 00:00:05.3392333: NewDelegateEachTime Test 00:00:05.3334596: CachedDelegate Test 00:00:26.9471681: LanguageForEachWithCopy1 Test 00:00:03.5251209: LanguageForEachWithCopy2 |
Test parameters: Size=100000; Iterations=10000 Test 00:00:11.6576344: LanguageForEach Test 00:00:05.2225531: NewDelegateEachTime Test 00:00:05.2066938: CachedDelegate Test 00:00:16.2563401: LanguageForEachWithCopy1 Test 00:00:03.0949064: LanguageForEachWithCopy2 |
Test parameters: Size=100; Iterations=10000000 Test 00:00:12.2547105: LanguageForEach Test 00:00:04.9791093: NewDelegateEachTime Test 00:00:04.6191521: CachedDelegate Test 00:00:06.0731525: LanguageForEachWithCopy1 Test 00:00:02.8182444: LanguageForEachWithCopy2 |
The LanguageForEachWithCopy1
results surprised me, as I’d really expected the
performance to go up as the number of iterations went up. It seems it’s cheaper to copy
a short list many times than a long list a few times…
Well, I am really suprised that there is such a huge difference between LanguageForEach and NewDelegateEachTime. What is more using delegate seems to be much faster. Do you see any explanation?
LikeLike
I believe that the explaination for the ForEach() being faster is that the former doesn’t do as much checking at each iteration and only results in one method call, whereas foreach checks for stuff like concurrent modifcation and out of bounds and has to call the Enumerator’s MoveNext() and get_Current() at each iteration.
LikeLike
Great post, John.
I got curious, and so I added a test using a traditional for() loop.
On my machine, built in Release configuration, and running outside of the debugger, using a straight for() loop was faster than every method except for LanguageForEachWithCopy2, which was eerily fast.
Here’s the code:
static int LanguageFor(List list)
{
int sum = 0;
for (int i = 0; i < Iterations; i++)
{
for (int index = 0; index < list.Count; index++)
{
sum += list[index].Length;
}
}
return sum;
}
LikeLike
Why using the same value for every string in list?
I think it leads to some hidden CPU optimization.
Also it’s far from a real-world scenario.
If every string in the list is different we have a fairly different result:
Random randomGenerator = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < Size; i++)
{
list.Add(randomGenerator.NextDouble().ToString());
}
Most of the differences are more leveled.
Test parameters: Size=100000; Iterations=10000
Test 00:00:18.5486500: LanguageForEach
Test 00:00:15.4419635: NewDelegateEachTime
Test 00:00:15.1783838: CachedDelegate
Test 00:00:43.2335005: LanguageForEachWithCopy1
Test 00:00:13.9626502: LanguageForEachWithCopy2
Test 00:00:13.3392666: LanguageFor
And it's not a matter of string length, since
for (int i = 0; i < Size; i++)
{
list.Add("0,38160862465464");
}
(same string lenght of my previous code) leads to the same results as your string with 1 char.
What do you think?
(I have tried to post it via google groups but something didn't work).
LikeLike
I don’t believe it’s a hidden optimisation – I suspect it’s the natural result of cache sizing. Willy Denoyette certainly saw very different results depending on whether the CPU’s L2 cache was able to hold everything or not.
I’m just guessing here though…
Jon
LikeLike
Yeah, that could be it.
A real pity we can’t upgrade it ;-)
BTW thanks for all your great contributions!
LikeLike
I think these test are fair enough.
But can anybody say is there any faster searching than Binary search in Generic List. Or any method in Array(which does not even feature binary search). Can anybody say???
LikeLike
I’ve found another interesting thing.
for (int i = 0; i < list.Count; i++)
{
…
}
is about 30% slower (I've tested it on Celeron M 1.6GHz 1MB L2) than
int count = list.Count;
for (int i = 0; i < count; i++)
{
…
}
LikeLike
The performance difference in the for loop is a given since the evaluation is performed in each iteration.
Still, the differences for the foreach performance is a bit interesting, though not too alarming. The more readable code is it surely is going to sacrifice some performance. (Otherwise we’d still be happily writing in mythical man-years with Turbo Assembler.)
Good to know when I need to perform operations against very large collections.
LikeLike
xsan, are you sure you were running with a Release build andnot from the IDE?
Sokak, actually, the evaluation is *not* done for every iteration. The JIT compiler is smarter than you might think and makes optimizations that can remove bounds checking and any computation necessary for the evaluation of the ‘Count’ or ‘Length’ property. You can actually *slow down* your code by trying to outsmart the JIT compiler by doing your own loop-hoisting.
LikeLike
My opinion:
1- If there’s a lot of computation being done inside the block, the choice might not be that relevant.
2- List.ForEach is still quite readable.
3- List.ForEach allows you to skip to the next iteration of the loop (continue) by doing a return from the delegate. On the other hand, it doesn’t allow you to break out of your loop easily. Perhaps you can throw and catch an exception.
David
LikeLike
Updating with results from .netcore3 seems like differences are a lot less visible now days:
*(changed delegates to new expression syntax () => {})
Test parameters: Size=1000; Iterations=1000000
Test 00:00:10.2126953: LanguageForEach
Test 00:00:08.1602443: NewDelegateEachTime
Test 00:00:07.3079221: CachedDelegate
Test 00:00:06.7648464: LanguageForEachWithCopy1
Test 00:00:04.8331680: LanguageForEachWithCopy2
Test parameters: Size=1000000; Iterations=1000
Test 00:00:09.6648334: LanguageForEach
Test 00:00:07.5927968: NewDelegateEachTime
Test 00:00:07.6961606: CachedDelegate
Test 00:00:09.2931712: LanguageForEachWithCopy1
Test 00:00:05.2325878: LanguageForEachWithCopy2
LikeLike