I’m currently reading the (generally excellent) CLR via C#, and I’ve recently hit the section on boxing. Why is it that authors feel they have to scaremonger about the effects boxing can have on performance?
Here’s a piece of code from the book:
using System;
public sealed class Program {
public static void Main() {
Int32 v = 5;
#if INEFFICIENT
Console.WriteLine(“{0}, {1}, {2}”, v, v, v);
#else
Object o = v;
Console.WriteLine(“{0}, {1}, {2}”, o, o, o);
#endif
}
}
In the text afterwards, he reiterates the point:
This second version executes much faster and allocates less memory from the heap.
This seemed like an overstatement to me, so I thought I’d try it out. Here’s my test application:
using System;
using System.Diagnostics;
public class Test
{
const int Iterations = 10000000;
public static void Main()
{
Stopwatch sw = Stopwatch.StartNew();
for (int i=0; i < Iterations; i++)
{
#if CONSOLE_WITH_BOXING
Console.WriteLine(“{0} {1} {2}”, i, i, i);
#elif CONSOLE_NO_BOXING
object o = i;
Console.WriteLine(“{0} {1} {2}”, o, o, o);
#elif CONSOLE_STRINGS
string s = i.ToString();
Console.WriteLine(“{0} {1} {2}”, s, s, s);
#elif FORMAT_WITH_BOXING
string.Format(“{0} {1} {2}”, i, i, i);
#elif FORMAT_NO_BOXING
object o = i;
string.Format(“{0} {1} {2}”, o, o, o);
#elif FORMAT_STRINGS
string s = i.ToString();
string.Format(“{0} {1} {2}”, s, s, s);
#elif CONCAT_WITH_BOXING
string.Concat(i, ” “, i, ” “, i);
#elif CONCAT_NO_BOXING
object o = i;
string.Concat(o, ” “, o, ” “, o);
#elif CONCAT_STRINGS
string s = i.ToString();
string.Concat(s, ” “, s, ” “, s);
#endif
}
sw.Stop();
Console.Error.WriteLine(“{0}ms”, sw.ElapsedMilliseconds);
}
}
I compiled the code with one symbol defined each time, with optimisations and without debug information, and ran it from a command line, writing to nul
(i.e. no disk or actual console activity). Here are the results:
Symbol |
Results (ms) |
Average (ms) |
CONSOLE_WITH_BOXING |
33054 |
33444 |
|
33898 |
|
|
33381 |
|
CONSOLE_NO_BOXING |
34638 |
33451 |
|
32423 |
|
|
33294 |
|
CONSOLE_STRINGS |
29259 |
28337 |
|
29071 |
|
|
26683 |
|
FORMAT_WITH_BOXING |
17143 |
17210 |
|
18100 |
|
|
16389 |
|
FORMAT_NO_BOXING |
15814 |
15657 |
|
15936 |
|
|
15222 |
|
FORMAT_STRINGS |
9178 |
8999 |
|
9077 |
|
|
8742 |
|
CONCAT_WITH_BOXING |
12056 |
12563 |
|
14304 |
|
|
11329 |
|
CONCAT_NO_BOXING |
11949 |
12240 |
|
13145 |
|
|
11628 |
|
CONCAT_STRINGS |
5833 |
5936 |
|
6263 |
|
|
5713 |
|
So, what do we learn from this? Well, a number of things:
- As ever, microbenchmarks like this are pretty variable. I tried to do this on a “quiet” machine, but as you can see the results varied quite a lot. (Over two seconds between best and worst for a particular configuration at times!)
- The difference due to boxing with the original code in the book is basically inside the “noise”
- The dominant factor of the statement is writing to the console, even when it’s not actually writing to anything real
- The next most important factor is whether we convert to string once or three times
- The next most important factor is whether we use String.Format or Concat
- The least important factor is boxing
Now I don’t want anyone to misunderstand me – I agree that boxing is less efficient than not boxing, where there’s a choice. Sometimes (as here, in my view) the “more efficient” code is slightly less readable – and the efficiency benefit is often negligible compared with other factors. Exactly the same thing happened in Accelerated C# 2008, where a call to Math.Pow(x, 2)
was the dominant factor in a program again designed to show the efficiency of avoiding boxing.
The performance scare of boxing is akin to that of exceptions, although I suppose it’s more likely that boxing could cause a real performance concern in an otherwise-well-designed program. It used to be a much more common issue, of course, before generics gave us collections which don’t require boxing/unboxing to add/fetch data.
In short: yes, boxing has a cost. But please look at it in context, and if you’re going to start making claims about how much faster code will run when it avoids boxing, at least provide an example where it actually contributes significantly to the overall execution cost.