Benchmarking, C#, Parallelization, Stack Overflow

Revisiting randomness

November 4, 2009 jonskeet 20 Comments

Almost every Stack Overflow question which includes the words "random" and "repeated" has the same basic answer. It’s one of the most common "gotchas" in .NET, Java, and no doubt other platforms: creating a new random number generator without specifying a seed will depend on the current instant of time. The current time as measured by the computer doesn’t change very often compared with how often you can create and use a random number generator – so code which repeatedly creates a new instance of Random and uses it once will end up showing a lot of repetition.

One common solution is to use a static field to store a single instance of Random and reuse it. That’s okay in Java (where Random is thread-safe) but it’s not so good in .NET – if you use the same instance repeatedly from .NET, you can corrupt the internal data structures.

A long time ago, I created a StaticRandom class in MiscUtil – essentially, it was just a bunch of static methods (to mirror the instance methods found in Random) wrapping a single instance of Random and locking appropriately. This allows you to just call StaticRandom.Next(1, 7) to roll a die, for example. However, it has a couple of problems:

It doesn’t scale well in a multi-threaded environment. When I originally wrote it, I benchmarked an alternative approach using [ThreadStatic] and at the time, locking won (at least on my computer, which may well have only had a single core).
It doesn’t provide any way of getting at an instance of Random, other than by using new Random(StaticRandom.Next()).

The latter point is mostly a problem because it encourages a style of coding where you just use StaticRandom.Next(…) any time you want a random number. This is undoubtedly convenient in some situations, but it goes against the idea of treating a source of randomness as a service or dependency. It makes it harder to get repeatability and to see what needs that dependency.

I could have just added a method generating a new instance into the existing class, but I decided to play with a different approach – going back to per-thread instances, but this time using the ThreadLocal<T> class introduced in .NET 4.0. Here’s the resulting code:

using System;
using System.Threading;

namespace RandomDemo
{
    /// <summary>
    /// Convenience class for dealing with randomness.
    /// </summary>
    public static class ThreadLocalRandom
    {
        /// <summary>
        /// Random number generator used to generate seeds,
        /// which are then used to create new random number
        /// generators on a per-thread basis.
        /// </summary>
        private static readonly Random globalRandom = new Random();
        private static readonly object globalLock = new object();

        /// <summary>
        /// Random number generator
        /// </summary>
        private static readonly ThreadLocal<Random> threadRandom = new ThreadLocal<Random>(NewRandom);

        /// <summary>
        /// Creates a new instance of Random. The seed is derived
        /// from a global (static) instance of Random, rather
        /// than time.
        /// </summary>
        public static Random NewRandom()
        {
            lock (globalLock)
            {
                return new Random(globalRandom.Next());
            }
        }

        /// <summary>
        /// Returns an instance of Random which can be used freely
        /// within the current thread.
        /// </summary>
        public static Random Instance { get { return threadRandom.Value; } }

        /// <summary>See <see cref="Random.Next()" /></summary>
        public static int Next()
        {
            return Instance.Next();
        }

        /// <summary>See <see cref="Random.Next(int)" /></summary>
        public static int Next(int maxValue)
        {
            return Instance.Next(maxValue);
        }

        /// <summary>See <see cref="Random.Next(int, int)" /></summary>
        public static int Next(int minValue, int maxValue)
        {
            return Instance.Next(minValue, maxValue);
        }

        /// <summary>See <see cref="Random.NextDouble()" /></summary>
        public static double NextDouble()
        {
            return Instance.NextDouble();
        }

        /// <summary>See <see cref="Random.NextBytes(byte[])" /></summary>
        public static void NextBytes(byte[] buffer)
        {
            Instance.NextBytes(buffer);
        }
    }
}

The user can still call the static Next(…) methods if they want, but they can also get at the thread-local instance of Random by calling ThreadLocalRandom.Instance – or easily create a new instance with ThreadLocalRandom.NewRandom(). (The fact that NewRandom uses the global instance rather than the thread-local one is an implementation detail really; it happens to be convenient from the point of view of the ThreadLocal<T> constructor. It wouldn’t be terribly hard to change this.)

Now it’s easy to write a method which needs randomness (e.g. to shuffle a deck of cards) and give it a Random parameter, then call it using the thread-local instance:

public void Shuffle(Random rng)
{
…
}
…
deck.Shuffle(ThreadLocalRandom.Instance);

The Shuffle method is then easier to test and debug, and expresses its dependency explicitly.

Performance

I tested the "old" and "new" implementations in a very simple way – for varying numbers of threads, I called Next() a fixed number of times (from each thread) and timed how long it took for all the threads to finish. I’ve also tried a .NET-3.5-compatible version using ThreadStatic instead of ThreadLocal<T>, and running the original code and the ThreadStatic version on .NET 3.5 as well.

Threads	StaticRandom (4.0b2)	ThreadLocalRandom (4.0b2)	ThreadStaticRandom (4.0b2)	StaticRandom(3.5)	ThreadStaticRandom (3.5)
1	11582	6016	10150	10373	16049
2	24667	7214	9043	25062	17257
3	38095	10295	14771	36827	25625
4	49402	13435	19116	47882	34415

A few things to take away from this:

The numbers were slightly erratic; somehow it was quicker to do twice the work with ThreadStaticRandom on .NET 4.0b2! This isn’t the perfect benchmarking machine; we’re interested in trends rather than absolute figures.
Locking hasn’t changed much in performance between framework versions
ThreadStatic has improved massively between .NET 3.5 and 4.0
Even on 3.5, ThreadStatic wins over a global lock as soon as there’s contention

The only downside to the ThreadLocal<T> version is that it requires .NET 4.0 – but the ThreadStatic version works pretty well too.

It’s worth bearing in mind that of course this is testing the highly unusual situation where there’s a lot of contention in the global lock version. The performance difference in the single-threaded version where the lock is always uncontended is still present, but very small.

Update

After reading the comments and thinking further, I would indeed get rid of the static methods elsewhere. Also, for the purposes of dependency injection, I agree that it’s a good idea to have a factory interface where that’s not overkill. The factory implementation could use either the ThreadLocal or ThreadStatic implementations, or effectively use the global lock version (by having its own instance of Random and a lock). In many cases I’d regard that as overkill, however.

One other interesting option would be to create a thread-safe instance of Random to start with, which delegated to thread-local "normal" implementations. That would be very useful from a DI standpoint.

20 thoughts on “Revisiting randomness”

Greg Beech says:

November 4, 2009 at 3:55 am

The PFX team have posted a version of this a while back that works with any .NET version if you’re interested. Exactly the same principle, just using the [ThreadStatic] attribute.

http://blogs.msdn.com/pfxteam/archive/2009/02/19/9434171.aspx

LikeLike

Reply
skeet says:

November 4, 2009 at 3:58 am

@Greg: Thanks for that. I may do another benchmark using ThreadStatic instead of ThreadLocal – I don’t know whether there’s anything “special” going on under the hood in ThreadLocal which would make it faster…

LikeLiked by 1 person

Reply
Martinho Fernandes says:

November 4, 2009 at 4:41 am

I had a similar class (with ThreadStatic), and ended up removing the never used Next() methods. They may be convenient at times, but eventually, you will want to remove all uses of them in your code to avoid the explicit dependency. So, you may as well go ahead and not include them from the beginning.

LikeLike

Reply
Eamon Nerbonne says:

November 4, 2009 at 7:48 am

I’ve got code just like this in most of my projects. I’d recommend removing the static versions of Next: this makes consumer code more easily refactorable. It’s relevant since commonly you may wish deterministically reproducible runs during debugging, and then it’s useful to easily be able to replace your “normal” source of Random objects with some other source.

Another advantage is that this encourages more performant code; your thread-local accessor is going to be a needless bottleneck, and encouraging people not to over-use it is a good thing.

Also, I’d do some renaming to reinforce the idea that this isn’t a _static_ random generator, it’s a thread-local random number generator.

It’s perhaps worth noting that there’s a http://code.msdn.microsoft.com/MersenneTwister implementation that’s compatible with System.Random and nearly as fast.

LikeLike

Reply
zvolkov says:

November 4, 2009 at 11:00 am

I’d rather define IRandomGenerator and inject that using DI/IoC. If dependency is properly managed and injected the implementation could then be simple local instance of Random. IMHO such design would be both Mockable and present no thread-safety concerns.

LikeLike

Reply
skeet says:

November 4, 2009 at 11:18 am

@zvolkov: Yes, if you want to go one level of abstraction further that would be okay. However, while I love dependency injection in general, there comes a time when you’ve had enough of factories :) It can be useful to be able to get a useful instance of Random *without* going through all the hoops. It does depend on the situation. A StaticRandomGenerator would be very easy to implement though, just returning StaticRandom.Instance.

LikeLike

Reply
Shay Erlichmen says:

November 4, 2009 at 4:04 pm

Jon I don’t think there should be a difference between ThreadStatic and ThreadLocal.

I wrote a simple test that shows (prove?) that they are both the same.
here is the paste http://utilitybase.com/paste/21497.

Works on my machine, waiting for the fail :)

LikeLike

Reply
skeet says:

November 4, 2009 at 4:10 pm

@Shay: Your code isn’t correct – you’re just creating a new instance of Random, rather than taking out a lock and using the single global instance. Otherwise several threads could all end up with the same seed, which is what we’re trying to avoid. Your code should look like this (apologies if it loses all formatting):

static Random Instance
{
get
{
if (random == null)
{
lock (globalLock)
{
random = new Random(globalRandom.Next());
}
}
return random;
}
}

… or just call NewRandom.

LikeLike

Reply
Shay Erlichmen says:

November 4, 2009 at 4:29 pm

Ok, [I’ve changed the code][1] to call NewRandom() instead, I still don’t see any major time difference.

[1] http://utilitybase.com/paste/21499

LikeLike

Reply
skeet says:

November 4, 2009 at 5:31 pm

Odd… I was definitely seeing it as less efficient, every single time. Maybe it depends on OS or processor. (I’m running on 32-bit Vista, Core 2 Duo.)

LikeLike

Reply
skeet says:

November 4, 2009 at 5:33 pm

Hang on, I’ve just checked your code again. You’re not doing the nullity check now. You’re creating a new instance of Random on every call, which I’d expect to be a *lot* slower.

LikeLiked by 1 person

Reply
Shay Erlichmen says:

November 5, 2009 at 1:57 am

I double and triple check the code (Incl. adding the null check, verifying that its not an optimizer issue), it seems that ThreadStatic and ThreadLocal take the same time.

But when I ran the code under 3.5 it took ~3 times longer to run the test (I can run only the ThreadStatic version).

My final conclusion: ThreadStatic==ThreadLocal and there are a lot of performance improvements in 4.0b2.

BTW, My setup is Win7x64 4 cores.

LikeLike

Reply
skeet says:

November 5, 2009 at 2:17 am

@Shay: It may well be a difference between the JIT on x86 and x64. I don’t think you can claim a blanket “ThreadStatic==ThreadLocal” until you can explain why I get consistently slower results for ThreadStatic than for ThreadLocal :) However, they’re clearly “quite close” which is a good start…

LikeLike

Reply
Shay Erlichmen says:

November 5, 2009 at 8:43 am

I couldn’t resist testing, so I test all 6 flavors

Here are the numbers http://bit.ly/4GeQy6 (Google Spreadsheet),

Executive summary:
I got the same result as you (LocalThread is indeed faster) but on x64 the ThreadStatic version is the fastest of them all.

LikeLike

Reply
Andrew Shapira says:

November 10, 2009 at 10:15 pm

Hi John.

Using a global random number generator as in NewRandom() is not necessary. In fact, NewRandom() as is has some problems.

NewRandom as written now has a birthday problem.

A simpler and lockless solution is to keep a counter, and increment the counter every time NewRandom wants a new seed. Use the counter for the seed to System.Random instance that’s about to be created. The counter can be maintained using Interlocked.Increment. Initializating the counter in a thread safe way might be trickier.

The counter avoids birthday problems.

The size of System.Random’s space is only around what can fit in a 32 bit word. Since NewRandom is public, it could be called 2^32 times, and duplicate random number streams would result. With the System.Random interface, it’s apparent that there are only 2^32 possible streams, but when using the method here, this fact is hidden behind the public interface. In other words, NewRandom’s contract is to produce independent streams, and if called sufficiently many times, NewRandom will fail to do its job. One possibility is for NewRandom to not be public. Dunno if that causes the class to lack a function that you consider essential.

Another thing to consider is that using this class will make the behavior of client application irreproducible, even when the client is running single-threaded. You might want to provide a way to reliably initialize to a reproducible state (possible based on a seed), so that for single-threaded operation the behavior is reproducible.

LikeLike

Reply
skeet says:

November 11, 2009 at 12:31 am

@Andrew: Thanks, good points. Would you initialize the counter randomly to start with (from the first date/time)?

LikeLike

Reply
Andrew Shapira says:

November 11, 2009 at 1:12 am

Yes. It seems to be a good idea to make that the default, just like the parameterless System.Random constructor does.

If I were using a class like this in an application, I’d also find it helpful if there were a way to specify a seed. This would do a couple of things: (a) give a way to reproduce the application’s behavior when it’s running single-threaded, and (b) give a way to run with different random number streams, in a reproducible way. Both of these are really useful for testing.

LikeLike

Reply
Frederic says:

September 28, 2015 at 7:14 pm

Hi John,

I wonder how well this approach works in an asp.net context, especially in case of thread switch due to asp.net thread agility (or usage async/await).

May we end up using a single Random instance on different threads?

(Getting the instance on begin request, thread switch occurring due to heavy load before end request, and using the instance on end request: it looks like we would use an instance that is belonging to another thread than the current one. Or have I missed something?)

LikeLike

Reply
Pingback: Local Functions in C# 7 | Jim's Rules Regarding Everything
hIpPy says:

August 5, 2023 at 5:58 pm

I ended up creating a factory class which can give out a LockRandom, ThreadStaticRandom, ThreadLocalRandom instance.
https://github.com/rmandvikar/random2/blob/dev/src/rm.Random2/RandomFactory.cs

I made a small tweak where I ended up dropping the ‘globalRandom’ instance, and instead using ‘RNGCryptoServiceProvider’ to create the instances.

I’ve also nugetized it here:
https://www.nuget.org/packages/rm.Random2

LikeLike

Reply

Jon Skeet's coding blog

Revisiting randomness

Performance

20 thoughts on “Revisiting randomness”

Leave a comment Cancel reply

Performance

Share this:

Related

20 thoughts on “Revisiting randomness”

Leave a comment Cancel reply