Threading Basics: Race Conditions, Part 1

4 minutes read

Originally posted on: http://geekswithblogs.net/jolson/archive/2008/09/18/threading-basics-race-conditions-part-1.aspx

I've talked in the past about the importance of parallel computing for all us developers. It's a trend in computer software and hardware architecture that is not a fad. Currently, in the US, it is nearly impossible to buy a new computer that has only a single core. We're even starting to see some of the first quad-core laptops hit the market.

It's becoming very important for developers to start dealing with parallel code. There's one problem: multithreaded development is hard. And in the class of Threading 101, Race Conditions is one of the first chapters you have to deal with.

So what is a Race Condition? According to Wikipedia, the source for all information on the Internet (emphasis mine):

"A race condition, or race hazard, is a flaw in a system or process whereby the output and/or result of the process is unexpectedly and critically dependent on the sequence or timing of other events. The term originates with the idea of two signals racing each other to influence the output first."

That's nice and all, but what does this mean to us developers? Let's look at a simple example. I have a Logger where I want to keep track of the number of messages that are logged.

   1: class Logger
   2: {
   3:     public static int TotalMessages = 0;
   4:  
   5:     public void Log(string message)
   6:     {
   7:         // Do something with message
   8:         TotalMessages++;
   9:     }
  10: }

Now let's call this method a bunch of times from four different threads and capture the total number of times Log message was called:

   1: int messagesPerTask = 200000;
   2: int unitsOfWork = 4;
   3:  
   4: var logger = new Logger();
   5: Parallel.For(0, unitsOfWork, (taskNumber) =>
   6: {
   7:     for (int i = 0; i < messagesPerTask; i++)
   8:     {
   9:         logger.Log("Message " + i.ToString());
  10:     }
  11: });
  12:  
  13: Console.WriteLine("{0} Expected", messagesPerTask * unitsOfWork);
  14: Console.WriteLine("{0} Actual", Logger.TotalMessages);

What do you expect the output to be? The obvious answer is "well, we are adding 200000 messages on four different threads, so it's obviously 800000". Unfortunately, the obvious answer also happens to be the wrong answer. What's the right answer? It depends, and it can change every single time you run the application. For example, this were the results from the first time I ran the code:

800000 Expected
680354 Actual

Wait... how is this possible? Did we lose updates? Did I not call the method enough times? This surely must be a horrendous bug in the .NET Framework! Damn Bill and his blasted company. Trust me, it's not a bug. There's a good explanation of this behavior.

If you haven't written a lot of multithreaded code yet, the problem with the above code may not be obvious. Truth be told, there really isn't a problem with the code... IF you are only running the code on a single thread ever. But the likelihood of running on a single thread is becoming smaller and smaller. And as you can see, when we are running this in a multithreaded environment, there is most definitely a problem with the code.

A lot of .NET developers look at "TotalMessages++" and interpret it as one line of code, one execution, one instruction, etc. But there's the rub, it isn't. The code "TotalMessages++" is actually four instructions when compiled down to IL.

   1: L_0001: ldsfld int32 CSharpSandbox.Logger::TotalMessages
   2: L_0006: ldc.i4.1 
   3: L_0007: add 
   4: L_0008: stsfld int32 CSharpSandbox.Logger::TotalMessages

These four lines essentially say the following:

  1. Get the current value of TotalMessages
  2. Get the value 1
  3. Add the two numbers together
  4. Save the result back into TotalMessages

Now let's imagine two threads executing the four steps above at the same time. Since threads won't be in perfect lock-step, they may be off by two steps (assuming the value of TotalMessages is currently 2):

[Thread 1] 1. Get the current value of TotalMessages (gets 2)
[Thread 1] 2. Get the value 1

[Thread 2] 1. Get the current value of TotalMessages (gets 2)
[Thread 2] 2. Get the value 1

[Thread 1] 3. Add the two numbers together (2 + 1)
[Thread 1] 4. Save the result back into TotalMessages (saves 3)

[Thread 2] 3. Add the two numbers together (2 + 1)
[Thread 2] 4. Save the result back into TotalMessages (saves 3)

So we've executed that code twice and at the end of the execution, TotalMessages has only increased by one. This is exactly what a race condition is. It's that simple. It's not as advanced as a concept as one might think.

This is why developers need to understand what race conditions are. A snippet of code that looks completely innocuous can cause some big problems when run in a multithreaded development. While future technologies will make writing parallel code easier, it won't prevent these types of problems from occurring. You still need to be aware of some of the common Threading 101 concerns/topics that exist today.

Updated:

Leave a Comment