
Spin lock
Using operating system level synchronization primitives requires quite a noticeable amount of resources, because of the context switching and all the entire corresponding overhead. Besides this, there is such thing as lock latency; that is, the time required for a lock to be notified about the state change of another lock. This means that when the current lock is being released, it takes some additional time for another lock to be signaled. This is the reason why when we need short time locks, it could be significantly faster to use a single thread without any locks than to parallelize these operations using OS level locking mechanics.
To avoid unnecessary context switches in such a situation, we can use a loop, which checks the other locks in each iteration. Since the locks should be very short, we would not use too much CPU, and we have a significant performance boost by not using the operating system resources and by lowering lock latency to the lowest amount.
This pattern is not so easy to implement, and, to be effective, you would need to use specific CPU instructions. Fortunately, there is a standard implementation of this pattern in the .NET Framework starting with version 3.5. The implementation contains the following methods and classes:
Thread.SpinWait
Thread.SpinWait
just spins an infinite loop. It's like Thread.Sleep
, only without context switching and using CPU time. It is used rarely in common scenarios, but could be useful in some specific cases, such as simulating real CPU work.
System.Threading.SpinWait
System.Threading.SpinWait
is a structure implementing a loop with a condition check. It is used internally in spinlock implementation.
System.Threading.SpinLock
Here we will be discussing about the spinlock implementation itself.
Note that it is a structure which allows to save on class instance allocation and reduces GC overhead.
The spinlock can optionally use a memory barrier (or a memory fencing instruction) to notify other threads that the lock has been released. The default behavior is to use a memory barrier, which prevents memory access operation reordering by compiler or hardware, and improves the fairness of the lock at the expense of performance. The other case is faster, but could lead to incorrect behavior in some situations.
Usually, it's not encouraged to use a spinlock directly unless you are 100% sure what you're doing. Make sure that you have confirmed the performance bottleneck with tests and you know that your locks are really short.
The code inside a spin lock should not do the following:
- Use regular locks, or a code that uses locks
- Acquire more than one spinlock at a time
- Perform dynamic dispatched calls (virtual methods, interface methods, or delegate calls)
- Call any third-party code, which is not controlled by you
- Perform memory allocation, including new operator usage
The following is a sample test for a spinlock:
static class Program { private const int _count = 10000000; static void Main() { // Warm up var map = new Dictionary<double, double>(); var r = Math.Sin(0.01); // lock map.Clear(); var prm = 0d; var lockFlag = new object(); var sw = Stopwatch.StartNew(); for (int i = 0; i < _count; i++) lock (lockFlag) { map.Add(prm, Math.Sin(prm)); prm += 0.01; } sw.Stop(); Console.WriteLine("Lock: {0}ms", sw.ElapsedMilliseconds); // spinlock with memory barrier map.Clear(); var spinLock = new SpinLock(); prm = 0; sw = Stopwatch.StartNew(); for (int i = 0; i < _count; i++) { var gotLock = false; try { spinLock.Enter(ref gotLock); map.Add(prm, Math.Sin(prm)); prm += 0.01; } finally { if (gotLock) spinLock.Exit(true); } } sw.Stop(); Console.WriteLine("Spinlock with memory barrier: {0}ms", sw.ElapsedMilliseconds); // spinlock without memory barrier map.Clear(); prm = 0; sw = Stopwatch.StartNew(); for (int i = 0; i < _count; i++) { var gotLock = false; try { spinLock.Enter(ref gotLock); map.Add(prm, Math.Sin(prm)); prm += 0.01; } finally { if (gotLock) spinLock.Exit(false); } } sw.Stop(); Console.WriteLine("Spinlock without memory barrier: {0}ms", sw.ElapsedMilliseconds); } }
Executing this code on Core i7 2600K and x64 OS in Release configuration gives the following results:
Lock: 1906ms Spinlock with memory barrier: 1761ms Spinlock without memory barrier: 1731ms
Note that the performance boost is very small even with short duration locks. Also note that starting from .NET Framework 3.5, the Monitor
, ReaderWriterLock,
and ReaderWriterLockSlim
classes are implemented with spinlock.
Note
The main disadvantage of spinlocks is intensive CPU usage. The endless loop consumes energy, while the blocked thread does not. However, now the standard Monitor
class can use spinlock for a short time lock and then turn to usual lock, so in real world scenarios the difference would be even less noticeable than in this test.