Using atomic operations to achieve thread safety
Atomic operations are ones which can be executed on otherwise-unprotected shared data as a single indivisible step, where no other thread will be able to compromise the consistency of either that shared data or the operation itself.
In other words, even for operation logic consisting of multiple, compound operations—steps—being atomic means that all those steps will run as an unbreakable whole on the shared data, and multiple threads will not be able to interfere with each other.
Most fundamental atomic operations are actually implemented on the hardware level, and rely on specific CPU instructions to ensure atomicity, but some can also be implemented in higher level code with the help of those atomic primitives.
Because hardware support is needed for the most fundamental primitives, atomic operations will be limited to simple data types commonly up to a size of a pointer. Support for atomic operations for larger data sizes—double pointers or even more—is hardware-dependent, and Delphi currently only supports atomic operations for 32-bit and 64-bit data types on all platforms.
Some atomic operations which are not directly supported by hardware can be implemented by using supported atomic primitives and a loop until the operation succeeds.
This size constraint makes it obvious that atomic operations have limited use, but they are still essential tools for achieving thread safety in multi-threaded code. On the one hand, they actually don't protect that much and might even look useless to the untrained eye; on the other hand, without them there would be no multi-threaded code as we know it.
The simplest atomic operations are atomic load and store (read and write). If the pointer-sized data is naturally aligned, then even a regular read or write of such a variable will be atomic. Because of this, you will rarely need to use dedicated atomic operations for reading and writing.
The next simple operation is an atomic exchange (swap), which unconditionally writes a new value and returns the old value.
And then there is another fundamental atomic operation: compare and exchange (compare and swap—CAS), which compares the current value with some comparand, and writes the new value only if the comparison succeeds, returning the original value of a variable.
Other Delphi-supported atomic operations are: addition and subtraction (including increment and decrement as special cases), and various bitwise operations.
Multiple steps
The purpose of atomic operations is to combine multiple steps which may be required and which could be interrupted when multiple threads operate on the same data into a single, indivisible one.
Showing some examples and how multiple threads can impact shared data in situations which may look simple from the perspective of a higher-level language and code will make it easier to understand the nature of concurrency bugs when performing similar, non-atomic operations, but also how we can use atomic operations and what we can achieve by using them.
Probably the simplest example to understand is incrementing a integer number declared as:
var Number: Integer := 0;
To increment a value (in this case 0) stored in such a variable we can write:
Inc(Number);
From a high-level code perspective, this may seem like a simple operation. But if we dig deeper into it and look at its logical equivalent, we can more easily see that it is not that simple at all:
Number := Number + 1;
To increment a number, we actually need to read the value from that variable (memory location), add one to that value, and then store that result back into the variable.
It is easy to see how multiple threads can interfere with each other in such a situation. If we have two threads incrementing a value, we would logically expect that the end result will be incremented by two. If we have more than two threads doing that, we would expect that it would be incremented by the number of times we called the increment function.
Even if we assume that the variable is naturally aligned, and that the reads and writes alone will be atomic with no data tearing, incrementing a value from multiple threads is not thread-safe, and it may not give us proper results.
How can that happen?
In a multi-threaded scenario, two threads can easily access and read the value
from Number at the same time. That means both threads will read the same
value. But then they will both increment that same value and write the
incremented value back, instead of one thread incrementing and writing the value
first, and the other thread incrementing the already incremented value afterwards. As a
result, we will have our value incremented only once instead of twice.
Add more threads into the mix, and the results can be even more wrong.
If we want to protect such shared data and have correct results in a multi-threaded scenario, we would have to create a lock, and lock all access—both reading and writing—to such a variable:
var Lock := TCriticalSection.Create;
var Number: Integer := 0;
Lock.Enter;
try
Number := Number + 1;
finally
Lock.Leave;
end;
Creating and maintaining a lock, as well as entering and exiting a lock for protecting variable requires a lot of code and is also rather slow. This is where atomic operations come in handy. Atomically incrementing a variable does not require an additional lock, and will therefore run much faster.
To atomically increment an integer in Delphi, we can call:
AtomicIncrement(Number);
or its equivalent from the static TInterlocked class, belonging to the
System.SyncObjs unit:
TInterlocked.Increment(Number);
Atomic increment will perform the whole read-increment-write sequence as a one single, uninterrupted operation, where other threads will not be able to read the value in that memory location (this is a simplified explanation of what actually happens behind the scenes) while some thread is in the middle of doing the atomic increment.
No matter how many threads you have, nor how many times you call
AtomicIncrement(Number) the value stored in the Number will be incremented
the exact number of times you called AtomicIncrement(Number), no more, no less.
Atomic increment, decrement, addition, and subtraction operations are actually functions which not only perform a particular operation on the target variable, but also return the result of that operation. That way you can know what was the actual value of a variable written at that point in time.
Some use cases
Reference counting
Atomically incrementing or decrementing a number is one of the more commonly used atomic operations. Its primary use case is maintaining a correct strong reference count of an instance in an automatic reference counting mechanism.
While automatic reference counting is also used for strings and dynamic arrays,
reference counting for interfaces is delegated to the _AddRef and _Release
methods. These methods are implemented in different ways depending on
the actual implementing class and its purpose, but the most basic ones would be:
function TInterfacedObject._AddRef: Integer;
begin
Result := AtomicIncrement(FRefCount);
end;
function TInterfacedObject._Release: Integer;
begin
Result := AtomicDecrement(FRefCount);
if Result = 0 then
Destroy;
end;
The above implementation maintains a thread-safe reference count of an object instance.
In _Release, we can see how we can use the result of the atomic decrement.
If the returned value is zero, that means this particular reference was the last
strong reference to that object instance.
In other words, the object instance is no longer accessible through any strong reference at that point, and we can safely destroy such an object. Not only can we destroy the object, but we must destroy it, otherwise we would have a memory leak.
Maintaining a thread-safe reference count in ARC is fundamental for sharing any reference-counted instance across multiple threads. Without it, it would be impossible to maintain a correct count of strong references, and without the correct count, the instance could get prematurely destroyed while it is still in use, or remain in memory after it is no longer used.
So once you have a strong reference to some reference-counted instance, you can safely pass that shared reference to other threads, and they can use it for creating other strong references and passing those around. Once the last strong reference goes out of scope, regardless of which thread it was in or when that happened, such an instance will be safely destroyed.
However, that is all the thread safety you will get with reference counting. Reading the reference, copying it and making another strong reference is thread-safe, but reading and writing the same shared variable from multiple threads is not a thread-safe operation.
So even though automatic reference counting has some thread safety built in, which helps a lot when working with reference-counted instances, that thread safety has some limitations, and if you need to have write access to a variable, you will have to protect it with the additional lock.
Lazy initialization
Another place where atomic operations are commonly used is a thread-safe lazy initialization. You can find plenty of such code within the built-in Delphi frameworks.
If you have a singleton class and you want to lazily initialize the instance, your code might look like the following:
class function TFoo: GetInstance: TFoo;
begin
if FInstance = nil then
FInstance := TFoo.Create;
Result := FInstance;
end;
The above initialization is not thread-safe. In other words, you would have to make sure that the singleton instance is initialized before being accessed by multiple threads.
This code can be easily converted into a thread-safe form without losing any
runtime performance once the initialization is completed. Of course, this is
just one of the many possible variants using CompareExchange to safely
initialize a singleton instance:
class function TFoo: GetInstance: TFoo;
var
LInstance: TFoo;
begin
if FInstance = nil then
begin
LInstance := TFoo.Create;
if TInterlocked.CompareExchange<TFoo>(FInstance, LInstance, nil) <> nil then
LInstance.Free; // Free redundant instance
end;
Result := FInstance;
end;
Reading a naturally aligned pointer is thread-safe, so the initial nil check does not
require any kind of protection. The worst what can happen is that multiple
threads enter the if branch and create multiple TFoo instances. However, by
using CompareExchange, only one thread will be able to successfully exchange
and write the LInstance value into the FInstance variable.
If the CompareExchange fails, and another thread has already successfully
initialized FInstance, we need to release the redundant instance to prevent
memory leaks.
Spin locks and guards
Atomic operations can also be used to implement other synchronization objects. For instance, a very basic non-reentrant spin lock can be implemented with something like the following code:
type
TSpinLock = class
private
[volatile] FLocked: Integer;
public
procedure Enter;
procedure Leave;
function TryEnter: Boolean;
end;
procedure TSpinLock.Enter;
begin
while TInterlocked.Exchange(FLocked, 1) = 1 do
TThread.Yield;
end;
procedure TSpinLock.Leave;
begin
TInterlocked.Exchange(FLocked, 0);
end;
function TSpinLock.TryEnter: Boolean;
begin
Result := TInterlocked.Exchange(FLocked, 1) = 0;
end;
We use a simple integer flag to show our lock state: If the value is 0, then the
lock is not locked, and the value 1 means that some thread holds the lock.
When entering the lock, we can use a simple atomic Exchange. If the returned, old
value, is 0 that means we have successfully entered the lock. After that, and
until we call Leave, any other thread which tries to enter the lock will get
1 as the result of the atomic Exchange. If that happens, the thread will yield its
processor time and then try setting the flag again, until it succeeds.
This kind of spin lock is not reentrant. If the same thread tries to enter the
lock, it will fail, and without any mechanism to break the while, it will cause
a deadlock.
If we remove the Enter implementation, the above spin lock will be converted into a
simple guard class, which can be used to protecting parts of the code which
needs to run only once by only using TryEnter, or where the thread will merely
skip running protected code if some other thread is already running it by using
a TryEnter/Leave combination.
If you need to maintain multiple guards for multiple states, then instead of using
multiple integer variables as guards for each state, you can also pack those
into a single integer, where each bit can be used as a guard for a particular
state with the help of the TInterlocked.BitTestAndSet and
TInterlocked.BitTestAndClear methods.
The above examples show just some possible use cases for atomic operations, but even from those it can be seen that, even though the thread safety they give is very narrow, there are still many situations where atomic operations will be a better, simpler, and much faster choice for achieving thread safety in some code, which would otherwise require more heavyweight and slower locking and synchronization mechanisms.
Comments
Post a Comment