Excessive Collection Locking
As soon as you step into writing multithreaded code, you will need to work with collections in a thread-safe manner. While Delphi provides some basic thread-safe collections for its own use in the RTL and visual frameworks, those are usually not enough.
So you will either end up using some third-party thread-safe collection library, or you will be insipired to roll out your own. Making thread-safe collections is not a very complicated task, but if you are not careful, you can easily end up using locking patterns which will look fine at first glance, but can cause excessive locking when used.
You will get thread safety, but running at a snail's pace. Considering that modern computers are powerful enough, such code can still run at an acceptable speed, and you might not even notice that you are burning CPU cycles in vain.
Thread-safe Collection
To implement any thread-safe collection, you will need a lock. While there are many different lock types we can use based on particular need, the lock type we use to demonstrate excessive locking problems is irrelevant. In this particular scenario, excessive locking will always happen, regardless of the lock used; the only difference will be how fast that lock type can execute the entering/leaving sequence.
Let's implement the minimum code needed for a generic thread-safe list:
uses
System.SysUtils,
System.Classes,
System.Generics.Collections,
System.Threading;
type
TConcurrentList<T> = class
private
FList: TList<T>;
FLock: TObject;
public
constructor Create;
destructor Destroy; override;
procedure Add(const Item: T);
procedure Enumerate(const AProc: TProc<T>);
end;
{ TConcurrentList<T> }
constructor TConcurrentList<T>.Create;
begin
FLock := TObject.Create;
FList := TList<T>.Create;
end;
destructor TConcurrentList<T>.Destroy;
begin
FList.Free;
FLock.Free;
inherited;
end;
procedure TConcurrentList<T>.Add(const Item: T);
begin
TMonitor.Enter(FLock);
try
FList.Add(Item);
finally
TMonitor.Exit(FLock);
end;
end;
procedure TConcurrentList<T>.Enumerate(const AProc: TProc<T>);
var
Idx: Integer;
begin
TMonitor.Enter(FLock);
try
for Idx := 0 to FList.Count - 1 do
AProc(FList.List[Idx]);
finally
TMonitor.Exit(FLock);
end;
end;
With such a list, you can now esily write code where one thread adds items to the list, while another reads its contents:
procedure Test;
var
List: TConcurrentList<Integer>;
ReadingTask, WritingTask: ITask;
begin
List := TConcurrentList<Integer>.Create;
try
WritingTask := TTask.Create(
procedure
begin
for var i := 0 to 100 do
begin
// simulate some work here
Sleep(20);
List.Add(i);
end;
end);
ReadingTask := TTask.Create(
procedure
begin
for var i := 0 to 20 do
begin
List.Enumerate(
procedure(Value: Integer)
begin
// writing to console from the background thread is not thread-safe,
// but for simplicity, it is acceptable here
Writeln(Value);
end);
Sleep(50);
end;
end);
ReadingTask.Start;
WritingTask.Start;
TTask.WaitForAll([ReadingTask, WritingTask]);
finally
List.Free;
end;
end;
Now, if you take a close look at the above code, you will notice that even our
WritingTask method excessively locks the list.
Namely, each call to Add in that loop will lock and then unlock the list.
Whether that is excessive or not depends on the particular situation. If the
processing work you are doing in that loop takes some time, you might not want
to block other threads from accessing the list during that time. Having a
thread-safe Add method will conform to such requirement fine. Yes, there will
be some extra locking calls made while adding items in the list, but sometimes
that is the price we must pay for doing work in the background threads while
retaining thread safety.
On the other hand, if you need to populate the list completely before the other thread reads only a partially populated list (in the above example, the reading task might never even get to read the full contents of the list), you would want to lock that whole list while you are adding items to that list.
Our initial list implemenattion does not support the second scenario. It also doesn't support more complex code, such as only adding unique numbers to the list.
While we can use Enumerate to see whether the number we are adding is already in
the list, and then call Add only if the number is not found, there is a small
window of opportunity between the Enumerate and Add calls, where some other
thread could try to add the same number, leaving us with duplicates in
the list.
Now, we can easily expand our thread-safe list with an AddUnique method which
will lock the list during the whole process. But, not all real life scenarios
can be handled with lists that implement locking from within and don't provide
other mechnanisms for locking list from the outside.
A common pattern for thread-safe collections is to expose the inner collection
through the LockList/UnlockList pattern. If you are working with code and locks
which support separate read and write access, you can also use the same principle
to implement BeginRead/EndRead and BeginWrite/EndWrite methods:
function TConcurrentList<T>.LockList: TList<T>;
begin
TMonitor.Enter(FLock);
Result := FList;
end;
procedure TConcurrentList<T>.UnlockList;
begin
TMonitor.Exit(FLock);
end;
As you can see, those methods are also very simple. But using that pattern complicates the code a bit, as you will need to have an additional variable storing the reference to the unprotected list:
procedure AddUnique(AList: TConcurrentList<Integer>; Value: Integer);
var
LList: TList<Integer>;
begin
LList := AList.LockList;
try
if not LList.Contains(Value) then
LList.Add(Value);
finally
AList.UnlockList;
end;
end;
But the above coding pattern also opens the doors to excessive locking. Instead
of calling LList.Add(Value), you can easily make a mistake and call
AList.Add(Value). If you do that, the code will still be thread safe, but you
would unnecessarily call an additional lock/unlock sequence.
AddUnique, even if you make a mistake and call Add on the wrong list, will
not have any more locking calls than the thread-unsafe Enumerate/Add sequence,
so you might think that this is not so important after all.
But real life is usually not that simple, and more complex code might require processing huge collections within a tight loop, where every CPU cycle counts:
procedure ProcessList(AList: TConcurrentList<Integer>);
var
LList: TList<Integer>;
begin
LList := AList.LockList;
try
//some more complex processing with single or multiple loops
for i := 0 to LList.Count - 1 do
...
finally
AList.UnlockList;
end;
end;
If you make a similar, simple mistake in the above code, and call AList.Add or
some other method which also internally locks the list, at some point the
excessive locking can have a significant impact on performance.
Another problem with exposing an unprotected list is that you also need to pay
attention to how you lock/unlock the list, and how you use the unprotected list.
There is nothing protecting you from using LList after you called
AList.UnlockList. There will be no compiler warnings, no immediate and obvious
crashes, nothing... just random errors and misbehaviors which may appear only
once in a blue moon when you run that code, but according to Murphy, very
frequently for your customers.
To prevent junior developers shooting themselves in the foot, you might be
tempted to use different approach and don't expose the inner list. What if you
just add Lock/Unlock methods to the list which will call the underlying locking
mechanism?
procedure TConcurrentList<T>.Lock;
begin
TMonitor.Enter(FLock);
end;
procedure TConcurrentList<T>.Unlock;
begin
TMonitor.Exit(FLock);
end;
Then for code which requires combining multiple list operations and performing them within the same lock, you can write:
procedure ProcessList(AList: TConcurrentList<Integer>);
begin
AList.Lock;
try
//some more complex processing with single or multiple loops
for i := 0 to AList.Count - 1 do
...
finally
AList.Unlock;
end;
end;
If you decide to use such a pattern, you are doing yourself a great disservice.
You have barely made the code easier to use for junior developers (or yourself). You
can still make mistakes in logic where you will not call the Lock/Unlock sequence
in the proper places or where you will not call them at all, making some code
logic thread-unsafe and misbehaving.
And the variable you saved will cost you dearly in performance terms. All code within such larger locks accessing the list will still call the—now unnecessary—internal lock/unlock methods.
Unlike the LockList/UnlockList pattern where you can simply fix the
mistake where you called the method on the wrong list, changing code which uses
the Lock/Unlock pattern will not be possible without either implementing the required
functionality directly on the list class, or switching to the
LockList/UnlockList pattern, both of which will require larger refactoring
efforts depending on the amount of code you have using that bad pattern.
No matter which kind of thread-safe collections and patterns you use, you will always have to be careful while using them to make your code run at optimal speed. Thread safety always has some negative impact on performance, but there is no need to make your code run slower than it absolutely needs to.
Comments
Post a Comment