Optimizing ARC with unsafe references
ARC recognizes two types of references: strong and weak. Strong references are ones
that participate in the reference counting mechanism, and weak references are references that
do not participate in reference counting.
While there is more than one way of achieving weak references, the ones that conceptually fit into ARC memory management are zeroing weak references. Zeroing weak references are niled (set to zero) when the object instance they point to is destroyed. A combination of strong and zeroing weak references leaves no room for invalid pointers. At all times, you will be dealing with references that are either nil or point to a valid object instance.
Sounds perfect, doesn't it?
Well, at least in theory. In practice, just like reference counting itself adds some small amount of overhead, zeroing weak references add some more. In order to zero out weak references after the object instance they point to is destroyed, the application has to track them at runtime and that can be a costly operation, especially if you have many of them or if you are dealing with code that must run as fast as possible.
This is where the concept of unsafe weak references comes in.
However, using unsafe weak reference is perfectly safe, if the lifetime of such a reference is smaller than or equal to the lifetime of the object instance it points to. In other words, if we can guarantee that we will not try to use the object instance after it is destroyed.
Since it does not have any influence on the reference counting mechanism,
Note: The
One of the places worth optimizing are iteration loops - where we are going through some collection of objects, doing some operation on them, but the whole time those objects are kept alive by the collection we iterate through. There is no need to grab an additional strong reference in such a case, but that is something we know, while the compiler plays it safe and takes the strong reference anway.
Let's take a look at the following situation: Iterating through a collection of interfaces, calling a method on each of them. This example uses interface references because it uses ARC on Windows compilers, where we can more easily debug and observe particular behaviors. On the ARC compiler, this behavior will be not be exhibited only by interfaces, but by object references, too.
CPU view of our code will give us insight into ARC triggers. Notice the
Accessing list items through its default (or Items) array property triggers reference counting not only once, but twice:
The above examples show how
While you may be tempted to use unsafe references everywhere, they should be used only when there is a need for optimization. Don't optimize prematurely. Use them with caution and only if you fully understand the consequences. And remember, under ARC, at all times you must have at least one strong reference to your object instance, so you cannot use an unsafe reference as the only reference to an object instance.
You cannot use
While there is more than one way of achieving weak references, the ones that conceptually fit into ARC memory management are zeroing weak references. Zeroing weak references are niled (set to zero) when the object instance they point to is destroyed. A combination of strong and zeroing weak references leaves no room for invalid pointers. At all times, you will be dealing with references that are either nil or point to a valid object instance.
Sounds perfect, doesn't it?
Well, at least in theory. In practice, just like reference counting itself adds some small amount of overhead, zeroing weak references add some more. In order to zero out weak references after the object instance they point to is destroyed, the application has to track them at runtime and that can be a costly operation, especially if you have many of them or if you are dealing with code that must run as fast as possible.
This is where the concept of unsafe weak references comes in.
What is an unsafe weak reference?
An unsafe weak reference is a weak reference that is not zeroed when the object it points to is destroyed. As their name suggests, they are unsafe because they open up doors for those darn invalid references. If the object instance is destroyed, an unsafe reference will not be zeroed and will point to an invalid memory location. And if we try to access invalid memory, our application will either crash or behave unpredictably.However, using unsafe weak reference is perfectly safe, if the lifetime of such a reference is smaller than or equal to the lifetime of the object instance it points to. In other words, if we can guarantee that we will not try to use the object instance after it is destroyed.
How to create an unsafe weak reference in Delphi?
There are two ways of achieving unsafe references: marking a reference with the[unsafe]
attribute, or using a raw pointer. And that is exactly what unsafe references are behind the
scenes - plain dumb pointers. The [unsafe]
attribute is basically an instruction to the
compiler to omit any reference counting code it would otherwise inject for a regular reference
counted object or interface reference. Nothing less, nothing more.Since it does not have any influence on the reference counting mechanism,
[unsafe]
can be
used as a debugging tool - a window through which we can observe a reference counted object,
without changing its behavior. Beyond debugging, it can be used for optimizing ARC code and
removing ARC overhead in particular places and under specific circumstances.Note: The
[weak]
and [unsafe]
attributes for interface references in the classic compiler
were introduced in Delphi Berlin 10.1, while all ARC compilers supported them from the start.Optimize!!!
There are two logically different places where unsafe references can be used as an optimization tool. The first is replacing a zeroing weak reference, in places where we know the weak reference will never outlive the object we point to, and the second is replacing a temporary strong reference, also in places where we know the other strong reference will keep the object instance alive while we access it through the temporary reference.One of the places worth optimizing are iteration loops - where we are going through some collection of objects, doing some operation on them, but the whole time those objects are kept alive by the collection we iterate through. There is no need to grab an additional strong reference in such a case, but that is something we know, while the compiler plays it safe and takes the strong reference anway.
Let's take a look at the following situation: Iterating through a collection of interfaces, calling a method on each of them. This example uses interface references because it uses ARC on Windows compilers, where we can more easily debug and observe particular behaviors. On the ARC compiler, this behavior will be not be exhibited only by interfaces, but by object references, too.
CPU view of our code will give us insight into ARC triggers. Notice the
_IntfCopy
calls - this
is a place where ARC is triggered by increasing the object's reference count. Since this is
transient reference counting, each call to _IntfCopy
will be matched with a call to
_IntfClear
that decreases the reference count, and we don't have to bother counting those to know
how many times we triggered ARC.Accessing list items through its default (or Items) array property triggers reference counting not only once, but twice:
Marking temporary Item
variable as unsafe removes one ARC trigger, but one still remains:
The default list array property uses a getter function where assigning the function's result
triggers the reference counting mechanism. In order to avoid that trigger, we have to avoid
the getter function. Fortunately, the list exposes its inner implementation storage - a
dynamic array - and using that array directly in combination with an unsafe temporary variable
completely removes unnecessary reference counting.
procedure LoopArray(List: TList<IFoo>);
var
i: integer;
[unsafe] Item: IFoo;
begin
for i := 0 to List.Count - 1 do
begin
Item := List.List[i];
Item.Foo;
end;
end;
The above examples show how
[unsafe]
can be used to optimize loops. But this principle
applies not only to loops, but also to other similar patterns where we need to access an
object instance through a temporary - transient - reference.Use caution
Just like with any other optimization, before we start optimizing we have to make sure there is actually something to optimize. Also, optimizations can be rather delicate. Changes in our code or any referenced code may easily break them.While you may be tempted to use unsafe references everywhere, they should be used only when there is a need for optimization. Don't optimize prematurely. Use them with caution and only if you fully understand the consequences. And remember, under ARC, at all times you must have at least one strong reference to your object instance, so you cannot use an unsafe reference as the only reference to an object instance.
You cannot use
[unsafe]
to create temporary object instances. This is not a proper use of
unsafe references. It will create a memory leak. Depending on the circumstances, it can also
prematurely release the object instance.
procedure Never;
var
[unsafe] Foo: IFoo; // or TFoo under ARC compiler
begin
// The Foo instance will leak
Foo := TFoo.Create;
....
end;
Very good article!
ReplyDeleteNote that you can have the very same behavior on oldest Delphi defining Item as pointer, then forcing a pointer() transtyping to set Item, and calling IFoo(item).Foo.
Of course, [unsafe] is better and less error-prone, but the pointer trick works on all Delphi and also on FPC.
To be fair, we should only care of this performance problems on core libraries like the RTL, DSpring or mORMot. I bet most user code won't suffer from the implicit reference counting.