The purpose of weak references - Part II
In the first part of this post series, I covered owning and non-owning references under manual memory management, and the purpose of non-owning (weak) references in that memory model. Following the same use cases, we can now clearly show the purpose of their counterparts under the automatic reference counting memory model.
Automatic reference counting
Automatic reference counting is a memory model under which an object instance will be valid as long as there is at least one strong reference to that object instance. When the last strong reference goes out of scope or is nilled, the object's reference count will drop to zero, and the instance will be automatically destroyed.
One of the side-effects of that design is that two object instances that are no longer reachable through any outside references can hold strong references to each other, thus keeping themselves alive and creating memory leaks. This is called a reference cycle.
To prevent problems with reference cycles, ARC has the concept of weak references. Weak references, in the context of ARC, are references to an object instance that don't participate in reference counting. Such references don't increase the reference count of an object, and if there are no other strong references to an object, such a weak reference will allow the object to be destroyed, even though it is accessible through the weak reference.
In ARC, it is the developer's job to recognize which references should be weak and
mark them as such. In Delphi, there are several ways to do so: using
the [weak]
attribute, the [unsafe]
attribute, or a plain pointer type.
In the classic Delphi compiler that works under manual memory management, object references also act like weak references to reference-counted object instances. Only interface references represent strong reference to such an object.
Because developers need to explicitly mark weak references to break reference cycles, it may seem that ARC as a memory management model has a vital flaw. However, we have already seen that even under manual memory management, the developer needs to know which references are owning ones and which are not. Non-owning references in the manual memory management model will be the ones that must be weak under ARC.
Keeping your references in order and knowing which one belongs to which category is a requirement for both manual memory management and ARC. Having to think about the code you write is not a flaw, but rather a feature, as not thinking will get you in trouble regardless of the memory management model in question.
There are some other slight differences, but only in the sense that ARC makes some code and features easier to write and achieve than under manual memory management. Namely, ARC allows an unlimited number of shared strong references to the same object instance without having to write elaborate notification code that will prevent releasing the same object multiple times or releasing it while it is still in use by some other code.
Another helpful feature of ARC is that it implements zeroing weak references. In
other words, weak references that will be zeroed—set to nil
—once the object
they point to is released. That avoids the dangling reference problem we have under
manual memory management, where a non-owning reference can point to an already
destroyed instance unless we manually implement some tracking and notification
mechanism. Instead of writing elaborate code, we only need to mark the desired reference with the [weak]
attribute and the compiler and RTL will automatically do the rest.
Before using such a zeroing weak reference, we always need to check whether it is
still assigned or not. If its value is nil
, that means the associated object is
no longer available and we cannot use it.
All other references that represent weak references as a concept of non-owning references are non-zeroing weak references. In other words, if the object instance they point to goes out of scope, such a non-zeroing weak reference will turn into a dangling pointer. Because of that, we call such references unsafe.
Unsafe references, as their name tells us, are inherently dangerous. If we have
zeroing weak references where we can always check whether an object instance is
valid or not, why would we ever use unsafe ones? The reason for that is the
performance penalty that comes with zeroing weak references. Namely, in order to
set such a reference back to nil
, we need to keep track of the object such
references point to, and clear them once their associated object is destroyed.
That tracking introduces a heavy performance penalty, and in places where a weak
reference will definitely go out of scope before the object it points to, we can
use unsafe weak references as form of optimization. That is the sole purpose of
unsafe references.
If, in any code you write, you are not sure whether you can use an unsafe reference or not, you can always use a zeroing weak reference. However you will pay a (hefty) price in performance, so eventually it is good to learn whether you really need a zeroing weak reference or not.
In places where you really need a zeroing weak reference, using them will generally not be more expensive than any other custom tracking code you would need to implement under manual memory management to avoid dangling references. And the resulting code under ARC will definitely be cleaner and simpler than the one under manual memory management.
Strong references under ARC:
- interface references
Zeroing weak references under ARC:
- interface references marked with
[weak]
attribute
Non-zeroing weak references under ARC:
- interface references marked with
[unsafe]
attribute - pointers
- object references
Strong (owning) reference:
procedure Foo;
var
Intf: IFoo; // owning reference
begin
Intf := TFoo.Create;
...
end; // the interface reference will be cleared and the object released at this point
Strong (owning) and weak (non-owning) references:
procedure Foo;
var
Intf: IFoo; // strong reference
[weak] IntfRef: IFoo; // zeroing weak reference
begin
Intf := TFoo.Create;
IntfRef := Intf;
...
Intf := nil; // explicitly release object (there are no other strong references)
// IntfRef is now nil
if Assigned(IntfRef) then
...
end;
procedure Foo;
var
Intf: IFoo; // strong reference
[unsafe] IntfRef: IFoo; // unsafe weak reference
begin
Intf := TFoo.Create;
IntfRef := Intf;
...
Intf := nil; // explicitly release object (there are no other strong references)
// IntfRef is now dangling pointer and must not be used after this point
...
end;
procedure Foo;
var
Intf: IFoo; // strong reference
ObjRef: TFoo; // unsafe weak reference
begin
Intf := TFoo.Create;
ObjRef := TFoo(Intf);
...
Intf := nil; // explicitly release object (there are no other strong references)
// ObjRef is now dangling pointer and must not be used after this point
...
end;
Another variant of the above example is also correct, but assigning to an interface reference needs to be the first thing you do after constructing the object instance.
procedure Foo;
var
Intf: IFoo; // strong reference
ObjRef: TFoo; // unsafe weak reference
begin
ObjRef := TFoo.Create;
Intf := ObjRef;
...
Intf := nil; // explicitly release object (there are no other strong references)
// ObjRef is now dangling pointer and must not be used after this point
...
end;
Ownership transfer:
Technically, this is actually shared ownership, as the original reference can be
freely used regardless of what happens to the reference stored in the list. Even
if the reference in the list is cleared, the object instance will still be alive as
long it is referenced through the strong Intf
reference:
procedure Foo(List: TInterfaceList);
var
Intf: IFoo; // strong reference
begin
Intf := TFoo.Create;
List.Add(Intf); // ownership is transferred to List
end;
Shared ownership:
procedure TMainForm.OnButtonClick(Sender: TObject);
var
Intf1, Intf2, Intf3: IFoo // strong reference
begin
Intf1 := TFoo.Create;
Intf2 := Intf1;
Intf3 := Intf1;
// as long as any of the above references are not nil, the object instance will be alive
...
end;
The purpose of weak references in ARC
Now, when we know more about the behavior of weak references under ARC, we can look at the manual memory management examples and convert them to the ARC model. Because ARC automatically manages memory, we no longer need destructors:
type
IParent = interface
...
end;
IChild = interface
...
end;
TChild = class(TInterfacedObject, IChild)
protected
Parent: IParent; // non-owning reference
public
constructor Create(const AParent: IParent);
end;
TParent = class(TInterfacedObject, IParent)
public
Child: IChild; // owning reference
constructor Create;
end;
constructor TChild.Create(const AParent: IParent);
begin
Parent := AParent;
end;
constructor TParent.Create;
begin
Child := TChild.Create(Self);
end;
The above example, as written, is not correct for, ARC and it will create a reference cycle. The parent holds a strong reference to a child, and the child holds a strong reference to the parent.
To avoid a memory leak, we need to break that reference cycle. For the untrained eye, this is the place where it may seem that ARC is imposing an unfair burden onto the developer. And indeed, it does require thinking about the references, just like manual memory management does, just in a slightly different way. Compared to the garbage collection memory model which will break cycles on its own, those two models may seem inferior.
However, GC also requires thinking about your references, as you can still rather easily create a nice mess and memory leaks (objects that are no longer used from some point on, but are still reachable and will not be collected by the GC mechanism). If you are not convinced, just Google out of memory, combined with Java or C# I am sure that the number of hits will convince you that is a real-life issue, not an imaginary one. Such leaks can be rather obscure, and can be extremely hard to figure out for inexperienced developers. GC is very forgiving for beginners, and it can take you a long way before you get into trouble, but once you get stuck, you will be on a whole new level of stuck.
This is a slight digression from the memory management models supported in Delphi, but just to show that no matter what language and tools they use, there is no free ride, and developers need to use their brains and think about the code they write.
Back to our parent-child example. We need to break the reference cycle. If we look at the manual memory management code, we have a clear picture of which reference is the owning one, and which is the non-owning one. In order to avoid a reference cycle, we just need to convert the non-owning reference in the child instance, pointing to its parent, to a weak reference:
TChild = class(TInterfacedObject, IChild)
protected
[weak] Parent: IParent; // non-owning reference
public
constructor Create(const AParent: IParent);
end;
We just need to mark that reference with the [weak]
attribute. But, using zeroing
weak references is costly from a performance point of view. If our child
instance will never outlive the parent, then there is no danger in creating
dangling references, and we can safely use unsafe weak references: Mark the reference
with the [unsafe]
attribute, or use an object reference or pointer.
TChild = class(TInterfacedObject, IChild)
protected
[unsafe] Parent: IParent; // non-owning reference
public
constructor Create(const AParent: IParent);
end;
or
TChild = class(TInterfacedObject, IChild)
protected
Parent: TParent; // non-owning reference
public
constructor Create(const AParent: IParent);
end;
Both [unsafe]
and object references are good choices for unsafe weak
references. Using pointers is less appealing, as it requires typecasting when using
the reference. However, if you need to store unsafe references in a collection or an
array, you cannot use attributes without creating an additional wrapper around the
reference. In such cases, using object references is a clear winner.
This example shows us that the code intent and purpose of each reference does not change regardless of the memory mode we are using. The requirement of explicitly marking references as weak in ARC goes beyond merely breaking reference cycles, and it clearly shows us the ownership relations between objects, which makes such code easier to understand and follow than the code written for manual memory management.
Besides a parent-child relationship between object instances, we have also seen an example of a functional relationships between two independent objects—an edit control and a popup menu. That kind of relationship also required implementing a notification mechanism to avoid accessing dangling references.
Under ARC, such relationships are represented with zeroing weak references. Because none of the objects owns the other, the reference to the other object should be weak and because we need to prevent accessing dangling pointers, zeroing weak references is exactly the safety mechanism we need.
If we imagine that our GUI controls are actually reference-counted objects, then the relationship between the edit and its popup menu would be represented with the following code:
type
TEdit = class(TControl, IEdit)
protected
[weak] FMenu: IMenu;
...
public
property Menu: IMenu read FMenu write SetMenu;
end;
TMenu = class(TComponent, IMenu)
...
In the VCL example, the menu had to store a reference back to the edit control in order to have a properly working free notification mechanism. In ARC, the menu does not have to store that reference. If there is a need to have the edit reference stored in the menu object for some other purpose, than such a reference would also need to be be a zeroing weak one.
Because ARC allows shared ownership out of the box, there may be code where you would want to have a strong reference to the object you are using, even though that object can also stand on its own. So it would be quite common to have a slightly different relationship between edit control and menu.
type
TEdit = class(TControl, IEdit)
protected
FMenu: IMenu;
...
public
property Menu: IMenu read FMenu write SetMenu;
end;
end;
In this example, the edit holds strong reference to a popup menu, and as long as
the edit needs to use that menu, it will be available unless we explicitly assign
nil
to the menu property. This is a different kind of ownership relation than
the one where the edit control would be the sole owner of the menu, and where that menu
would be constructed and treated as child of the edit control.
If we have an edit control with such a strong relationship to the menu component, and we would need to store a reference to the edit in the menu object, that reference still needs to be weak, as otherwise we would have a strong reference cycle that would prevent releasing both objects.
While ARC gives us more flexibility and additional features, like shared ownership, compared to manual memory management, it still has some limitations on what we can do straight out of the box.
Weak references in ARC, just like non-owning references in manual memory management, have a purpose—they describe the intent of our code and relationships between objects. And just like we can make a mess under manual memory management if we forget about the relationships between our objects, forgetting about those relationships under ARC, and failing to define them, will make a mess, too.
When that happens, it will not be a flaw of ARC, nor manual memory management, but an error on the developer's side. It is our responsibility and our mess to clean up.
> Another variant of the above example is also correct, but assigning to an interface reference needs to be the first thing you do after constructing the object instance.
ReplyDeleteThis is really important!
procedure DoSomething(Intf: IMyIntf);
begin
// do something
end;
procedure Foo;
var Obj: TMyObject;
begin
Obj := TMyObject.Create;
try
DoSomething(Obj);
finally
Obj.Free; // <-- this raises an exception
end;
end;
The reason is the transfer to the interface in the procedure call. On the call the ref count is incremented (to 1) and once the procedure ends, the interface variable is disposed of and the ref count is set to 0, freeing the object.
There are several issues with the posted code.
DeleteFirst, instances of reference counted classes must never, ever be manually released by calling Free. They should be left to ARC to manage their memory.
In order to leave them to ARC, reference counting must be properly initialized. That can only be done by assigning such instance to interface reference.
Passing object reference to a method that takes interface reference is not a proper way to do it, except in rare occasions where you know that passed reference will be assigned within that method to some more permanent interface reference. For instance when you call Add on interface list or similar collection that holds interfaces.
Problem with calling methods without having properly initialized reference counting is that in such cases you depend on whether reference counting will be triggered inside that method or not. For instance if you have const parameter, it might not get triggered and this can cause memory leaks.
Also in such scenarios, you are not allowed to touch that object reference after the method is called, as you don't have strong reference to that instance in scope. If the method triggers reference counting, but it doesn't store the reference to some longer lived variable, instance will be released after method is completed and any code using object reference to access it will be working with dangling pointer.
That is why I am strongly advising that you also keep an interface reference to such object in the same scope, if for some reason you really need to have object reference, too.
This is all correct, I just wanted to point out the dangers ARC can bring you, when you don't be careful with reference counted classes.
DeleteIf you don't remember that DoSomething() expects an interface the above example might bite you and it might not be obvious at first.
You're right, that reference counted classes have to be treated as such, but you may have interfaces just to implement contracts and don't think of the reference counting while writing your code. That's the reason I wanted to point the above out.
If you have interface just as a contract and you don't want to have reference counting on a class you can use class that has reference counting disabled. Only then you can use such class through object reference. This is how TComponent is implemented. Since Delphi 11 there is TNoRefCountObject base class that can be used as ancestor for such classes. However, with such classes you must make sure that you don't have any interface reference to the object at the point you call Free because while reference counting will not manage instance lifetime, it will still call _AddRef and _Release methods on interface references and you don't want them called on destroyed object instance.
DeleteThe only danger in ARC is if you use it incorrectly. If you have class that implements reference counting mechanism, you always must have interface reference to manage its lifetime correctly.