The purpose of weak references - Part II
In the first part of this post series, I covered owning and non-owning references under manual memory management, and the purpose of non-owning (weak) references in that memory model. Following the same use cases, we can now clearly show the purpose of their counterparts under the automatic reference counting memory model.
Automatic reference counting
Automatic reference counting is a memory model under which an object instance will be valid as long as there is at least one strong reference to that object instance. When the last strong reference goes out of scope or is nilled, the object's reference count will drop to zero, and the instance will be automatically destroyed.
One of the side-effects of that design is that two object instances that are no longer reachable through any outside references can hold strong references to each other, thus keeping themselves alive and creating memory leaks. This is called a reference cycle.
To prevent problems with reference cycles, ARC has the concept of weak references. Weak references, in the context of ARC, are references to an object instance that don't participate in reference counting. Such references don't increase the reference count of an object, and if there are no other strong references to an object, such a weak reference will allow the object to be destroyed, even though it is accessible through the weak reference.
In ARC, it is the developer's job to recognize which references should be weak and
mark them as such. In Delphi, there are several ways to do so: using
[weak] attribute, the
[unsafe] attribute, or a plain pointer type.
In the classic Delphi compiler that works under manual memory management, object references also act like weak references to reference-counted object instances. Only interface references represent strong reference to such an object.
Because developers need to explicitly mark weak references to break reference cycles, it may seem that ARC as a memory management model has a vital flaw. However, we have already seen that even under manual memory management, the developer needs to know which references are owning ones and which are not. Non-owning references in the manual memory management model will be the ones that must be weak under ARC.
Keeping your references in order and knowing which one belongs to which category is a requirement for both manual memory management and ARC. Having to think about the code you write is not a flaw, but rather a feature, as not thinking will get you in trouble regardless of the memory management model in question.
There are some other slight differences, but only in the sense that ARC makes some code and features easier to write and achieve than under manual memory management. Namely, ARC allows an unlimited number of shared strong references to the same object instance without having to write elaborate notification code that will prevent releasing the same object multiple times or releasing it while it is still in use by some other code.
Another helpful feature of ARC is that it implements zeroing weak references. In
other words, weak references that will be zeroed—set to
nil—once the object
they point to is released. That avoids the dangling reference problem we have under
manual memory management, where a non-owning reference can point to an already
destroyed instance unless we manually implement some tracking and notification
mechanism. Instead of writing elaborate code, we only need to mark the desired reference with the
[weak] attribute and the compiler and RTL will automatically do the rest.
Before using such a zeroing weak reference, we always need to check whether it is
still assigned or not. If its value is
nil, that means the associated object is
no longer available and we cannot use it.
All other references that represent weak references as a concept of non-owning references are non-zeroing weak references. In other words, if the object instance they point to goes out of scope, such a non-zeroing weak reference will turn into a dangling pointer. Because of that, we call such references unsafe.
Unsafe references, as their name tells us, are inherently dangerous. If we have
zeroing weak references where we can always check whether an object instance is
valid or not, why would we ever use unsafe ones? The reason for that is the
performance penalty that comes with zeroing weak references. Namely, in order to
set such a reference back to
nil, we need to keep track of the object such
references point to, and clear them once their associated object is destroyed.
That tracking introduces a heavy performance penalty, and in places where a weak
reference will definitely go out of scope before the object it points to, we can
use unsafe weak references as form of optimization. That is the sole purpose of
If, in any code you write, you are not sure whether you can use an unsafe reference or not, you can always use a zeroing weak reference. However you will pay a (hefty) price in performance, so eventually it is good to learn whether you really need a zeroing weak reference or not.
In places where you really need a zeroing weak reference, using them will generally not be more expensive than any other custom tracking code you would need to implement under manual memory management to avoid dangling references. And the resulting code under ARC will definitely be cleaner and simpler than the one under manual memory management.
Strong references under ARC:
- interface references
Zeroing weak references under ARC:
- interface references marked with
Non-zeroing weak references under ARC:
- interface references marked with
- object references
Strong (owning) reference:
procedure Foo; var Intf: IFoo; // owning reference begin Intf := TFoo.Create; ... end; // the interface reference will be cleared and the object released at this point
Strong (owning) and weak (non-owning) references:
procedure Foo; var Intf: IFoo; // strong reference [weak] IntfRef: IFoo; // zeroing weak reference begin Intf := TFoo.Create; IntfRef := Intf; ... Intf := nil; // explicitly release object (there are no other strong references) // IntfRef is now nil if Assigned(IntfRef) then ... end;
procedure Foo; var Intf: IFoo; // strong reference [unsafe] IntfRef: IFoo; // unsafe weak reference begin Intf := TFoo.Create; IntfRef := Intf; ... Intf := nil; // explicitly release object (there are no other strong references) // IntfRef is now dangling pointer and must not be used after this point ... end;
procedure Foo; var Intf: IFoo; // strong reference ObjRef: TFoo; // unsafe weak reference begin Intf := TFoo.Create; ObjRef := TFoo(Intf); ... Intf := nil; // explicitly release object (there are no other strong references) // ObjRef is now dangling pointer and must not be used after this point ... end;
Another variant of the above example is also correct, but assigning to an interface reference needs to be the first thing you do after constructing the object instance.
procedure Foo; var Intf: IFoo; // strong reference ObjRef: TFoo; // unsafe weak reference begin ObjRef := TFoo.Create; Intf := ObjRef; ... Intf := nil; // explicitly release object (there are no other strong references) // ObjRef is now dangling pointer and must not be used after this point ... end;
Technically, this is actually shared ownership, as the original reference can be
freely used regardless of what happens to the reference stored in the list. Even
if the reference in the list is cleared, the object instance will still be alive as
long it is referenced through the strong
procedure Foo(List: TInterfaceList); var Intf: IFoo; // strong reference begin Intf := TFoo.Create; List.Add(Intf); // ownership is transferred to List end;
procedure TMainForm.OnButtonClick(Sender: TObject); var Intf1, Intf2, Intf3: IFoo // strong reference begin Intf1 := TFoo.Create; Intf2 := Intf1; Intf3 := Intf1; // as long as any of the above references are not nil, the object instance will be alive ... end;
The purpose of weak references in ARC
Now, when we know more about the behavior of weak references under ARC, we can look at the manual memory management examples and convert them to the ARC model. Because ARC automatically manages memory, we no longer need destructors:
type IParent = interface ... end; IChild = interface ... end; TChild = class(TInterfacedObject, IChild) protected Parent: IParent; // non-owning reference public constructor Create(const AParent: IParent); end; TParent = class(TInterfacedObject, IParent) public Child: IChild; // owning reference constructor Create; end; constructor TChild.Create(const AParent: IParent); begin Parent := AParent; end; constructor TParent.Create; begin Child := TChild.Create(Self); end;
The above example, as written, is not correct for, ARC and it will create a reference cycle. The parent holds a strong reference to a child, and the child holds a strong reference to the parent.
To avoid a memory leak, we need to break that reference cycle. For the untrained eye, this is the place where it may seem that ARC is imposing an unfair burden onto the developer. And indeed, it does require thinking about the references, just like manual memory management does, just in a slightly different way. Compared to the garbage collection memory model which will break cycles on its own, those two models may seem inferior.
However, GC also requires thinking about your references, as you can still rather easily create a nice mess and memory leaks (objects that are no longer used from some point on, but are still reachable and will not be collected by the GC mechanism). If you are not convinced, just Google out of memory, combined with Java or C# I am sure that the number of hits will convince you that is a real-life issue, not an imaginary one. Such leaks can be rather obscure, and can be extremely hard to figure out for inexperienced developers. GC is very forgiving for beginners, and it can take you a long way before you get into trouble, but once you get stuck, you will be on a whole new level of stuck.
This is a slight digression from the memory management models supported in Delphi, but just to show that no matter what language and tools they use, there is no free ride, and developers need to use their brains and think about the code they write.
Back to our parent-child example. We need to break the reference cycle. If we look at the manual memory management code, we have a clear picture of which reference is the owning one, and which is the non-owning one. In order to avoid a reference cycle, we just need to convert the non-owning reference in the child instance, pointing to its parent, to a weak reference:
TChild = class(TInterfacedObject, IChild) protected [weak] Parent: IParent; // non-owning reference public constructor Create(const AParent: IParent); end;
We just need to mark that reference with the
[weak] attribute. But, using zeroing
weak references is costly from a performance point of view. If our child
instance will never outlive the parent, then there is no danger in creating
dangling references, and we can safely use unsafe weak references: Mark the reference
[unsafe] attribute, or use an object reference or pointer.
TChild = class(TInterfacedObject, IChild) protected [unsafe] Parent: IParent; // non-owning reference public constructor Create(const AParent: IParent); end; or TChild = class(TInterfacedObject, IChild) protected Parent: TParent; // non-owning reference public constructor Create(const AParent: IParent); end;
[unsafe] and object references are good choices for unsafe weak
references. Using pointers is less appealing, as it requires typecasting when using
the reference. However, if you need to store unsafe references in a collection or an
array, you cannot use attributes without creating an additional wrapper around the
reference. In such cases, using object references is a clear winner.
This example shows us that the code intent and purpose of each reference does not change regardless of the memory mode we are using. The requirement of explicitly marking references as weak in ARC goes beyond merely breaking reference cycles, and it clearly shows us the ownership relations between objects, which makes such code easier to understand and follow than the code written for manual memory management.
Besides a parent-child relationship between object instances, we have also seen an example of a functional relationships between two independent objects—an edit control and a popup menu. That kind of relationship also required implementing a notification mechanism to avoid accessing dangling references.
Under ARC, such relationships are represented with zeroing weak references. Because none of the objects owns the other, the reference to the other object should be weak and because we need to prevent accessing dangling pointers, zeroing weak references is exactly the safety mechanism we need.
If we imagine that our GUI controls are actually reference-counted objects, then the relationship between the edit and its popup menu would be represented with the following code:
type TEdit = class(TControl, IEdit) protected [weak] FMenu: IMenu; ... public property Menu: IMenu read FMenu write SetMenu; end; TMenu = class(TComponent, IMenu) ...
In the VCL example, the menu had to store a reference back to the edit control in order to have a properly working free notification mechanism. In ARC, the menu does not have to store that reference. If there is a need to have the edit reference stored in the menu object for some other purpose, than such a reference would also need to be be a zeroing weak one.
Because ARC allows shared ownership out of the box, there may be code where you would want to have a strong reference to the object you are using, even though that object can also stand on its own. So it would be quite common to have a slightly different relationship between edit control and menu.
type TEdit = class(TControl, IEdit) protected FMenu: IMenu; ... public property Menu: IMenu read FMenu write SetMenu; end; end;
In this example, the edit holds strong reference to a popup menu, and as long as
the edit needs to use that menu, it will be available unless we explicitly assign
nil to the menu property. This is a different kind of ownership relation than
the one where the edit control would be the sole owner of the menu, and where that menu
would be constructed and treated as child of the edit control.
If we have an edit control with such a strong relationship to the menu component, and we would need to store a reference to the edit in the menu object, that reference still needs to be weak, as otherwise we would have a strong reference cycle that would prevent releasing both objects.
While ARC gives us more flexibility and additional features, like shared ownership, compared to manual memory management, it still has some limitations on what we can do straight out of the box.
Weak references in ARC, just like non-owning references in manual memory management, have a purpose—they describe the intent of our code and relationships between objects. And just like we can make a mess under manual memory management if we forget about the relationships between our objects, forgetting about those relationships under ARC, and failing to define them, will make a mess, too.
When that happens, it will not be a flaw of ARC, nor manual memory management, but an error on the developer's side. It is our responsibility and our mess to clean up.