The purpose of weak references

The purpose of weak references - Part II

- November 07, 2022

In the first part of this post series, I covered owning and non-owning references under manual memory management, and the purpose of non-owning (weak) references in that memory model. Following the same use cases, we can now clearly show the purpose of their counterparts under the automatic reference counting memory model.

Automatic reference counting

Automatic reference counting is a memory model under which an object instance will be valid as long as there is at least one strong reference to that object instance. When the last strong reference goes out of scope or is nilled, the object's reference count will drop to zero, and the instance will be automatically destroyed.

One of the side-effects of that design is that two object instances that are no longer reachable through any outside references can hold strong references to each other, thus keeping themselves alive and creating memory leaks. This is called a reference cycle.

To prevent problems with reference cycles, ARC has the concept of weak references. Weak references, in the context of ARC, are references to an object instance that don't participate in reference counting. Such references don't increase the reference count of an object, and if there are no other strong references to an object, such a weak reference will allow the object to be destroyed, even though it is accessible through the weak reference.

In ARC, it is the developer's job to recognize which references should be weak and mark them as such. In Delphi, there are several ways to do so: using the [weak] attribute, the [unsafe] attribute, or a plain pointer type.

In the classic Delphi compiler that works under manual memory management, object references also act like weak references to reference-counted object instances. Only interface references represent strong reference to such an object.

Because developers need to explicitly mark weak references to break reference cycles, it may seem that ARC as a memory management model has a vital flaw. However, we have already seen that even under manual memory management, the developer needs to know which references are owning ones and which are not. Non-owning references in the manual memory management model will be the ones that must be weak under ARC.

Keeping your references in order and knowing which one belongs to which category is a requirement for both manual memory management and ARC. Having to think about the code you write is not a flaw, but rather a feature, as not thinking will get you in trouble regardless of the memory management model in question.

There are some other slight differences, but only in the sense that ARC makes some code and features easier to write and achieve than under manual memory management. Namely, ARC allows an unlimited number of shared strong references to the same object instance without having to write elaborate notification code that will prevent releasing the same object multiple times or releasing it while it is still in use by some other code.

Another helpful feature of ARC is that it implements zeroing weak references. In other words, weak references that will be zeroed—set to nil—once the object they point to is released. That avoids the dangling reference problem we have under manual memory management, where a non-owning reference can point to an already destroyed instance unless we manually implement some tracking and notification mechanism. Instead of writing elaborate code, we only need to mark the desired reference with the [weak] attribute and the compiler and RTL will automatically do the rest.

Before using such a zeroing weak reference, we always need to check whether it is still assigned or not. If its value is nil, that means the associated object is no longer available and we cannot use it.

All other references that represent weak references as a concept of non-owning references are non-zeroing weak references. In other words, if the object instance they point to goes out of scope, such a non-zeroing weak reference will turn into a dangling pointer. Because of that, we call such references unsafe.

Unsafe references, as their name tells us, are inherently dangerous. If we have zeroing weak references where we can always check whether an object instance is valid or not, why would we ever use unsafe ones? The reason for that is the performance penalty that comes with zeroing weak references. Namely, in order to set such a reference back to nil, we need to keep track of the object such references point to, and clear them once their associated object is destroyed. That tracking introduces a heavy performance penalty, and in places where a weak reference will definitely go out of scope before the object it points to, we can use unsafe weak references as form of optimization. That is the sole purpose of unsafe references.

If, in any code you write, you are not sure whether you can use an unsafe reference or not, you can always use a zeroing weak reference. However you will pay a (hefty) price in performance, so eventually it is good to learn whether you really need a zeroing weak reference or not.

In places where you really need a zeroing weak reference, using them will generally not be more expensive than any other custom tracking code you would need to implement under manual memory management to avoid dangling references. And the resulting code under ARC will definitely be cleaner and simpler than the one under manual memory management.

Strong references under ARC:

interface references

Zeroing weak references under ARC:

interface references marked with [weak] attribute

Non-zeroing weak references under ARC:

interface references marked with [unsafe] attribute
pointers
object references

Strong (owning) reference:

procedure Foo;
var
  Intf: IFoo; // owning reference
begin
  Intf := TFoo.Create;
  ...
end; // the interface reference will be cleared and the object released at this point

Strong (owning) and weak (non-owning) references:

procedure Foo;
var
  Intf: IFoo; // strong reference
  [weak] IntfRef: IFoo; // zeroing weak reference
begin
  Intf := TFoo.Create;
  IntfRef := Intf;
  ...

  Intf := nil; // explicitly release object (there are no other strong references)
  // IntfRef is now nil
  if Assigned(IntfRef) then
    ...
end;

procedure Foo;
var
  Intf: IFoo; // strong reference
  [unsafe] IntfRef: IFoo; // unsafe weak reference
begin
  Intf := TFoo.Create;
  IntfRef := Intf;
  ...

  Intf := nil; // explicitly release object (there are no other strong references)
  // IntfRef is now dangling pointer and must not be used after this point
  ... 
end;

procedure Foo;
var
  Intf: IFoo; // strong reference
  ObjRef: TFoo; // unsafe weak reference
begin
  Intf := TFoo.Create;
  ObjRef := TFoo(Intf);
  ...

  Intf := nil; // explicitly release object (there are no other strong references)
  // ObjRef is now dangling pointer and must not be used after this point
  ... 
end;

Another variant of the above example is also correct, but assigning to an interface reference needs to be the first thing you do after constructing the object instance.

procedure Foo;
var
  Intf: IFoo; // strong reference
  ObjRef: TFoo; // unsafe weak reference
begin
  ObjRef := TFoo.Create;
  Intf := ObjRef;
  ...

  Intf := nil; // explicitly release object (there are no other strong references)
  // ObjRef is now dangling pointer and must not be used after this point
  ... 
end;

Ownership transfer:

Technically, this is actually shared ownership, as the original reference can be freely used regardless of what happens to the reference stored in the list. Even if the reference in the list is cleared, the object instance will still be alive as long it is referenced through the strong Intf reference:

procedure Foo(List: TInterfaceList);
var
  Intf: IFoo; // strong reference
begin
  Intf := TFoo.Create;
  List.Add(Intf); // ownership is transferred to List 
end;

Shared ownership:

procedure TMainForm.OnButtonClick(Sender: TObject);
var
  Intf1, Intf2, Intf3: IFoo // strong reference
begin
  Intf1 := TFoo.Create;
  Intf2 := Intf1;
  Intf3 := Intf1;
  // as long as any of the above references are not nil, the object instance will be alive
  ...
end;

The purpose of weak references in ARC

Now, when we know more about the behavior of weak references under ARC, we can look at the manual memory management examples and convert them to the ARC model. Because ARC automatically manages memory, we no longer need destructors:

type
  IParent = interface
  ...
  end;

  IChild = interface
  ...
  end;

  TChild = class(TInterfacedObject, IChild)
  protected
    Parent: IParent; // non-owning reference
  public
    constructor Create(const AParent: IParent);
  end;

  TParent = class(TInterfacedObject, IParent)
  public
    Child: IChild; // owning reference
    constructor Create;
  end;

constructor TChild.Create(const AParent: IParent);
begin
  Parent := AParent;
end;

constructor TParent.Create;
begin
  Child := TChild.Create(Self);
end;

The above example, as written, is not correct for, ARC and it will create a reference cycle. The parent holds a strong reference to a child, and the child holds a strong reference to the parent.

To avoid a memory leak, we need to break that reference cycle. For the untrained eye, this is the place where it may seem that ARC is imposing an unfair burden onto the developer. And indeed, it does require thinking about the references, just like manual memory management does, just in a slightly different way. Compared to the garbage collection memory model which will break cycles on its own, those two models may seem inferior.

However, GC also requires thinking about your references, as you can still rather easily create a nice mess and memory leaks (objects that are no longer used from some point on, but are still reachable and will not be collected by the GC mechanism). If you are not convinced, just Google out of memory, combined with Java or C# I am sure that the number of hits will convince you that is a real-life issue, not an imaginary one. Such leaks can be rather obscure, and can be extremely hard to figure out for inexperienced developers. GC is very forgiving for beginners, and it can take you a long way before you get into trouble, but once you get stuck, you will be on a whole new level of stuck.

This is a slight digression from the memory management models supported in Delphi, but just to show that no matter what language and tools they use, there is no free ride, and developers need to use their brains and think about the code they write.

Back to our parent-child example. We need to break the reference cycle. If we look at the manual memory management code, we have a clear picture of which reference is the owning one, and which is the non-owning one. In order to avoid a reference cycle, we just need to convert the non-owning reference in the child instance, pointing to its parent, to a weak reference:

  TChild = class(TInterfacedObject, IChild)
  protected
    [weak] Parent: IParent; // non-owning reference
  public
    constructor Create(const AParent: IParent);
  end;

We just need to mark that reference with the [weak] attribute. But, using zeroing weak references is costly from a performance point of view. If our child instance will never outlive the parent, then there is no danger in creating dangling references, and we can safely use unsafe weak references: Mark the reference with the [unsafe] attribute, or use an object reference or pointer.

  TChild = class(TInterfacedObject, IChild)
  protected
    [unsafe] Parent: IParent; // non-owning reference
  public
    constructor Create(const AParent: IParent);
  end;

or

  TChild = class(TInterfacedObject, IChild)
  protected
    Parent: TParent; // non-owning reference
  public
    constructor Create(const AParent: IParent);
  end;

Both [unsafe] and object references are good choices for unsafe weak references. Using pointers is less appealing, as it requires typecasting when using the reference. However, if you need to store unsafe references in a collection or an array, you cannot use attributes without creating an additional wrapper around the reference. In such cases, using object references is a clear winner.

This example shows us that the code intent and purpose of each reference does not change regardless of the memory mode we are using. The requirement of explicitly marking references as weak in ARC goes beyond merely breaking reference cycles, and it clearly shows us the ownership relations between objects, which makes such code easier to understand and follow than the code written for manual memory management.

Besides a parent-child relationship between object instances, we have also seen an example of a functional relationships between two independent objects—an edit control and a popup menu. That kind of relationship also required implementing a notification mechanism to avoid accessing dangling references.

Under ARC, such relationships are represented with zeroing weak references. Because none of the objects owns the other, the reference to the other object should be weak and because we need to prevent accessing dangling pointers, zeroing weak references is exactly the safety mechanism we need.

If we imagine that our GUI controls are actually reference-counted objects, then the relationship between the edit and its popup menu would be represented with the following code:

type
  TEdit = class(TControl, IEdit)
  protected
    [weak] FMenu: IMenu;
   ... 
  public
    property Menu: IMenu read FMenu write SetMenu;
  end;

  TMenu = class(TComponent, IMenu)
  ...

In the VCL example, the menu had to store a reference back to the edit control in order to have a properly working free notification mechanism. In ARC, the menu does not have to store that reference. If there is a need to have the edit reference stored in the menu object for some other purpose, than such a reference would also need to be be a zeroing weak one.

Because ARC allows shared ownership out of the box, there may be code where you would want to have a strong reference to the object you are using, even though that object can also stand on its own. So it would be quite common to have a slightly different relationship between edit control and menu.

type
  TEdit = class(TControl, IEdit)
  protected
    FMenu: IMenu;
   ... 
  public
    property Menu: IMenu read FMenu write SetMenu;
  end;
end;

In this example, the edit holds strong reference to a popup menu, and as long as the edit needs to use that menu, it will be available unless we explicitly assign nil to the menu property. This is a different kind of ownership relation than the one where the edit control would be the sole owner of the menu, and where that menu would be constructed and treated as child of the edit control.

If we have an edit control with such a strong relationship to the menu component, and we would need to store a reference to the edit in the menu object, that reference still needs to be weak, as otherwise we would have a strong reference cycle that would prevent releasing both objects.

While ARC gives us more flexibility and additional features, like shared ownership, compared to manual memory management, it still has some limitations on what we can do straight out of the box.

Weak references in ARC, just like non-owning references in manual memory management, have a purpose—they describe the intent of our code and relationships between objects. And just like we can make a mess under manual memory management if we forget about the relationships between our objects, forgetting about those relationships under ARC, and failing to define them, will make a mess, too.

When that happens, it will not be a flaw of ARC, nor manual memory management, but an error on the developer's side. It is our responsibility and our mess to clean up.

Comments

baka0815November 25, 2022 at 2:38 AM
> Another variant of the above example is also correct, but assigning to an interface reference needs to be the first thing you do after constructing the object instance.

This is really important!

procedure DoSomething(Intf: IMyIntf);
begin
// do something
end;

procedure Foo;
var Obj: TMyObject;
begin
Obj := TMyObject.Create;
try
DoSomething(Obj);
finally
Obj.Free; // <-- this raises an exception
end;
end;

The reason is the transfer to the interface in the procedure call. On the call the ref count is incremented (to 1) and once the procedure ends, the interface variable is disposed of and the ref count is set to 0, freeing the object.
ReplyDelete
Replies

Add comment

Search This Blog

Delphi Programming Tips & Tricks

The purpose of weak references - Part II

Automatic reference counting

The purpose of weak references in ARC

Comments

Post a Comment

Popular posts from this blog

Delphi 12.3 Update

Celebrating 30 Years of Delphi With a New Book: Delphi Quality-Driven Development

Universal storage for Delphi procedural types