Virtual methods in Delphi

Virtual methods enable subtype polymorphism - in other words, they allow implementing different behavior in descendant classes.

That means if you have a base TShape class with a virtual Paint method, and several descendant classes like TRectangle, TCircle and TTriangle, then each of those subclasses can implement a different Paint method to appropriately paint itself. You can call the Paint method on any shape instance without needing to know which kind it is, and it will be correctly painted.



Static vs dynamic dispatch

Virtual methods can achieve polymorphism via a mechanism called dynamic dispatching at runtime, while non-virtual methods will be statically dispatched (bound) during compilation. Polymorphism cannot be achieved by statically bound methods—in other words, when you call a statically bound method, you will call the same implementation for all descendant classes.

Static dispatching means that the compiler will resolve the method address at compile time, and will emit code that will directly call that address without any indirections.

Dynamic dispatching is a bit more complicated. Instead of pointing to the method address directly, the compiler will point to a particular slot (address) in the VMT— virtual method table—associated with every object instance. Depending on the exact class of the instance, that VMT table will hold different addresses pointing to the last method override in the class hierarchy.

Method calls

Methods are similar to regular functions and procedures with one significant difference—they also pass an additional (hidden) parameter, identifying the object instance upon which they were called:

TShape = class protected FX, FY: Integer; public procedure Paint; procedure Move(AX, AY: Integer); property X: Integer read FX write FX; property Y: Integer read FY write FY; end; procedure TShape.Paint; begin Writeln('Painting shape at'); Writeln('X: ', X); Writeln('Y: ', Y); end; procedure TShape.Move(AX, AY: Integer); begin FX := AX; FY := AY; Paint; end; var Shape: TShape; begin Shape := TShape.Create; Shape.Move(10, 20); Shape.Paint; ... end;

When we call Shape.Move(10, 20), the compiler roughly translates it to Move(Shape, 10, 20). When we call Shape.Paint, it is translated to Paint(Shape).

Calling another class method from within the method implementation will pass Self as a parameter. For instance:

procedure TShape.Move(AX, AY: Integer); begin FX := AX; FY := AY; Paint; // this will translate to Paint(Self) end;

This hidden object instance parameter is also crucial for understanding the differences between static and dynamic dispatching, as well as some special tricks you can use with statically bound methods.

Static dispatch

In Delphi, methods are statically bound by default. Only if they are marked with the virtual, dynamic or override directives will they be dynamically bound:

TShape = class public procedure Paint; end;

The Paint method in the above declaration is statically bound. When the compiler encounters a call to Shape.Paint, it will resolve it directly to the TShape.Paint address. If we translate further by adding an object instance parameter, our call will look like TShape.Paint(Shape).

This will be equivalent to having a standalone Paint procedure with one parameter of the TShape type:

TShape = class protected FX, FY: Integer; public property X: Integer read FX write FX; property Y: Integer read FY write FY; end; procedure ShapePaint(AShape: TShape); begin Writeln('Painting shape at'); Writeln('X: ', AShape.X); Writeln('Y: ', AShape.Y); end; procedure ShapeMove(AShape: TShape; AX, AY: Integer); begin AShape.X := AX; AShape.Y := AY; ShapePaint(AShape); end; var Shape: TShape; begin Shape := TShape.Create; // this is equivalent of calling TShape.Paint in previous declaration ShapePaint(Shape); ...

Since TShape.Paint is resolved at compile time, and the method resolution (address) itself does not involve a particular object instance—it will only be passed as a parameter— we can call Paint on a nil object reference, just like we can pass nil to a standalone procedure. Of course, if some of our code inside the procedure tries to access any data or other code inside that nil instance, we will have an access violation exception at that point, but the call to the method itself will never cause any crashes.

If we check whether the passed object instance is nil, and perform work only if it is not nil, then we will have perfectly working code and no crashes:

procedure TShape.Paint; begin if Assigned(Self) then begin Writeln('Painting shape at'); Writeln('X: ', X); Writeln('Y: ', Y); end; end; procedure ShapePaint(AShape: TShape); begin if Assigned(AShape) then begin Writeln('Painting shape at'); Writeln('X: ', AShape.X); Writeln('Y: ', AShape.Y); end; end;

For the most of the code it will make no sense to check whether the instance inside a statically bound method is nil or not, but the fact that we can safely call statically bound methods on nil references allows us to implement tricks in special methods like Free, where in real-life code it is possible to have situation where a method could be called on a nil reference and checking for nil from the outside would result in excessive checking everywhere.

Note: Checking Self for nil in statically bound methods is appropriate only if nil is a valid argument. It should never be used as a crash prevention measure in cases where you never expect nil, but you want to be safe just in case you have some bug somewhere.

Dynamic dispatch

As previously mentioned, statically bound methods have fixed implementations. For our shape class, that means that in order to properly paint different shapes, we would need to check for the actual type inside the Paint method, and then draw the appropriate shape:

procedure TShape.Paint; begin if Self is TRectangle then Writeln('Painting rectangle at') else if Self is TCircle then Writeln('Painting circle at') else Writeln('Painting unknown shape at'); Writeln('X: ', X); Writeln('Y: ', Y); end;

Such a paint method is not very extensible, and it is not following object-oriented programming principles—as the base class should never know or care about its descendant classes. If we add a new shape descendant, we would need to change the paint method to support painting such a shape. This is where virtual methods come in handy, as they allow us to customize painting for each shape without changing the code in the base class:

TShape = class protected FX, FY: Integer; public procedure Paint; virtual; procedure Move(AX, AY: Integer); property X: Integer read FX write FX; property Y: Integer read FY write FY; end; TRectangle = class(TShape) public procedure Paint; override; end; TCircle = class(TShape) public procedure Paint; override; end; procedure TShape.Paint; begin Writeln('X: ', X); Writeln('Y: ', Y); end; procedure TRectangle.Paint; begin Writeln('Painting rectangle at') inherited; end; procedure TCircle.Paint; begin Writeln('Painting circle'); end;

Inside the overridden virtual methods, we can also call the inherited Paint method so we don't have to reimplement common code in each descendant method, but we don't have to call it, and we can completely change its behavior if we want to.

Besides virtual methods, Delphi also has dynamic methods. They are also virtual in the sense that they use dynamic dispatching and can be overridden, but instead of using a VMT for dispatching, they use a slower but more space-efficient DMT—dynamic method table. The main difference is that the VMT for each class will contain a complete list of entries for all virtual methods in the class hierarchy, while a DMT would contain only the list of overridden methods for that particular class.

Since the basic dispatching principle (behavior) is the same, the only difference is in the method lookup process, and virtual methods are more widely used, I will only cover dispatching through a VMT.

Dynamic dispatching goes through the class' virtual method table. Every object instance contains pointer to its class' VMT, and the class' VMT holds the addresses of all virtual methods declared in that class and its ancestors. If the method is overridden in the class, then the address will point to that override. If not, it will point to last override from its ancestors:

var Shape: TShape; begin Shape := TRectangle.Create; Shape.Paint; ...

So when we call the virtual Shape.Paint method where Shape is a TRectangle, the compiler will find the method index in the TShape VMT at compile time and resolve the actual address at runtime with the Shape instance's VMT, that will point to the actual shape class' VMT at runtime. In this case, that will be the TRectangle VMT.

The output of the above code will be:

Painting rectangle at X: 0 Y: 0

If we create a TCircle instead of a TRectangle, then the output will be just:

Painting circle

because we didn't call the inherited implementation in TCircle's Paint method override.

In the above example it may seem like the compiler could also directly resolve the Paint method because it knows that Shape is a TRectangle or TCircle, as it is constructed just one line before the call to Paint. However, if we have a collection of different shapes and want to paint all of them, the compiler certainly does not know which method it should call for Shapes[i].Paint at compile time.

If the method is virtual, the compiler will not bother to go back and see whether it actually knows the shape type or not. It will just always use dynamic dispatching and fully resolve such a method's address at runtime.

Because the VMT pointer is associated (stored) with the object instance, and populated during instance construction at runtime, calling a virtual method on a nil object reference will crash, as there is no VMT and we cannot figure out the actual method address.

Virtual method overriding vs method hiding

Virtual methods can be overridden in descendant classes by adding the override directive to the method declaration. If you don't add the override directive, and a virtual method of the same name is declared in some of the ancestor classes, the compiler will issue a warning:

W1010 Method '%s' hides virtual method of base type '%s' (Delphi)

What happens here is that the new method declared without override is no longer virtual, and instead of being dynamically dispatched, it will be statically resolved at compile time. While the new method will have the same name as the virtual ancestor method at that point, we now have two separate method chains in the class. One statically bound and one dynamically bound. Depending on the visibility (the context where it is called), compiler will use the former where it is accessible rather than the latter.

This is what "hiding" means in the warning—it literally tells the compiler: In contexts where you see (can access) this new method, you will not use dynamic dispatching and the VMT, but you will just directly call this method.

Because it is statically bound, that new method will not be added to the class' virtual method table. But we have to keep in mind that the virtual ancestor method is still part of the class' VMT, and in code where the compiler does not have direct access to (does not see) this new non-virtual method, it will use dynamic dispatching, and the actual method called will be the last override in the class hierarchy:

TTriangle = class(TShape) public procedure Paint; end; procedure TTriangle.Paint; begin Writeln('Painting triangle at ', ' X: ', X, ' Y: ', Y); end;

Now we have a statically bound Paint method that has broken our virtual Paint method chain. If we did not intend to do so, this will lead to all kinds of weird, unintended behavior.

For instance, calling Paint on a TShape variable even if it contains a TTriangle instance will call the virtual Paint method, but if we call Paint on a TTriangle variable it will call the statically bound Paint method declared in TTriangle:

var Shape: TShape; Triangle: TTriangle; begin Shape := TTriangle.Create; Shape.Paint; Triangle := TTriangle.Create; Triangle.Paint; ...

The output will be:

X: 0 Y: 0 Painting triangle at X: 0 Y: 0

If we call the Move (declared in TShape) method on Triangle variable, the Paint called inside Move will be the virtual Paint method, because when compiling TShape, the compiler does not have access to the statically bound method in TTriangle. It does not know or care about descendant classes. However, if we declare a new method, MoveTriangle, in the TTriangle class and call the Paint method there, then the new statically bound method will be called, because inside TTriangle's method implementations. the compiler has access to the statically bound Paint and will prefer that method over the one declared in the ancestor class.

Usually, not adding the override directive is an error in your code, as you don't want to break the virtual method chain. What the compiler is trying to tell you with the W1010 warning—and you should never ignore it—is that you probably forgot to add the override directive to the method.

However, there are some (rare) situations where you really want to break the virtual method chain and redeclare a method as statically bound. If you really want to do that, and you are aware of the consequences, you can add the reintroduce directive to the method declaration. That will tell the compiler to suppress the warning, but its behavior will remain the same—such a method will be statically bound, not dynamically bound.

Since you will either want to override virtual method or reintroduce it, W1010 should be threated as error and not just warning.

You can do that by using compiler directive

{$WARN HIDDEN_VIRTUAL ERROR}

or setting that particular warning to Error in

Project Options -> Hints and warnings -> Method hides virtual method of base type

Comments

  1. Very nice explanation but I think you should highlight two special types of methods - constructors and destructors:
    - Destructors are always virtual and making them non-virtual will most likely cause a memory or resource leak
    - Constructors are unfortunately not always virtual (which I believe is one of few big mistakes in Delphi) and that can cause access violations if we treat them as virtual

    A lot of developers from C++ and Java background will make their object constructors non-virtual which makes those objects difficult to use via class types and when these developers work on RTL this causes a lot of pain. An example of this is TJSONAncestor class and it's descendants that initially had static constructors so instead of creating different type of JSON value using TJSONAncestorClass.Create you had to resort to bunch of if then else if constructs which would have to be kept updated with new types if any were added.

    ReplyDelete
    Replies
    1. I think that "some (rare) situations where you really want to break the virtual method" covers the fact that overriding should be used by default :)

      When it comes to destructors, obviously they need to be overridden because they are almost never called directly from the context where compiler would have access to the new statically bound destructor. For instance, Free is declared in TObject, so Destroy called from within will always be dynamically dispatched - meaning non overridden destructor will never be called.

      But, when we are talking about destructors, they are not "so special", there are other methods that are part of the object lifecycle where overriding is absolute must, most notably AfterConstruction and BeforeDestruction.

      Constructors are a different ball game (more complex one), as they will usually be called directly on a class type, so proper one will be used. However, for any code that uses meta classes and virtual constructors, being overridden is also a must (TComponent.Create(AOwner: TComponent).

      Covering constructor behavior is complex enough to be covered in separate post - possibly along with other lifecycle methods.

      Delete

Post a Comment

Popular posts from this blog

Just released eBook: Delphi Event-based and Asynchronous Programming

Magic behind FreeAndNil