[20] Inheritance virtual functions, C++ FAQ Lite

From an OO perspective, it is the single most important feature of C++: [6.8], [6.9].

A virtual function allows derived classes to replace the implementation provided by the base class. The compiler makes sure the replacement is always called whenever the object in question is actually of the derived class, even if the object is accessed by a base pointer rather than a derived pointer. This allows algorithms in the base class to be replaced in the derived class, even if users don't know about the derived class.

The derived class can either fully replace ("override") the base class member function, or the derived class can partially replace ("augment") the base class member function. The latter is accomplished by having the derived class member function call the base class member function, if desired.

When you have a pointer to an object, the object may actually be of a class that is derived from the class of the pointer (e.g., a Vehicle* that is actually pointing to a Car object; this is called "polymorphism"). Thus there are two types: the (static) type of the pointer (Vehicle, in this case), and the (dynamic) type of the pointed-to object (Car, in this case).

Static typing means that the legality of a member function invocation is checked at the earliest possible moment: by the compiler at compile time. The compiler uses the static type of the pointer to determine whether the member function invocation is legal. If the type of the pointer can handle the member function, certainly the pointed-to object can handle it as well. E.g., if Vehicle has a certain member function, certainly Car also has that member function since Car is a kind-of Vehicle.

Dynamic binding means that the address of the code in a member function invocation is determined at the last possible moment: based on the dynamic type of the object at run time. It is called "dynamic binding" because the binding to the code that actually gets called is accomplished dynamically (at run time). Dynamic binding is a result of virtual functions.

Non-virtual member functions are resolved statically. That is, the member function is selected statically (at compile-time) based on the type of the pointer (or reference) to the object.

In contrast, virtual member functions are resolved dynamically (at run-time). That is, the member function is selected dynamically (at run-time) based on the type of the object, not the type of the pointer/reference to that object. This is called "dynamic binding." Most compilers use some variant of the following technique: if the object has one or more virtual functions, the compiler puts a hidden pointer in the object called a "virtual-pointer" or "v-pointer." This v-pointer points to a global table called the "virtual-table" or "v-table."

The compiler creates a v-table for each class that has at least one virtual function. For example, if class Circle has virtual functions for draw() and move() and resize(), there would be exactly one v-table associated with class Circle, even if there were a gazillion Circle objects, and the v-pointer of each of those Circle objects would point to the Circle v-table. The v-table itself has pointers to each of the virtual functions in the class. For example, the Circle v-table would have three pointers: a pointer to Circle::draw(), a pointer to Circle::move(), and a pointer to Circle::resize().

During a dispatch of a virtual function, the run-time system follows the object's v-pointer to the class's v-table, then follows the appropriate slot in the v-table to the method code.

The space-cost overhead of the above technique is nominal: an extra pointer per object (but only for objects that will need to do dynamic binding), plus an extra pointer per method (but only for virtual methods). The time-cost overhead is also fairly nominal: compared to a normal function call, a virtual function call requires two extra fetches (one to get the value of the v-pointer, a second to get the address of the method). None of this runtime activity happens with non-virtual functions, since the compiler resolves non-virtual functions exclusively at compile-time based on the type of the pointer.

Note: the above discussion is simplified considerably, since it doesn't account for extra structural things like multiple inheritance, virtual inheritance, RTTI, etc., nor does it account for space/speed issues such as page faults, calling a function via a pointer-to-function, etc. If you want to know about those other things, please ask comp.lang.c++; PLEASE DO NOT SEND E-MAIL TO ME!

[20.4] I have a heterogenous list of objects, and my code needs to do class-specific things to the objects. Seems like this ought to use dynamic binding but can't figure it out. What should I do?

Suppose there is a base class Vehicle with derived classes Car and "Truck". The code traverses a list of Vehicle objects and does different things depending on the type of Vehicle. For example it might weigh the "Truck" objects (to make sure they're not carrying too heavy of a load) but it might do something different with a Car object — check the registration, for example.

The initial solution for this, at least with most people, is to use an if statement. E.g., "if the object is a "Truck", do this, else if it is a Car, do that, else do a third thing":

typedef std::vector<Vehicle*> VehicleList; void myCode(VehicleList& v) { for (VehicleList::iterator p = v.begin(); p != v.end(); ++p) { Vehicle& v = **p; // just for shorthand // generic code that works for any vehicle... ... // perform the "foo-bar" operation. // note: the details of the "foo-bar" operation depend // on whether we're working with a car or a truck. if (v is a Car) { // car-specific code that does "foo-bar" on car v ... } else if (v is a Truck) { // truck-specific code that does "foo-bar" on truck v ... } else { // semi-generic code that does "foo-bar" on something else ... } // generic code that works for any vehicle... ... } }

The problem with this is what I call "else-if-heimer's disease": eventually you'll forget to add an else if when you add a new derived class, and you'll probably have a bug that won't be detected until run-time, or worse, when the product is in the field.

The solution is to use dynamic binding rather than dynamic typing. Instead of having (what I call) the live-code dead-data metaphor (where the code is alive and the car/truck objects are relatively dead), we move the code into the data. This is a slight variation of Bertrand Meyer's Inversion Principle.

It's surprisingly easy. You just give a name to the code within the {...} blocks of each if (in this case it's the "foo-bar" operation), and you add that name as a virtual member function in the base class, Vehicle.

class Vehicle { public: // performs the "foo-bar" operation virtual void fooBar() = 0; };

Then you remove the whole if...else if... block, and replace it with a simple call to this virtual function:

typedef std::vector<Vehicle> VehicleList; void myCode(VehicleList& v) { for (VehicleList::iterator p = v.begin(); p != v.end(); ++p) { Vehicle& v = **p; // just for shorthand // generic code that works for any vehicle... ... // perform the "foo-bar" operation. v.fooBar(); // generic code that works for any vehicle... ... } }

Finally you simply move the code that used to be in the {...} block of each if into the fooBar() member function of the appropriate derived class:

class Car : public Vehicle { public: virtual void fooBar(); }; void Car::fooBar() { // car-specific code that does "foo-bar" on 'this' ... // this code was in {...} of if (v is a Car) } class Truck : public Vehicle { public: virtual void fooBar(); }; void Truck::fooBar() { // truck-specific code that does "foo-bar" on 'this' ... // this code was in {...} of if (v is a Truck) }

If you actually have an else block in the original myCode() function (see above for the "semi-generic code that does the 'foo-bar' operation on something other than a Car or Truck"), change Vehicle's fooBar() from pure virtual to plain virtual and move the code into that member function:

class Vehicle { public: // performs the "foo-bar" operation virtual void fooBar(); }; void Vehicle::fooBar() { // semi-generic code that does "foo-bar" on something else ... // this code was in {...} of the else case }

In any case, the point is that we try to avoid decision logic with decisions based on the kind-of derived class you're dealing with. I.e., you're trying to avoid if the object is a car do xyz, else if it's a truck do pqr, etc.

virtual functions bind to the code associated with the class of the object, rather than with the class of the pointer/reference. When you say delete basePtr, and the base class has a virtual destructor, the destructor that gets invoked is the one associated with the type of the object *basePtr, rather than the one associated with the type of the pointer. This is generally A Good Thing.

TECHNO-GEEK WARNING; PUT YOUR PROPELLER HAT ON.
Technically speaking, you need a base class's destructor to be virtual if and only if you intend to allow someone to invoke an object's destructor via a base class pointer (this is normally done implicitly via delete), and the object being destructed is of a derived class that has a non-trivial destructor. A class has a non-trivial destructor if it either has an explicitly defined destructor, or if it has a member object or a base class that has a non-trivial destructor (note that this is a recursive definition (e.g., a class has a non-trivial destructor if it has a member object (which has a base class (which has a member object (which has a base class (which has an explicitly defined destructor)))))).
END TECHNO-GEEK WARNING; REMOVE YOUR PROPELLER HAT

If you had a hard grokking the previous rule, try this (over)simplified one on for size: A class should have a virtual destructor unless that class has no virtual functions. Rationale: if you have any virtual functions at all, you're probably going to be doing "stuff" to derived objects via a base pointer, and some of the "stuff" you may do may include invoking a destructor (normally done implicitly via delete). Plus once you've put the first virtual function into a class, you've already paid all the per-object space cost that you'll ever pay (one pointer per object; note that this is theoretically compiler-specific; in practice everyone does it pretty much the same way), so making the destructor virtual won't generally cost you anything extra.

You can get the effect of a virtual constructor by a virtual clone() member function (for copy constructing), or a virtual create() member function (for the default constructor).

class Shape { public: virtual ~Shape() { } // A virtual destructor virtual void draw() = 0; // A pure virtual function virtual void move() = 0; // ... virtual Shape* clone() const = 0; // Uses the copy constructor virtual Shape* create() const = 0; // Uses the default constructor }; class Circle : public Shape { public: Circle* clone() const; // Covariant Return Types; see below Circle* create() const; // Covariant Return Types; see below // ... }; Circle* Circle::clone() const { return new Circle(*this); } Circle* Circle::create() const { return new Circle(); }

In the clone() member function, the new Circle(*this) code calls Circle's copy constructor to copy the state of this into the newly created Circle object. In the create() member function, the new Circle() code calls Circle's default constructor.

void userCode(Shape& s) { Shape* s2 = s.clone(); Shape* s3 = s.create(); // ... delete s2; // You probably need a virtual destructor here delete s3; }

This function will work correctly regardless of whether the Shape is a Circle, Square, or some other kind-of Shape that doesn't even exist yet.

Note: The return type of Circle's clone() member function is intentionally different from the return type of Shape's clone() member function. This is called Covariant Return Types, a feature that was not originally part of the language. If your compiler complains at the declaration of Circle* clone() const within class Circle (e.g., saying "The return type is different" or "The member function's type differs from the base class virtual function by return type alone"), you have an old compiler and you'll have to change the return type to Shape*.

Amazingly Microsoft Visual C++ is one of those compilers that does not, as of version 6.0, handle Covariant Return Types. This means:

[20] Inheritance — `virtual` functions
(Part of C++ FAQ Lite, Copyright © 1991-2001, Marshall Cline, cline@parashift.com)

FAQs in section [20]:

[20.1] What is a "`virtual` member function"?

[20.2] How can C++ achieve dynamic binding yet also static typing?

[20.3] What's the difference between how `virtual` and non-`virtual` member functions are called?

[20.4] I have a heterogenous list of objects, and my code needs to do class-specific things to the objects. Seems like this ought to use dynamic binding but can't figure it out. What should I do?

[20.5] When should my destructor be `virtual`?

[20.6] What is a "`virtual` constructor"?

[20] Inheritance — virtual functions (Part of C++ FAQ Lite, Copyright © 1991-2001, Marshall Cline, cline@parashift.com)

FAQs in section [20]:

[20.1] What is a "virtual member function"?

[20.2] How can C++ achieve dynamic binding yet also static typing?

[20.3] What's the difference between how virtual and non-virtual member functions are called?

[20.4] I have a heterogenous list of objects, and my code needs to do class-specific things to the objects. Seems like this ought to use dynamic binding but can't figure it out. What should I do?

[20.5] When should my destructor be virtual?

[20.6] What is a "virtual constructor"?

[20] Inheritance — `virtual` functions
(Part of C++ FAQ Lite, Copyright © 1991-2001, Marshall Cline, cline@parashift.com)

[20.1] What is a "`virtual` member function"?

[20.3] What's the difference between how `virtual` and non-`virtual` member functions are called?

[20.5] When should my destructor be `virtual`?

[20.6] What is a "`virtual` constructor"?