The virtual table

Virtual table

To implement virtual functions, C++ uses a special form of late binding known as the virtual table.

The virtual table is a lookup table of functions used to resolve function calls in a dynamic/late binding manner

The virtual table sometimes goes by other names, such as "vtable", "virtual function table", "virtual method table" or "dispatch table".

First

Every class that uses virtual functions (or is derived from a class that uses virtual functions) is given its own virtual table.

This table is simply a static array that the complier sets up at compile time.

A virtual table contains one entry for each virtual function that can be called by objects of the class.

Each entry in this table is simply a function pointer that points to the most-derived function accessible by that class

The virtual table

Virtual table

Second

The complier also adds a hidden pointer to the base class, which we will cal *__vptr.

*__vptr is set (automaticically) when a class instance is created so that it points to the virtual table for that class. Unlike the *this pointer, which is actually a function parameter used by the complier to resolve self-references, *__vptr is a real pointer.

Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is impotant.

The virtual table

Virtual table

By now, you're probably confused as to how these things all fit together, so let's take a look at a simple example:

class Base{
public:
	virtual void f1(){}
	virtual void f2(){}
};
class D1 : public Base{
public:
	virtual void f1(){}	
};
class D2 : public Base{
public:
	virtual void f2(){}	
};
			
The virtual table

Virtual table

Because there are 3 classes here, the compiler will set up 3 virtual table: one for Base, one for D1, one for D2

The complier also adds a hidden pointer to the most base class that uses virtual functions, Although the compiler does this automatically, we'll put it in the next example just to show where it's added:

class Base{
	FunctionPointer* __vptr;
public:
	virtual void f1(){}
	virtual void f2(){}
};
class D1 : public Base{
public:
	virtual void f1(){}	
};
class D2 : public Base{
public:
	virtual void f2(){}	
};
			
The virtual table

Virtual table

When a class object is created, *__vptr is set to point the virtual table for that class.

For example, when a object of type Base is created, *__vptr is set to the virtual table for Base

When objects of type D1 or D2 are constructed, *__vptr is set to point the virtual table for D1 or D2 respectively.

Now, let's talk about how these virtual table are filled out. Because there are only two virtual functions here, each virtual table will have two entries (one for f1(), and one for f2()). Remember that when these virtual table are filled out, each entry out with the most-derived function an object of that class type can call.

The virtual table

Virtual table

The virtual table for Base object is simple. An object of type Base can only access members of Base. Base has no access to D1 or D2 functions. Consequently, the entry for f1() points to Base::f1(), and the entry for f2() points to Base::f2().

The virtual table for D1 is slightly more complex. an object of type D1 can access members of both Dervied1 and Base. However, D1 has overriden f1(), making D1:f1() more derived than Base::f1(). Consequently, the entry for f1() point to D1::f1(). D1 hasn't overriden f2(), so the entry for f2() will point to Base::f2().

The virtual table for D2 is similar to D1, except the entry for f1() points to Base::f1(), and the entry for doThis points to D2::f2().

The virtual table

Virtual table

The virtual table

Virtual table

Although this diagram is kind of crazy looking, it's really simple: the *__vptr in each class points to the virtual table for class. The entries in the virtual table point to the most-derived version of the function objects of that class are allowed to call

So consider what happens when we create object of type D1:

void main(){
	D1 d1;
}
			

Because d1 is a D1 object, d1 has its *__vptr set to the D1 virtual table.

Now, let's set as Base pointer to D1:

void main(){
	D1 d1;
	Base *p = &d1;
}
			
The virtual table

Virtual table

Note that because p is base pointer, it only points to the Base portion of d1. However, also note that *__vptr is in the Base portion of the class, so p has access to this pointer. Finally, note that p->__vptr points to the D1 virtual table! Consequently. even though p is of type Base, it still has access to D1's virtual table (throught ___vptr)

So what happens when we try to call p->f1()?

void main(){
	D1 d1;
	Base *p = &d1;
	p->f1();
}
			
First

The program recognizes that f1() is a virtual function

Second

The program uses p->vptr to get to D1's virtual table. Therefore, p->f1() resolves to D1::f1()!

The virtual table

Virtual table

Now, you might be saying, "But what if Base really pointed to a Base object instead of a D1 object. Would it still call D1::f1()?". The answer is no.

void main(){
	Base b;
	Base* p = &b;
	p->f1();
}
			

In this case, when b is created, *__vptr point to Base's virtual table, not D1's virtual table. Consequently, p->__vptr will also be pointing to Base's virtual table. Base's virtual table entry for f1() point to Base::f1(). Thus, p->f1() resolves to Base::f1(), which is the most-derived version of f1() that a Base object should be able to call.

The virtual table

Virtual table

By using these tables, the compiler and program are able to ensure function calls resolve to the appropriate virtual function, event if you're only using a pointer or reference to a base class!

Calling a virtual function is slower than calling a non-virtual function for a couple of reasons:

First, where have to use the *__vptr to get to the appropriate virtual table

Second, we have to index the virtual tablevirtual table to find the correct function to call. Only then can we call the function. As a result, we have to do 3 operation to find the function to call, as apposed to 2 operation for a normal indirect function call, or one operation for a direct function call. However, with modern computers, this added time is usually fairly insignificant.

Also as a reminder, any class that uses virtual function has a *__vptr and thus each object of that class will be bigger by one pointer. Virtual function are powerfull, but they do have a performance coast.