As Meyers noted in Item 24 of Effective C++,the inability to inline a virtual function is its biggest performace penalty.
Virtual functions seems to inflict a performace cost in several ways:
[1] The vptr must be initialized in the constructor.
[2] A virtual function is invoked via pointer indirection. We must fetch the pointer to the function table and then access the correct function offset.
[3] Inlining is a compile-time decision. The compiler cannot inline virtual functions whose resolution takes place at run-time.
The true cost of virtual functions then boils down to the third item only.
-------------------------------------------------------------------------------------------
Virtual function calls that can be resolved only at rum-time will inhibit inling. At times, that may pose a performace problem that we must solve. Dynamic binding of?a function call is a consequence of inheritance. One way to eliminate dyanamic binding is to replace inheritance with a template-based design. Templates are more performance-friendly in the sense that they push the resolution step from run-time to compile-time. Compile-time, as far as we are concerned, is free.
The desing space for inheritance and templates has some overlap. We will discuss one such example.
Suppose you wanted to develop a thread-safe string class that may be manipulated safely by concurrent threads in a Win32 environment. In that environment you have a choice of multiple synchronization schemes such ascriticalsection, mutex, and semanphores, just to name a few. You would like your thread-safe string to offer the flexibility to use any of those schemes, and at different times you may have a reason to prefer one scheme over another. Inheritance would be a reasonable choice to capture the commonality among synchronization mechanisms.
The Locker abstract base class will declare the common interface:
?1
class
?Locker
?2
{
?3
public
:
?4
????Locker()?
{?}
?5
????
virtual
?
~
Locker()?
{?}
?6
????
virtual
?
void
?
lock
()?
=
?
0
;
?7
????
virtual
?
void
?unlock()?
=
?
0
;
?8
}
;
?9
10
class
?CriticalSectionLock?:?
public
?Locker
11
{?
12
13
}
;
14
class
?MutexLock?:?
public
?Locker
15
{
16
?
17
}
;
Because you prefer not to re-invent the wheel, you made the choice to derive the thread-safe string from the existing standard string. The remaining design choices are:
[1]
Hard coding. You could derive three distinct classes from string::CriticalSectionString, MutexString, and SemaphoreString, each class implementing its implied synchronization mechanism.
[2]
Inheritance. You could derive a single ThreadSafeString class that contains a pointer to a Locker object. Use polynorphism to select the particular synchronization mechanism at run-time.
[3]
Templates. Create a template-based string class parameterized by the Locker type.
////////////////////////////////////////////////////////////////////////////////////////////
Here we only talk about the Template implementation.
The templates-based design combines the best of both worlds-reuse and efficiency. The ThreadSafeString is implemented as a?template parameterized by the Locker template argument:
?1
template?<class?LOCKER>
?2
class?ThreadSafeString?:?public?string
?3
{
?4
public:
?5
???ThreadSafeString(const?char*?s)?
?6
???:?string(s)?{?}
?7
???
?8
???int?length();
?9
private:
10
???LOCKER?lock;
11
};
12
The length method implementation is similar to the previous ones:
?1
template?<class?LOCKER>
?2
inline
?3
int?ThreadSafeString<LOCKER>?::?length()
?4

{
?5
??lock.lock();
?6
??int?len?=?string::length();
?7
??lock.unlock();
?8
?9
??return?len;
10
} If you want critical section protection, you will instantiate the template with a CriticalSectionLock:
{
?? ThreadSafeString<CriticalSectionLock> csString = "Hello";
?? ...
}
or you may go with a mutex:
{
?? ThreadSafeString<MutexLock> mtxString = "Hello";
?? ...
}
This design also provides a relief from the virtual function calls to lock() and unlock(). The declaration of a
ThreadSafeString selects a particular type of synchronization upon template instantiation time. Just like hard coding, this enables the compiler to resolve the virtual calls and inline them.
As you?can see, templates can make a positive performace contribution by pushing computations out of the excution-time and into compile-time, enabling inling in the process.