In school, we about virtual functions in C++, and just how they're resolved (or found, or matched up, I'm not sure exactly what the terminology is -- we are not studying in British) at execution time rather than compile time. The teacher also told us that compile-time resolution is a lot faster than execution-time (also it will make sense for this to become so). However, a fast experiment indicate otherwise. I have built this small program:

#include <iostream>
#include <limits.h>

using namespace std;

class A {
    public:
    void f() {
        // do nothing
    }
};

class B: public A {
    public:
    void f() {
        // do nothing
    }
};

int main() {
    unsigned int i;
    A *a = new B;
    for (i=0; i < UINT_MAX; i++) a->f();
    return 0;
}

I put together this program above and referred to it as normal. Then, I modified A to appear such as this:

class A {
    public:
    virtual void f() {
        // do nothing
    }
};

Put together and referred to it as virtual. Listed here are my results:

[felix@the-machine C]$ time ./normal 

real    0m25.834s
user    0m25.742s
sys 0m0.000s
[felix@the-machine C]$ time ./virtual 

real    0m24.630s
user    0m24.472s
sys 0m0.003s
[felix@the-machine C]$ time ./normal 

real    0m25.860s
user    0m25.735s
sys 0m0.007s
[felix@the-machine C]$ time ./virtual 

real    0m24.514s
user    0m24.475s
sys 0m0.000s
[felix@the-machine C]$ time ./normal 

real    0m26.022s
user    0m25.795s
sys 0m0.013s
[felix@the-machine C]$ time ./virtual 

real    0m24.503s
user    0m24.468s
sys 0m0.000s

There appears to become a steady ~1 second difference in support of the virtual version. Why?


Relevant or otherwise: dual-core pentium @ 2.80Ghz, no extra programs running between two tests. Archlinux with gcc 4.5.. Producing normally, like:

$ g++ test.cpp -o normal

Also, -Wall does not goes any alerts, either.


Edit: I've separated my program into A.cpp, B.cpp and main.cpp. Also, I made the f() (both A::f() and B::f()) function really do something (x = 0 - x where x is really a public int person in A, initialized with one in A::A()). Put together this into six versions, listed here are my benefits:

[felix@the-machine poo]$ time ./normal-unoptimized 

real    0m31.172s
user    0m30.621s
sys 0m0.033s
[felix@the-machine poo]$ time ./normal-O2

real    0m2.417s
user    0m2.363s
sys 0m0.007s
[felix@the-machine poo]$ time ./normal-O3

real    0m2.495s
user    0m2.447s
sys 0m0.000s
[felix@the-machine poo]$ time ./virtual-unoptimized 

real    0m32.386s
user    0m32.111s
sys 0m0.010s
[felix@the-machine poo]$ time ./virtual-O2

real    0m26.875s
user    0m26.668s
sys 0m0.003s
[felix@the-machine poo]$ time ./virtual-O3

real    0m26.905s
user    0m26.645s
sys 0m0.017s

Unoptimized continues to be 1 second faster when virtual, that we find a little peculiar. But it was a pleasant experiment and want to thank everyone for the solutions!

When the vtable is incorporated in the cache, the performance distinction between virtual and non-virtual functions that really make a move is extremely small. It's definitely not something you need to normally be worried about when developing software using C++. So that as others have stated, benchmarking unoptimised code in C++ is pointless.

Profiling unoptimised code is virtually meaningless. Use -O2 to make a significant result. Using -O3 may lead to even faster code, but it might not produce a realistic outcome unless of course you compile A::f and B::f individually to main (i.e., in separate compilation models).

In line with the feedback, possibly even -O2 is simply too aggressive. The Two ms outcome is since the compiler optimized the loop away entirely. Direct calls aren't that fast actually, it must be tough to observe any significant difference. Slowly move the implementations of f right into a separate compilation unit to come on amounts. Define the classes inside a .h, but define A::f and B::f in their own individual .cc file.