[pooma-dev] KCC versus icc (and gcc)
Tarjei Knapstad
tarjeik at chemcon.no
Tue Mar 4 10:30:55 UTC 2003
On Wed, 2003-02-26 at 20:27, Richard Guenther wrote:
> Hi!
>
> I remember problems with the inliner, i.e. it refused to inline
> some of the expression template machinery. You might want to search
> for an option letting you tune the inlining behavior or try profile
> directed optimizations. With standard -O3 icc is not always faster
> than gcc3.2.2 with -O3.
>
Just thought I'd add a bit to that. A while back me and some others
constructed some code to try to measure the performance hit of
dynamic_cast with the following code (needs the boost libraries):
========== BEGIN CODE ================
#include <iostream>
#include <boost/timer.hpp>
using namespace std;
const int num=10000000;
class TestBase
{
public:
virtual ~TestBase() {}
virtual void f() {}
void f2() {}
};
class Test1 : public TestBase
{
public:
virtual ~Test1() {}
virtual void f() {}
void f2() {}
};
// Ensure that it doesn't optimise away the reading of it in the loops
volatile TestBase* testBasePtr = new Test1();
int main()
{
boost::timer t1;
for(unsigned int i = 0; i != num; ++i)
{
Test1* test1 = const_cast<Test1*>(dynamic_cast<volatile
Test1*>(testBasePtr));
if (test1)
{
test1 -> f2();
}
}
cout << "Elapsed t1 " << t1.elapsed() << " " << endl;
boost::timer t2;
for(unsigned int i = 0; i != num; ++i)
{
Test1* test1 = const_cast<Test1*>(static_cast<volatile
Test1*>(testBasePtr));
if (test1)
{
test1 -> f2();
}
}
cout << "Elapsed t2 " << t2.elapsed() << " " << endl;
return 0;
}
=========== END CODE ==========
We tried running this on both gcc:
Elapsed t1 0.52
Elapsed t2 0.01
and Intel 7:
Elapsed t1 4.03
Elapsed t2 0.12
with optimizations turned on and the results are quite staggering.
Similar differences are found when testing boost.lexical_cast on the
different compilers.
There are quite a lot of advanced compiler optimizations in the Intel
compiler though, which I haven't had too much time to play around with
(I'm using gcc and have only played around with icc).
I would also be interested in your findings if you can make icc generate
code that is more in the vicinity of Kcc's performance.
Regards,
--
Tarjei
More information about the pooma-dev
mailing list