[pooma-dev] Good News. Intel's ICC 8.0 Beta looks promising, now.
Richard Guenther
rguenth at tat.physik.uni-tuebingen.de
Tue Jun 3 19:36:33 UTC 2003
On Tue, 3 Jun 2003, Paul A. Renard wrote:
> Back in February, 2003, I reported that Intel's icc 7.0 compiler was producing code
> using Pooma constructs that was 2X-4X slower than KCC. Since then, the folks at
> Intel have worked hard, and for my little test (reproduced at the end of this
> message), the icc 8.0 Beta compiler (l_cc_b_8.0.023) is now producing code slightly
> faster (maybe 5-10%) than KCC, and certainly comparable to hand-written loops.
>
> The only optimization items for compiling were:
> -O3 -DNOPAssert -DNOCTAssert -tpp7 -xW
> but the last two are particular to Pentium 4 vectorization, which plays a very small
> part in the tests I did, and which probably caused the "slightly faster", rather
> than "just about the same speed".
>
> So, icc 8.0 seems to be a useful choice in compilers (for Linux and Windows).
Unfortunately my tests show its better, but still worse than with gcc.
Your test is 1d, try 3d and it starts to suck. Inlining is still the
culprit, as is CSE with f.i. Loc<n> (where n>1) objects.
With the following gcc3.3 patch applied
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/leafify-3.3-2
or leafify-3.4-2 for mainline, I get very good results with gcc3.3.
The only parts to change inside POOMA are the expression Kernels in
src/Evaluator, where you put __attribute__((leafify)) on the kernel
functions (can extract a patch, if you like).
Richard.
More information about the pooma-dev
mailing list