[pooma-dev] Temporary copies do appear...??

Richard Guenther rguenth at tat.physik.uni-tuebingen.de
Fri May 21 08:31:40 UTC 2004


Radek Pecher wrote:

> Basically, simple algebraic expressions based on the tiny Vector class 
> do create temporary Full-engine copies of individual subexpressions, 
> as opposed to what POOMA claims to prevent. The following short main 
> code:
> 
> 
> 
> #include "Pooma/Arrays.h"
> 
> int main(int argc, char* argv[])
> {
>   Pooma::initialize(argc, argv);
> 
>   Vector<2> v1(1, 2), v2;
>   v2 = v1*v1 + v1*v1;
> 
>   Pooma::finalize(); 
>   return 0;
> }

You are right that gcc 3.3 does not optimize the copy calls.  But 
compiling the above with g++-3.4 -O2 -fpeel-loops results in straight 
line code.  Using Intel 8.0 compiler the asm code is a bit obfuscated
and there are calls to destructors left (not inlining these seems to be 
a common problem of the Intel compiler).

I don't know wether one can structurally avoid the extra constructor 
calls inside the Vector code, but maybe you can have a look at it?  This 
is certainly a point where optimization will be useful (if not for 
compilation speed).


> g++ -ftemplate-depth-60 -Drestrict=__restrict__ -fno-exceptions 
> -DNOPAssert -DNOCTAssert -O2 -fno-default-inline -funroll-loops 
> -fstrict-aliasing -o Main Main.cpp -I$HOME/lib/Optim/POOMA/linux/lib/
> PoomaConfiguration-gcc -I$HOME/lib/Optim/POOMA/linux/src -I$HOME/lib/
> Optim/POOMA/linux/lib -fno-exceptions -L$HOME/lib/Optim/POOMA/linux/
> lib -lpooma-gcc -lm
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Also, if you are using gcc, you may consider applying the leafify patch
to your gcc distribution available at 
http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/
and making the POOMA evaluators use it (I can provide a patch to you). 
That's worth about 50% performance increase.

Hope that helps,
Richard.




More information about the pooma-dev mailing list