[pooma-dev] compilers and speed
George Talbot
gtalbot at locuspharma.com
Mon Apr 28 13:00:04 UTC 2003
Wow! Now that's really good information! Thank you very much.
May I ask two additional questions? Do you have a particular snapshot
of GCC 3.3 that works well? Did you get it in binary form or did you
build it yourself?
Thanks.
--George
On Sat, 2003-04-26 at 05:31, Richard Guenther wrote:
> Hi!
>
> I decided to just sum up what I do to get the most performance out of
> POOMA. First, I use POOMA from CVS (that has bugfixes and support for
> ISO conforming compilers) and cheetah with SCore MPI. To go anywhere
> near hand-coded performance for simple Brick arrays, I need to use
> recent gcc-3.3 with at least -O2 -funroll-loops --param
> min-inline-insns=250, of course selecting the right -march and perhaps
> -ffast-math helps in some cases. Intel icc is disappointing in
> performance, but I didnt try using profile-directed optimization with it.
>
> Performace compared to hand-coded loops is on-par as soon as you're going
> out of L2 cache, within cache dont expect anything good from POOMA.
>
> The real advantage of POOMA for single Brick arrays is the possibility to
> adjust loop processing for cache optimality (i.e. do handcrafted
> "multipatching" inside the evaluators) - still on my todo-list.
>
> I never had KAI CC available to compare its performance, but I cannot
> confirm that IRIX CC does a good job on optimizing POOMA. I hope Intel
> icc will solve its problems, as then _very_ simple OpenMPization (I've
> done it) can be applied to POOMA as well.
>
> Hope, this answers most of the questions,
>
> Richard.
>
More information about the pooma-dev
mailing list