[pooma-dev] Re: InlineEvaluator implementation question
Jeffrey Oldham
oldham at codesourcery.com
Mon Dec 16 19:42:27 UTC 2002
Richard Guenther wrote:
> On Mon, 16 Dec 2002, Mark Mitchell wrote:
>
>
>>--On Monday, December 16, 2002 04:55:52 PM +0100 Richard Guenther
>><rguenth at tat.physik.uni-tuebingen.de> wrote:
>>
>>
>>>Hi!
>>>
>>>Does anyone remember why we create copies of the LHS and RHS inside
>>>the KernelEvaluator<InlineKernelTag>::evaluate() methods (within
>>>ReductionEvaluator<InlineKernelTag>::evaluate() is similar code)? I.e.
>>>there is code like
>>>
>>> template<class LHS,class Op,class RHS,class Domain>
>>> inline static void evaluate(const LHS& lhs,const Op& op,const RHS& rhs,
>>> const Domain& domain,WrappedInt<1>)
>>> {
>>> CTAssert(Domain::unitStride);
>>> PAssert(domain[0].first() == 0);
>>> LHS localLHS(lhs);
>>> RHS localRHS(rhs);
>>> int e0 = domain[0].length();
>>> for (int i0=0; i0<e0; ++i0)
>>> op(localLHS(i0),localRHS.read(i0));
>>> }
>>
>>I'm pretty sure that this copy allowed some C++ compilers (KCC) to see
>>that some parts of lhs/rhs were loop-invariant, and then hoist references
>>to those fields out of the loop. (The compiler can see that nothing can
>>modify localLHS; it's less obvious to it that nothing can modify rhs
>>since it doesn't know what else might point to that location.)
>
>
> Hmm - as both, lhs and rhs are declared const, isnt this enough to tell
> the compiler? Or has the compiler to assume every function call can have
> a side-effect on any (but local) variable?
>
> Well, at least gcc creates worse (larger) code with copying than without.
Using the copies measurably reduced execution time in previous experiments.
Thanks,
Jeffrey D. Oldham
oldham at codesourcery.com
More information about the pooma-dev
mailing list