InlineEvaluator implementation question

Mon Dec 16 15:55:52 UTC 2002

Hi!

Does anyone remember why we create copies of the LHS and RHS inside
the KernelEvaluator<InlineKernelTag>::evaluate() methods (within
ReductionEvaluator<InlineKernelTag>::evaluate() is similar code)? I.e.
there is code like

  template<class LHS,class Op,class RHS,class Domain>
  inline static void evaluate(const LHS& lhs,const Op& op,const RHS& rhs,
                              const Domain& domain,WrappedInt<1>)
  {
    CTAssert(Domain::unitStride);
    PAssert(domain[0].first() == 0);
    LHS localLHS(lhs);
    RHS localRHS(rhs);
    int e0 = domain[0].length();
    for (int i0=0; i0<e0; ++i0)
      op(localLHS(i0),localRHS.read(i0));
  }

instead of

  template<class LHS,class Op,class RHS,class Domain>
  inline static void evaluate(const LHS& lhs,const Op& op,const RHS& rhs,
                              const Domain& domain,WrappedInt<1>)
  {
    CTAssert(Domain::unitStride);
    PAssert(domain[0].first() == 0);
    int e0 = domain[0].length();
    for (int i0=0; i0<e0; ++i0)
      op(lhs(i0),rhs.read(i0));
  }

Changing the evaluate methods to not copy saves some code size but doesnt
seem to affect performance (checked gcc3.0 only).

Richard.

--
Richard Guenther <richard.guenther at uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/