A way to handle CSE in Expression Templates
Richard Guenther
rguenth at tat.physik.uni-tuebingen.de
Fri Jun 18 11:42:56 UTC 2004
Hi!
I just got bored writing subexpressions multiple times in expressions and
then to watch the compiler not being able to find out it can do CSE.
Here's a quick-and-dirty solution which has the only disadvantage to
require a tag class per CSE and produces a global storage per CSE (which
actually gets written/read to at least once, dependent on the capabilities
of your optimizing compiler).
With the code snippets below you can now write stuff like:
struct DivV {};
...
ScalarCode<GradV<Dim> >()(gradv, v);
a_pg(I) = where(save<DivV>(dot(gradv, Vector<Dim>(1))) < 0.0,
c * rh * pow2(spacings(rh).comp(0)) * ref<DivV, double>()),
0.0);
to compute f.i. an artificial pressure like above. See that we create a
CSE that contains dot(gradv, Vector<Dim>(1)) - i.e. actually the
divergence of v, and re-use the computed value in the first arm of the
where expression using ref<DivV, double>(). The requirement of specifying
the type of the CSE expression result looks annoying, but I didn't find a
way to avoid this.
Basically the above is equal to the following 1d C code:
double divv;
for (int i=0; i<n; ++i) {
divv = dot(gradv(i), Vector<Dim>(1));
if (divv < 0.0)
a_pg(i) = c * rh(i) * pow2(spacings(rh).comp(0)(i)) * divv;
else
a_pg(i) = 0.0;
}
Not only is the compiler much more happy with the above, compile-time also
benefits. Caveats are that ref<> objects may not occour with scalars -
i.e. just writing 2*ref<>() does not work. You can work around this by
creating IndexFunction Arrays for ref<>(), but that seemed ugly, too,
as you need to specify domains et al.
Hope someone may find this useful, like I did.
Richard.
template <class Tag, class T>
struct FnSave
{
inline T
operator()(const T &a) const
{
val = a;
return a;
}
static T val;
};
template <class Tag, class T>
T FnSave<Tag, T>::val;
template<class Tag, int D1,class T1,class E1>
inline typename MakeReturn<UnaryNode<FnSave<Tag, T1>,
typename CreateLeaf<Array<D1,T1,E1> >::Leaf_t> >::Expression_t
save(const Array<D1,T1,E1> & l)
{
typedef UnaryNode<FnSave<Tag, T1>,
typename CreateLeaf<Array<D1,T1,E1> >::Leaf_t> Tree_t;
return MakeReturn<Tree_t>::make(Tree_t(
CreateLeaf<Array<D1,T1,E1> >::make(l)));
}
template<class Tag, class M1,class T1,class E1>
inline typename MakeReturn<UnaryNode<FnSave<Tag, T1>,
typename CreateLeaf<Field<M1,T1,E1> >::Leaf_t> >::Expression_t
save(const Field<M1,T1,E1> & l)
{
typedef UnaryNode<FnSave<Tag, T1>,
typename CreateLeaf<Field<M1,T1,E1> >::Leaf_t> Tree_t;
return MakeReturn<Tree_t>::make(Tree_t(
CreateLeaf<Field<M1,T1,E1> >::make(l)));
}
template <class Tag, class T>
struct SaveRef
{
SaveRef() {}
inline
operator T() const
{
return FnSave<Tag, T>::val;
}
};
template<class Tag, class T>
inline Expression<Scalar<SaveRef<Tag, T> > >
ref()
{
return Expression<Scalar<SaveRef<Tag, T> > >(SaveRef<Tag, T>());
}
More information about the pooma-dev
mailing list