[vsipl++] [patch] Serial Expression Profiling
Jules Bergmann
jules at codesourcery.com
Mon Aug 14 18:14:30 UTC 2006
Don,
This patch looks good. There are 2 things I would like to change:
- First, I would like to move the profiling code from the evaluator
class specializations into the Dispatch_assign class. This requires
some changes to the Evaluator framework, and some help from the
evaluators themselves, but not much.
Doing this will reduce the amount of duplication, making it easier
to add a new evaluator. It will also give us visibility into
distributed expressions before they are reduced.
- Second, the expression name generator is pretty cool, but the
psuedo-postfix notation seems unintuitive. Since the framework you
have is fairly general, it should not be too hard to generate a
name with proper prefix or infix notation.
This email only discusses the second bullet. I need to take a look at the
evaluator framework before discussing the first in more detail.
-- Jules
> All expressions are tagged in the profiler output with "Expr[/type/]",
> where type is LF, Dense or Trans. Following that is the dimensionality
> (1D, 2D or 3D), a compact representation of the expression and finally
> the size(s). For example, the following expression (where all are the
> same size and of type Vector<T>):
>
> r = v1 * v2;
>
> Gets logged as:
>
> Expr[LF] 1D *SS 262144 : 66929535 : 1 : 262144 : 14.0664
>
> The expression is represented as "*SS", meaning "the binary multiply
> operator applied to two single-precision real values" (again using the
> BLAS/LAPACK convention of S/D/C/Z for operand types).
> In general, operators are designated with a 'u', 'b' or 't' for unary,
> binary and ternary operators respectively, with the exception of the
> common binary operators, shown in their more familiar +-*/ form.
> Multiple operators are evaluated in order, therefore
>
> v1 * T(4) + v2 / v3
>
> is tagged as:
>
> Expr[LF] 1D *SS/SS+SS 2048 : 1527534 : 1 : 6144 : 14.4451
I think it would be easier to read the expression name if
- it used prefix or infix notation
- treated sub-expressions differently from leaves
I.e. the above expression could be:
prefix: +(*(S,S), /(S,S))
infix: (S*S)+(S/S)
I would suggest doing infix first, even though it is harder to read,
and then adding support for infix, since we'll have to support
operators (such as 'hypot') that don't have infix equivalents.
>
> Changing it to
>
> (v1 * T(4) + v2) / v3
>
> yields:
>
> Expr[LF] 1D *SS+SS/SS 2048 : 1536309 : 1 : 6144 : 14.3626
As an example of how this notation breaks down, (v1 * T(4)) / (v2 + v3) also
has the same name: '*SS+SS/SS'.
An alternative to generating the name in this way is to use the standard
C++ typeinfo (i.e. 'typeid(ExprBlockType).name()'). This is *much* more
verbose and difficult to read than the above, but it would be possible to
clean up in a post-processing step.
> Index: src/vsip/impl/expr_op_names.hpp
> ===================================================================
> +template <template <typename, typename> class BinaryOp,
> + typename T1,
> + typename T2>
> +struct Binary_op_tag
> +{
> + static std::string tag()
> + {
> + std::ostringstream st;
> + st << Binary_op_name<BinaryOp>::value
> + << Type_name<T1>::value
> + << Type_name<T2>::value;
You can determine the parameter value types from the block type
and have transform() roll them up, getting rid of the T1 and T2
parameters (see below)
> +
> + return st.str();
> + }
> +};
> +/// Reduction to generate a tag for the entire expression tree
> +
> +struct Reduce_expr_op_name
> +{
> +public:
> +
> + template <typename BlockT>
> + struct transform
> + {
> + // Leaf nodes get empty tags
> + static std::string tag() { return std::string(); }
Leaf nodes can figure out what they are: 'S', 'C', etc.
return Type_name<typename BlockT::value_type>::value;
Also, we should specialize Scalar_block so that scalars can be
distinguished from vectors (perhaps 's' for scalar float, 'S' for vector
float?).
> + };
> +
> + template <dimension_type Dim0,
> + template <typename> class Op,
> + typename Block,
> + typename Type>
> + struct transform<Unary_expr_block<Dim0, Op,
> + Block, Type> const>
> + {
> + static std::string tag()
> + {
> + return transform<Block>::tag() + Unary_op_tag<Op, Type>::tag();
> + }
> + };
> +
> + template <dimension_type Dim0,
> + template <typename, typename> class Op,
> + typename LBlock,
> + typename LType,
> + typename RBlock,
> + typename RType>
> + struct transform<Binary_expr_block<Dim0, Op,
> + LBlock, LType,
> + RBlock, RType> const>
> + {
> + static std::string tag()
> + {
> + return transform<LBlock>::tag() + transform<RBlock>::tag() +
> + Binary_op_tag<Op, LType, RType>::tag();
To do prefix notation:
return Binary_op_tag<Op>::tag() + std::string("(")
+ transform(RBlock>::tag() + std::string(",")
+ transform<LBlock>::tag() + std::string(")");
To handle a mix of prefix and infix
if (Binary_op_tag<Op>::is_infix)
return string("(")
+ transform(RBlock>::tag()
+ Binary_op_tag<Op>::tag()
+ transform<LBlock>::tag()
+ string(")");
else
return Binary_op_tag<Op>::tag() + std::string("(")
+ transform(RBlock>::tag() + std::string(",")
+ transform<LBlock>::tag() + std::string(")");
(Bonus points for figuring out how to avoid unnecessary parenthesis
for infix! :)
> + }
> + };
> Index: src/vsip/impl/expr_ops_per_point.hpp
> ===================================================================
> +//UNARY_OPS_FUNCTOR(bnot)
-> 1 op
> +//UNARY_OPS_FUNCTOR(ceil)
-> 1 op
> +//UNARY_OPS_FUNCTOR(conj)
-> 1 op
> +UNARY_OPS_FUNCTOR(cos, T, 1);
I think that sin, cos, and tan of a float are more expnsive than 1
floating-point op
> +UNARY_OPS_FUNCTOR(cos, complex<T>, 12);
> +//UNARY_OPS_FUNCTOR(floor)
-> 1 op
> +//UNARY_OPS_FUNCTOR(imag)
-> 0 op
> +//UNARY_OPS_FUNCTOR(lnot)
-> 1 op
> +//UNARY_OPS_FUNCTOR(mag)
mag -> sqrt(R^2 + I^2) -> 3 + sqrt? ops
> +//UNARY_OPS_FUNCTOR(magsq)
magsq -> R^2 + I^2 -> 3 ops
> +//UNARY_OPS_FUNCTOR(neg)
neg -> 1 op
> +//UNARY_OPS_FUNCTOR(real)
real -> 0 op
> +UNARY_OPS_FUNCTOR(sqrt, T, 1);
-> sqrt for T is probably more like 8 flops
> +//BINARY_OPS_FUNCTOR(band)
-> 1 op
> +//BINARY_OPS_FUNCTOR(bor)
-> 1 op
> +//BINARY_OPS_FUNCTOR(bxor)
-> 1 op
> +//BINARY_OPS_FUNCTOR(eq)
-> 1 op
> +//BINARY_OPS_FUNCTOR(ge)
-> 1 op
> +//BINARY_OPS_FUNCTOR(gt)
-> 1 op
> +//BINARY_OPS_FUNCTOR(land)
-> 1 op
> +//BINARY_OPS_FUNCTOR(le)
-> 1 op
> +//BINARY_OPS_FUNCTOR(lt)
-> 1 op
> +//BINARY_OPS_FUNCTOR(lor)
-> 1 op
> +//BINARY_OPS_FUNCTOR(lxor)
-> 1 op
> +//BINARY_OPS_FUNCTOR(max)
-> 1 op
> +//BINARY_OPS_FUNCTOR(maxmgsq)
-> 3+1 ops
> +//BINARY_OPS_FUNCTOR(min)
-> 1 op
> +//BINARY_OPS_FUNCTOR(minmgsq)
-> 3+1 ops
> +//BINARY_OPS_FUNCTOR(ne)
-> 1 op
> +// The cost is computed by adding the costs for pure real, mixed
real-complex and
> +// pure complex adds and multiples for the given equation:
> +
> +// (t1 + t2) * t3
> +// < adds > < muls >
> +// R M C R M C
> +TERNARY_OPS_FUNCTOR(am, T1, T2, T3, 1 + 0 + 0*2 + 1 + 0*2 + 0*6)
> +TERNARY_OPS_FUNCTOR(am, T1, T2, C3, 1 + 0 + 0*2 + 0 + 0*2 + 0*6)
> +TERNARY_OPS_FUNCTOR(am, T1, C2, T3, 0 + 1 + 0*2 + 0 + 1*2 + 0*6)
> +TERNARY_OPS_FUNCTOR(am, T1, C2, C3, 0 + 1 + 0*2 + 0 + 0*2 + 1*6)
> +TERNARY_OPS_FUNCTOR(am, C1, T2, T3, 0 + 1 + 0*2 + 0 + 1*2 + 0*6)
> +TERNARY_OPS_FUNCTOR(am, C1, T2, C3, 0 + 1 + 0*2 + 0 + 0*2 + 1*6)
> +TERNARY_OPS_FUNCTOR(am, C1, C2, T3, 0 + 0 + 1*2 + 0 + 1*2 + 0*6)
> +TERNARY_OPS_FUNCTOR(am, C1, C2, C3, 0 + 0 + 1*2 + 0 + 0*2 + 1*6)
good stuff!
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
More information about the vsipl++
mailing list