[patch] Serial Expression Profiling

Don McCoy don at codesourcery.com
Wed Aug 9 20:26:16 UTC 2006


The attached patch extends the profiling further by handling some of the 
dispatched expression evaluations.  The three specific cases covered are:

    * Loop fusion - collapsing multiple loops into one when doing
      element-wise operations on views.
    * Dense expressions - converting tightly-packed 2-D and 3-D views
      into 1-D views that are then evaluated normally.
    * Matrix transpose - transposing matrices with possibly different
      storage formats (row/col)

This can conceivably be extended to cover cases where we are dispatching 
to IPP and SAL as well.

All expressions are tagged in the profiler output with "Expr[/type/]", 
where type is LF, Dense or Trans.  Following that is the dimensionality 
(1D, 2D or 3D), a compact representation of the expression and finally 
the size(s).  For example,  the following expression (where all are the 
same size and of type Vector<T>):

    r = v1 * v2;

Gets logged as:

    Expr[LF] 1D *SS 262144 : 66929535 : 1 : 262144 : 14.0664

The expression is represented as "*SS", meaning "the binary multiply 
operator applied to two single-precision real values" (again using the 
BLAS/LAPACK convention of S/D/C/Z for operand types). 

In general, operators are designated with a 'u', 'b' or 't' for unary, 
binary and ternary operators respectively, with the exception of the 
common binary operators, shown in their more familiar +-*/ form. 

Multiple operators are evaluated in order, therefore

    v1 * T(4) + v2 / v3

is tagged as:

    Expr[LF] 1D *SS/SS+SS 2048 : 1527534 : 1 : 6144 : 14.4451

Changing it to

    (v1 * T(4) + v2) / v3

yields:

    Expr[LF] 1D *SS+SS/SS 2048 : 1536309 : 1 : 6144 : 14.3626


Dense expressions will appear twice in the profiler output -- once when 
it is converted from a 2- or 3-D view and once when evaluated as a 1-D 
expression.  They do, in fact, refer to the same expression.  For example:

    Expr[Dense] 3D *SS 64x64x64 : 67455693 : 1 : 262144 : 13.9567
    Expr[LF] 1D *SS 262144 : 66991743 : 1 : 262144 : 14.0533

Note that the dense evaluation includes the time it takes to perform the 
loop fusion evaluation, hence the slightly longer amount of time spent 
there.  However, the time difference is probably dominated by the amount 
of time it takes to generate the tag itself.  Note also that the sizes 
are reported differently, but are equivalent as 64x64x64=262144

Finally, please note that not all the operation counts are done at this 
point.  Missing ones should probably be counted in some fashion.  
Currently, if an operator is not handled, it defaults to adding zero ops 
to the total count.

Regards,

-- 
Don McCoy
don (at) CodeSourcery 
(888) 776-0262 / (650) 331-3385, x712



-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: se1.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060809/207e9608/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: se1.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060809/207e9608/attachment-0001.ksh>


More information about the vsipl++ mailing list