[patch] vmmul, IPP scalar-view, dense<2> opt.

Wed Dec 21 21:17:37 UTC 2005

This patch contains several performance enhancements:

  - Optimization of dense 2-D and 3-D get and put.  Previously, it
    tried to be clever and abstract the 2-D and 3-D varieties into
    a get(Point) and put(Point), which where later "de-abstracted"
    by Layout, by converting the point into a location in memory to
    access.  Our compilers have trouble collapsing this.  Changing
    the implementation to pass the indices directly to layout
    improves performance.

  - Dispatch matrix expressions through Serial_dispatch_evaluator.

    This, in along with an additional trasnspose tag in LibraryTagList,
    lets us plug fast transpose algorithms into the dispatch
    framework.

  - Add evaluator to decompose vector-matrix multiply into individual
    vector-vector or scalar-vector element-wise operations (depending on
    whether the multiply is by-row or by-column and what the output
    dimension-ordering is), which are themselves dispatched.  For IPP,
    this results in better performance than plain loop fusion.  For
    non-IPP, this results in better performance then loop fusion if the
    decomposition is to scalar-vector (because this reduces the memory
    bandwidth of the operation), or the same performance if the
    decomposition is to vector-vector.

  - Add IPP dispatch for scalar-vector add, subtract, multiply, and
    divide operations.  New scalar-view test for additional coverage.

Ok to commit?

				-- Jules

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: misc.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20051221/8abd50a1/attachment.ksh>