[vsipl++] [patch] SSAR Interpolation
Jules Bergmann
jules at codesourcery.com
Wed Nov 22 13:35:37 UTC 2006
Don McCoy wrote:
> This patch changes the processing order of the interpolation loop to
> work on columns first and rows second. This entailed switching all
> views in that loop to use column-major storage and adding an explicit
> transpose to get it in the right format for processing. Just after this
> loop, the order of the FFTs is reversed to take advantage of the new
> ordering -- keeping the net processing time for them the same.
>
> The change results in a 2x speedup for the interpolation loop! That
> translates to a 25% increase overall, at the cost of an additional
> transpose.
Sweet! 2x is good :)
The white-paper fodder here is that it is easy to experiment with
different dimension-orderings primarily by changing the matrix decls.
Also, thinking out loud, in parallel the transposes will be more
expensive, which might alter this tradeoff of an extra transpose.
We'll cross that bridge when we get there.
One minor comment below, please this in.
thanks,
-- Jules
> + Tensor<T, Dense<3, T, col2_type> > SINC_HAM_;
[x] col2_type happens to work, but this is undefined behavior.
I.e. 'col2_type = tuple<1, 0, undefined>', where 'undefined' happens
to be 2.
Please use an explicit 'tuple<1, 0, 2>' instead.
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
More information about the vsipl++
mailing list