[vsipl++] More SSAR optimizations
Jules Bergmann
jules at codesourcery.com
Mon Nov 6 13:19:06 UTC 2006
Don, This looks good, please check it in. thanks, -- Jules
Don McCoy wrote:
> The attached patch splits the Kernel 1 processing class into two parts.
> The new base class is responsible for most the setup that is applicable
> to images with the same geometry. Its constructor also computes the
> dimensions of the final output image. The benefit to the derived class
> is that it can now pre-allocate the remaining memory needed for
> processing, including the creation of the Fftm objects, which includes a
> potentially lengthy planning process.
>
> Also of note, this "pre-processing" phase allows two equations to be
> reduced (at run-time that is) to simple multiplications, which can then
> be vectorized by the SIMD unit. See equations 62 and 68. The setup for
> these equations is expensive in part because they involve two
> vector-matrix multiplies (one along the rows and one along the columns)
> which results in a hard-to-optimize memory access pattern. As this
> portion is now done outside the computational loop, the cost is less of
> an issue. It should be possible to use the resulting matrices (that I'm
> correctly calling 'filters') on any incoming radar data.
>
> An explicit loop at eq. 65 was also removed.
>
> The good news: These simple changes realized a 1.5x performance
> improvement over the current (SVN head) version!
>
Woo-hoo!
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
More information about the vsipl++
mailing list