More SSAR optimizations
Don McCoy
don at codesourcery.com
Sun Nov 5 22:32:23 UTC 2006
The attached patch splits the Kernel 1 processing class into two parts.
The new base class is responsible for most the setup that is applicable
to images with the same geometry. Its constructor also computes the
dimensions of the final output image. The benefit to the derived class
is that it can now pre-allocate the remaining memory needed for
processing, including the creation of the Fftm objects, which includes a
potentially lengthy planning process.
Also of note, this "pre-processing" phase allows two equations to be
reduced (at run-time that is) to simple multiplications, which can then
be vectorized by the SIMD unit. See equations 62 and 68. The setup for
these equations is expensive in part because they involve two
vector-matrix multiplies (one along the rows and one along the columns)
which results in a hard-to-optimize memory access pattern. As this
portion is now done outside the computational loop, the cost is less of
an issue. It should be possible to use the resulting matrices (that I'm
correctly calling 'filters') on any incoming radar data.
An explicit loop at eq. 65 was also removed.
The good news: These simple changes realized a 1.5x performance
improvement over the current (SVN head) version!
Regards,
--
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: k1_base.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20061105/29e08669/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: k1_base.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20061105/29e08669/attachment-0001.ksh>
More information about the vsipl++
mailing list