[patch] Fast convolution for Cell BE

Don McCoy don at codesourcery.com
Sat Feb 24 21:40:34 UTC 2007


The attached patch adds a fast convolution object to the 'impl' 
namespace.  It operates only on vectors of length 32 up to 2048 points, 
or on matrices with rows having the same lengths.  In either case, views 
must also be dense.

The application 'examples/fconv.cpp' demonstrates how to use it and 
validates the results by computing the reference result using the three 
component operations: forward FFT, element-wise vector multiply and 
inverse FFT (using the existing SPE kernels for these tasks).

This is an example of a "fused" kernel -- i.e. one that avoids 
unnecessary I/O overhead by having the SPE's do three operations at 
once, as opposed to one, thereby gaining a performance advantage. 

Regards,

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fconv2.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070224/661c7bee/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fconv2.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070224/661c7bee/attachment-0001.ksh>


More information about the vsipl++ mailing list