[patch] FIR Filter bank benchmark

Fri Mar 31 02:28:30 UTC 2006

The attached patch adds one of the MIT Lincoln Labs' PCA Kernel-Level 
Benchmarks to VSIPL++ -- the FIR Filter Bank.  It also has a minor 
re-organization of some support functions, moving them from the tests/ 
directory to the src/vsip_csl/ directory.  Actually, copies have been 
made as I didn't think it would be good to delete the ones in tests/ 
until all other references to them have been cleaned up.

This benchmark defines two sets of parameters for performing a series of 
convolutions on the input data.  In each case, M input vectors of length 
N are convolved with filters of length K.  The two sets of parameters 
are given as follows:

	Set	1	2
	M	64	20
	N	4096	1024
	K	128	12

The benchmark framework defined for VSIPL++ sweeps N over a range of 
values, so the point of interest for each set may be extracted according 
to the table above.

Refer to the end of benchmarks/firbank.cpp to see the options used to 
select various tests.  Note: the last digit of the option value is 
always 1 or 2, corresponding to the data set chosen.

In order to use external data files with the benchmark, they must be 
located in benchmarks/data/set1 and benchmarks/data/set2.  The filenames 
must be as follows: inputs_X.matrix, filter.matrix and outputs_X.matrix, 
where X denotes the size as a power of two [log2(N)].  The default 
starting and ending values for N are 7 and 16, so files corresponding to 
those vector sizes must be provided.

Validation is performed with external data.  For full convolution, all 
values are checked.  The FFT-based algorithm is circular rather than 
linear though, so values near the beginning and end are not checked. 
The number of values that are checked is N - 2 * (K - 2).

Lastly, I had some difficulty getting the right answers to come out due 
to the fact that the convolutions are done repeatedly on the same vector 
in order to take a more accurate measurement.  With the Fir class, the 
state_save/state_no_save template parameter *must* be set to 'no_save', 
or the results are retained between successive convolutions, thereby 
corrupting the results.  Not what is desired in this case!

Similarly with fast convolution, a temporary is used.  I.e.:

     for (index_type l=0; l<loop; ++l)
     {
       // Perform FIR convolutions
       for ( length_type i = 0; i < local_M; ++i )
       {
         Vector<T> tmp(N, T());
         fwd_fft(l_inputs.row(i), tmp);
         tmp *= response.row(0);    // assume fft already done on response
         inv_fft(tmp, test.row(i));
       }
     }

Moving the declaration and initialization of 'tmp' outside the loop has 
the same effect as with 'state_save' because the contents of tmp are not 
zeroed between rows.  With it inside the loop (as it should be), 
performance does not appear to be affected noticeably, though it should 
have a slight impact.

Comments and feedback appreciated.

Regards,

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fb.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060330/3d621009/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fb.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060330/3d621009/attachment-0001.ksh>