[vsipl++] [patch] FIR Filter bank benchmark

Fri Mar 31 19:59:03 UTC 2006

Don McCoy wrote:
> The attached patch adds one of the MIT Lincoln Labs' PCA Kernel-Level 
> Benchmarks to VSIPL++ -- the FIR Filter Bank.  It also has a minor 
> re-organization of some support functions, moving them from the tests/ 
> directory to the src/vsip_csl/ directory.  Actually, copies have been 
> made as I didn't think it would be good to delete the ones in tests/ 
> until all other references to them have been cleaned up.
> 
> This benchmark defines two sets of parameters for performing a series of 
> convolutions on the input data.  In each case, M input vectors of length 
> N are convolved with filters of length K.  The two sets of parameters 
> are given as follows:
> 
>     Set    1    2
>     M    64    20
>     N    4096    1024
>     K    128    12
> 
> The benchmark framework defined for VSIPL++ sweeps N over a range of 
> values, so the point of interest for each set may be extracted according 
> to the table above.
> 
> Refer to the end of benchmarks/firbank.cpp to see the options used to 
> select various tests.  Note: the last digit of the option value is 
> always 1 or 2, corresponding to the data set chosen.
> 
> In order to use external data files with the benchmark, they must be 
> located in benchmarks/data/set1 and benchmarks/data/set2.  The filenames 
> must be as follows: inputs_X.matrix, filter.matrix and outputs_X.matrix, 
> where X denotes the size as a power of two [log2(N)].  The default 
> starting and ending values for N are 7 and 16, so files corresponding to 
> those vector sizes must be provided.

> 
> Validation is performed with external data.  For full convolution, all 
> values are checked.  The FFT-based algorithm is circular rather than 
> linear though, so values near the beginning and end are not checked. The 
> number of values that are checked is N - 2 * (K - 2).
> 
> 
> Lastly, I had some difficulty getting the right answers to come out due 
> to the fact that the convolutions are done repeatedly on the same vector 
> in order to take a more accurate measurement.  With the Fir class, the 
> state_save/state_no_save template parameter *must* be set to 'no_save', 
> or the results are retained between successive convolutions, thereby 
> corrupting the results.  Not what is desired in this case!

Actually, using state_no_save isn't all that bad.  In particular for 
radar systems, data is usually not collected continuously.  A regular 
interval of pulses are transmitted.  In between each pulse the received 
signal is collected.  This received data is not continuous because most 
systems cannot transmit and recieve data simultaneously (radar signals 
fall off with the 4th power of distance, so getting the transmitted 
signal would blow out the receive amplifiers); and because each new 
pulse "resets" the distance corresponding to the received data.

A system might look something like:

transmit:   *          *           *
receive:        ......    .......      .......

                      ^    ^
                      |    +- the beginning of this pulse is near
                      |
                      +- this end of this pulse is far

In a cheapo system, each pulse might have the same waveform (which would 
simplify the FIRbank into only needing a single set of coefficients). 
However, systems often use "waveform diversity" where each pulse is 
slightly different.  This makes it harder to jam and may increase the 
sensitivity of the system.  This diversity would require multiple sets 
of filter kernels.

> 
> Similarly with fast convolution, a temporary is used.  I.e.:
> 
>     for (index_type l=0; l<loop; ++l)
>     {
>       // Perform FIR convolutions
>       for ( length_type i = 0; i < local_M; ++i )
>       {
>         Vector<T> tmp(N, T());
>         fwd_fft(l_inputs.row(i), tmp);
>         tmp *= response.row(0);    // assume fft already done on response
>         inv_fft(tmp, test.row(i));
>       }
>     }

It should be OK to move the declaration of tmp entirely outside the 
loop.  If fwd_fft's size is N, it will completely overwrite the values 
in 'tmp'

> 
> Moving the declaration and initialization of 'tmp' outside the loop has 
> the same effect as with 'state_save' because the contents of tmp are not 
> zeroed between rows.  With it inside the loop (as it should be), 
> performance does not appear to be affected noticeably, though it should 
> have a slight impact.
> 
> Comments and feedback appreciated.
> 

Reviewing the patch now ...

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705