[vsipl++] [patch] FIR Filter bank benchmark

Sun Apr 2 21:01:29 UTC 2006

Jules Bergmann wrote:
> Don McCoy wrote:
> 
> This patch looks good.  The only real change I have is you should put 
> the output into a global matrix (see below).  Let me know if that makes 
> sense.  Once that is changed, please check it in.
> 
...
> 
> Instead of declaring 'test' to be a local matrix, can you instead 
> declare a global results matrix, use the local view of that matrix here, 
> and then check the local portion below?
> 

It seemed easiest to pass around yet another matrix and keep all the 
declarations in one place.  That satisfies the above, but I'm not sure 
that is what you meant.

The code makes a bit more sense now I think.  The calculation now uses 
'outputs' and they are compared against 'expected'.  It may be a bit 
confusing where it reads data in from file in that it refers to those 
files as outputs, but then puts the data into 'expected'.  It's not too 
bad as is I hope.

> Nice diffs (for the following files)!  You did this manually right?  I 
> didn't think CVS handled renaming of files.  Thanks!
> 
Yes, it made it easier to compare them.

 >> Similarly with fast convolution, a temporary is used.  I.e.:
 >>
 >>     for (index_type l=0; l<loop; ++l)
 >>     {
 >>       // Perform FIR convolutions
 >>       for ( length_type i = 0; i < local_M; ++i )
 >>       {
 >>         Vector<T> tmp(N, T());
 >>         fwd_fft(l_inputs.row(i), tmp);
 >>         tmp *= response.row(0);    // assume fft already done on 
response
 >>         inv_fft(tmp, test.row(i));
 >>       }
 >>     }
 >
 >
 > It should be OK to move the declaration of tmp entirely outside the
 > loop.  If fwd_fft's size is N, it will completely overwrite the values
 > in 'tmp'
 >

I agree.  Done.  Thanks.

 >>
 >> Moving the declaration and initialization of 'tmp' outside the loop
 >> has the same effect as with 'state_save' because the contents of tmp
 >> are not zeroed between rows.  With it inside the loop (as it should
 >> be), performance does not appear to be affected noticeably, though it
 >> should have a slight impact.

I believe I had myself confused over what turned out to be problems with 
comparing near-zero floating-point numbers.  Disregard the above please.

On that note, however, I found that I can construct data sets that will 
  fail the data comparison.  I.e. when the output data set contains 
zeroes and the fast convolution algorithm is used, the 'view_equal' 
check will fail.  The reason for this is the method we use for comparing 
values looks at the relative error of say a and b, as (a-b)/a.  This 
works well for most small values, but evaluates to 1 when b == 0 (the 
relative error check returns false for anything over 1e-4).

The example set that cause this error results from leaving the imaginary 
portion of the inputs set to zero and setting the filter coefficients to 
all ones.  Doing this results in outputs where the imaginary portion is 
exactly zero, but the fast conv algorithm will produce numbers on the 
order of 1e-6, which are only a few bits off of zero (for floats) but 
will fail the current comparison check.

In any case, this is a separate topic and I don't think it affects 
whether or not we check this in at this time.  Sound ok?

One other item is worth bringing up though.  There was an error in the 
previous patch for the not-from-file case where the output (now 
'expected') vectors were not seeded with the correct answers.  I believe 
I let this one in by failing to compile and test in debug mode.  Given 
that assert() does nothing in optimized builds, the data checks will not 
be run the way the benchmark will normally be configured.

Perhaps it would be nice to insert something that printed a nice warning 
if the call to view_equal() fails, then passes the result to assert() 
which will halt execution in the debug case (handy) but allow it to 
continue in the optimized case (ok since warning printed?).  Better 
warnings could be printed from within the view_equal() function, e.g. 
ones that show the location of the error and the values that failed 
comparison.

In addition, perhaps we always want to check the data.  It is done 
outside the timing loop, so doesn't affect results, but it does slow the 
overall execution time.  Suggestions?

Ok to check in with these changes?

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fb2.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060402/37c3690c/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fb2.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060402/37c3690c/attachment-0001.ksh>