[vsipl++] [patch] FIR Filter bank benchmark
Don McCoy
don at codesourcery.com
Sun Apr 2 21:01:29 UTC 2006
Jules Bergmann wrote:
> Don McCoy wrote:
>
> This patch looks good. The only real change I have is you should put
> the output into a global matrix (see below). Let me know if that makes
> sense. Once that is changed, please check it in.
>
...
>
> Instead of declaring 'test' to be a local matrix, can you instead
> declare a global results matrix, use the local view of that matrix here,
> and then check the local portion below?
>
It seemed easiest to pass around yet another matrix and keep all the
declarations in one place. That satisfies the above, but I'm not sure
that is what you meant.
The code makes a bit more sense now I think. The calculation now uses
'outputs' and they are compared against 'expected'. It may be a bit
confusing where it reads data in from file in that it refers to those
files as outputs, but then puts the data into 'expected'. It's not too
bad as is I hope.
> Nice diffs (for the following files)! You did this manually right? I
> didn't think CVS handled renaming of files. Thanks!
>
Yes, it made it easier to compare them.
>> Similarly with fast convolution, a temporary is used. I.e.:
>>
>> for (index_type l=0; l<loop; ++l)
>> {
>> // Perform FIR convolutions
>> for ( length_type i = 0; i < local_M; ++i )
>> {
>> Vector<T> tmp(N, T());
>> fwd_fft(l_inputs.row(i), tmp);
>> tmp *= response.row(0); // assume fft already done on
response
>> inv_fft(tmp, test.row(i));
>> }
>> }
>
>
> It should be OK to move the declaration of tmp entirely outside the
> loop. If fwd_fft's size is N, it will completely overwrite the values
> in 'tmp'
>
I agree. Done. Thanks.
>>
>> Moving the declaration and initialization of 'tmp' outside the loop
>> has the same effect as with 'state_save' because the contents of tmp
>> are not zeroed between rows. With it inside the loop (as it should
>> be), performance does not appear to be affected noticeably, though it
>> should have a slight impact.
I believe I had myself confused over what turned out to be problems with
comparing near-zero floating-point numbers. Disregard the above please.
On that note, however, I found that I can construct data sets that will
fail the data comparison. I.e. when the output data set contains
zeroes and the fast convolution algorithm is used, the 'view_equal'
check will fail. The reason for this is the method we use for comparing
values looks at the relative error of say a and b, as (a-b)/a. This
works well for most small values, but evaluates to 1 when b == 0 (the
relative error check returns false for anything over 1e-4).
The example set that cause this error results from leaving the imaginary
portion of the inputs set to zero and setting the filter coefficients to
all ones. Doing this results in outputs where the imaginary portion is
exactly zero, but the fast conv algorithm will produce numbers on the
order of 1e-6, which are only a few bits off of zero (for floats) but
will fail the current comparison check.
In any case, this is a separate topic and I don't think it affects
whether or not we check this in at this time. Sound ok?
One other item is worth bringing up though. There was an error in the
previous patch for the not-from-file case where the output (now
'expected') vectors were not seeded with the correct answers. I believe
I let this one in by failing to compile and test in debug mode. Given
that assert() does nothing in optimized builds, the data checks will not
be run the way the benchmark will normally be configured.
Perhaps it would be nice to insert something that printed a nice warning
if the call to view_equal() fails, then passes the result to assert()
which will halt execution in the debug case (handy) but allow it to
continue in the optimized case (ok since warning printed?). Better
warnings could be printed from within the view_equal() function, e.g.
ones that show the location of the error and the values that failed
comparison.
In addition, perhaps we always want to check the data. It is done
outside the timing loop, so doesn't affect results, but it does slow the
overall execution time. Suggestions?
Ok to check in with these changes?
--
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fb2.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060402/37c3690c/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fb2.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20060402/37c3690c/attachment-0001.ksh>
More information about the vsipl++
mailing list