[patch] Misc fixes
Jules Bergmann
jules at codesourcery.com
Sun Mar 18 03:51:48 UTC 2007
This patch:
- Fixes the DFT FFT backend to force the input and output layouts to
have the same complex format. Previously attempting to use the
backend when the input and output had different formats resulted in
an assertion failure in the workspace. This was causing the
regressions/fft_expr_arg test to fail. A new test
regressions/fft_split_inter was added for more direct coverage.
- Adds a 'name()' member to the FFT backends. This is useful for
debugging (determining which backend is being used). It may also
be useful for diagnostics and profiling in the future.
- Changes the DFT backend to use double-precision internally for
accumulation. This fixes precision difference that were arising
between the DFT backend and the ref::dft routine. This was causing
parallel/fftm to fail. IIRC it was also causing fft_be to fail when
testing the DFT backend.
- Checks DMA address alignment. Address must have 16-byte alignment
on the Cell. This caused vmmul test to fail because vmmul redispatch
generated vector multiplies that were unaligned (i.e. the second row
of a 5 x 7 matrix of floats).
- Updates SIMD traits for AltiVec (also tested on PPC 970FX with GCC
4.1 and PowerPC 7447A with GreenHills), and adds a unit-test for
SIMD traits that I've been meaning to checkin for some time.
- Fixes the builtin SIMD vmul routine for split-complex to work
correctly when the output aliases one of the inputs. This was
causing coverage_binary to fail.
Curiously, ppu-g++ -m32 does not defined __VEC__, while ppu32-g++
does.
With this patch, all tests should pass on the Cell, with the following
exceptions:
- convolution fails with OpenMPI becasue "MPI_BOR reduction not define
for non-intrinsic type". Passes in serial build.
- parallel/fftm likewise.
=> These two appear to be an OpenMPI problem, not a Cell problem.
We can debug them later.)
- Some of the fft_ext test cases fail, in particular real->complex
=> I have not debugged this.
- correlation fails because of a precision error (error_db threshold).
- ref-impl/fft-coverage fails because of a precision error (test does
not use error_db, but if it did, it would fail for our usual
threshold)
=> It looks like the libfft FFT is noisy. This isn't worth diagnosing
too much since we'll eventually replace it with a faster FFT.
Also, the regressions/transpose_assign test takes a long time to run.
Granted, it is doing a lot of transposes and I had optimization turned
off, but it runs much faster on EM64t.
Patch applied.
-- Jules
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fixes.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070317/598379ae/attachment.ksh>
More information about the vsipl++
mailing list