[patch] Misc fixes

Sun Mar 18 03:51:48 UTC 2007

This patch:

  - Fixes the DFT FFT backend to force the input and output layouts to
    have the same complex format.  Previously attempting to use the
    backend when the input and output had different formats resulted in
    an assertion failure in the workspace.  This was causing the
    regressions/fft_expr_arg test to fail.  A new test
    regressions/fft_split_inter was added for more direct coverage.

  - Adds a 'name()' member to the FFT backends.  This is useful for
    debugging (determining which backend is being used).  It may also
    be useful for diagnostics and profiling in the future.

  - Changes the DFT backend to use double-precision internally for
    accumulation.  This fixes precision difference that were arising
    between the DFT backend and the ref::dft routine.  This was causing
    parallel/fftm to fail.  IIRC it was also causing fft_be to fail when
    testing the DFT backend.

  - Checks DMA address alignment.  Address must have 16-byte alignment
    on the Cell.  This caused vmmul test to fail because vmmul redispatch
    generated vector multiplies that were unaligned (i.e. the second row
    of a 5 x 7 matrix of floats).

  - Updates SIMD traits for AltiVec (also tested on PPC 970FX with GCC
    4.1 and PowerPC 7447A with GreenHills), and adds a unit-test for
    SIMD traits that I've been meaning to checkin for some time.

  - Fixes the builtin SIMD vmul routine for split-complex to work
    correctly when the output aliases one of the inputs.  This was
    causing coverage_binary to fail.

    Curiously, ppu-g++ -m32 does not defined __VEC__, while ppu32-g++
    does.

With this patch, all tests should pass on the Cell, with the following 
exceptions:

  - convolution fails with OpenMPI becasue "MPI_BOR reduction not define
    for non-intrinsic type".  Passes in serial build.
  - parallel/fftm likewise.

=> These two appear to be an OpenMPI problem, not a Cell problem.
    We can debug them later.)

  - Some of the fft_ext test cases fail, in particular real->complex

=> I have not debugged this.

  - correlation fails because of a precision error (error_db threshold).

  - ref-impl/fft-coverage fails because of a precision error (test does
    not use error_db, but if it did, it would fail for our usual
    threshold)

=> It looks like the libfft FFT is noisy.  This isn't worth diagnosing
    too much since we'll eventually replace it with a faster FFT.

Also, the regressions/transpose_assign test takes a long time to run. 
Granted, it is doing a lot of transposes and I had optimization turned 
off, but it runs much faster on EM64t.

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fixes.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070317/598379ae/attachment.ksh>