[vsipl++] [patch] CBE Fftm support

Thu Feb 15 13:10:03 UTC 2007

Don McCoy wrote:

> Index: src/vsip/core/fft.hpp
> ===================================================================
> --- src/vsip/core/fft.hpp	(revision 163256)
> +++ src/vsip/core/fft.hpp	(working copy)
> @@ -449,10 +449,11 @@
>  class Fftm : public impl::fftm_facade<I, O, impl::fft::LibraryTagList,
>  				      1 - A, D, R, N, H> 
>  {
> -  // The S template parameter in 2D Fft is '0' for column-first
> -  // and '1' for row-first transformation. As Fftm's Axis parameter
> -  // does the inverse, we use '1 - A' here to be able to share the same
> -  // logic underneath.
> +  // Fftm and 2D Fft share some of the same underlying logic.  
> +  // Unfortunately, the latter uses S where '0' stands for column-first 
> +  // and '1' for row-first transformations.  Fftm uses A where '0' means 
> +  // by-row and '1' means by-column.  As a result, here we use '1 - A'
> +  // in order to be consistent in the base class.

What about:

Fftm and 2D Fft share some underlying logic.
The 'Special dimension' (S) template parameter in 2D Fft uses '0' to
represent column-first and '1' for a row-first transformation, while
the Fftm 'Axis' (A) parameter uses '0' to represent row-wise, and
'1' for column-wise transformation.
Thus, by using '1 - A' here we can share the implementation, too.

> Index: src/vsip/opt/cbe/ppu/fft.cpp
> ===================================================================
> --- src/vsip/opt/cbe/ppu/fft.cpp	(revision 163256)
> +++ src/vsip/opt/cbe/ppu/fft.cpp	(working copy)

>  // 1D complex -> complex FFT
>  
>  template <typename T, int A, int E>
>  class Fft_impl<1, std::complex<T>, std::complex<T>, A, E>
> -  : public fft::backend<1, std::complex<T>, std::complex<T>, A, E>
> -
> +    : public fft::backend<1, std::complex<T>, std::complex<T>, A, E>,
> +      private Fft_base<T>
>  {
>    typedef T rtype;
>    typedef std::complex<rtype> ctype;
>    typedef std::pair<rtype*, rtype*> ztype;
>  
>  public:
> -  Fft_impl(Domain<1> const &dom, rtype scale) VSIP_THROW((std::bad_alloc))
> +  Fft_impl(Domain<1> const &dom, rtype scale)
>        : scale_(scale),
>          W_(alloc_align<ctype>(VSIP_IMPL_ALLOC_ALIGNMENT, dom.size()/4))
>    {
> -    compute_twiddle_factors(W_, dom.size());
> +    this->compute_twiddle_factors(W_, dom.size());

Since you have now put the definition of 'compute_twiddle_factors' into a
base class, why don't you store W_ there, too, and then call this function
from the base class constructor ?

Thus...

>    }
>    virtual ~Fft_impl()
>    {
> @@ -106,7 +164,7 @@
>    virtual void in_place(ctype *inout, stride_type stride, length_type length)
>    {
>      assert(stride == 1);
> -    fft_8K<T>(inout, inout, W_, length, this->scale_, E);
> +    this->fft_8K(inout, inout, W_, length, this->scale_, E);

...this would become:

this->fft_8k(inout, inout, this->scale_, E);

>    }
>    virtual void in_place(ztype, stride_type, length_type)
>    {
> @@ -117,7 +175,7 @@
>    {
>      assert(in_stride == 1);
>      assert(out_stride == 1);
> -    fft_8K<T>(out, in, W_, length, this->scale_, E);
> +    this->fft_8K(out, in, W_, length, this->scale_, E);

Could you exchange 'in' and 'out' here for consistency ? I think everywhere else
we pass the input first.

Thanks,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718