[vsipl++] patch: support sal-fft

Fri Feb 17 17:26:36 UTC 2006

Stefan Seefeld wrote:
> The attached patch adds support for SAL as a new backend for our FFT 
> engine.
> This new backend does not support 3D ffts, and block sizes that are not 
> powers of two.
> I therefor (conditionally) masked those tests in tests/fft.cpp that SAL 
> does not
> support. This can be reverted as soon as the fft_dispatch framework is 
> in place,
> i.e. multiple fft backends are supported in parallel.

That sounds fine.

> 
> I believe the code in src/vsip/impl/sal/fft.h can be tidied up quite a bit,
> though I'd prefer to do that only once the code is refactored, i.e. for now
> focus on feature completeness.

Stefan,

This looks good.

We need to avoid copying data, unless it is necessary to pack/unpack it, 
align it, or transpose it.  In particular,for complex to complex FFTs we 
should use fft_coptx() instead of fft_copx().

				-- Jules

> 
> Regards,
>         Stefan
> 
> 
> ------------------------------------------------------------------------

> +#elif defined(VSIP_IMPL_SAL_FFT)
> +    // In some contexts, SAL destroys the input data itself, and sometimes
> +    // we have to modify it to 'pack' data into the format SAL expects
> +    // (see SAL Tutorial for details).
> +    // Therefor, we always copy the input.
> +    static const bool  must_copy = true;

For complex-to-complex FFTs, there are non-clobbering variants of some 
of the functions (fft_coptx instead of fft_copx) that we should use 
instead of copying the data.  For real-to-complex and complex-to-real it 
still might be necessary to copy.

> +    // SAL cannot handle non-unit strides properly as 'complex' isn't
> +    // a real (packed) datatype, so the stride would be applied to the real/imag
> +    // offset, too.
> +    static const vsip::dimension_type  transpose_target = 1;
>  #else
>        static const bool  must_copy = false;
>        static const vsip::dimension_type  transpose_target = axis;
>  #endif
> -
>        typename impl::Maybe_force_copy<
>  	  must_copy,typename local_type::block_type,
>            typename impl::Maybe_transpose<2,axis,transpose_target>::type>::type
> @@ -833,7 +890,7 @@
>  	  impl::Ext_data<in_block_type>(this->in_temp_).data());
>  
>        const bool native_order = (axis == transpose_target);
> -      
> +
>        this->core_->scale_ = this->scale_;
>        this->core_->runs_ = local_inout.size(1-axis);
>        this->core_->stride_ = 1;
> Index: src/vsip/impl/sal/fft.hpp

> +
> +template <vsip::dimension_type D, typename inT, typename outT, bool doFftm>
> +inline void
> +from_to(Fft_core<D, inT, outT, doFftm>& self, inT const* in, outT* out)
> +  VSIP_NOTHROW
> +{
> +  assert(0 && "Sorry, operation not yet supported for this type !");
> +  // TBD

This shouldn't make it out into a release, but to be on the safe size, 
you should really say:

	VSIP_IMPL_THROW(vsip::impl::unimplemented("..."));

> +}
> +
> +// 1D real -> complex forward fft
> +
> +inline void
> +from_to(Fft_core<1, float, std::complex<float>, false>& self,
> +	float const* in, std::complex<float>* out)
> +  VSIP_NOTHROW
> +{
> +  FFT_setup setup = reinterpret_cast<FFT_setup>(self.plan_);
> +  float *out_ = reinterpret_cast<float*>(out);

We reserve the "_" suffix for member variables.  Perhaps you could call 
the parameter "arg_out" and then use "out" for the cast?

> +  fft_ropx(&setup, const_cast<float*>(in), 1, out_, 1,
> +	   self.size_[0], FFT_FORWARD, sal::ESAL);
> +  // unpack the data (see SAL reference for details).
> +  int const N = (1 << self.size_[0]) + 2;
> +  out_[N - 2] = out_[1];
> +  out_[1] = 0.f;
> +  out_[N - 1] = 0.f;
> +  // forward fft_ropx is scaled up by 2.
> +  float scale = 0.5f;
> +  vsmulx(out_, 1, &scale, out_, 1, N, sal::ESAL);
> +}

> +inline void
> +from_to(Fft_core<1, std::complex<float>, std::complex<float>, false>& self,
> +	std::complex<float> const *in, std::complex<float> *out) VSIP_NOTHROW
> +{
> +  FFT_setup setup = reinterpret_cast<FFT_setup>(self.plan_);
> +  COMPLEX *in_ = reinterpret_cast<COMPLEX *>(const_cast<std::complex<float>*>(in));
> +  COMPLEX *out_ = reinterpret_cast<COMPLEX *>(out);
> +  long stride = 2;
> +  long direction = self.is_forward_ ? FFT_FORWARD : FFT_INVERSE;
> +  fft_copx(&setup, in_, stride, out_, stride, self.size_[0],
> +	   direction, sal::ESAL);

We should use fft_coptx() instead to avoid clobbering the input. 
Likewise for fft_copdtx.