[vsipl++] [patch] Fast convolution expression templates
Jules Bergmann
jules at codesourcery.com
Fri Apr 13 15:47:39 UTC 2007
Don McCoy wrote:
> The attached patch adds expression templates to support these
> single-line, multiple-row fast convolutions using unique weights for
> each row ('weights_' is a matrix as well).
Don,
This looks good. I have a copule of comments below, mostly adding
additional checks to the evaluators. Otherwise, please check it in.
thanks,
-- Jules
> +template <typename DstBlock,
> + typename T,
> + typename CoeffsMatBlockT,
> + typename MatBlockT,
> + typename Backend1T,
> + typename Workspace1T,
> + typename Backend2T,
> + typename Workspace2T>
> +struct Serial_expr_evaluator<2, DstBlock,
> + const Return_expr_block<2, T,
> + fft::Fft_return_functor<2, T,
> + const Binary_expr_block<2, op::Mult,
> + CoeffsMatBlockT, T,
> + const Return_expr_block<2, T,
> + fft::Fft_return_functor<2, T,
> + MatBlockT,
> + Backend2T, Workspace2T>
> + >, T
> + >,
> + Backend1T, Workspace1T>
> + >,
> + Cbe_sdk_tag
> + >
> +{
> + static char const* name() { return "Cbe_sdk_tag"; }
> +
> + typedef
> + Return_expr_block<2, T,
> + fft::Fft_return_functor<2, T,
> + const Binary_expr_block<2, op::Mult,
> + CoeffsMatBlockT, T,
> + const Return_expr_block<2, T,
> + fft::Fft_return_functor<2, T,
> + MatBlockT,
> + Backend2T, Workspace2T>
> + >, T
> + >,
> + Backend1T, Workspace1T>
> + >
> + SrcBlock;
> +
> + typedef typename DstBlock::value_type dst_type;
> + typedef typename SrcBlock::value_type src_type;
> + typedef typename Block_layout<DstBlock>::complex_type complex_type;
> + typedef impl::cbe::Fastconv_base<2, T, complex_type> fconv_type;
> +
> + static bool const ct_valid = Type_equal<T, std::complex<float>
>::value;
[1] We should enforce that MatBlockT::value_type == complex<float>
> +
> + static bool rt_valid(DstBlock& dst, SrcBlock const& /*src*/)
> + {
> + return fconv_type::rt_valid_size(dst.size(2, 1));
[2] Do we need to enforce any other run-time constaints? Ext_data
access OK?
Unit-stride? etc.
Or are those handled by Fastconv_base?
We should definitely check FFT scaling (see ifdef'd out check in
opt/expr/eval_fastconv). IIRC that check was expensive for some
reason, although I believe it shouldn't be. If it proves to be
expensive here, we can leave it out for the time being.
> Index: benchmarks/fastconv.cpp
> ===================================================================
> - double error = error_db(data, chk);
> + double error = error_db(LOCAL(data), LOCAL(chk));
[3] The global version failed to compile right? I think I've run
across this too. There is a bug in error_db and/or the reductions
that I need to track down. Your work around is better than mine, I
just commented out the test altogether! :)
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
More information about the vsipl++
mailing list