[vsipl++] SIMD all unaligned dispatch
Jules Bergmann
jules at codesourcery.com
Tue Jun 19 12:34:52 UTC 2007
Assem Salama wrote:
> Everyone,
> This patch includes some missing pieces not included in previous patch.
> This should make a fresh checkout compile ok :) I apologize for last
> patch's incompleteness.
Assem,
What is the reason for extending the length of type_list? Is that
needed for this patch?
Rather than add a new evaluator ("all unaligned"), I would like to
have a single evaluator handle the cases where views have the same
alignment (whether it is 0 or N). The only difference between the two
is cleanup code before SIMD processing. Can you make that change and
repost a patch?
Also, did you have a chance to benchmark the iterator change (#4)
below?
-- Jules
>
> Thanks,
> Assem
>
>
> ------------------------------------------------------------------------
>
> Index: src/vsip/opt/simd/expr_evaluator.hpp
> ===================================================================
> + static bool rt_valid(LB& lhs, RB const& rhs)
> + {
> + Ext_data<LB, layout_type> dda(lhs, SYNC_OUT);
> + int lhs_a = simd::Proxy_factory<LB, true>::alignment(lhs);
[1] Instead of calling Proxy_factory::alignment (which internally
creates another Ext_data object -- which is both extra overhead and
potentially undefined), use Simd_traits::alignment_of directly.
> + return (dda.stride(0) == 1 &&
> + simd::Proxy_factory<RB, true>::rt_valid(rhs, lhs_a));
> +
> +
> + }
> +
> + // First, deal with unaligned pointers
> + typename Ext_data<LB, layout_type>::raw_ptr_type raw_ptr =
dda.data();
> + while(simd::Simd_traits<typename
LB::value_type>::alignment_of(raw_ptr) &&
> + n > 0)
> + {
> + lhs.put(size-n, rhs.get(size-n));
> + n--;
> + raw_ptr++;
> + }
[2] What updates the pointers held by lp and rp? They are still
unaligned, right?
Ah, I see. You've changed Proxy::Proxy to force alignment below.
> Index: src/vsip/opt/simd/eval_generic.hpp
> ===================================================================
> --- src/vsip/opt/simd/eval_generic.hpp (revision 174261)
> +++ src/vsip/opt/simd/eval_generic.hpp (working copy)
> @@ -664,6 +664,8 @@
>
> static bool rt_valid(DstBlock& dst, SrcBlock const& src)
> {
> + typedef simd::Simd_traits<typename SrcBlock::value_type> simd;
> +
> // check if all data is unit stride
> Ext_data<DstBlock, dst_lp> ext_dst(dst, SYNC_OUT);
> Ext_data<Block1, a_lp> ext_a(src.first().left(), SYNC_IN);
> @@ -672,7 +674,11 @@
> ext_a.stride(0) == 1 &&
> ext_b.stride(0) == 1 &&
> // make sure (A op B, A, k)
> - (&(src.first().left()) == &(src.second())));
> + (&(src.first().left()) == &(src.second())) &&
> + // make sure everyting is aligned!
> + !simd::alignment_of(ext_dst.data()) &&
> + !simd::alignment_of(ext_a.data()) &&
> + !simd::alignment_of(ext_b.data()));
[3] Doesn't threshold handle initial unaligned values? If so, it is
sufficient to check that dst, a, and b all have the same alignment.
> static void exec(DstBlock& dst, SrcBlock const& src)
> Index: src/vsip/opt/simd/expr_iterator.hpp
> ===================================================================
> --- src/vsip/opt/simd/expr_iterator.hpp (revision 174261)
> +++ src/vsip/opt/simd/expr_iterator.hpp (working copy)
> @@ -268,13 +268,14 @@
> simd_type load() const
> { return simd::perm(x0_, x1_, sh_); }
>
> - void increment(length_type n = 1)
> + //void increment(length_type n = 1)
> + void increment()
> {
> - ptr_unaligned_ += n * Simd_traits<value_type>::vec_size;
> - ptr_aligned_ += n;
> + ptr_unaligned_ += Simd_traits<value_type>::vec_size;
> + ptr_aligned_++;
>
> // update x0
> - x0_ = (n == 1)? x1_:simd::load((value_type*)ptr_aligned_);
> + x0_ = x1_;
[4] Did you ever benchmark the difference between these two?
>
> - Proxy(value_type *ptr) : ptr_(ptr) {}
> + Proxy(value_type *ptr) : ptr_(ptr)
> + {
> + // Force alignment of pointer.
> + intptr_t int_ptr = (intptr_t)ptr_;
> + int_ptr &= ~(Simd_traits<value_type>::alignment-1);
> + ptr_ = (value_type*) int_ptr;
> + }
> +
[5] For LValue_access_traits, this ignores the IsAligned template
parameter. since we appear to only handle the case where the LHS
is aligned, we should specialize this for IsAligned = true.
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
More information about the vsipl++
mailing list