[vsipl++] SIMD threshold with loop fusion
Jules Bergmann
jules at codesourcery.com
Tue May 15 20:28:56 UTC 2007
Assem Salama wrote:
> Everyone,
> I forgot to attach patch.
Assem,
This looks good. Pusing the Binary_operator_map::apply down into
simd_thresh really increases the coverage of the routine.
Going forward, I would like to be able to support the following.
In some contexts, A > B should evaluate to a bool SIMD value. I.e.
Vector<bool> Z;
Vector<float> A, B, C, D;
Z = A > B;
Z = A > B && C > D;
Right now A > B evaluates to an int SIMD value that holds a bitmask.
This makes sense when used as the predicat for an ite() operator.
Any ideas on how to support both?
- We could force all A > B exprs to be treated as bool, and then have
ite expand the bool back out to a bitmask, but this would be
inefficient. That seems like a bad idea.
- We could have the return type of A > B be determined by how it is
used. I.e. for 'X = ite(A>B, ...)' the return type of 'A > B' would
be int bitmask, but for 'Z = A > B' it would be bool.
- We could do all logic as int bitmasks, than force the 'Z =' to
convert an int mask into a bool at assignment. That might sacrifice a
bit of efficiency in a some cases (like Z = A > B && C > D), but might
be a decent solution.
However, I don't think the current work prevents that, so let's check
it in once you've addressed the feedback below.
Also, check with Stefan for feedback too.
-- Jules
> ------------------------------------------------------------------------
>
> Index: src/vsip/opt/simd/simd.hpp
> ===================================================================
> --- src/vsip/opt/simd/simd.hpp (revision 165174)
> +++ src/vsip/opt/simd/simd.hpp (working copy)
> @@ -167,6 +167,9 @@
> static simd_type gt(simd_type const& v1, simd_type const& v2)
> { return (v1 > v2) ? simd_type(1) : simd_type(0); }
>
> + static simd_type lt(simd_type const& v1, simd_type const& v2)
> + { return (v1 < v2) ? simd_type(1) : simd_type(0); }
> +
[1] This looks good. However, do you think faux-SIMD should have the
same "API" as the real SIMD functions below?
For example, AltiVec vgt returns 0xFFFFFFFF or 0x00000000 for each
position. That can be used as a mask. (What does SSE do?)
Since faux SIMD returns 1 or 0, it can't be used as mask. A generic
routine that uses vgt may not work with faux-simd if it expects
vgt/vlt to return a value valid for a mask.
> static simd_type pack(simd_type const&, simd_type const&)
> { assert(0); }
>
> @@ -998,6 +1019,7 @@
> struct Alg_vbor;
> struct Alg_vbxor;
> struct Alg_vbnot;
> +struct Alg_threshold;
[2] Isn't 'Alg_threshold' already checked in? I'm confused.
>
> template <typename T,
> bool IsSplit,
> Index: src/vsip/opt/simd/threshold.hpp
> ===================================================================
> --- src/vsip/opt/simd/threshold.hpp (revision 171195)
> +++ src/vsip/opt/simd/threshold.hpp (working copy)
> @@ -15,6 +15,7 @@
> #define VSIP_OPT_SIMD_THRESHOLD_HPP
>
> #include <vsip/opt/simd/simd.hpp>
> +#include <vsip/opt/simd/expr_iterator.hpp>
[3] I'm a little wary about including expr_iterator since it might
pull in a lot of unnecessary dependencies.
However, we can fix that later by pusing Binary_operator_map into
a separate header file.
> #include <vsip/core/metaprogramming.hpp>
>
> /***********************************************************************
> @@ -47,19 +48,22 @@
> // Class for threshold
>
> template <typename T,
> + template <typename,typename> class O,
[4] Please use a slightly more descriptive parameter name, such as
"Op", or document.
> bool Is_vectorized>
> struct Simd_threshold;
>
>
> Index: src/vsip/opt/simd/expr_iterator.hpp
> ===================================================================
> +// Proxy for ternary access traits for ite functor
> +template <typename A, typename B, typename C>
> +class Proxy<Ternary_access_traits<A,B,C,ite_functor> >
[5] This is OK. However, since the behavior is governed by
Ternary_operator_map<..., ite_functor>, this could be generalized to
take ite_functor as an arbitrary template parameter
That way, in future when you add other Tenary_access_traits
specializations, this specialization could apply too.
> +{
> + typedef typename A::access_traits access_traits;
> + typedef typename access_traits::value_type value_type;
> + typedef typename Simd_traits<value_type>::simd_type simd_type;
> +
> +public:
> + Proxy(A const &a, B const &b, C const &c)
> + : a_(a), b_(b), c_(c) {}
> +
> + simd_type load() const
> + {
> + typedef typename A::access_traits::return_type return_type;
> + typedef typename A::access_traits::value_type value_type;
> + typedef typename Simd_traits<return_type>::simd_type simd_ret_type;
> + typedef typename Simd_traits<value_type>::simd_type simd_val_type;
> +
> + simd_ret_type a_ret = a_.load(); // this is the mask
> + simd_val_type b = b_.load(); // if true
> + simd_val_type c = c_.load(); // if false
> + // apply the mask
> + return
Ternary_operator_map<value_type,ite_functor>::apply(a_ret,b,c);
> + }
> +
> + void increment(length_type n = 1)
> + {
> + a_.increment(n);
> + b_.increment(n);
> + c_.increment(n);
> + }
> +
> +private:
> + A a_;
> + B b_;
> + C c_;
> +};
> +
> template <typename T>
> struct Iterator
> {
--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
More information about the vsipl++
mailing list