[vsipl++] [patch] matvec: outer, gem, cumsum
Jules Bergmann
jules at codesourcery.com
Sat Oct 1 14:01:55 UTC 2005
Don, Looks good. Please check it in, modulo the two comments below.
thanks, -- Jules
Don McCoy wrote:
> Suggested changes applied. Using a modified approach that applies the
> 'mat_op_type' makes the code more readable and it was easier to extend
> to include op types mat_herm and mat_conj. Also includes
> specializations that allow herm and conj to be performed on real types
> (by doing transpose and nothing respectively).
> Tested under GCC 3.4 successfully. ICPC 8.0 and 9.0 caused failures
> related to handling of complex types.
>
>
>
> + template <typename T0, typename T1, typename T2, typename T3, typename T4,
> + typename Block1, typename Block2, typename Block4>
> + void
> + gemp( T0 alpha, const_Matrix<T1, Block1> A,
> + const_Matrix<T2, Block2> B, T3 beta, Matrix<T4, Block4> C)
> + {
> + assert( A.size(0) == C.size(0) );
> + assert( B.size(1) == C.size(1) );
Also assert that A.size(1) == B.size(0)
(calling dot() does this implicity, but catching errors earlier in the
call chain makes it easier for users to understand the assertion failure).
> +
> + for ( index_type i = A.size(0); i-- > 0; )
> + for ( index_type j = B.size(1); j-- > 0; )
> + C.put(i, j, alpha * dot( A.row(i), B.col(j) ) + beta * C.get(i, j));
> + }
> +
> +
> + /// outer product of two complex vectors
> + template <typename T0,
> + typename T1,
> + typename T2,
> + typename Block1,
> + typename Block2>
> + const_Matrix<typename Promotion<T0, typename Promotion<T1, T2>::type>::type>
> + outer( T0 alpha, const_Vector<std::complex<T1>, Block1> v,
> + const_Vector<std::complex<T2>, Block2> w )
> + VSIP_NOTHROW
> + {
> + typedef Matrix<typename Promotion<T0,
> + typename Promotion<T1, T2>::type>::type> return_type;
I think this should be:
typedef Matrix<typename Promotion<T0,
typename Promotion<std::complex<T1>, std::complex<T2> >::type
>::type> return type;
i.e. promote std::complex<T1> instead of T1, same for T2.
Also, the function return type should be updated too.
> + return_type r( v.size(), w.size(), alpha );
> +
> + for ( index_type i = v.size(); i-- > 0; )
> + for ( index_type j = w.size(); j-- > 0; )
> + r.put( i, j, alpha * v.get(i) * conj(w.get(j)) );
> +
> + return r;
> + }
> +
> +
More information about the vsipl++
mailing list