[vsipl++] [patch] matvec: outer, gem, cumsum

Tue Sep 27 22:56:48 UTC 2005

Don McCoy wrote:
> The attached patch rounds out the functionality of [math.matvec] with 
> the exception of a few of the matrix-vector product functions.  Since 
> those are implemented in a separate file, this patch stands by itself 
> pretty well.

Don,

gemp and gems need to support the mat_conj and mat_herm mat_op_types as 
well.  (The spec is a bit confusing.  [math.matvec.gem]/3 defines the 4 
mat_op_types: mat_ntrans, mat_trans, mat_herm, and mat_conj.  gemp's 
requirements than say that OpA and OpB must be mat_ntrans or mat_trans 
unless T is complex.  The implication is that if T is complex, OpA and 
OpB can be mat_herm and mat_conj as well).

The approach you've taken for gemp is fine, it is definitely possible to 
plug those additional cases in.  However, since the number of cases is 
multiplicative (size(OpA) x size(OpB)), you might want to separate the 
handling of OpA and OpB to simplify things.

One way to do this is to define a class that applies a mat_op to a 
single matrix:

template <mat_op_type OpT,
           typename    T,
           typename    Block>
struct Apply_mat_op;

template <typename T,
           typename Block>
struct Apply_mat_op<mat_ntrans, T, Block>
{
   typedef typename const_Matrix<T, Block> result_type;

   static result_type
   exec(const_Matrix<T, Block> m) VSIP_NOTHROW
   {
     return m;
   }
};

template <typename T,
           typename Block>
struct Apply_mat_op<mat_trans, T, Block>
{
   typedef typename const_Matrix<T, Block>::transpose_type result_type;

   static result_type
   exec(const_Matrix<T, Block> m) VSIP_NOTHROW
   {
     return m.transpose();
   }
};

template <typename T,
           typename Block>
struct Apply_mat_op<mat_herm, complex<T>, Block>
// this definition only makes mat_herm only valid for complex<T>
{
...
};

You could optionaly provide a convenience function to use Apply_mat_op:

template <mat_op_type OpT,
           typename    T,
           typename    Block>
typename Apply_mat_op<OpT, T, Block>::result_type
apply_mat_op(...)
{
   return Apply_mat_op<OpT, T, Block>::exec(m);
}

Now, you could implement the top-level gemp as:

void
gemp(
   T0 alpha,
   const_Matrix<T1, Block1> A,
   const_Matrix<T2, Block2> B,
   T3 beta,
   Matrix<T4, Block4> C)
      VSIP_NOTHROW
{
   // equivalent to C = alpha * OpA(A) * OpB(B) + beta * C
   impl::gemp(alpha, apply_mat_op<OpA>(A), apply_mat_op<OpB>(B),
               beta, C);
}

> 
> 
> ------------------------------------------------------------------------

> + 
> + 
> + template <dimension_type d,
> +           typename T0,
> +           typename T1,
> +           typename Block0,
> +           typename Block1>
> + void
> + cumsum(
> +   const_Vector<T0, Block0> v,
> +   Vector<T1, Block1> w) 
> +     VSIP_NOTHROW
> + {
> +   //  Effects: w has values equaling the cumulative sum of values in v. 
> +   //
> +   //  If View is Vector, d is ignored and, for 
> +   //    0 <= i < v.size(), 
> +   //      w.get(i) equals the sum over 0 <= j <= i of v.get(j)
> +   assert( v.size() == w.size() );
> + 
> +   for ( index_type i = 0; i < v.size(); ++i )
> +   {
> +     T1 sum = T0();
> +     for ( index_type j = 0; j <= i; ++j )
> +       sum += v.get(j);
> +     w.put(i, sum);
> +   }

You could avoid recomputing the sum each time by keeping a running total:

	T1 sum = T0();
	for (index_type i=0; ...)
	{
	  sum += v.get(i);
	  w.put(i, sum);
	}

You should be able to something similar for matrix cumsum.