[vsipl++] [patch] matvec: outer, gem, cumsum
Jules Bergmann
jules at codesourcery.com
Tue Sep 27 22:56:48 UTC 2005
Don McCoy wrote:
> The attached patch rounds out the functionality of [math.matvec] with
> the exception of a few of the matrix-vector product functions. Since
> those are implemented in a separate file, this patch stands by itself
> pretty well.
Don,
gemp and gems need to support the mat_conj and mat_herm mat_op_types as
well. (The spec is a bit confusing. [math.matvec.gem]/3 defines the 4
mat_op_types: mat_ntrans, mat_trans, mat_herm, and mat_conj. gemp's
requirements than say that OpA and OpB must be mat_ntrans or mat_trans
unless T is complex. The implication is that if T is complex, OpA and
OpB can be mat_herm and mat_conj as well).
The approach you've taken for gemp is fine, it is definitely possible to
plug those additional cases in. However, since the number of cases is
multiplicative (size(OpA) x size(OpB)), you might want to separate the
handling of OpA and OpB to simplify things.
One way to do this is to define a class that applies a mat_op to a
single matrix:
template <mat_op_type OpT,
typename T,
typename Block>
struct Apply_mat_op;
template <typename T,
typename Block>
struct Apply_mat_op<mat_ntrans, T, Block>
{
typedef typename const_Matrix<T, Block> result_type;
static result_type
exec(const_Matrix<T, Block> m) VSIP_NOTHROW
{
return m;
}
};
template <typename T,
typename Block>
struct Apply_mat_op<mat_trans, T, Block>
{
typedef typename const_Matrix<T, Block>::transpose_type result_type;
static result_type
exec(const_Matrix<T, Block> m) VSIP_NOTHROW
{
return m.transpose();
}
};
template <typename T,
typename Block>
struct Apply_mat_op<mat_herm, complex<T>, Block>
// this definition only makes mat_herm only valid for complex<T>
{
...
};
You could optionaly provide a convenience function to use Apply_mat_op:
template <mat_op_type OpT,
typename T,
typename Block>
typename Apply_mat_op<OpT, T, Block>::result_type
apply_mat_op(...)
{
return Apply_mat_op<OpT, T, Block>::exec(m);
}
Now, you could implement the top-level gemp as:
void
gemp(
T0 alpha,
const_Matrix<T1, Block1> A,
const_Matrix<T2, Block2> B,
T3 beta,
Matrix<T4, Block4> C)
VSIP_NOTHROW
{
// equivalent to C = alpha * OpA(A) * OpB(B) + beta * C
impl::gemp(alpha, apply_mat_op<OpA>(A), apply_mat_op<OpB>(B),
beta, C);
}
>
>
> ------------------------------------------------------------------------
> +
> +
> + template <dimension_type d,
> + typename T0,
> + typename T1,
> + typename Block0,
> + typename Block1>
> + void
> + cumsum(
> + const_Vector<T0, Block0> v,
> + Vector<T1, Block1> w)
> + VSIP_NOTHROW
> + {
> + // Effects: w has values equaling the cumulative sum of values in v.
> + //
> + // If View is Vector, d is ignored and, for
> + // 0 <= i < v.size(),
> + // w.get(i) equals the sum over 0 <= j <= i of v.get(j)
> + assert( v.size() == w.size() );
> +
> + for ( index_type i = 0; i < v.size(); ++i )
> + {
> + T1 sum = T0();
> + for ( index_type j = 0; j <= i; ++j )
> + sum += v.get(j);
> + w.put(i, sum);
> + }
You could avoid recomputing the sum each time by keeping a running total:
T1 sum = T0();
for (index_type i=0; ...)
{
sum += v.get(i);
w.put(i, sum);
}
You should be able to something similar for matrix cumsum.
More information about the vsipl++
mailing list