From Wayne.Haney at gd-ais.com  Tue Mar  6 14:53:00 2007
From: Wayne.Haney at gd-ais.com (Haney, Wayne W.)
Date: Tue, 6 Mar 2007 09:53:00 -0500
Subject: Needing help with VSIPL++ nomenclature...
Message-ID: <3D54FD86EBFE0540BDA2048CE1A4F258AA06DA@vaff06-mail01.ad.gd-ais.com>

The last few days I've been researching VSIPL & VSIPL++ in order to
integrate into our Submarine SONAR code. The VSIPL++ specification is
confusing with the definition of what a Domain<> is. Could you please
elucidate a bit more on what Domain<>s are? Any information you can give
is greatly appreciated.

 
Thank you!

Wayne Haney

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070306/659baa2a/attachment.html>

From jules at codesourcery.com  Tue Mar  6 16:03:10 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 06 Mar 2007 11:03:10 -0500
Subject: [patch] Test updates
Message-ID: <45ED90BE.7030104@codesourcery.com>

This patch:
  - Splits scalar-view.cpp into 4 separate tests.  Compiling scalar-view
    was taking 2 GB of core!

  - Adds TEST_LEVEL=0 cases for some long running tests.

  - Disables some double-precision tests when TEST_DOUBLE is not defined.

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070306/2ae68431/attachment.ksh>

From don at codesourcery.com  Tue Mar  6 16:50:54 2007
From: don at codesourcery.com (Don McCoy)
Date: Tue, 06 Mar 2007 09:50:54 -0700
Subject: [vsipl++] Needing help with VSIPL++ nomenclature...
In-Reply-To: <3D54FD86EBFE0540BDA2048CE1A4F258AA06DA@vaff06-mail01.ad.gd-ais.com>
References: <3D54FD86EBFE0540BDA2048CE1A4F258AA06DA@vaff06-mail01.ad.gd-ais.com>
Message-ID: <45ED9BEE.3050908@codesourcery.com>

Haney, Wayne W. wrote:
>
> The last few days I?ve been researching VSIPL & VSIPL++ in order to 
> integrate into our Submarine SONAR code. The VSIPL++ specification is 
> confusing with the definition of what a Domain<> is. Could you please 
> elucidate a bit more on what Domain<>s are? Any information you can 
> give is greatly appreciated.
>
Domains provide an efficient way to specify a subset of the elements in 
a particular view, i.e. a matrix or a vector. In specifying a domain, 
one gives the starting index, an element-to-element stride and a length. 
Using domains is one way to extract a "sub-view" from a set of data, for 
cases where built-in methods are inadequate.

Normally, subviews are expressed through various, more convenient, view 
member functions which depend on type of view being considered. For 
example, complex views provide the real() and imag() subviews that one 
would expect. These could also be expressed with domains, but one must 
then be concerned with whether the data is held in split or interleaved 
forms.

Other examples of subviews are the row() and col() operators for 
matrices, which again may be expressed using domains, but only given 
foreknowledge of whether the data is stored in row-major or col-major 
form, etc...

In short, domains are an important construct but are not often needed 
(at least in application programs) due to the high-level syntax provided 
by the VSIPL++ standard.

I'd encourage you to take a look at the tutorial if you haven't already. 
It contains an in-depth example (fast convolution), which is relevant to 
a number of signal processing algorithms. The reference section included 
in part two discusses views, subviews and domains among other things.

http://www.codesourcery.com/vsiplplusplus/1.3/tutorial.pdf


I personally find it easier to work from things like the tutorial or 
from functional examples. We do not at this time have as many examples 
as I would like, but we are adding more all the time. If you have any 
specific requests, please don't hesitate to contact us again. We'll do 
everything we can to assist you in your evaluation! Feedback is 
appreciated as well.

Regards,

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712


From jules at codesourcery.com  Wed Mar  7 01:00:09 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 06 Mar 2007 20:00:09 -0500
Subject: [patch] Split fastconv benchmark
Message-ID: <45EE0E99.1070006@codesourcery.com>

This patch splits the Cbe benchmark cases into 
benchmarks/cell/fastconv.cpp, and puts commont bits into 
benchmarks/fastconv.hpp.

Patch applied.

				-- Jules
-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fc.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070306/192a0481/attachment.ksh>

From jules at codesourcery.com  Wed Mar  7 02:01:29 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 06 Mar 2007 21:01:29 -0500
Subject: [patch] Missing file
Message-ID: <45EE1CF9.5090208@codesourcery.com>

This file is used by the cell/fastconv benchmark for block creation.  It 
can create a block using either user-provided storage or library 
provided storage.

Patch applied.
-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ab.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070306/3d286cc9/attachment.ksh>

From jules at codesourcery.com  Wed Mar  7 02:07:52 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 06 Mar 2007 21:07:52 -0500
Subject: [vsipl++] [patch] Missing file
In-Reply-To: <45EE1CF9.5090208@codesourcery.com>
References: <45EE1CF9.5090208@codesourcery.com>
Message-ID: <45EE1E78.5050103@codesourcery.com>

This patch removes a debug assert that used a private member function, 
causing a compile error.

Patch applied.

				-- Jules


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ab2.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070306/b31d78f0/attachment.ksh>

From jules at codesourcery.com  Wed Mar  7 16:24:21 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Wed, 07 Mar 2007 11:24:21 -0500
Subject: [patch] RBO preview
Message-ID: <45EEE735.1000302@codesourcery.com>

This patch
  - adds RBO support,
  - applies it to by-value FFT and FFTM,
  - adds a simple RBO evaluator for expressions like 'A = fft(B)'
    which avoids the temporary and copy,
  - adds fastconv RBO evaluators for the general case using Fftm
    underneath, and the special case when using Cbe Fastconv underneath.
  - adds single-line fastconv case to the fastconv benchmark

This patch is fairly close to ready.  However there are a few bits missing:
  - add distributed support for Return_expr_blocks
  - validation (this does work for the fastconv benchmark)

Comments?

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rbo.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070307/f6773262/attachment.ksh>

From don at codesourcery.com  Wed Mar  7 17:38:17 2007
From: don at codesourcery.com (Don McCoy)
Date: Wed, 07 Mar 2007 10:38:17 -0700
Subject: [patch] support for non-contiguous rows or columns with Cell FFTM
Message-ID: <45EEF889.8090303@codesourcery.com>

Ok to commit?


-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fnc.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070307/f2da808d/attachment.ksh>

From stefan at codesourcery.com  Wed Mar  7 18:32:25 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Wed, 07 Mar 2007 13:32:25 -0500
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45EEF889.8090303@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com>
Message-ID: <45EF0539.7030706@codesourcery.com>

Don McCoy wrote:

> Index: src/vsip/opt/cbe/ppu/fft.cpp
> ===================================================================
> --- src/vsip/opt/cbe/ppu/fft.cpp	(revision 165069)
> +++ src/vsip/opt/cbe/ppu/fft.cpp	(working copy)
> @@ -306,7 +306,20 @@
>  			    length_type, length_type)
>    {
>    }
> -
> +  virtual void query_layout(Rt_layout<2> &rtl_inout)
> +  {
> +    // must have unit stride, but does not have to be dense
> +    rtl_inout.pack = stride_unit;
> +    rtl_inout.order = tuple<0, 1, 2>();

Since we want unit-stride in the direction in which the FFT is taken,
we need to take the axis parameter 'A' into account.
So, for example:

    if (A == 0) rtl_inout.order = tuple<0, 1, 2>();
    else rtl_inout.order = tuple<1, 0, 2>();

> +    rtl_inout.complex = cmplx_inter_fmt;
> +  }
> +  virtual void query_layout(Rt_layout<2> &rtl_in, Rt_layout<2> &rtl_out)
> +  {
> +    // must have unit stride, but does not have to be dense
> +    rtl_in.pack = rtl_out.pack = stride_unit;
> +    rtl_in.order = rtl_out.order = tuple<0, 1, 2>();

Same here.


> +    rtl_in.complex = rtl_out.complex = cmplx_inter_fmt;
> +  }
>  private:
>    rtype scale_;
>    length_type fft_length_;


Regards,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718


From stefan at codesourcery.com  Thu Mar  8 21:57:27 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Thu, 08 Mar 2007 16:57:27 -0500
Subject: patch: conditionalize support for bool and int C-VSIPL views.
Message-ID: <45F086C7.10304@codesourcery.com>

The attached patch adds checks for bool and int view creation
in the used C-VSIPL library, and conditionalizes appropriate
View_traits<>.

OK to check in ?

Thanks,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cvsip.patch
Type: text/x-patch
Size: 16015 bytes
Desc: not available
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070308/c8da768d/attachment.bin>

From jules at codesourcery.com  Fri Mar  9 16:05:14 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Fri, 09 Mar 2007 11:05:14 -0500
Subject: [vsipl++] patch: conditionalize support for bool and int C-VSIPL
 views.
In-Reply-To: <45F086C7.10304@codesourcery.com>
References: <45F086C7.10304@codesourcery.com>
Message-ID: <45F185BA.4070304@codesourcery.com>

Stefan Seefeld wrote:
> The attached patch adds checks for bool and int view creation
> in the used C-VSIPL library, and conditionalizes appropriate
> View_traits<>.
> 
> OK to check in ?

Stefan, Yes, please check it in. thanks, -- Jules

--
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From don at codesourcery.com  Fri Mar  9 17:43:27 2007
From: don at codesourcery.com (Don McCoy)
Date: Fri, 09 Mar 2007 10:43:27 -0700
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45EF0539.7030706@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com>
Message-ID: <45F19CBF.9030203@codesourcery.com>

Stefan Seefeld wrote:
> Since we want unit-stride in the direction in which the FFT is taken,
> we need to take the axis parameter 'A' into account.
>   
Thanks for catching that Stefan.  I've fixed the attached patch and 
tested it locally.  I'm presently adding test cases to the FFT "backend" 
test (tests/fft_be.cpp), but will include those changes in a separate patch.

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fnc2.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070309/3de9763f/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fnc2.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070309/3de9763f/attachment-0001.ksh>

From don at codesourcery.com  Mon Mar 12 19:47:39 2007
From: don at codesourcery.com (Don McCoy)
Date: Mon, 12 Mar 2007 13:47:39 -0600
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F19CBF.9030203@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com>
Message-ID: <45F5AE5B.1050604@codesourcery.com>

Don McCoy wrote: 
> Thanks for catching that Stefan.  I've fixed the attached patch and 
> tested it locally.  I'm presently adding test cases to the FFT 
> "backend" test (tests/fft_be.cpp), but will include those changes in a 
> separate patch.

And here is that patch.  In putting this together I found and fixed two 
defects in the Cell FFT code.  Yay!  The tests for FFTM also cover using 
column-major data when doing column-wise FFT's, including the case where 
the columns are not dense (tightly packed), as is the case when a 
subview of a matrix is taken (i.e. every other row, etc...).

Note also this patch includes the changes from the previous patch as well.

The changes in FFT require that FFT_BE_TESTS be defined in order to run 
the backend-specific tests.  Without it, the test presently fails to 
compile because the CBE backend does not support 2-D and 3-D FFT's as of 
yet.

Regards,

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cbe_tests.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070312/f6241dc0/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cbe_tests.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070312/f6241dc0/attachment-0001.ksh>

From stefan at codesourcery.com  Mon Mar 12 20:10:29 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Mon, 12 Mar 2007 16:10:29 -0400
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F5AE5B.1050604@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com> <45F5AE5B.1050604@codesourcery.com>
Message-ID: <45F5B3B5.6090403@codesourcery.com>

Don McCoy wrote:

> The changes in FFT require that FFT_BE_TESTS be defined in order to run
> the backend-specific tests.  Without it, the test presently fails to
> compile because the CBE backend does not support 2-D and 3-D FFT's as of
> yet.

Is that because the cbe evaluator doesn't invalidate 2D and 3D operations,
or because you didn't configure with --enable-fft=no_fft (or any real fft,
for that matter) ?

I'm asking because the way fft_be is designed is to only generate runtime-errors,
so all those 'little' subtests can be driven by a single executable.

Thanks,
		Stefan


-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718


From stefan at codesourcery.com  Mon Mar 12 22:37:26 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Mon, 12 Mar 2007 18:37:26 -0400
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F5AE5B.1050604@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com> <45F5AE5B.1050604@codesourcery.com>
Message-ID: <45F5D626.4080609@codesourcery.com>

Don McCoy wrote:

> Index: src/vsip/opt/cbe/ppu/fft.cpp
> ===================================================================
> --- src/vsip/opt/cbe/ppu/fft.cpp	(revision 165340)
> +++ src/vsip/opt/cbe/ppu/fft.cpp	(working copy)
> @@ -53,18 +53,16 @@
>    fft(std::complex<T> const* in, std::complex<T>* out, 
>      length_type length, T scale, int exponent)
>    {
> -    // Note: the twiddle factors require only 1/4 the memory of the input and 
> -    // output arrays.
>      Fft_params fftp;
>      fftp.direction = (exponent == -1 ? fwd_fft : inv_fft);
>      fftp.elements = length;
>      fftp.scale = scale;
>      fftp.ea_twiddle_factors = 
>        reinterpret_cast<unsigned long long>(twiddle_factors_.get());
> -    fftp.ea_input_buffer    = 0;
> -    fftp.ea_output_buffer   = 0;
> -    fftp.in_blk_stride      = 0;
> -    fftp.out_blk_stride     = 0;
> +    fftp.ea_input_buffer    = reinterpret_cast<unsigned long long>(in);
> +    fftp.ea_output_buffer   = reinterpret_cast<unsigned long long>(out);
> +    fftp.in_blk_stride      = 1;  // not applicable in the single FFT case
> +    fftp.out_blk_stride     = 1;
>  
>      Task_manager *mgr = Task_manager::instance();
>      // The stack size is determined by accounting for the *worst case*
> @@ -76,11 +74,9 @@
>         sizeof(Fft_params),
>         sizeof(complex<T>)*length*2, 

Could you please add a comment explaining this factor '2' ? It isn't obvious...

>         sizeof(complex<T>)*length,
> -       false);
> -    Workblock block = task.create_block();
> +       true);
> +    Workblock block = task.create_multi_block(1);
>      block.set_parameters(fftp);
> -    block.add_input(in, length);
> -    block.add_output(out, length);
>      task.enqueue(block);
>      task.sync();
>    }

[...]

> Index: tests/fft_be.cpp
> ===================================================================
> --- tests/fft_be.cpp	(revision 165340)
> +++ tests/fft_be.cpp	(working copy)

[...]

> @@ -152,24 +166,33 @@
>    static Domain<D> out_dom(Domain<D> const &dom) { return dom;}
>  };
>  
> -template <typename T>
> +template <typename T,
> +          typename OrderT>
>  const_Vector<T, impl::Generator_expr_block<1, impl::Ramp_generator<T> > const>
>  ramp(Domain<1> const &dom) 
>  { return vsip::ramp(T(0.), T(1.), dom.length() * dom.stride());}
>  
> -template <typename T>
> -Matrix<T>
> +template <typename T,
> +          typename OrderT>
> +Matrix<T, Dense<2, T, OrderT> >
>  ramp(Domain<2> const &dom) 
>  {
> +  typedef OrderT order_type;
> +  typedef Dense<2, T, order_type> block_type;
>    length_type rows = dom[0].length() * dom[0].stride();
>    length_type cols = dom[1].length() * dom[1].stride();
> -  Matrix<T> m(rows, cols);
> -  for (size_t r = 0; r != rows; ++r)
> -    m.row(r) = ramp(T(r), T(1.), m.size(1));
> +  Matrix<T, block_type> m(rows, cols);
> +  if (impl::Type_equal<row2_type, order_type>::value)
> +    for (size_t r = 0; r != rows; ++r)
> +      m.row(r) = ramp(T(r), T(1.), m.size(1));
> +  else
> +    for (size_t c = 0; c != cols; ++c)
> +      m.col(c) = ramp(T(c), T(1.), m.size(0));
>    return m;
>  }

While I like the addition of the dimension-ordering parameter, I think
the conditional initialization here is a bit misleading: The value of
matrix(x, y) should be the same, no matter its dimension-ordering.

> -template <typename T>
> +template <typename T,
> +          typename OrderT>
>  Tensor<T>
>  ramp(Domain<3> const &dom) 
>  {


[...]

> @@ -222,7 +246,7 @@
>    typedef typename rfft_type<T, F, 1, A>::I I;
>    static typename impl::View_of_dim<D, I, Dense<D, I> >::type
>    create(Domain<D> const &dom) 
> -  { return ramp<I>(rfft_type<T, F, 1, A>::in_dom(dom));}
> +    { return ramp<I, row1_type>(rfft_type<T, F, 1, A>::in_dom(dom));}
>  };

I think with the above in place we should go all the way and push the order parameter
up to the highest level, so all tests get run twice, once for row-major and once for
col-major. That gives maximum coverage.

>  // Real inverse 2D FFT.
> @@ -238,7 +262,7 @@
>      length_type rows2 = rows/2+1;
>      length_type cols2 = cols/2+1;
>  
> -    Matrix<I> input = ramp<I>(rfft_type<T, F, 1, A>::in_dom(dom));
> +    Matrix<I> input = ramp<I, row1_type>(rfft_type<T, F, 1, A>::in_dom(dom));
>      if (rfft_type<T, F, 1, A>::axis == 0)
>      {
>        // Necessary symmetry:
> @@ -330,8 +354,8 @@
>    typedef impl::Fast_block<D, CT, layout_type> block_type;
>    typedef typename impl::View_of_dim<D, CT, block_type>::type View;
>  
> -  View data = ramp<T>(dom);
> -  View ref = ramp<T>(dom);
> +  View data = ramp<T, row1_type>(dom);
> +  View ref = ramp<T, row1_type>(dom);
>  
>    typename View::subview_type sub_data = data(dom);
>  
> @@ -357,9 +381,10 @@
>  {
>    typedef typename T::I I;
>    typedef typename T::O O;
> -  typedef typename impl::Layout<2, row1_type,
> +  typedef typename T::order_type order_type;
> +  typedef typename impl::Layout<2, order_type,
>      impl::Stride_unit_dense, typename T::i_format> i_layout_type;
> -  typedef typename impl::Layout<2, row1_type,
> +  typedef typename impl::Layout<2, order_type,
>      impl::Stride_unit_dense, typename T::o_format> o_layout_type;
>    return_mechanism_type const r = by_reference;
>  
> @@ -371,7 +396,7 @@
>    Domain<2> in_dom = T::in_dom(dom);
>    Domain<2> out_dom = T::out_dom(dom);
>  
> -  Iview input = input_creator<T, 2>::create(dom);
> +  Iview input = input_creator<T, 2, order_type>::create(dom);
>    typename Iview::subview_type sub_input = input(in_dom);
>  
>    Oview output = empty<O>(out_dom);
> @@ -408,8 +433,8 @@
>    typedef impl::Fast_block<2, CT, layout_type> block_type;
>    typedef Matrix<CT, block_type> View;
>  
> -  View data = ramp<T>(dom);
> -  View ref = ramp<T>(dom);
> +  View data = ramp<T, row1_type>(dom);
> +  View ref = ramp<T, row1_type>(dom);
>  
>    typename View::subview_type sub_data = data(dom);
>  
> @@ -498,6 +523,13 @@
>    fft_in_place<T, F, 1, cvsip>(Domain<1>(0, 2, 8));
>  #endif
>  
> +#if VSIP_IMPL_CBE_SDK
> +  std::cout << "testing fwd in_place cbe...";
> +  fft_in_place<T, F, -1, cbe>(Domain<1>(32));
> +  std::cout << "testing inv in_place cbe...";
> +  fft_in_place<T, F, 1, cbe>(Domain<1>(32));
> +#endif
> +
>  #if VSIP_IMPL_FFTW3
>    std::cout << "testing c->c fwd by_ref fftw...";
>    fft_by_ref<cfft_type<T, F, -1>, fftw>(Domain<1>(16));
> @@ -558,7 +590,14 @@
>    fft_by_ref<rfft_type<T, F, 1, 0>, cvsip>(Domain<1>(0, 2, 8));
>  #endif
>  
> +#if VSIP_IMPL_CBE_SDK
> +  std::cout << "testing c->c fwd by_ref cbe...";
> +  fft_by_ref<cfft_type<T, F, -1>, cbe>(Domain<1>(32));
> +  std::cout << "testing c->c inv by_ref cbe...";
> +  fft_by_ref<cfft_type<T, F, 1>, cbe>(Domain<1>(32));
>  #endif
> +
> +#endif
>  }
>  
>  template <typename T, typename F>
> @@ -902,6 +941,23 @@
>    fftm_in_place<T, F, 1, 1, cvsip>(Domain<2>(8, 16));
>  #endif
>  
> +#if VSIP_IMPL_CBE_SDK
> +// Note: column-wise FFTs need to be performed on
> +// col-major data in this case.  These are commented
> +// out until fftm_in_place is changed to be like
> +// fftm_by_ref, where the cfft_type<> template allows
> +// the dimension order to be specified.

That's OK, though I believe we should fix that as soon as possible,
such that fft_be.cpp remains as much backend-agnostic as possible,
i.e. no backend-specific tests creep in.

(I can complete that if you are busy finishing other bits.)

> +
> +//  std::cout << "testing fwd on cols in_place cbe...";
> +//  fftm_in_place<T, F, -1, 0, cbe>(Domain<2>(64, 32));
> +  std::cout << "testing fwd on rows in_place cbe...";
> +  fftm_in_place<T, F, -1, 1, cbe>(Domain<2>(32, 64));
> +//  std::cout << "testing inv on cols in_place cbe...";
> +//  fftm_in_place<T, F, 1, 0, cbe>(Domain<2>(64, 32));
> +  std::cout << "testing inv on rows in_place cbe...";
> +  fftm_in_place<T, F, 1, 1, cbe>(Domain<2>(32, 64));
> +#endif
> +
>  #if VSIP_IMPL_FFTW3
>    std::cout << "testing c->c fwd 0 by_ref fftw...";
>    fftm_by_ref<cfft_type<T, F, -1, 0>, fftw>(Domain<2>(8, 16));
> @@ -978,7 +1034,24 @@
>    fftm_by_ref<rfft_type<T, F, 1, 1>, cvsip> (Domain<2>(4, 16));
>  #endif
>  
> +#if VSIP_IMPL_CBE_SDK
> +  std::cout << "testing c->c fwd on cols by_ref cbe...";
> +  fftm_by_ref<cfft_type<T, F, -1, 0, col2_type>, cbe>(Domain<2>(32, 64));
> +  fftm_by_ref<cfft_type<T, F, -1, 0, col2_type>, cbe>(Domain<2>(Domain<1>(32), Domain<1>(0, 2, 32)));
> +  std::cout << "testing c->c fwd on rows by_ref cbe...";
> +  fftm_by_ref<cfft_type<T, F, -1, 1, row2_type>, cbe>(Domain<2>(32, 64));
> +  fftm_by_ref<cfft_type<T, F, -1, 1, row2_type>, cbe>(Domain<2>(Domain<1>(0, 2, 32), Domain<1>(64)));
> +  std::cout << "testing c->c inv 0 by_ref cbe...";
> +  fftm_by_ref<cfft_type<T, F, 1, 0, col2_type>, cbe>(Domain<2>(32, 64));
> +  fftm_by_ref<cfft_type<T, F, 1, 0, col2_type>, cbe>(Domain<2>(Domain<1>(32), Domain<1>(0, 2, 32)));
> +  std::cout << "testing c->c inv 1 by_ref cbe...";
> +  fftm_by_ref<cfft_type<T, F, 1, 1, row2_type>, cbe>(Domain<2>(32, 64));
> +  fftm_by_ref<cfft_type<T, F, 1, 1, row2_type>, cbe>(Domain<2>(Domain<1>(0, 2, 32), Domain<1>(64)));
>  #endif
> +
> +
> +
> +#endif
>  }
>  
>  int main(int argc, char **argv)

Same here.

Thanks,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718


From jules at codesourcery.com  Tue Mar 13 16:22:40 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 13 Mar 2007 12:22:40 -0400
Subject: [patch] RBO - Re: [vsipl++] [patch] RBO preview
In-Reply-To: <45EEE735.1000302@codesourcery.com>
References: <45EEE735.1000302@codesourcery.com>
Message-ID: <45F6CFD0.3090807@codesourcery.com>

This is an updated RBO patch, that should be ready to check in.  It adds 
support for distributed expressions, and has been validated (all tests 
pass using the IPP/MKL backends on cugel, and all tests either pass or 
use too much VM using the FFTW backend on belgarath).

It also has the following:
  - Fft_return_functor is now templatized by block type, rather than
    view type (it continues to store the operand by block).  This makes
    the conversion from distributed block to local block easier.
  - Return_block and Fft_return_functor properly hide their member
    data, and provide accessor functions.  This necessitates using
    const references/pointers to FFT Workspaces and FFT backends.
    I made the workspace member functions const correct, but did not
    attempt this for the backends.
  - Diagnostics for ext_data.
  - Moves files around: RBO is part of the optimized implementation,
    not the ref-impl.
  - Adds error checking to fastconv benchmark.

It includes some unrelated benchmark updates for characterizing 
performance on the PowerStream.

Ok to commit?

				-- Jules

Jules Bergmann wrote:
> This patch
>  - adds RBO support,
>  - applies it to by-value FFT and FFTM,
>  - adds a simple RBO evaluator for expressions like 'A = fft(B)'
>    which avoids the temporary and copy,
>  - adds fastconv RBO evaluators for the general case using Fftm
>    underneath, and the special case when using Cbe Fastconv underneath.
>  - adds single-line fastconv case to the fastconv benchmark
> 
> This patch is fairly close to ready.  However there are a few bits missing:
>  - add distributed support for Return_expr_blocks
>  - validation (this does work for the fastconv benchmark)
> 
> Comments?

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rbo.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070313/aa32fbd9/attachment.ksh>

From jules at codesourcery.com  Tue Mar 13 18:18:47 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 13 Mar 2007 14:18:47 -0400
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F5AE5B.1050604@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com> <45F5AE5B.1050604@codesourcery.com>
Message-ID: <45F6EB07.2060207@codesourcery.com>


Don,

These change to ppu/fft.cpp looks good.  I have a minor suggestion
below, but otherwise please check it in.

I'll defer to Stefan on the fft_be.cpp changes.  Once he is happy,
please check them in too.

				-- Jules

 > Index: src/vsip/opt/cbe/ppu/fft.cpp
 > ===================================================================

 > -    fftp.ea_input_buffer    = 0;
 > -    fftp.ea_output_buffer   = 0;
 > -    fftp.in_blk_stride      = 0;
 > -    fftp.out_blk_stride     = 0;
 > +    fftp.ea_input_buffer    = reinterpret_cast<unsigned long long>(in);
 > +    fftp.ea_output_buffer   = reinterpret_cast<unsigned long long>(out);
 > +    fftp.in_blk_stride      = 1;  // not applicable in the single 
FFT case
 > +    fftp.out_blk_stride     = 1;

I would keep the strides set to 0 if they're aren't applicable.  That
way, if the SPE kernel needs to check them for some reason, it can
assume non-zero strides are valid.

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From don at codesourcery.com  Tue Mar 13 19:18:10 2007
From: don at codesourcery.com (Don McCoy)
Date: Tue, 13 Mar 2007 13:18:10 -0600
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F6EB07.2060207@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com> <45F5AE5B.1050604@codesourcery.com> <45F6EB07.2060207@codesourcery.com>
Message-ID: <45F6F8F2.5040305@codesourcery.com>

Jules Bergmann wrote:
> These change to ppu/fft.cpp looks good.  I have a minor suggestion
> below, but otherwise please check it in.
>
> I'll defer to Stefan on the fft_be.cpp changes.  Once he is happy,
> please check them in too.
>
Committed as attached.  I believe all comments have been addressed.

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fnc3.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070313/a5deacef/attachment.ksh>

From don at codesourcery.com  Tue Mar 13 19:27:45 2007
From: don at codesourcery.com (Don McCoy)
Date: Tue, 13 Mar 2007 13:27:45 -0600
Subject: [patch] SPU timer
In-Reply-To: <45F6D712.8000408@codesourcery.com>
References: <45F6204B.7060107@codesourcery.com> <45F69117.90300@codesourcery.com> <45F6D712.8000408@codesourcery.com>
Message-ID: <45F6FB31.1040500@codesourcery.com>

This file is not included (presently) from any other file, but it is 
useful for debugging and testing.

Committed.

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: timer.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070313/8e56fdaa/attachment.ksh>

From stefan at codesourcery.com  Wed Mar 14 03:02:56 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Tue, 13 Mar 2007 23:02:56 -0400
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F5AE5B.1050604@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com> <45F5AE5B.1050604@codesourcery.com>
Message-ID: <45F765E0.7000506@codesourcery.com>

Don McCoy wrote:

> The changes in FFT require that FFT_BE_TESTS be defined in order to run
> the backend-specific tests.  Without it, the test presently fails to
> compile because the CBE backend does not support 2-D and 3-D FFT's as of
> yet.

The attached patch fixes the CBE FFT Evaluator to only enable 1D FFTs. With
that the above workaround isn't needed. Checked in.

Regards,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft.hpp.diff
Type: text/x-patch
Size: 1300 bytes
Desc: not available
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070313/bb998643/attachment.bin>

From stefan at codesourcery.com  Wed Mar 14 03:58:40 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Tue, 13 Mar 2007 23:58:40 -0400
Subject: [vsipl++] [patch] support for non-contiguous rows or columns
 with Cell FFTM
In-Reply-To: <45F5D626.4080609@codesourcery.com>
References: <45EEF889.8090303@codesourcery.com> <45EF0539.7030706@codesourcery.com> <45F19CBF.9030203@codesourcery.com> <45F5AE5B.1050604@codesourcery.com> <45F5D626.4080609@codesourcery.com>
Message-ID: <45F772F0.2050806@codesourcery.com>

Please find attached a cleanup patch. (Checked in.)
Comments below...

Stefan Seefeld wrote:
> Don McCoy wrote:

> 
>> Index: tests/fft_be.cpp
>> ===================================================================
>> --- tests/fft_be.cpp	(revision 165340)
>> +++ tests/fft_be.cpp	(working copy)
> 
> [...]
> 
>> @@ -152,24 +166,33 @@
>>    static Domain<D> out_dom(Domain<D> const &dom) { return dom;}
>>  };
>>  
>> -template <typename T>
>> +template <typename T,
>> +          typename OrderT>
>>  const_Vector<T, impl::Generator_expr_block<1, impl::Ramp_generator<T> > const>
>>  ramp(Domain<1> const &dom) 
>>  { return vsip::ramp(T(0.), T(1.), dom.length() * dom.stride());}
>>  
>> -template <typename T>
>> -Matrix<T>
>> +template <typename T,
>> +          typename OrderT>
>> +Matrix<T, Dense<2, T, OrderT> >
>>  ramp(Domain<2> const &dom) 
>>  {
>> +  typedef OrderT order_type;
>> +  typedef Dense<2, T, order_type> block_type;
>>    length_type rows = dom[0].length() * dom[0].stride();
>>    length_type cols = dom[1].length() * dom[1].stride();
>> -  Matrix<T> m(rows, cols);
>> -  for (size_t r = 0; r != rows; ++r)
>> -    m.row(r) = ramp(T(r), T(1.), m.size(1));
>> +  Matrix<T, block_type> m(rows, cols);
>> +  if (impl::Type_equal<row2_type, order_type>::value)
>> +    for (size_t r = 0; r != rows; ++r)
>> +      m.row(r) = ramp(T(r), T(1.), m.size(1));
>> +  else
>> +    for (size_t c = 0; c != cols; ++c)
>> +      m.col(c) = ramp(T(c), T(1.), m.size(0));
>>    return m;
>>  }
> 
> While I like the addition of the dimension-ordering parameter, I think
> the conditional initialization here is a bit misleading: The value of
> matrix(x, y) should be the same, no matter its dimension-ordering.

Having another look at that code I realized that the layout of the
views created by ramp() (and input_creator::create(), for that matter),
doesn't play any role in the actual tests, as they are assigned to other
views only. Thus, I removed the dimension-ordering parameter from the above,
only adding it to the harness in fft_by_ref and fftm_by_ref.
I still need to change the way fft_in_place as well as fftm_in_place handle
their template parameters, so I can easily add the dimension-ordering there,
too, but I'll defer that to some later point.

Thanks,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft_be.cpp.diff
Type: text/x-patch
Size: 4065 bytes
Desc: not available
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070313/c10d496a/attachment.bin>

From jules at codesourcery.com  Fri Mar 16 15:22:38 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Fri, 16 Mar 2007 11:22:38 -0400
Subject: [patch] Cell fixes
Message-ID: <45FAB63E.7000407@codesourcery.com>

This patch
  - Adds loop fusion init/fini calls before block copy uses get to
    access values.  This is necessary if copying from an expression
    block with Return_blocks.

    This is a temporary fix.  The real fix is to make get/put do the
    right thing for Return_blocks, that is check if the result has been
    computed.  In cases where get will be called multiple times that this
    overhead would have an impact (such as loop fusion evaluators), an
    expression tree transformation would be done to replace the return
    blocks with regular blocks.  However, this is a day or two of work
    so I've created an issue (#132).

  - Pulls additional command line arguments from the SVPP_OPT environment
    variable.

	export SVPP_OPTS="--svpp-num-spes 1"
	./run-program

    Is equiv to

	./run-program --svpp-num-spes 1

    I added this to run 'make check' without using all the 8 SPEs, but it
    should be useful for other things as well.

  - Fix Cbe dispatch for vmul to check if type is supported (some tests
    were attempting to perform double and int vector-multiplies).

Patch applied, but suggestions for simplifying how arguments are pulled 
from the environment are welcome!

				thanks
				-- Jules


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: misc.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070316/ed4adecb/attachment.ksh>

From stefan at codesourcery.com  Fri Mar 16 15:32:54 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Fri, 16 Mar 2007 11:32:54 -0400
Subject: [vsipl++] [patch] Cell fixes
In-Reply-To: <45FAB63E.7000407@codesourcery.com>
References: <45FAB63E.7000407@codesourcery.com>
Message-ID: <45FAB8A6.9010703@codesourcery.com>

Jules Bergmann wrote:

>  - Pulls additional command line arguments from the SVPP_OPT environment
>    variable.
> 
>     export SVPP_OPTS="--svpp-num-spes 1"
>     ./run-program
> 
>    Is equiv to
> 
>     ./run-program --svpp-num-spes 1
> 
>    I added this to run 'make check' without using all the 8 SPEs, but it
>    should be useful for other things as well.

I'm not sure the VSIPL++ library is the best place to do this. If users
want to pass extra arguments by default, why can't they just write a little
wrapper script themselves ? (We may even provide a generic wrapper to do
this, if there is a common use case for it.)
My point is that (shell) scripts are much better suited to do this kind of
command-line argument / environment meddling than C++ code. We could even
make it work more easily for windows that way. :-)


Regards,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718


From jules at codesourcery.com  Fri Mar 16 18:02:07 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Fri, 16 Mar 2007 14:02:07 -0400
Subject: [patch] Avoid invalid DMA sizes for vmul
Message-ID: <45FADB9F.4020605@codesourcery.com>

This patch fixes the cleanup code to avoid DMA sizes that aren't a 
multiple of 16.  This fixes test failures for coverage_binary.  It also 
adds a new regression test that sweeps through vmul sizes from 1 to 128.

Don, is this ok to commit?  Is there a better place than bindings.hpp to 
put the GRANULARITY macro and is_dma_size_ok() function?

I'm currently running a regression now to see how this works out.

				-- Jules


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmasize.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070316/4db08253/attachment.ksh>

From don at codesourcery.com  Fri Mar 16 20:32:51 2007
From: don at codesourcery.com (Don McCoy)
Date: Fri, 16 Mar 2007 14:32:51 -0600
Subject: [vsipl++] [patch] Avoid invalid DMA sizes for vmul
In-Reply-To: <45FADB9F.4020605@codesourcery.com>
References: <45FADB9F.4020605@codesourcery.com>
Message-ID: <45FAFEF3.5040201@codesourcery.com>

Jules Bergmann wrote:
> This patch fixes the cleanup code to avoid DMA sizes that aren't a 
> multiple of 16.  This fixes test failures for coverage_binary.  It 
> also adds a new regression test that sweeps through vmul sizes from 1 
> to 128.
>
> Don, is this ok to commit?  Is there a better place than bindings.hpp 
> to put the GRANULARITY macro and is_dma_size_ok() function?
>
I think this is the right place.  But there is also a bit of code in the 
vmul kernels that deals with any leftover values in cases where the 
length is not a multiple of four floats (16 bytes).  We could probably 
get rid of that now.

-- 
Don McCoy
don (at) CodeSourcery  
(888) 776-0262 / (650) 331-3385, x712


From jules at codesourcery.com  Sat Mar 17 02:46:10 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Fri, 16 Mar 2007 22:46:10 -0400
Subject: [vsipl++] [patch] Avoid invalid DMA sizes for vmul
In-Reply-To: <45FAFEF3.5040201@codesourcery.com>
References: <45FADB9F.4020605@codesourcery.com> <45FAFEF3.5040201@codesourcery.com>
Message-ID: <45FB5672.7020507@codesourcery.com>

Don McCoy wrote:
> Jules Bergmann wrote:
>> This patch fixes the cleanup code to avoid DMA sizes that aren't a 
>> multiple of 16.  This fixes test failures for coverage_binary.  It 
>> also adds a new regression test that sweeps through vmul sizes from 1 
>> to 128.
>>
>> Don, is this ok to commit?  Is there a better place than bindings.hpp 
>> to put the GRANULARITY macro and is_dma_size_ok() function?
>>
> I think this is the right place.  But there is also a bit of code in the 
> vmul kernels that deals with any leftover values in cases where the 
> length is not a multiple of four floats (16 bytes).  We could probably 
> get rid of that now.
> 

Patch applied.  As before, but with removal of cleanup code for float 
vmul kernel.

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmasize.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070316/d244c720/attachment.ksh>

From jules at codesourcery.com  Sun Mar 18 03:51:48 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Sat, 17 Mar 2007 23:51:48 -0400
Subject: [patch] Misc fixes
Message-ID: <45FCB754.7070200@codesourcery.com>

This patch:

  - Fixes the DFT FFT backend to force the input and output layouts to
    have the same complex format.  Previously attempting to use the
    backend when the input and output had different formats resulted in
    an assertion failure in the workspace.  This was causing the
    regressions/fft_expr_arg test to fail.  A new test
    regressions/fft_split_inter was added for more direct coverage.

  - Adds a 'name()' member to the FFT backends.  This is useful for
    debugging (determining which backend is being used).  It may also
    be useful for diagnostics and profiling in the future.

  - Changes the DFT backend to use double-precision internally for
    accumulation.  This fixes precision difference that were arising
    between the DFT backend and the ref::dft routine.  This was causing
    parallel/fftm to fail.  IIRC it was also causing fft_be to fail when
    testing the DFT backend.

  - Checks DMA address alignment.  Address must have 16-byte alignment
    on the Cell.  This caused vmmul test to fail because vmmul redispatch
    generated vector multiplies that were unaligned (i.e. the second row
    of a 5 x 7 matrix of floats).

  - Updates SIMD traits for AltiVec (also tested on PPC 970FX with GCC
    4.1 and PowerPC 7447A with GreenHills), and adds a unit-test for
    SIMD traits that I've been meaning to checkin for some time.

  - Fixes the builtin SIMD vmul routine for split-complex to work
    correctly when the output aliases one of the inputs.  This was
    causing coverage_binary to fail.

    Curiously, ppu-g++ -m32 does not defined __VEC__, while ppu32-g++
    does.

With this patch, all tests should pass on the Cell, with the following 
exceptions:

  - convolution fails with OpenMPI becasue "MPI_BOR reduction not define
    for non-intrinsic type".  Passes in serial build.
  - parallel/fftm likewise.

=> These two appear to be an OpenMPI problem, not a Cell problem.
    We can debug them later.)

  - Some of the fft_ext test cases fail, in particular real->complex

=> I have not debugged this.

  - correlation fails because of a precision error (error_db threshold).

  - ref-impl/fft-coverage fails because of a precision error (test does
    not use error_db, but if it did, it would fail for our usual
    threshold)

=> It looks like the libfft FFT is noisy.  This isn't worth diagnosing
    too much since we'll eventually replace it with a faster FFT.

Also, the regressions/transpose_assign test takes a long time to run. 
Granted, it is doing a lot of transposes and I had optimization turned 
off, but it runs much faster on EM64t.

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fixes.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070317/598379ae/attachment.ksh>

From jules at codesourcery.com  Sun Mar 18 21:17:57 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Sun, 18 Mar 2007 17:17:57 -0400
Subject: [patch] Handle unaligned Fft, Fftm; Split view_functions test.
Message-ID: <45FDAC85.9030203@codesourcery.com>

This patch fixes the Cbe Fft and Fftm backends to request have 16-byte 
alignment, which is necessary for DMA.  It also adds regression tests 
for unaligned Fft and Fftm.

It splits the view_functions test into three smaller tests.  Compiling 
view_functions with optimization turned on was being killed by the 
process killer on snipes.

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: align.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070318/c3759c46/attachment.ksh>

From jules at codesourcery.com  Mon Mar 19 17:15:45 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 19 Mar 2007 13:15:45 -0400
Subject: [patch] Fix bug for aligned rt_layouts
Message-ID: <45FEC541.3020206@codesourcery.com>

This fixes a bug where the layout was larger than the memory allocated 
by an Fftm workspace, leading to memory corruption.

Hurrah for valgrind!

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: layout-bug.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070319/f473178d/attachment.ksh>

From jules at codesourcery.com  Mon Mar 19 21:01:13 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 19 Mar 2007 17:01:13 -0400
Subject: [patch] Quickstart changes
Message-ID: <45FEFA19.1000808@codesourcery.com>

This patch documents the new configure options for Cell.  Does it look 
ok to apply?

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: qs.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070319/9de89f5b/attachment.ksh>

From jules at codesourcery.com  Tue Mar 20 00:00:55 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 19 Mar 2007 20:00:55 -0400
Subject: [patch] Quickstart, + configure/build bits
Message-ID: <45FF2437.6060008@codesourcery.com>

Patch applied.
-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: qs.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070319/afc1af13/attachment.ksh>

From jules at codesourcery.com  Tue Mar 20 00:13:28 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 19 Mar 2007 20:13:28 -0400
Subject: [patch] Bump FFTW to 3.1.2
Message-ID: <45FF2728.2070507@codesourcery.com>

This bumps vendor/fftw to 3.1.2.

3.0.1 + ppu-gcc + altivec did not get along together :(

Bumping to 3.1.2 fixes this, plus should give better performance on 
em64t and altivec.

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ext.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070319/d245de1a/attachment.ksh>

From don at codesourcery.com  Fri Mar 23 17:46:59 2007
From: don at codesourcery.com (Don McCoy)
Date: Fri, 23 Mar 2007 11:46:59 -0600
Subject: [patch] Increase max size of split-complex fast convolution
Message-ID: <46041293.2000609@codesourcery.com>

This patch allows the split-complex version of the fast convolution for 
Cell BE run at twice the former size limit.  It now works for up to 4K 
points (2K for interleaved complex).

Regards,

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fc4k.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070323/58460a24/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fc4k.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070323/58460a24/attachment-0001.ksh>

From jules at codesourcery.com  Fri Mar 23 18:21:04 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Fri, 23 Mar 2007 14:21:04 -0400
Subject: [vsipl++] [patch] Increase max size of split-complex fast convolution
In-Reply-To: <46041293.2000609@codesourcery.com>
References: <46041293.2000609@codesourcery.com>
Message-ID: <46041A90.1080700@codesourcery.com>

Don McCoy wrote:
> This patch allows the split-complex version of the fast convolution for 
> Cell BE run at twice the former size limit.  It now works for up to 4K 
> points (2K for interleaved complex).

Don,

This looks good, please check it in.

Did we resolve the use of asserts on the SPE?  Do they abort the 
program, or still cause it to deadlock?

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From don at codesourcery.com  Fri Mar 23 19:32:14 2007
From: don at codesourcery.com (Don McCoy)
Date: Fri, 23 Mar 2007 13:32:14 -0600
Subject: [vsipl++] [patch] Increase max size of split-complex fast convolution
In-Reply-To: <46041A90.1080700@codesourcery.com>
References: <46041293.2000609@codesourcery.com> <46041A90.1080700@codesourcery.com>
Message-ID: <46042B3E.5000803@codesourcery.com>

Jules Bergmann wrote:
> This looks good, please check it in.
>
> Did we resolve the use of asserts on the SPE?  Do they abort the 
> program, or still cause it to deadlock?
>
>
They still cause a deadlock.  This patch add an alternative, called 
spe_assert(), though I note that it still hangs in abort(), but at least 
the messages get out to the console.

Note: we may want to consider using -NDEBUG for release builds of SPE 
code, as this new version pulls in the stdio header otherwise.  Is the 
same true with the system assert?

Still ok to check in?

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fc4k2.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070323/34de02e8/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fc4k2.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070323/34de02e8/attachment-0001.ksh>

From jules at codesourcery.com  Fri Mar 23 19:41:49 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Fri, 23 Mar 2007 15:41:49 -0400
Subject: [vsipl++] [patch] Increase max size of split-complex fast convolution
In-Reply-To: <46042B3E.5000803@codesourcery.com>
References: <46041293.2000609@codesourcery.com> <46041A90.1080700@codesourcery.com> <46042B3E.5000803@codesourcery.com>
Message-ID: <46042D7D.90108@codesourcery.com>


> They still cause a deadlock.  This patch add an alternative, called 
> spe_assert(), though I note that it still hangs in abort(), but at least 
> the messages get out to the console.

That is an improvement.

> 
> Note: we may want to consider using -NDEBUG for release builds of SPE 
> code, as this new version pulls in the stdio header otherwise.  Is the 
> same true with the system assert?

That is a good idea to use -NDEBUG for the SPE cflags when building a 
release.  That shouldn't require any changes to the library itself, but 
should be a matter of configuration options.  I'll handle that in 
scripts/config.

Just to make sure I undertand the impact of including stdio: it is going 
to increase the SPE text size, which may result in our larger problem 
sizes being single buffered instead of double buffered.  However, it 
should not create a correctness issue, i.e. it shouldn't affect stack 
usage if an assertion is not thrown.

I don't know if the SPE system assert pulls in stdio, but I hope not 
since it doesn't actually print anything!

> 
> Still ok to check in?

Yes, please do.

				thanks,
				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From don at codesourcery.com  Fri Mar 23 19:47:26 2007
From: don at codesourcery.com (Don McCoy)
Date: Fri, 23 Mar 2007 13:47:26 -0600
Subject: [vsipl++] [patch] Increase max size of split-complex fast convolution
In-Reply-To: <46042D7D.90108@codesourcery.com>
References: <46041293.2000609@codesourcery.com> <46041A90.1080700@codesourcery.com> <46042B3E.5000803@codesourcery.com> <46042D7D.90108@codesourcery.com>
Message-ID: <46042ECE.4060606@codesourcery.com>

Jules Bergmann wrote:
> Just to make sure I undertand the impact of including stdio: it is 
> going to increase the SPE text size, which may result in our larger 
> problem sizes being single buffered instead of double buffered.  
> However, it should not create a correctness issue, i.e. it shouldn't 
> affect stack usage if an assertion is not thrown.

That matches my assumptions.

Patch committed.

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712


From assem at codesourcery.com  Sat Mar 24 10:25:24 2007
From: assem at codesourcery.com (Assem Salama)
Date: Sat, 24 Mar 2007 06:25:24 -0400
Subject: par evaluators
Message-ID: <4604FC94.4010204@codesourcery.com>

Everyone,
  This patch address Jules comments.

Thanks,
Assem
-------------- next part --------------
A non-text attachment was scrubbed...
Name: svn.diff.03242007.1.log
Type: text/x-log
Size: 15152 bytes
Desc: not available
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070324/80735341/attachment.bin>

From assem at codesourcery.com  Sat Mar 24 10:26:46 2007
From: assem at codesourcery.com (Assem Salama)
Date: Sat, 24 Mar 2007 06:26:46 -0400
Subject: maxval_test
Message-ID: <4604FCE6.9050301@codesourcery.com>

Everyone,
  This test tests the maxval operator. It creates a vector on a subset 
of the processors to test the processor mapping.

Thanks,
Assem
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maxval_test.cpp
Type: text/x-c++src
Size: 1680 bytes
Desc: not available
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070324/faf0c17e/attachment.cpp>

From joseph.j.cook at lmco.com  Sun Mar 25 20:53:46 2007
From: joseph.j.cook at lmco.com (Cook, Joseph J)
Date: Sun, 25 Mar 2007 16:53:46 -0400
Subject: VSIPL++ Compile Problem
Message-ID: <2ACA4DC2D980EC43B5771343F896B6C6012560E0@EMSS04M21.us.lmco.com>

Good Afternoon,

   I'm trying to compile the following simple program using Parallel
VSIPL++ for Mercury mcoe, but I am getting a compile error:

 
#include "vsip/vector.hpp"

int main()

{

vsip::Vector<float> foo(10,2.f);

vsip::Vector<float> bar(10,2.f);

 
bar *= foo;

}

 
The error I get is on a computer unfortunately inaccessible to e-mail so
I can't cut and paste it.  A portion of it is:

 
"incomplete type is not allowed:

map_type  map_;

...many more lines...complaining about BinaryOperator *=  in the last
line" 

 
If you cannot replicate this problem, let me know.

 
My impression was that even though I am using Parallel VSIPL++,
declaring a Map for my Vectors was an optional parameter, is this true?

 
Thanks!

Joe Cook

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070325/c51705c1/attachment.html>

From jules at codesourcery.com  Mon Mar 26 13:17:32 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 09:17:32 -0400
Subject: [vsipl++] VSIPL++ Compile Problem
In-Reply-To: <2ACA4DC2D980EC43B5771343F896B6C6012560E0@EMSS04M21.us.lmco.com>
References: <2ACA4DC2D980EC43B5771343F896B6C6012560E0@EMSS04M21.us.lmco.com>
Message-ID: <4607C7EC.4000600@codesourcery.com>

Cook, Joseph J wrote:
> Good Afternoon,
> 
>    I?m trying to compile the following simple program using Parallel 
> VSIPL++ for Mercury mcoe, but I am getting a compile error:
> 
> 
> #include ?vsip/vector.hpp?
> int main()
> {
> vsip::Vector<float> foo(10,2.f);
> vsip::Vector<float> bar(10,2.f);
> 
> bar *= foo;
> }
> 
> The error I get is on a computer unfortunately inaccessible to e-mail so 
> I can?t cut and paste it.  A portion of it is:
> 
> ?incomplete type is not allowed:
> map_type  map_;
> ?many more lines?complaining about BinaryOperator *=  in the last line?
> 
> If you cannot replicate this problem, let me know.

Joe,

Thanks for the problem report.  I will try to reproduce the error 
locally.  If there are any file names and line numbers associated with 
the message above, that might be helfpul.

This is using the 1.3 release?

> My impression was that even though I am using Parallel VSIPL++, 
> declaring a Map for my Vectors was an optional parameter, is this true?

Yes, that is correct.  The program above should compile without error. 
The only thing that stands out is the library is not being initialized 
(there is no "vsip::vsipl" object).

Not initializing the library will not cause a compile-time error (but it 
may lead to run-time errors, esp when using the library in parallel).

However, the error "incomplete type" suggests that a definition is not 
being included, which may be a result of not including a particular 
header file.  You might including <vsip/initfin.hpp> (necessary to 
initilize the library), and potentially <vsip/parallel.hpp>

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From jules at codesourcery.com  Mon Mar 26 13:42:33 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 09:42:33 -0400
Subject: [vsipl++] VSIPL++ Compile Problem
In-Reply-To: <4607C7EC.4000600@codesourcery.com>
References: <2ACA4DC2D980EC43B5771343F896B6C6012560E0@EMSS04M21.us.lmco.com> <4607C7EC.4000600@codesourcery.com>
Message-ID: <4607CDC9.6010804@codesourcery.com>


> 
> However, the error "incomplete type" suggests that a definition is not 
> being included, which may be a result of not including a particular 
> header file.  You might including <vsip/initfin.hpp> (necessary to 
> initilize the library), and potentially <vsip/parallel.hpp>

Joe,

To confirm, the problem is because of a missing header file.  If you 
include <vsip/map.hpp> the program should compile and execute correctly.

This is a bug in the library.  It should not be necessary to include 
map.hpp in this program.  We'll definitely fix it.  However, I assume 
that the work around should be sufficient and that an updated snapshot 
is not necessary.  Is that OK?

				thanks,
				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From jules at codesourcery.com  Mon Mar 26 14:08:21 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 10:08:21 -0400
Subject: [patch] Regression test for vector headers
Message-ID: <4607D3D5.9050808@codesourcery.com>

This regression test captures the bug reported by Joe Cook.

Patch applied.

				-- Jules
-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: vh.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070326/1085c8ff/attachment.ksh>

From joseph.j.cook at lmco.com  Mon Mar 26 14:17:32 2007
From: joseph.j.cook at lmco.com (Cook, Joseph J)
Date: Mon, 26 Mar 2007 10:17:32 -0400
Subject: [vsipl++] VSIPL++ Compile Problem
In-Reply-To: <4607CDC9.6010804@codesourcery.com>
References: <2ACA4DC2D980EC43B5771343F896B6C6012560E0@EMSS04M21.us.lmco.com> <4607C7EC.4000600@codesourcery.com>
 <4607CDC9.6010804@codesourcery.com>
Message-ID: <2ACA4DC2D980EC43B5771343F896B6C6012560E4@EMSS04M21.us.lmco.com>

Thanks for your very quick response.  Adding map.hpp allowed the program
to compile.  

No, we won't need a new drop just for this fix since the workaround is
simple enough.

Thanks,
Joe Cook

-----Original Message-----
From: Jules Bergmann [mailto:jules at codesourcery.com] 
Sent: Monday, March 26, 2007 9:43 AM
To: Jules Bergmann
Cc: Cook, Joseph J; vsipl++ at codesourcery.com; Steck, Thomas F; McClean,
Tom
Subject: Re: [vsipl++] VSIPL++ Compile Problem


> 
> However, the error "incomplete type" suggests that a definition is not

> being included, which may be a result of not including a particular 
> header file.  You might including <vsip/initfin.hpp> (necessary to 
> initilize the library), and potentially <vsip/parallel.hpp>

Joe,

To confirm, the problem is because of a missing header file.  If you 
include <vsip/map.hpp> the program should compile and execute correctly.

This is a bug in the library.  It should not be necessary to include 
map.hpp in this program.  We'll definitely fix it.  However, I assume 
that the work around should be sufficient and that an updated snapshot 
is not necessary.  Is that OK?

				thanks,
				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From jules at codesourcery.com  Mon Mar 26 16:58:32 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 12:58:32 -0400
Subject: [vsipl++] par evaluators
In-Reply-To: <4604FC94.4010204@codesourcery.com>
References: <4604FC94.4010204@codesourcery.com>
Message-ID: <4607FBB8.3020906@codesourcery.com>

Assem Salama wrote:
 > Everyone,
 >  This patch address Jules comments.

Assem,

There are several items from my last feedback that you did not
address:

---

[1] When you adding to existing code, try to follow the lead set by it
if possible (and if it doesn't violate our coding standards :) .  Here
the '#includes" are indented to improve readability of the #ifdef
logic.  Your new include should be indented too.

---

(for global_from_local_index_blk:)

[2] What namespace is this in?  vsip or vsip::impl ?

If it is in vsip, please move it to vsip::impl.

(Can you answer the question?  Is it in vsip or vsip::impl?)

---

[3]                  ^^^^ opt

---

Also, I have some additional feedback below.

				-- Jules


 > ------------------------------------------------------------------------
 >
 > Index: benchmarks/maxval.cpp
 > ===================================================================

 > @@ -96,6 +173,12 @@
 >    case  1: loop(t_maxval1<float>(0)); break;
 >    case  2: loop(t_maxval1<float>(1)); break;
 >    case  3: loop(t_maxval1<float>(2)); break;
 > +  case  4: loop(t_maxval2<float,Map<>,impl::Parallel_tag>(0)); break;
 > +  case  5: loop(t_maxval2<float,Map<>,impl::Parallel_tag>(1)); break;
 > +  case  6: loop(t_maxval2<float,Map<>,impl::Parallel_tag>(2)); break;

 > +  case  7: loop(t_maxval2<float,Map<>,impl::Cvsip_tag>(0)); break;
 > +  case  8: loop(t_maxval2<float,Map<>,impl::Cvsip_tag>(1)); break;
 > +  case  9: loop(t_maxval2<float,Map<>,impl::Cvsip_tag>(2)); break;

[1] Will this benchmark build if the C-VSIP backend is not configured
in?  If not, please guard these cases with an ifdef.


 >    default: return 0;
 >    }
 >    return 1;
 > Index: benchmarks/maxval.hpp
 > ===================================================================
 > --- benchmarks/maxval.hpp	(revision 0)
 > +++ benchmarks/maxval.hpp	(revision 0)
 > @@ -0,0 +1,101 @@
 > +/* Copyright (c) 2006 by CodeSourcery.  All rights reserved.

[2] Please fix the date.

 > +
 > +   This file is available for license from CodeSourcery, Inc. under 
the terms
 > +   of a commercial license and under the GPL.  It is not part of the 
VSIPL++
 > +   reference implementation and is not available under the BSD license.
 > +*/
 > +/** @file    benchmarks/maxval.hpp
 > +    @author  Assem Salama
 > +    @date    2006-07-22

[3] Please fix the date.

 > +    @brief   VSIPL++ Library: Helper file for maxval benchmark
 > +
 > +*/
 > +#ifndef BENCHMARKS_MAXVAL_HPP
 > +#define BENCHMARKS_MAXVAL_HPP
 > +
 > +using namespace vsip::impl;
 > +using namespace vsip;

[4] Header files shouldn't have global 'using namespace' statements
that import names into the global namespace.  They can introduces
order-dependent behavior (names introduced before this header will be
included, names introduced after will not).

Instead, names should either have an explicit namespace
(i.e. vsip::dimension_type), or 'using' statements should have limited
scope (such as within a function).


 > Index: src/vsip/opt/reductions/par_reductions.hpp
 > ===================================================================


 > +/***********************************************************************
 > +  Parallel evaluators.

[5] Please update or remove this comment.

 > +***********************************************************************/


 > +/**********************************************************************
 > +* Parallel evaluators for index returning reductions

[6] Please change this to the singular "Parallel evaluator".  There is
only one evaluator.

 > +**********************************************************************/
 > +
 > +template <template <typename> class ReduceT,
 > +          typename                  T,
 > +	  typename                  Block,
 > +	  typename                  OrderT,
 > +	  dimension_type            Dim >
 > +struct Evaluator<Op_reduce_idx<ReduceT>, T,
 > +		 Op_list_3<Block const&, Index<Dim>&, OrderT>,
 > +		 Parallel_tag>


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From jules at codesourcery.com  Mon Mar 26 17:09:59 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 13:09:59 -0400
Subject: [vsipl++] maxval_test
In-Reply-To: <4604FCE6.9050301@codesourcery.com>
References: <4604FCE6.9050301@codesourcery.com>
Message-ID: <4607FE67.6000905@codesourcery.com>

Assem Salama wrote:
> Everyone,
>  This test tests the maxval operator. It creates a vector on a subset of 
> the processors to test the processor mapping.

Assem,


This looks good.  Please address the feedback below and then check it in.

				thanks,
				-- Jules

> ------------------------------------------------------------------------

Assem,

[1] This file should be called tests/maxval.cpp or tests/par_maxval.cpp. 
  The _test suffix isn't necessary for files in the tests directory.

[2] Please add a header.  All source files in the library need a header. 
  In the future, please do this before posting a patch.


> 
> #include <vsip/initfin.hpp>
> #include <vsip/support.hpp>
> #include <vsip/map.hpp>
> #include <vsip/vector.hpp>
> #include <vsip/selgen.hpp>
> #include <vsip_csl/output.hpp>

> #include <vsip/opt/general_dispatch.hpp>
> #include <vsip/core/reductions/reductions_idx.hpp>

[3] Why is it necessary to include these directly?


> 
>   if(max == float(size-1) && max_idx[0] == size-1)
>   {
> #if DEBUG == 1
>     std::cout << "Test passed\n";
> #endif
>     return 0;
>   } else
>   {
> #if DEBUG == 1
>     std::cout << "Test failed\n";
> #endif
>     return -1;
>   }

[4] To make this test easier to extend and easier to debug (if it 
fails), it would better to use test_assert than returning the error code 
from main, i.e. something like

	#if DEBUG == 1
	  if(max == float(size-1) && max_idx[0] == size-1)
	    std::cout << "Test passed\n";
	  else
	    std::cout << "Test passed\n";
	#endif
	test_assert(max == float(size-1) && max_idx[0] == size-1);

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From stefan at codesourcery.com  Mon Mar 26 17:16:58 2007
From: stefan at codesourcery.com (Stefan Seefeld)
Date: Mon, 26 Mar 2007 13:16:58 -0400
Subject: [vsipl++] maxval_test
In-Reply-To: <4607FE67.6000905@codesourcery.com>
References: <4604FCE6.9050301@codesourcery.com> <4607FE67.6000905@codesourcery.com>
Message-ID: <4608000A.2090801@codesourcery.com>

Jules Bergmann wrote:
> Assem Salama wrote:
>> Everyone,
>>  This test tests the maxval operator. It creates a vector on a subset
>> of the processors to test the processor mapping.
> 
> Assem,
> 
> 
> This looks good.  Please address the feedback below and then check it in.
> 
>                 thanks,
>                 -- Jules
> 
>> ------------------------------------------------------------------------
> 
> Assem,
> 
> [1] This file should be called tests/maxval.cpp or tests/par_maxval.cpp.
>  The _test suffix isn't necessary for files in the tests directory.

Since this test appears to explicitely test parallel functionality, may I
suggest to put it into tests/parallel/maxal.cpp instead ?

This will be useful when / if we decide to extend our parallel testing harness
(for example by running the tests multiple times, with differing numbers of
processes).

Thanks,
		Stefan

-- 
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718


From jules at codesourcery.com  Mon Mar 26 18:53:12 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 14:53:12 -0400
Subject: [vsipl++] maxval_test
In-Reply-To: <4608000A.2090801@codesourcery.com>
References: <4604FCE6.9050301@codesourcery.com> <4607FE67.6000905@codesourcery.com> <4608000A.2090801@codesourcery.com>
Message-ID: <46081698.6000700@codesourcery.com>


>> [1] This file should be called tests/maxval.cpp or tests/par_maxval.cpp.
>>  The _test suffix isn't necessary for files in the tests directory.
> 
> Since this test appears to explicitely test parallel functionality, may I
> suggest to put it into tests/parallel/maxal.cpp instead ?
> 
> This will be useful when / if we decide to extend our parallel testing harness
> (for example by running the tests multiple times, with differing numbers of
> processes).

Yes, that's a good idea.  Let's put this in tests/parallel.

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From jules at codesourcery.com  Mon Mar 26 19:31:34 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 15:31:34 -0400
Subject: [patch] Misc fixes
Message-ID: <46081F96.8040502@codesourcery.com>

This patch includes fixes for
  - IPP 5.1 on ia32
  - IPP and MKL with ia64
  - ATLAS on ubuntu
  - C-VSIP Fftm backend for distributed data

It also adds variations of the vector_headers.cpp test for matrices and 
tensors (matrix_headers.cpp and tensor_headers.cpp respectively). 
Curiously, matrix_headers fails but tensor_headers passes :)

Patch applied.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: misc
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070326/107cdf5c/attachment.ksh>

From jules at codesourcery.com  Mon Mar 26 19:55:20 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Mon, 26 Mar 2007 15:55:20 -0400
Subject: [vsipl++] [patch] Misc fixes
In-Reply-To: <46081F96.8040502@codesourcery.com>
References: <46081F96.8040502@codesourcery.com>
Message-ID: <46082528.2050501@codesourcery.com>

Jules Bergmann wrote:
> This patch includes fixes for
>  - IPP 5.1 on ia32
>  - IPP and MKL with ia64
>  - ATLAS on ubuntu
>  - C-VSIP Fftm backend for distributed data
> 
> It also adds variations of the vector_headers.cpp test for matrices and 
> tensors (matrix_headers.cpp and tensor_headers.cpp respectively). 
> Curiously, matrix_headers fails but tensor_headers passes :)
> 
> Patch applied.
> 
>                 -- Jules
> 

Sorry, I sent an empty patch file with the previous email.  Patch attached.

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: misc.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070326/14ee68a1/attachment.ksh>

From jules at codesourcery.com  Tue Mar 27 21:06:15 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Tue, 27 Mar 2007 17:06:15 -0400
Subject: [vsipl++] I'm having trouble using Sourcery VSIPL++
In-Reply-To: <4594485D.9070105@codesourcery.com>
References: <122820062109.23725.4594328100098A8900005CAD2207000953CB010802019C@comcast.net> <4594485D.9070105@codesourcery.com>
Message-ID: <46098747.2010605@codesourcery.com>

Son,

I just wanted to check in on how VSIPL++ is working for you.  Was our 
feedback helpful in solving your problem?  Definitely let me know if you 
are having any issues using VSIPL++.

Also, we're considering how to improve VSIPL++ support for Windows and 
wanted to get feedback from Windows users.  If you don't mind answering 
a few questions, that would be great!

* What types of applications are you developing?  Are they intended to 
run on Windows in production, or do you use Windows to develop 
applications for embedded hardware?

* What signal and image processing libraries do you currently use?

* What Windows development tools do you use?  Compilers, IDEs, 
debuggers, etc.

* Is the use of Intel C++ on Windows acceptable for your development, 
assuming that it is integrated in to the Visual IDE?

* Would you consider using GCC based tools if they were integrated into 
the Eclipse IDE?

Please feel free to give me a call at 650-704-4014.

				thanks,
				-- Jules

Stefan Seefeld wrote:
> sonho4 at comcast.net wrote:
>>  
>> 1) I downloaded the VSIPL++ Binary package (IA32 Microsoft Windows XP),
>> sourceryvsipl++-1.2-win-x86 (WinZip file), from
>> http://www.codesourcery.com/vsiplplusplus/1.2/download.html to my WinXP
>> laptop computer.
>>  
>> 2) Also, I installed the Intel(R) C++ Compiler for 32-bit applications,
>> Version 9.1 (evaluation copy) on my laptop computer.
>>  
>> 3) I unzipped the VSIPL++ binary package.
>>  
>> 4) I tried to compile and run /share/sourceryvsipl++/example1.cpp
>> *without success.*
> 
> The Makefile in that directory assumes a UNIX-like environment with
> tools such as pkg-config to extract build parameters from pre-built
> and packaged configuration files. Unfortunately, that strategy isn't
> supported on Windows, where these tools don't exist.
> 
> Instead, it is assumed that you use some platform-specific build
> environment (such as the MSVC IDE), where you manually add build
> parameters, such as installation paths for third-party libraries
> that are used in conjunction with Sourcery VSIPL++.
> 
> We are working on ways to enhance this for future releases.
> 
>> Please look at the following compilation output and tell me what's wrong.
>>  
>> The last error showed the header file, mkl_cblas.h, was not found.
>> The *VSIPL++ binary package does not contain mkl_cblas.h*. Should I go
>> somewhere else to get this missing header file?
> 
> The Windows version of Sourcery VSIPL++ is built using Intel's
> IPP and MKL libraries (see
> http://www.codesourcery.com/public/vsiplplusplus/sourceryvsipl++-1.2/quickstart/chap-installation.html#id287819
> for more information on these).
> We are looking into how to remove that restriction, i.e. to make IPP and MKL
> optional, like they are on other platforms.
> 
> Regards,
> 		Stefan
> 


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705


From don at codesourcery.com  Wed Mar 28 07:32:23 2007
From: don at codesourcery.com (Don McCoy)
Date: Wed, 28 Mar 2007 01:32:23 -0600
Subject: [patch] Fast convolution cleanup.
Message-ID: <460A1A07.80005@codesourcery.com>

This patch does some minor cleanup on the fast convolution kernels for 
Cell BE.

A more portable method for passing PPU<-->SPU addresses is now used.  
The header 'common.h', formerly containing FFT and fast convolution 
parameters, has been removed -- these definitions now reside in their 
own headers.  It also cleans up the naming conventions somewhat, with 
the aim of being more consistent with other, similar code.  In general, 
buffer addresses are 'ea_foo' and the strides between vectors (be it by 
row or by column) are 'foo_stride'.

Regards,

-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fcc.changes
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070328/07e56a54/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fcc.diff
URL: <http://sourcerytools.com/pipermail/vsipl++/attachments/20070328/07e56a54/attachment-0001.ksh>

From jules at codesourcery.com  Wed Mar 28 19:01:41 2007
From: jules at codesourcery.com (Jules Bergmann)
Date: Wed, 28 Mar 2007 15:01:41 -0400
Subject: [vsipl++] [patch] Fast convolution cleanup.
In-Reply-To: <460A1A07.80005@codesourcery.com>
References: <460A1A07.80005@codesourcery.com>
Message-ID: <460ABB95.3090209@codesourcery.com>

Don McCoy wrote:
 > This patch does some minor cleanup on the fast convolution kernels for
 > Cell BE.
 >
 > A more portable method for passing PPU<-->SPU addresses is now used.
 > The header 'common.h', formerly containing FFT and fast convolution
 > parameters, has been removed -- these definitions now reside in their
 > own headers.  It also cleans up the naming conventions somewhat, with
 > the aim of being more consistent with other, similar code.  In general,
 > buffer addresses are 'ea_foo' and the strides between vectors (be it by
 > row or by column) are 'foo_stride'.

Don,

This looks good.  Please check it in.

			thanks,
			-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705