From assem at codesourcery.com Fri Jun 1 14:46:03 2007 From: assem at codesourcery.com (Assem Salama) Date: Fri, 01 Jun 2007 10:46:03 -0400 Subject: SIMD loop fusion support for unaligned Message-ID: <4660312B.5030009@codesourcery.com> Everyone, This patch adds a new unary operator, unaligned. This operator hints to the compiler that this array may be unaligned. This allows the user to mix unaligned and aligned vectors. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05302007.1.log Type: text/x-log Size: 27901 bytes Desc: not available URL: From assem at codesourcery.com Fri Jun 1 15:07:16 2007 From: assem at codesourcery.com (Assem Salama) Date: Fri, 01 Jun 2007 11:07:16 -0400 Subject: Support for parallel generator blocks Message-ID: <46603624.5040002@codesourcery.com> Everyone, This patch was submitted a while ago but didn't receive any feedback. This patch has a Choose_local_block addition that switches between Map_subset_block and Subset_block. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.06012007.1.log Type: text/x-log Size: 2382 bytes Desc: not available URL: From assem at codesourcery.com Fri Jun 1 15:08:38 2007 From: assem at codesourcery.com (Assem Salama) Date: Fri, 01 Jun 2007 11:08:38 -0400 Subject: fftw3 split support Message-ID: <46603676.3040504@codesourcery.com> Everyone, This patch supports split ffts using fftw3 backend. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.06012007.2.log Type: text/x-log Size: 27041 bytes Desc: not available URL: From assem at codesourcery.com Fri Jun 1 15:13:24 2007 From: assem at codesourcery.com (Assem Salama) Date: Fri, 01 Jun 2007 11:13:24 -0400 Subject: benchmarks Message-ID: <46603794.4050608@codesourcery.com> Everyone, This patch contains two benchmarks, one for expression template stuff and the other for vramp. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.06012007.3.log Type: text/x-log Size: 9754 bytes Desc: not available URL: From jules at codesourcery.com Mon Jun 4 14:41:53 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 04 Jun 2007 10:41:53 -0400 Subject: [vsipl++] SIMD loop fusion support for unaligned In-Reply-To: <4660312B.5030009@codesourcery.com> References: <4660312B.5030009@codesourcery.com> Message-ID: <466424B1.6000802@codesourcery.com> Assem Salama wrote: > Everyone, > This patch adds a new unary operator, unaligned. This operator hints to > the compiler that this array may be unaligned. This allows the user to > mix unaligned and aligned vectors. Assem, This looks good. Can you add 'has_perm' to the general simd traits class (faux-SIMD), then check in? thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Mon Jun 4 14:49:44 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 04 Jun 2007 10:49:44 -0400 Subject: [vsipl++] Support for parallel generator blocks In-Reply-To: <46603624.5040002@codesourcery.com> References: <46603624.5040002@codesourcery.com> Message-ID: <46642688.8000008@codesourcery.com> Assem Salama wrote: > Everyone, > This patch was submitted a while ago but didn't receive any feedback. > This patch has a Choose_local_block addition that switches between > Map_subset_block and Subset_block. Assem, Thanks for resending this. I did have some feedback from the first time around, I apologize if you did not see it: This looks good, however, can you extend Choose_subblock to handle Global_map and Replicated_map? Both maps should be able to use a Subset_block. Also, you might consider specializing Create_subblock based on the RetBlock type rather than Map type, since the RetBlock type is what governs the arguments to the constructor. As currently written, if you add a new cases to Choose_subblock (say for Global_map), but forget to add it to Create_subblock, you'll get an error. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Mon Jun 4 14:57:03 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 04 Jun 2007 10:57:03 -0400 Subject: [vsipl++] fftw3 split support In-Reply-To: <46603676.3040504@codesourcery.com> References: <46603676.3040504@codesourcery.com> Message-ID: <4664283F.2080708@codesourcery.com> Assem Salama wrote: > Everyone, > This patch supports split ffts using fftw3 backend. Assem, Is this the same patch as http://www.codesourcery.com/archives/vsipl%2B%2B/msg01024.html ? If so, it looks good, modulo one comment. See: http://www.codesourcery.com/archives/vsipl%2B%2B/msg01033.html thanks, -- Jules > > Thanks, > Assem > > > ------------------------------------------------------------------------ > > Index: src/vsip/opt/fftw3/fft.cpp > =================================================================== > --- src/vsip/opt/fftw3/fft.cpp (revision 165174) > +++ src/vsip/opt/fftw3/fft.cpp (working copy) > @@ -19,6 +19,11 @@ > #include > #include > > +// We need to include this create_plan.hpp header file because fft_impl.cpp > +// uses this file. We cannot include this file in fft_impl.cpp because > +// fft_impl.cpp gets included multiple times here. > +#include > + > /*********************************************************************** > Declarations > ***********************************************************************/ > Index: src/vsip/opt/fftw3/fft_impl.cpp > =================================================================== > --- src/vsip/opt/fftw3/fft_impl.cpp (revision 168725) > +++ src/vsip/opt/fftw3/fft_impl.cpp (working copy) > @@ -21,8 +21,8 @@ > #include > #include > #include > +#include > #include > -#include > > /*********************************************************************** > Declarations > @@ -40,25 +40,25 @@ > { > Fft_base(Domain const& dom, int exp, int flags) > VSIP_THROW((std::bad_alloc)) > - : in_buffer_(32, dom.size()), > - out_buffer_(32, dom.size()) > + : in_buffer_(dom.size()), > + out_buffer_(dom.size()) > { > // For multi-dimensional transforms, these plans assume both > // input and output data is dense, row-major, interleave-complex > // format. > - > - for (vsip::dimension_type i = 0; i < D; ++i) size_[i] = dom[i].size(); > - plan_in_place_ = FFTW(plan_dft)(D, size_, > - reinterpret_cast(in_buffer_.get()), > - reinterpret_cast(in_buffer_.get()), > - exp, flags); > > + for(index_type i=0;i + plan_in_place_ = > + Create_plan > + ::create > + (in_buffer_.ptr(), in_buffer_.ptr(), exp, flags, dom); > + > if (!plan_in_place_) VSIP_IMPL_THROW(std::bad_alloc()); > > - plan_by_reference_ = FFTW(plan_dft)(D, size_, > - reinterpret_cast(in_buffer_.get()), > - reinterpret_cast(out_buffer_.get()), > - exp, FFTW_PRESERVE_INPUT | flags); > + plan_by_reference_ = Create_plan > + ::create > + (in_buffer_.ptr(), out_buffer_.ptr(), exp, flags, dom); > + > if (!plan_by_reference_) > { > FFTW(destroy_plan)(plan_in_place_); > @@ -71,8 +71,8 @@ > if (plan_by_reference_) FFTW(destroy_plan)(plan_by_reference_); > } > > - aligned_array > in_buffer_; > - aligned_array > out_buffer_; > + Cmplx_buffer in_buffer_; > + Cmplx_buffer out_buffer_; > FFTW(plan) plan_in_place_; > FFTW(plan) plan_by_reference_; > int size_[D]; > @@ -84,17 +84,15 @@ > Fft_base(Domain const& dom, int A, int flags) > VSIP_THROW((std::bad_alloc)) > : in_buffer_(32, dom.size()), > - out_buffer_(32, dom.size()) > + out_buffer_(dom.size()) > { > for (vsip::dimension_type i = 0; i < D; ++i) size_[i] = dom[i].size(); > // FFTW3 assumes A == D - 1. > // See also query_layout(). > if (A != D - 1) std::swap(size_[A], size_[D - 1]); > - plan_by_reference_ = FFTW(plan_dft_r2c)( > - D, size_, > - in_buffer_.get(), reinterpret_cast(out_buffer_.get()), > - FFTW_PRESERVE_INPUT | flags); > - > + plan_by_reference_ = Create_plan:: > + create > + (in_buffer_.get(), out_buffer_.ptr(), A, flags, dom); > if (!plan_by_reference_) VSIP_IMPL_THROW(std::bad_alloc()); > } > ~Fft_base() VSIP_NOTHROW > @@ -103,7 +101,7 @@ > } > > aligned_array in_buffer_; > - aligned_array > out_buffer_; > + Cmplx_buffer out_buffer_; > FFTW(plan) plan_by_reference_; > int size_[D]; > }; > @@ -113,17 +111,16 @@ > { > Fft_base(Domain const& dom, int A, int flags) > VSIP_THROW((std::bad_alloc)) > - : in_buffer_(32, dom.size()), > + : in_buffer_(dom.size()), > out_buffer_(32, dom.size()) > { > for (vsip::dimension_type i = 0; i < D; ++i) size_[i] = dom[i].size(); > // FFTW3 assumes A == D - 1. > // See also query_layout(). > if (A != D - 1) std::swap(size_[A], size_[D - 1]); > - plan_by_reference_ = FFTW(plan_dft_c2r)( > - D, size_, > - reinterpret_cast(in_buffer_.get()), out_buffer_.get(), > - flags); > + plan_by_reference_ = Create_plan:: > + create > + (in_buffer_.ptr(), out_buffer_.get(), A, flags, dom); > > if (!plan_by_reference_) VSIP_IMPL_THROW(std::bad_alloc()); > } > @@ -132,8 +129,8 @@ > if (plan_by_reference_) FFTW(destroy_plan)(plan_by_reference_); > } > > - aligned_array > in_buffer_; > - aligned_array out_buffer_; > + Cmplx_buffer in_buffer_; > + aligned_array out_buffer_; > FFTW(plan) plan_by_reference_; > int size_[D]; > }; > @@ -156,6 +153,23 @@ > : Fft_base<1, ctype, ctype>(dom, E, convert_NoT(number)) > {} > virtual char const* name() { return "fft-fftw3-1D-complex"; } > + virtual void query_layout(Rt_layout<1> &rtl_inout) > + { > + // By default use unit_stride, tuple<0, 1, 2> > + rtl_inout.pack = stride_unit_dense; > + rtl_inout.order = tuple<0, 1, 2>(); > + // make default based on library > + rtl_inout.complex = Create_plan::format; > + } > + virtual void query_layout(Rt_layout<1> &rtl_in, Rt_layout<1> &rtl_out) > + { > + // By default use unit_stride, tuple<0, 1, 2> > + rtl_in.pack = rtl_out.pack = stride_unit_dense; > + rtl_in.order = rtl_out.order = tuple<0, 1, 2>(); > + // make default based on library > + rtl_in.complex = rtl_out.complex = Create_plan::format; > + } > + > virtual void in_place(ctype *inout, stride_type s, length_type l) > { > assert(s == 1 && static_cast(l) == this->size_[0]); > @@ -163,8 +177,12 @@ > reinterpret_cast(inout), > reinterpret_cast(inout)); > } > - virtual void in_place(ztype, stride_type, length_type) > + virtual void in_place(ztype inout, stride_type s, length_type l) > { > + assert(s == 1 && static_cast(l) == this->size_[0]); > + FFTW(execute_split_dft)(plan_in_place_, > + inout.first, inout.second, > + inout.first, inout.second); > } > virtual void by_reference(ctype *in, stride_type in_stride, > ctype *out, stride_type out_stride, > @@ -173,13 +191,18 @@ > assert(in_stride == 1 && out_stride == 1 && > static_cast(length) == this->size_[0]); > FFTW(execute_dft)(plan_by_reference_, > - reinterpret_cast(in), > + reinterpret_cast(in), > reinterpret_cast(out)); > } > - virtual void by_reference(ztype, stride_type, > - ztype, stride_type, > - length_type) > + virtual void by_reference(ztype in, stride_type in_stride, > + ztype out, stride_type out_stride, > + length_type length) > { > + assert(in_stride == 1 && out_stride == 1 && > + static_cast(length) == this->size_[0]); > + FFTW(execute_split_dft)(plan_by_reference_, > + in.first, in.second, > + out.first, out.second); > } > }; > > @@ -206,11 +229,29 @@ > FFTW(execute_dft_r2c)(plan_by_reference_, > in, reinterpret_cast(out)); > } > - virtual void by_reference(rtype *, stride_type, > - ztype, stride_type, > - length_type) > + virtual void by_reference(rtype *in, stride_type is, > + ztype out, stride_type os, > + length_type length) > { > + FFTW(execute_split_dft_r2c)(plan_by_reference_, > + in, out.first, out.second); > } > + virtual void query_layout(Rt_layout<1> &rtl_inout) > + { > + // By default use unit_stride, tuple<0, 1, 2> > + rtl_inout.pack = stride_unit_dense; > + rtl_inout.order = tuple<0, 1, 2>(); > + // make default based on library > + rtl_inout.complex = Create_plan::format; > + } > + virtual void query_layout(Rt_layout<1> &rtl_in, Rt_layout<1> &rtl_out) > + { > + // By default use unit_stride, tuple<0, 1, 2> > + rtl_in.pack = rtl_out.pack = stride_unit_dense; > + rtl_in.order = rtl_out.order = tuple<0, 1, 2>(); > + // make default based on library > + rtl_in.complex = rtl_out.complex = Create_plan::format; > + } > > }; > > @@ -241,11 +282,29 @@ > FFTW(execute_dft_c2r)(plan_by_reference_, > reinterpret_cast(in), out); > } > - virtual void by_reference(ztype, stride_type, > - rtype *, stride_type, > - length_type) > + virtual void by_reference(ztype in, stride_type is, > + rtype *out, stride_type os, > + length_type length) > { > + FFTW(execute_split_dft_c2r)(plan_by_reference_, > + in.first, in.second, out); > } > + virtual void query_layout(Rt_layout<1> &rtl_inout) > + { > + // By default use unit_stride, tuple<0, 1, 2> > + rtl_inout.pack = stride_unit_dense; > + rtl_inout.order = tuple<0, 1, 2>(); > + // make default based on library > + rtl_inout.complex = Create_plan::format; > + } > + virtual void query_layout(Rt_layout<1> &rtl_in, Rt_layout<1> &rtl_out) > + { > + // By default use unit_stride, tuple<0, 1, 2>, cmplx_inter_fmt > + rtl_in.pack = rtl_out.pack = stride_unit_dense; > + rtl_in.order = rtl_out.order = tuple<0, 1, 2>(); > + // make default based on library > + rtl_in.complex = rtl_out.complex = Create_plan::format; > + } > > }; > > @@ -270,8 +329,8 @@ > virtual void query_layout(Rt_layout<2> &rtl_in, Rt_layout<2> &rtl_out) > { > rtl_in.pack = stride_unit_dense; > - rtl_in.complex = cmplx_inter_fmt; > rtl_in.order = row2_type(); > + rtl_in.complex = Create_plan::format; > rtl_out = rtl_in; > } > virtual void in_place(ctype *inout, > @@ -288,10 +347,13 @@ > reinterpret_cast(inout)); > } > /// complex (split) in-place > - virtual void in_place(ztype, > + virtual void in_place(ztype inout, > stride_type, stride_type, > length_type, length_type) > { > + FFTW(execute_split_dft)(plan_in_place_, > + inout.first, inout.second, > + inout.first, inout.second); > } > virtual void by_reference(ctype *in, > stride_type in_r_stride, > @@ -311,12 +373,21 @@ > reinterpret_cast(in), > reinterpret_cast(out)); > } > - virtual void by_reference(ztype, > - stride_type, stride_type, > - ztype, > - stride_type, stride_type, > - length_type, length_type) > + virtual void by_reference(ztype in, > + stride_type in_r_stride, stride_type in_c_stride, > + ztype out, > + stride_type out_r_stride, stride_type out_c_stride, > + length_type, length_type cols) > { > + // Check that data is dense row-major. > + assert(in_r_stride == static_cast(cols)); > + assert(in_c_stride == 1); > + assert(out_r_stride == static_cast(cols)); > + assert(out_c_stride == 1); > + > + FFTW(execute_split_dft)(plan_by_reference_, > + in.first, in.second, > + out.first, out.second); > } > }; > > @@ -344,7 +415,7 @@ > // FFTW3 assumes A is the last dimension. > if (A == 0) rtl_in.order = tuple<1, 0, 2>(); > else rtl_in.order = tuple<0, 1, 2>(); > - rtl_in.complex = cmplx_inter_fmt; > + rtl_in.complex = Create_plan::format; > rtl_out = rtl_in; > } > virtual bool requires_copy(Rt_layout<2> &) { return true;} > @@ -358,12 +429,14 @@ > FFTW(execute_dft_r2c)(plan_by_reference_, > in, reinterpret_cast(out)); > } > - virtual void by_reference(rtype *, > + virtual void by_reference(rtype *in, > stride_type, stride_type, > - ztype, > + ztype out, > stride_type, stride_type, > length_type, length_type) > { > + FFTW(execute_split_dft_r2c)(plan_by_reference_, > + in, out.first, out.second); > } > > }; > @@ -392,7 +465,7 @@ > // FFTW3 assumes A is the last dimension. > if (A == 0) rtl_in.order = tuple<1, 0, 2>(); > else rtl_in.order = tuple<0, 1, 2>(); > - rtl_in.complex = cmplx_inter_fmt; > + rtl_in.complex = Create_plan::format; > rtl_out = rtl_in; > } > virtual bool requires_copy(Rt_layout<2> &) { return true;} > @@ -406,12 +479,14 @@ > FFTW(execute_dft_c2r)(plan_by_reference_, > reinterpret_cast(in), out); > } > - virtual void by_reference(ztype, > + virtual void by_reference(ztype in, > stride_type, stride_type, > - rtype *, > + rtype *out, > stride_type, stride_type, > length_type, length_type) > { > + FFTW(execute_split_dft_c2r)(plan_by_reference_, > + in.first, in.second, out); > } > > }; > @@ -437,8 +512,8 @@ > virtual void query_layout(Rt_layout<3> &rtl_in, Rt_layout<3> &rtl_out) > { > rtl_in.pack = stride_unit_dense; > - rtl_in.complex = cmplx_inter_fmt; > rtl_in.order = row3_type(); > + rtl_in.complex = Create_plan::format; > rtl_out = rtl_in; > } > virtual void in_place(ctype *inout, > @@ -462,14 +537,26 @@ > reinterpret_cast(inout), > reinterpret_cast(inout)); > } > - virtual void in_place(ztype, > - stride_type, > - stride_type, > - stride_type, > - length_type, > - length_type, > - length_type) > + virtual void in_place(ztype inout, > + stride_type x_stride, > + stride_type y_stride, > + stride_type z_stride, > + length_type x_length, > + length_type y_length, > + length_type z_length) > { > + assert(static_cast(x_length) == this->size_[0]); > + assert(static_cast(y_length) == this->size_[1]); > + assert(static_cast(z_length) == this->size_[2]); > + > + // Check that data is dense row-major. > + assert(x_stride == static_cast(y_length*z_length)); > + assert(y_stride == static_cast(z_length)); > + assert(z_stride == 1); > + > + FFTW(execute_split_dft)(plan_in_place_, > + inout.first, inout.second, > + inout.first, inout.second); > } > virtual void by_reference(ctype *in, > stride_type in_x_stride, > @@ -499,18 +586,33 @@ > reinterpret_cast(in), > reinterpret_cast(out)); > } > - virtual void by_reference(ztype, > - stride_type, > - stride_type, > - stride_type, > - ztype, > - stride_type, > - stride_type, > - stride_type, > - length_type, > - length_type, > - length_type) > + virtual void by_reference(ztype in, > + stride_type in_x_stride, > + stride_type in_y_stride, > + stride_type in_z_stride, > + ztype out, > + stride_type out_x_stride, > + stride_type out_y_stride, > + stride_type out_z_stride, > + length_type x_length, > + length_type y_length, > + length_type z_length) > { > + assert(static_cast(x_length) == this->size_[0]); > + assert(static_cast(y_length) == this->size_[1]); > + assert(static_cast(z_length) == this->size_[2]); > + > + // Check that data is dense row-major. > + assert(in_x_stride == static_cast(y_length*z_length)); > + assert(in_y_stride == static_cast(z_length)); > + assert(in_z_stride == 1); > + assert(out_x_stride == static_cast(y_length*z_length)); > + assert(out_y_stride == static_cast(z_length)); > + assert(out_z_stride == 1); > + > + FFTW(execute_split_dft)(plan_by_reference_, > + in.first, in.second, > + out.first, out.second); > } > }; > > @@ -542,7 +644,7 @@ > case 1: rtl_in.order = tuple<0, 2, 1>(); break; > default: rtl_in.order = tuple<0, 1, 2>(); break; > } > - rtl_in.complex = cmplx_inter_fmt; > + rtl_in.complex = Create_plan::format; > rtl_out = rtl_in; > } > virtual bool requires_copy(Rt_layout<3> &) { return true;} > @@ -562,11 +664,11 @@ > FFTW(execute_dft_r2c)(plan_by_reference_, > in, reinterpret_cast(out)); > } > - virtual void by_reference(rtype *, > + virtual void by_reference(rtype *in, > stride_type, > stride_type, > stride_type, > - ztype, > + ztype out, > stride_type, > stride_type, > stride_type, > @@ -574,6 +676,8 @@ > length_type, > length_type) > { > + FFTW(execute_split_dft_r2c)(plan_by_reference_, > + in, out.first, out.second); > } > > }; > @@ -606,7 +710,7 @@ > case 1: rtl_in.order = tuple<0, 2, 1>(); break; > default: rtl_in.order = tuple<0, 1, 2>(); break; > } > - rtl_in.complex = cmplx_inter_fmt; > + rtl_in.complex = Create_plan::format; > rtl_out = rtl_in; > } > virtual bool requires_copy(Rt_layout<3> &) { return true;} > @@ -626,11 +730,11 @@ > FFTW(execute_dft_c2r)(plan_by_reference_, > reinterpret_cast(in), out); > } > - virtual void by_reference(ztype, > + virtual void by_reference(ztype in, > stride_type, > stride_type, > stride_type, > - rtype *, > + rtype *out, > stride_type, > stride_type, > stride_type, > @@ -638,6 +742,8 @@ > length_type, > length_type) > { > + FFTW(execute_split_dft_c2r)(plan_by_reference_, > + in.first, in.second, out); > } > > }; > Index: src/vsip/opt/fftw3/create_plan.hpp > =================================================================== > --- src/vsip/opt/fftw3/create_plan.hpp (revision 0) > +++ src/vsip/opt/fftw3/create_plan.hpp (revision 0) > @@ -0,0 +1,224 @@ > +/* Copyright (c) 2007 by CodeSourcery. All rights reserved. > + > + This file is available for license from CodeSourcery, Inc. under the terms > + of a commercial license and under the GPL. It is not part of the VSIPL++ > + reference implementation and is not available under the BSD license. > +*/ > +/** @file vsip/opt/fftw3/create_plan.hpp > + @author Assem Salama > + @date 2007-04-13 > + @brief VSIPL++ Library: File that has create_plan struct > +*/ > +#ifndef VSIP_OPT_FFTW3_CREATE_PLAN_HPP > +#define VSIP_OPT_FFTW3_CREATE_PLAN_HPP > + > +#include > + > +#include > + > +namespace vsip > +{ > +namespace impl > +{ > +namespace fftw3 > +{ > + > +// This is a helper struct to create temporary buffers used durring plan > +// creation. > +template > +struct Cmplx_buffer; > + > +// intereaved complex > +template > +struct Cmplx_buffer > +{ > + std::complex *ptr() { return buffer_.get(); } > + > + Cmplx_buffer(length_type size) : buffer_(32, size) > + {} > + aligned_array > buffer_; > +}; > + > +// split complex > +template > +struct Cmplx_buffer > +{ > + Cmplx_buffer(length_type size) : > + buffer_r_(32, size), > + buffer_i_(32, size) > + {} > + > + std::pair ptr() > + { return std::pair(buffer_r_.get(), buffer_i_.get()); } > + > + aligned_array buffer_r_; > + aligned_array buffer_i_; > +}; > + > +// Convert form axis to tuple > +template > +Rt_tuple tuple_from_axis(int A); > + > +template <> > +Rt_tuple tuple_from_axis<1>(int A) { return Rt_tuple(0,1,2); } > +template <> > +Rt_tuple tuple_from_axis<2>(int A) > +{ > + switch (A) { > + case 0: return Rt_tuple(1,0,2); > + default: return Rt_tuple(0,1,2); > + }; > +} > + > +template <> > +Rt_tuple tuple_from_axis<3>(int A) > +{ > + switch (A) { > + case 0: return Rt_tuple(2,1,0); > + case 1: return Rt_tuple(0,2,1); > + default: return Rt_tuple(0,1,2); > + }; > +} > + > +// This is a helper strcut to create plans > +template > +struct Create_plan; > + > +// interleaved > +template<> > +struct Create_plan > +{ > + > + // create function for complex -> complex > + template + typename T, dimension_type Dim> > + static PlanT > + create(std::complex* ptr1, std::complex* ptr2, > + int exp, int flags, Domain const& size) > + { > + int sz[Dim],i; > + for(i=0;i + return create_fftw_plan(Dim, sz, ptr1,ptr2,exp,flags); > + } > + > + // create function for real -> complex > + template + typename T, dimension_type Dim> > + static PlanT > + create(T* ptr1, std::complex* ptr2, > + int A, int flags, Domain const& size) > + { > + int sz[Dim],i; > + for(i=0;i + if(A != Dim-1) std::swap(sz[A], sz[Dim-1]); > + return create_fftw_plan(Dim,sz,ptr1,ptr2,flags); > + } > + > + // create function for complex -> real > + template + typename T, dimension_type Dim> > + static PlanT > + create(std::complex* ptr1, T* ptr2, > + int A, int flags, Domain const& size) > + { > + int sz[Dim],i; > + for(i=0;i + if(A != Dim-1) std::swap(sz[A], sz[Dim-1]); > + return create_fftw_plan(Dim,sz,ptr1,ptr2,flags); > + } > + > + static rt_complex_type const format = cmplx_inter_fmt; > + > +}; > + > +// split > +template<> > +struct Create_plan > +{ > + > + // create for complex -> complex > + template + typename T, dimension_type Dim> > + static PlanT > + create(std::pair ptr1, std::pair ptr2, > + int exp, int flags, Domain const& size) > + { > + IodimT iodims[Dim]; > + int i; > + Applied_layout::type, > + Stride_unit_dense, Cmplx_split_fmt> > > + app_layout(size); > + > + for(i=0;i + { > + iodims[i].n = app_layout.size(i); > + iodims[i].is = iodims[i].os = app_layout.stride(i); > + } > + > + return create_fftw_plan(Dim, iodims, ptr1,ptr2, flags); > + > + } > + > + // create for real -> complex > + template + typename T, dimension_type Dim> > + static PlanT > + create(T *ptr1, std::pair ptr2, > + int A, int flags, Domain const& size) > + { > + IodimT iodims[Dim]; > + int i; > + Applied_layout > > + app_layout(Rt_layout(stride_unit_align, > + tuple_from_axis(A), > + cmplx_split_fmt, > + 0), > + size, sizeof(T)); > + > + > + for(i=0;i + { > + iodims[i].n = app_layout.size(i); > + iodims[i].is = iodims[i].os = app_layout.stride(i); > + } > + > + return create_fftw_plan(Dim, iodims, ptr1,ptr2, flags); > + } > + > + // create for complex -> real > + template + typename T, dimension_type Dim> > + static PlanT > + create(std::pair ptr1, T* ptr2, > + int A, int flags, Domain const& size) > + { > + IodimT iodims[Dim]; > + int i; > + Applied_layout > > + app_layout(Rt_layout(stride_unit_align, > + tuple_from_axis(A), > + cmplx_split_fmt, > + 0), > + size, sizeof(T)); > + > + > + > + > + for(i=0;i + { > + iodims[i].n = app_layout.size(i); > + iodims[i].is = iodims[i].os = app_layout.stride(i); > + } > + > + return create_fftw_plan(Dim, iodims, ptr1,ptr2, flags); > + } > + > + static rt_complex_type const format = cmplx_split_fmt; > +}; > + > + > +} // namespace vsip::impl::fftw3 > +} // namespace vsip::impl > +} // namespace vsip > + > +#endif // VSIP_OPT_FFTW3_CREATE_PLAN_HPP > Index: src/vsip/opt/fftw3/fftw_support.hpp > =================================================================== > --- src/vsip/opt/fftw3/fftw_support.hpp (revision 0) > +++ src/vsip/opt/fftw3/fftw_support.hpp (revision 0) > @@ -0,0 +1,92 @@ > +/* Copyright (c) 2007 by CodeSourcery. All rights reserved. > + > + This file is available for license from CodeSourcery, Inc. under the terms > + of a commercial license and under the GPL. It is not part of the VSIPL++ > + reference implementation and is not available under the BSD license. > +*/ > +/** @file vsip/opt/fftw3/fftw_support.hpp > + @author Assem Salama > + @date 2007-04-25 > + @brief VSIPL++ Library: File that has overloaded create functions for > + fftw > + > +*/ > +#ifndef VSIP_OPT_FFTW3_FFTW_SUPPORT_HPP > +#define VSIP_OPT_FFTW3_FFTW_SUPPORT_HPP > + > +namespace vsip > +{ > +namespace impl > +{ > +namespace fftw3 > +{ > + > +#define DCL_FFTW_PLAN_FUNC_C2C(T, fT) \ > +fT##_plan create_fftw_plan(int dim, int *sz, \ > + std::complex* ptr1, std::complex* ptr2,\ > + int exp, int flags) \ > +{ return fT##_plan_dft(dim,sz,reinterpret_cast(ptr1), \ > + reinterpret_cast(ptr2), exp, flags); \ > +} \ > +\ > +fT##_plan create_fftw_plan(int dim, fT##_iodim *iodim, \ > + std::pair ptr1, std::pair ptr2,\ > + int flags) \ > +{ return fT##_plan_guru_split_dft(dim,iodim,0,NULL, \ > + ptr1.first,ptr1.second,ptr2.first,ptr2.second, \ > + flags); \ > +} > + > +#define DCL_FFTW_PLAN_FUNC_R2C(T, fT) \ > +fT##_plan create_fftw_plan(int dim, int *sz, \ > + T* ptr1, std::complex* ptr2,\ > + int flags) \ > +{ return fT##_plan_dft_r2c(dim,sz,ptr1, \ > + reinterpret_cast(ptr2), flags); \ > +} \ > +\ > +fT##_plan create_fftw_plan(int dim, fT##_iodim *iodim, \ > + T* ptr1, std::pair ptr2,\ > + int flags) \ > +{ return fT##_plan_guru_split_dft_r2c(dim,iodim,0,NULL, \ > + ptr1,ptr2.first,ptr2.second, \ > + flags); \ > +} > + > +#define DCL_FFTW_PLAN_FUNC_C2R(T, fT) \ > +fT##_plan create_fftw_plan(int dim, int *sz, \ > + std::complex* ptr1, T* ptr2,\ > + int flags) \ > +{ return fT##_plan_dft_c2r(dim,sz,reinterpret_cast(ptr1), \ > + ptr2, flags); \ > +} \ > +\ > +fT##_plan create_fftw_plan(int dim, fT##_iodim *iodim, \ > + std::pair ptr1, T* ptr2,\ > + int flags) \ > +{ return fT##_plan_guru_split_dft_c2r(dim,iodim,0,NULL, \ > + ptr1.first,ptr1.second,ptr2, \ > + flags); \ > +} > + > +#define DCL_FFTW_PLANS(T, fT) \ > + DCL_FFTW_PLAN_FUNC_C2C(T, fT) \ > + DCL_FFTW_PLAN_FUNC_R2C(T, fT) \ > + DCL_FFTW_PLAN_FUNC_C2R(T, fT) > + > + > +#if VSIP_IMPL_PROVIDE_FFT_FLOAT > + DCL_FFTW_PLANS(float, fftwf) > +#endif > +#if VSIP_IMPL_PROVIDE_FFT_DOUBLE > + DCL_FFTW_PLANS(double, fftw) > +#endif > +#if VSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE > + DCL_FFTW_PLANS(long double, fftwl) > +#endif > + > +} // namespace vsip::impl::fftw3 > +} // namespace vsip::impl > +} // namespace vsip > + > +#endif -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Mon Jun 4 15:03:10 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 04 Jun 2007 11:03:10 -0400 Subject: [vsipl++] Support for parallel generator blocks In-Reply-To: <46642880.1070209@codesourcery.com> References: <46603624.5040002@codesourcery.com> <46642688.8000008@codesourcery.com> <46642880.1070209@codesourcery.com> Message-ID: <466429AE.9080609@codesourcery.com> Assem Salama wrote: > Jules Bergmann wrote: >> Assem Salama wrote: >>> Everyone, >>> This patch was submitted a while ago but didn't receive any >>> feedback. This patch has a Choose_local_block addition that switches >>> between Map_subset_block and Subset_block. >> >> Assem, >> >> Thanks for resending this. I did have some feedback from the first >> time around, I apologize if you did not see it: >> >> >> This looks good, however, can you extend Choose_subblock to handle >> Global_map and Replicated_map? Both maps should be able to use a >> Subset_block. >> >> Also, you might consider specializing Create_subblock based on the >> RetBlock type rather than Map type, since the RetBlock type is what >> governs the arguments to the constructor. As currently written, if >> you add a new cases to Choose_subblock (say for Global_map), but >> forget to add it to Create_subblock, you'll get an error. >> >> -- Jules >> > Jules, > I am confused. This patch does support Global_map and Replicated map... Assem, No, I am confused :). Sorry! I was looking at feedback from a previous version of the patch. The patch looks good, please check it in. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Mon Jun 4 15:11:49 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 04 Jun 2007 11:11:49 -0400 Subject: [vsipl++] benchmarks In-Reply-To: <46603794.4050608@codesourcery.com> References: <46603794.4050608@codesourcery.com> Message-ID: <46642BB5.7050102@codesourcery.com> Assem Salama wrote: > Everyone, > This patch contains two benchmarks, one for expression template stuff > and the other for vramp. Assem, These look good. Please check expr.cpp in. For vramp.cpp, can you - use Create_map from benchmarks/create_map.hpp, - merge do_test into vramp.cpp, so that vramp.hpp goes away, and then check in? thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Tue Jun 5 21:09:16 2007 From: don at codesourcery.com (Don McCoy) Date: Tue, 05 Jun 2007 15:09:16 -0600 Subject: [vsipl++] [patch] more cleanup with benchmarks In-Reply-To: <465D836C.8050000@codesourcery.com> References: <4654C4FC.40300@codesourcery.com> <4655F411.60307@codesourcery.com> <465D836C.8050000@codesourcery.com> Message-ID: <4665D0FC.1070600@codesourcery.com> Jules Bergmann wrote: > Don, this looks good, please check it in. -- Jules > Just FYI, this is checked in now. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From assem at codesourcery.com Wed Jun 6 05:35:29 2007 From: assem at codesourcery.com (Assem Salama) Date: Wed, 06 Jun 2007 01:35:29 -0400 Subject: Simd unaligned vectors Message-ID: <466647A1.5060603@codesourcery.com> Everyone, This patch adds support for operations where all vectors are unaligned. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.06062007.1.log Type: text/x-log Size: 9984 bytes Desc: not available URL: From jules at codesourcery.com Wed Jun 6 17:16:34 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 06 Jun 2007 13:16:34 -0400 Subject: [patch] simd.hpp: fix typos, work around ppu-g++ Message-ID: <4666EBF2.9040407@codesourcery.com> -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: simd.diff URL: From jules at codesourcery.com Wed Jun 6 17:35:14 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 06 Jun 2007 13:35:14 -0400 Subject: [vsipl++] [patch] simd.hpp: fix typos, work around ppu-g++ In-Reply-To: <4666EBF2.9040407@codesourcery.com> References: <4666EBF2.9040407@codesourcery.com> Message-ID: <4666F052.30204@codesourcery.com> Oops, I attached the wrong patch. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: simd.diff URL: From jules at codesourcery.com Wed Jun 6 17:41:41 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 06 Jun 2007 13:41:41 -0400 Subject: [patch] Split large tests Message-ID: <4666F1D5.9060809@codesourcery.com> This splits several large tests into separate, smaller tests. The smaller tests are easier to compile on machines with limited physical memory. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test.diff URL: From don at codesourcery.com Wed Jun 6 18:33:46 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 06 Jun 2007 12:33:46 -0600 Subject: [patch] Fix dot product benchmark for split complex case Message-ID: <4666FE0A.50504@codesourcery.com> Ok to commit? -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: bdot.diff URL: From jules at codesourcery.com Wed Jun 6 18:46:12 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 06 Jun 2007 14:46:12 -0400 Subject: [vsipl++] [patch] Fix dot product benchmark for split complex case In-Reply-To: <4666FE0A.50504@codesourcery.com> References: <4666FE0A.50504@codesourcery.com> Message-ID: <466700F4.5080606@codesourcery.com> Don McCoy wrote: > Ok to commit? Don, Instead of ifdef'ing the case out, can you make the test class t_dot2 check the evaluator's ct_valid flag (the check would be through an implicit template parameter / class specialization). That would be a little more robust, in case this code ever gets copied-and-pasted. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From hq at export2u.ro Thu Jun 7 23:14:52 2007 From: hq at export2u.ro (hq at export2u.ro) Date: Fri, 8 Jun 2007 02:14:52 +0300 Subject: Romanian PHP, Java, ASP, & .NET Software Outsourcing Message-ID: A non-text attachment was scrubbed... Name: not available Type: text/plain charset=us-ascii Size: 4753 bytes Desc: not available URL: From jules at codesourcery.com Mon Jun 11 17:31:56 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 11 Jun 2007 13:31:56 -0400 Subject: [patch] Expression performance optimization Message-ID: <466D870C.9050108@codesourcery.com> This patch has several optimizations for expression performance. For Scalar_blocks, it uses a new shared map for all blocks, instead of each block having a local Local_or_global map. It also removes the storage of the Scalar_block's size. Before these changes, the compiler believed that Scalar_blocks had to be stored on the stack. This added significant overhead to expressions using Scalar_blocks. For unary, binary, and ternary functions defined in fns_elementwise, it passes views by const reference, instead of by value. This avoids the need to increment/decrement reference counts, which add significant overhead for small vector sizes. For the mul binary function, it uses the op::Mult functor instead of creating a redundant mul_functor. mul_functor was functionally equivalent, but math library evaluators (such as SAL, IPP, and builtin SIMD) did not recognize it. Similar changes need to be made for other functions that correspond to an operator. Currently testing. Will post some examples of improved performance in a bit. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: opt.diff URL: From don at codesourcery.com Mon Jun 11 22:37:35 2007 From: don at codesourcery.com (Don McCoy) Date: Mon, 11 Jun 2007 16:37:35 -0600 Subject: [vsipl++] [patch] Fix dot product benchmark for split complex case In-Reply-To: <466700F4.5080606@codesourcery.com> References: <4666FE0A.50504@codesourcery.com> <466700F4.5080606@codesourcery.com> Message-ID: <466DCEAF.4040108@codesourcery.com> Jules Bergmann wrote: > Instead of ifdef'ing the case out, can you make the test class t_dot2 > check the evaluator's ct_valid flag (the check would be through an > implicit template parameter / class specialization). That would be a > little more robust, in case this code ever gets copied-and-pasted. Here is a revised patch. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: bdot2.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: bdot2.diff URL: From jules at codesourcery.com Tue Jun 12 14:23:48 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 12 Jun 2007 10:23:48 -0400 Subject: [vsipl++] [patch] Fix dot product benchmark for split complex case In-Reply-To: <466DCEAF.4040108@codesourcery.com> References: <4666FE0A.50504@codesourcery.com> <466700F4.5080606@codesourcery.com> <466DCEAF.4040108@codesourcery.com> Message-ID: <466EAC74.1030706@codesourcery.com> Don McCoy wrote: > Jules Bergmann wrote: >> Instead of ifdef'ing the case out, can you make the test class t_dot2 >> check the evaluator's ct_valid flag (the check would be through an >> implicit template parameter / class specialization). That would be a >> little more robust, in case this code ever gets copied-and-pasted. > Here is a revised patch. This looks good, please check it in. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From stefan at codesourcery.com Tue Jun 12 14:55:16 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 12 Jun 2007 10:55:16 -0400 Subject: patch: fix merge conflicts Message-ID: <466EB3D4.6080505@codesourcery.com> The attached patch fixes some conflicts seemingly introduced by two overlapping patches / commits last week. (Assem: please be careful when applying 'svn resolved'. There were some artifacts (such as "<<<< .mine") as well as conflicting code checked in with the last commit.) The new Create_plan harness uses Stride_unit_align everywhere (there was one place with Stride_unit_dense, that looked like a typo). I'm not sure this is required, as we only stipulate aligned input for 1D FFTs. Thus, the current code may require the data to be copied without need. Should I instead add an overload for Create_plan::create() for 1D FFTs and relax the alignment for non-1D cases to Stride_unit_dense ? Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 -------------- next part -------------- A non-text attachment was scrubbed... Name: fftw3.patch Type: text/x-patch Size: 4729 bytes Desc: not available URL: From assem at codesourcery.com Tue Jun 12 16:23:01 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 12 Jun 2007 12:23:01 -0400 Subject: [vsipl++] patch: fix merge conflicts In-Reply-To: <466EB3D4.6080505@codesourcery.com> References: <466EB3D4.6080505@codesourcery.com> Message-ID: <466EC865.6060600@codesourcery.com> Stefan Seefeld wrote: >The attached patch fixes some conflicts seemingly introduced by two overlapping >patches / commits last week. > >(Assem: please be careful when applying 'svn resolved'. There were some artifacts >(such as "<<<< .mine") as well as conflicting code checked in with the last commit.) > > Sorry about that. I actually fixed this same file yesterday but didn't get arround to generating a patch. Thanks, Assem From jules at codesourcery.com Tue Jun 12 16:56:59 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 12 Jun 2007 12:56:59 -0400 Subject: [vsipl++] patch: fix merge conflicts In-Reply-To: <466EB3D4.6080505@codesourcery.com> References: <466EB3D4.6080505@codesourcery.com> Message-ID: <466ED05B.4040405@codesourcery.com> Stefan Seefeld wrote: > The attached patch fixes some conflicts seemingly introduced by two overlapping > patches / commits last week. > > (Assem: please be careful when applying 'svn resolved'. There were some artifacts > (such as "<<<< .mine") as well as conflicting code checked in with the last commit.) > > The new Create_plan harness uses Stride_unit_align everywhere (there was one place > with Stride_unit_dense, that looked like a typo). I'm not sure this is required, > as we only stipulate aligned input for 1D FFTs. Thus, the current code may require > the data to be copied without need. > Should I instead add an overload for Create_plan::create() for 1D FFTs and relax > the alignment for non-1D cases to Stride_unit_dense ? Stefan, Thanks for fixing this. Do you consider FFTM to be 1D or non-1D? The reason I ask is ... We implement FFTM by planning for a single 1D FFT (we do this, rather than planning for multiple 1D FFTs, because distributed data may cause the local multiple count to be different from the global multiple count). Ideally this single 1D FFT should be planned for aligned data. We can relax that when the FFT size is not a multiple of the alignment. I.e. if we're doing multiple 257-point FFTs, we can plan for unaligned data. Does that sound reasonable? -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From stefan at codesourcery.com Tue Jun 12 17:09:23 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 12 Jun 2007 13:09:23 -0400 Subject: [vsipl++] patch: fix merge conflicts In-Reply-To: <466ED05B.4040405@codesourcery.com> References: <466EB3D4.6080505@codesourcery.com> <466ED05B.4040405@codesourcery.com> Message-ID: <466ED343.8000002@codesourcery.com> Jules Bergmann wrote: > Do you consider FFTM to be 1D or non-1D? Well, the distinction is only required if we use different alignment constraints. For planning we use FFTW_UNALIGNED for all but 1D FFT (i.e. 2D, 3D, as well as M). With assem's patch (and my little fix) we use Stride_unit_align throughout, which may be overly restrictive, given that Stride_unit_dense would be perfectly valid for non-1D cases, so we may end up doing a redundant copy (well, two, actually). > The reason I ask is ... > > We implement FFTM by planning for a single 1D FFT (we do this, rather > than planning for multiple 1D FFTs, because distributed data may cause > the local multiple count to be different from the global multiple count). > > Ideally this single 1D FFT should be planned for aligned data. Right, understood. We don't do that yet, though this only seems to require a change to the Fftm_impl constructor's call to Fft_base<>(), where we would no longer pass FFTW_UNALIGNED. > We can relax that when the FFT size is not a multiple of the alignment. > I.e. if we're doing multiple 257-point FFTs, we can plan for unaligned > data. > > Does that sound reasonable? Indeed. Should I add my suggested change above to the patch before checking it in ? Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From jules at codesourcery.com Tue Jun 12 17:17:25 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 12 Jun 2007 13:17:25 -0400 Subject: [vsipl++] patch: fix merge conflicts In-Reply-To: <466ED343.8000002@codesourcery.com> References: <466EB3D4.6080505@codesourcery.com> <466ED05B.4040405@codesourcery.com> <466ED343.8000002@codesourcery.com> Message-ID: <466ED525.8060807@codesourcery.com> > > Indeed. Should I add my suggested change above to the patch before checking > it in ? Yes, that sounds good. I suspect we'll have to do something different if people ever start using multi-dim FFTs, but for now let's avoid the copy. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From stefan at codesourcery.com Tue Jun 12 21:20:48 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 12 Jun 2007 17:20:48 -0400 Subject: [vsipl++] patch: fix merge conflicts In-Reply-To: <466ED525.8060807@codesourcery.com> References: <466EB3D4.6080505@codesourcery.com> <466ED05B.4040405@codesourcery.com> <466ED343.8000002@codesourcery.com> <466ED525.8060807@codesourcery.com> Message-ID: <466F0E30.9090004@codesourcery.com> Jules Bergmann wrote: > >> >> Indeed. Should I add my suggested change above to the patch before >> checking >> it in ? > > Yes, that sounds good. I suspect we'll have to do something different > if people ever start using multi-dim FFTs, but for now let's avoid the > copy. -- Jules Here is a new patch, incorporating the changes we discussed. 1D FFT as well as FFTM now use / require aligned blocks if the block size is a multiple of the alignment size (and thus individual rows operations can be vectorized). (Since the patch is slightly more involved than I originally assumed, I'd prefer another round of review.) Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 -------------- next part -------------- A non-text attachment was scrubbed... Name: fftw.patch Type: text/x-patch Size: 10484 bytes Desc: not available URL: From jules at codesourcery.com Wed Jun 13 02:52:07 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 12 Jun 2007 22:52:07 -0400 Subject: [vsipl++] patch: fix merge conflicts In-Reply-To: <466F0E30.9090004@codesourcery.com> References: <466EB3D4.6080505@codesourcery.com> <466ED05B.4040405@codesourcery.com> <466ED343.8000002@codesourcery.com> <466ED525.8060807@codesourcery.com> <466F0E30.9090004@codesourcery.com> Message-ID: <466F5BD7.7070606@codesourcery.com> Stefan Seefeld wrote: > Jules Bergmann wrote: >>> Indeed. Should I add my suggested change above to the patch before >>> checking >>> it in ? >> Yes, that sounds good. I suspect we'll have to do something different >> if people ever start using multi-dim FFTs, but for now let's avoid the >> copy. -- Jules > > Here is a new patch, incorporating the changes we discussed. 1D FFT as > well as FFTM now use / require aligned blocks if the block size is a multiple > of the alignment size (and thus individual rows operations can be vectorized). > > (Since the patch is slightly more involved than I originally assumed, I'd > prefer another round of review.) > > Thanks, > Stefan Stefan, this looks good. I like the way you have made aligned/unaligned orthogonal in Fft_base. Please check it in. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Wed Jun 13 16:12:39 2007 From: assem at codesourcery.com (Assem Salama) Date: Wed, 13 Jun 2007 12:12:39 -0400 Subject: SIMD unaligned loop fusion Message-ID: <46701777.5010809@codesourcery.com> Everyone, This patch makes a new dispatcher that is valid when all operands are unaligned. I could make the normal Simd_loop_fusion dispatcher for this and if the alignment is 0, don't do the initial cleanup. What does everyone think about that? Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.06132007.1.log Type: text/x-log Size: 11469 bytes Desc: not available URL: From jules at codesourcery.com Fri Jun 15 11:39:44 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 15 Jun 2007 07:39:44 -0400 Subject: [patch] Fix missing tags and traits Message-ID: <46727A80.5080907@codesourcery.com> This should fix the non-FFT test failures. I'm looking into the FFT failures now. From the location of the assert failures, it looks like complex->real FFT is broken for FFTW3. Does that case ring any bells? Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misc-fix.diff URL: From jules at codesourcery.com Sat Jun 16 05:32:55 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Sat, 16 Jun 2007 01:32:55 -0400 Subject: [patch] Fix Rt_layout/Rt_extdata/Fftw3 BE Message-ID: <46737607.5090607@codesourcery.com> This patch fixes a couple of bugs - First, it fixes Applied_layout > (used by Rt_extdata) to only pay attention to alignment when the pack type is stride_unit_aligned. Previously it adjusted alignment when it was non-zero. (This is probably the only fix strictly necessary to fix the test failures). - Second, it robustifies the FFT workspace frontend and the FFTW3 BE to deal with alignment requirements. For workspace, before constructing temporary input/output buffers, the backend is not queried to determine the acceptable layout. This will handle cases where padding is needed to fix alignment. For the FFTW3 BE, the 2D FFTM implementation class now uses the stride arguments to determine the FFT to FFT stride (instead of the size). This lets it deal with non-dense but unit-stride matrices. (Right now query_layout guarantees that input/output matrices will be unit-stride, but in the future we can relax this). I'm going to start a snapshot build with this. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fftw3.diff URL: From jules at codesourcery.com Mon Jun 18 02:56:50 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Sun, 17 Jun 2007 22:56:50 -0400 Subject: [patch] Fix FFTW3 BE alignment for R-to-C and C-to-R plan creation Message-ID: <4675F472.8010408@codesourcery.com> Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misc-fft.diff URL: From assem at codesourcery.com Mon Jun 18 16:03:46 2007 From: assem at codesourcery.com (Assem Salama) Date: Mon, 18 Jun 2007 12:03:46 -0400 Subject: SIMD all unaligned dispatch Message-ID: <4676ACE2.50104@codesourcery.com> Everyone, This patch includes some missing pieces not included in previous patch. This should make a fresh checkout compile ok :) I apologize for last patch's incompleteness. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.06172007.1.log Type: text/x-log Size: 13313 bytes Desc: not available URL: From jules at codesourcery.com Mon Jun 18 20:30:56 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 18 Jun 2007 16:30:56 -0400 Subject: [patch] Fix scalar_blocks to work with GCC 3.4 Message-ID: <4676EB80.9090807@codesourcery.com> This patch scales back the scalar-block optimization a bit when using GCC 3.x (anything pre 4.0). GCC 3.4.4 was having trouble compiling expressions like this one from threshold.cpp: Vector A, C; float b = 0.5; C = ite(A >= b, A, 0.0) The scalar value for b (0.5) was being replaced by the scalar value (0.0). Any of the following changes made the error go away: - compiling with lower optimization levels - compiling with less aggressive inlining options - using printfs to examine the values stored in the expression blocks - using later versions of GCC Similar errors occured in the coverage_ternary_*.cpp tests, but were even more difficult to debug because most attempts to simplify the test case or print debugging information caused the error go away. The fix adds a copy constructor equivalent to the default copy constructor, which IIUC forces GCC to store scalar_blocks (and any expression containing scalar_blocks) on the stack. This patch also separates benchmark installation into a separate rule. The rationale for this is that building the benchmarks takes much longer than building the core library and installing the benchmarks isn't necessary to use the library. Patch applied, snapshot started! -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fix-sb.diff URL: From Aramini at LL.MIT.edu Tue Jun 19 00:59:39 2007 From: Aramini at LL.MIT.edu (Michael Aramini) Date: Mon, 18 Jun 2007 20:59:39 -0400 Subject: configure fails on Solaris on Intel/AMD architecture Message-ID: <46772A7B.9090202@LL.MIT.edu> When attempting to build Sourcery VSIPL++ 1.3 on Solaris running on a system with 64-bit AMD processors, configure fails as follows: > ATLAS: CC gcc > ATLAS: F77 > ATLAS: CFLAGS -g -O2 > checking build system type... i386-pc-solaris2.10 > checking host system type... i386-pc-solaris2.10 > checking for i386-pc-solaris2.10-gcc... gcc > checking for C compiler default output file name... a.out > checking whether the C compiler works... yes > checking whether we are cross compiling... no > checking for suffix of executables... > checking for suffix of object files... o > checking whether we are using the GNU C compiler... yes > checking whether gcc accepts -g... yes > checking for gcc option to accept ANSI C... none needed > checking for machine type... probe > checking for asm style... configure: error: cannot determine asm type. > =============================================================== > configure: error: built-in ATLAS configure FAILED. -Michael Aramini From jules at codesourcery.com Tue Jun 19 12:34:52 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 19 Jun 2007 08:34:52 -0400 Subject: [vsipl++] SIMD all unaligned dispatch In-Reply-To: <4676ACE2.50104@codesourcery.com> References: <4676ACE2.50104@codesourcery.com> Message-ID: <4677CD6C.3060207@codesourcery.com> Assem Salama wrote: > Everyone, > This patch includes some missing pieces not included in previous patch. > This should make a fresh checkout compile ok :) I apologize for last > patch's incompleteness. Assem, What is the reason for extending the length of type_list? Is that needed for this patch? Rather than add a new evaluator ("all unaligned"), I would like to have a single evaluator handle the cases where views have the same alignment (whether it is 0 or N). The only difference between the two is cleanup code before SIMD processing. Can you make that change and repost a patch? Also, did you have a chance to benchmark the iterator change (#4) below? -- Jules > > Thanks, > Assem > > > ------------------------------------------------------------------------ > > Index: src/vsip/opt/simd/expr_evaluator.hpp > =================================================================== > + static bool rt_valid(LB& lhs, RB const& rhs) > + { > + Ext_data dda(lhs, SYNC_OUT); > + int lhs_a = simd::Proxy_factory::alignment(lhs); [1] Instead of calling Proxy_factory::alignment (which internally creates another Ext_data object -- which is both extra overhead and potentially undefined), use Simd_traits::alignment_of directly. > + return (dda.stride(0) == 1 && > + simd::Proxy_factory::rt_valid(rhs, lhs_a)); > + > + > + } > + > + // First, deal with unaligned pointers > + typename Ext_data::raw_ptr_type raw_ptr = dda.data(); > + while(simd::Simd_traits::alignment_of(raw_ptr) && > + n > 0) > + { > + lhs.put(size-n, rhs.get(size-n)); > + n--; > + raw_ptr++; > + } [2] What updates the pointers held by lp and rp? They are still unaligned, right? Ah, I see. You've changed Proxy::Proxy to force alignment below. > Index: src/vsip/opt/simd/eval_generic.hpp > =================================================================== > --- src/vsip/opt/simd/eval_generic.hpp (revision 174261) > +++ src/vsip/opt/simd/eval_generic.hpp (working copy) > @@ -664,6 +664,8 @@ > > static bool rt_valid(DstBlock& dst, SrcBlock const& src) > { > + typedef simd::Simd_traits simd; > + > // check if all data is unit stride > Ext_data ext_dst(dst, SYNC_OUT); > Ext_data ext_a(src.first().left(), SYNC_IN); > @@ -672,7 +674,11 @@ > ext_a.stride(0) == 1 && > ext_b.stride(0) == 1 && > // make sure (A op B, A, k) > - (&(src.first().left()) == &(src.second()))); > + (&(src.first().left()) == &(src.second())) && > + // make sure everyting is aligned! > + !simd::alignment_of(ext_dst.data()) && > + !simd::alignment_of(ext_a.data()) && > + !simd::alignment_of(ext_b.data())); [3] Doesn't threshold handle initial unaligned values? If so, it is sufficient to check that dst, a, and b all have the same alignment. > static void exec(DstBlock& dst, SrcBlock const& src) > Index: src/vsip/opt/simd/expr_iterator.hpp > =================================================================== > --- src/vsip/opt/simd/expr_iterator.hpp (revision 174261) > +++ src/vsip/opt/simd/expr_iterator.hpp (working copy) > @@ -268,13 +268,14 @@ > simd_type load() const > { return simd::perm(x0_, x1_, sh_); } > > - void increment(length_type n = 1) > + //void increment(length_type n = 1) > + void increment() > { > - ptr_unaligned_ += n * Simd_traits::vec_size; > - ptr_aligned_ += n; > + ptr_unaligned_ += Simd_traits::vec_size; > + ptr_aligned_++; > > // update x0 > - x0_ = (n == 1)? x1_:simd::load((value_type*)ptr_aligned_); > + x0_ = x1_; [4] Did you ever benchmark the difference between these two? > > - Proxy(value_type *ptr) : ptr_(ptr) {} > + Proxy(value_type *ptr) : ptr_(ptr) > + { > + // Force alignment of pointer. > + intptr_t int_ptr = (intptr_t)ptr_; > + int_ptr &= ~(Simd_traits::alignment-1); > + ptr_ = (value_type*) int_ptr; > + } > + [5] For LValue_access_traits, this ignores the IsAligned template parameter. since we appear to only handle the case where the LHS is aligned, we should specialize this for IsAligned = true. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Tue Jun 19 12:45:34 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 19 Jun 2007 08:45:34 -0400 Subject: [vsipl++] SIMD all unaligned dispatch In-Reply-To: <4677CD6C.3060207@codesourcery.com> References: <4676ACE2.50104@codesourcery.com> <4677CD6C.3060207@codesourcery.com> Message-ID: <4677CFEE.7010309@codesourcery.com> Assem, Also, can you include a unit test for this in your next patch? thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Wed Jun 20 19:18:04 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 20 Jun 2007 13:18:04 -0600 Subject: [patch] fix for MPI type define Message-ID: <46797D6C.7050504@codesourcery.com> This patch corrects a minor typo related to the location of the mpi.h header file. Ok to commit? -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mh.diff URL: From jules at codesourcery.com Wed Jun 20 22:44:05 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 20 Jun 2007 18:44:05 -0400 Subject: [vsipl++] [patch] fix for MPI type define In-Reply-To: <46797D6C.7050504@codesourcery.com> References: <46797D6C.7050504@codesourcery.com> Message-ID: <4679ADB5.30008@codesourcery.com> Don McCoy wrote: > This patch corrects a minor typo related to the location of the mpi.h > header file. > > Ok to commit? Don, Looks good, please check it in. Can you mention the macro name in the ChangeLog? Thanks for catching this. The obvious response is to wonder how this ever worked. IIRC we used to create a pound-define with the MPI header name, i.e. #define VSIP_IMPL_MPI_H and would then include it #include VSIP_IMPL_MPI_H However, this did not work with Intel C++. The fix was to use VSIP_IMPL_MPI_H_TYPE, but apparently we did not test the mpi/mpi.h case after making the change! -- Jules > > > ------------------------------------------------------------------------ > > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 174589) > +++ ChangeLog (working copy) > @@ -1,3 +1,8 @@ > +2007-06-20 Don McCoy > + > + * src/vsip/core/mpi/services.hpp: Fix typo for systems having > + their MPI header files in the mpi/ subdirectory. Can you mention the macro name: * src/vsip/core/mpi/services.hpp (VSIP_IMPL_MPI_H_TYPE): ... -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Mon Jun 25 22:31:42 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 25 Jun 2007 18:31:42 -0400 Subject: Using FFTW3 with the VSIPL++ Reference Implementation Message-ID: <4680424E.1050309@codesourcery.com> Here is a description of how to use FFTW3 with the VSIPL++ reference implementation. These instructions are only for the reference implementation. FFTW3 already works with the optimized implementation (just configure with --enable-fft=fftw3 or --enable-fft=builtin). Please let me know if you have any questions regarding these instructions. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- A non-text attachment was scrubbed... Name: FFTW3_RefImpl.pdf Type: application/pdf Size: 86944 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 1.3-ri-fftw3.diff URL: From assem at codesourcery.com Tue Jun 26 07:16:24 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 26 Jun 2007 03:16:24 -0400 Subject: reductions-idx Message-ID: <4680BD48.6000004@codesourcery.com> Everyone, This patch fixes a failure in reduction-idx test. Thanks, Assem -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: svn.diff.06262007.1.log URL: From assem at codesourcery.com Tue Jun 26 13:27:52 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 26 Jun 2007 09:27:52 -0400 Subject: Vacation Message-ID: <46811458.8040601@codesourcery.com> Everyone, I will taking this week off and work half time for next two weeks after that. I will have access to interent. Thanks, Assem From John.Day at EssexCorp.com Tue Jun 26 23:18:09 2007 From: John.Day at EssexCorp.com (Day, John) Date: Tue, 26 Jun 2007 19:18:09 -0400 Subject: fftm compile problem Message-ID: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> Hello, I am trying out vsipl++ 1.3 !A32 binary for Windows/XP and am having a problem instantiating the fftm templates in your fft.cpp code example (from the source distribution). I using MinGW with g++ 3.4.5 and I am getting "error: no type named `first' in `struct vsip::impl::fft::LibraryTagList'" (see below for all messages). I suspect this has something to do with vsip/opt/dispatch.hpp, but I don't know enough about the architecture to figure out how to fix it. I have also tried to compile Judd and Cottel vsipl++ beamformer and got the same error for the fftm template instantiation: http://hpec-si.com/MinimumVarianceBeamformerExample.pdf Everything else compiled without error. I know that you recommend the Intel compiler for the Windows binary, but I'm hoping that g++ 3.4.5 might work with MinGW as I assume it does under Linux. Has anyone else tried this? Also, has anyone built the vsipl++ source using MinGW alone (i.e. without Cygwin)? [MinGW provides a convenient environment for linking gcc/g++/g77 to MSVC runtime, Cygwin dll's are not required] Thanks, John Day Staff Scientist Essex Corp. Melbourne, Fl > g++ -c -I/usr/local/include fft.cpp /usr/local/include/vsip/core/fft.hpp: In instantiation of `vsip::impl::fft_facade<1u, vsip::cscalar_f, vsip::cscalar_f, vsip::impl::fft::LibraryTagList, -0x000000002, by_value, 0u, alg_time>': /usr/local/include/vsip/core/fft.hpp:432: instantiated from `vsip::Fft' fft.cpp:45: instantiated from here /usr/local/include/vsip/core/fft.hpp:187: error: no type named `first' in `struct vsip::impl::fft::LibraryTagList' /usr/local/include/vsip/core/fft.hpp: In instantiation of `vsip::impl::fft_facade<1u, vsip::cscalar_f, vsip::cscalar_f, vsip::impl::fft::LibraryTagList, -0x000000001, by_value, 0u, alg_time>': /usr/local/include/vsip/core/fft.hpp:432: instantiated from `vsip::Fft' fft.cpp:46: instantiated from here /usr/local/include/vsip/core/fft.hpp:187: error: no type named `first' in `struct vsip::impl::fft::LibraryTagList' /usr/local/include/vsip/core/fft.hpp: In constructor `vsip::impl::fft_facade::fft_facade(const vsip::Domain&, typename vsip::impl::fft::base_interface::axis, vsip::impl::fft_facade::exponent>::scalar_type) [with unsigned int D = 1u, I = vsip::cscalar_f, O = vsip::cscalar_f, L = vsip::impl::fft::LibraryTagList, int S = -0x000000002, unsigned int N = 0u, vsip::alg_hint_type H = alg_time]': /usr/local/include/vsip/core/fft.hpp:439: instantiated from `vsip::Fft::Fft(const vsip::Domain::dim>&, typename vsip::impl::fft_facade::dim, I, O, vsip::impl::fft::LibraryTagList, S, R, N, H>::scalar_type) [with V = vsip::const_Vector, I = vsip::cscalar_f, O = vsip::cscalar_f, int S = -0x000000002, vsip::return_mechanism_type R = by_value, unsigned int N = 0u, vsip::alg_hint_type H = alg_time]' fft.cpp:45: instantiated from here /usr/local/include/vsip/core/fft.hpp:199: error: no type named `first' in `struct vsip::impl::fft::LibraryTagList' /usr/local/include/vsip/core/fft.hpp: In constructor `vsip::impl::fft_facade::fft_facade(const vsip::Domain&, typename vsip::impl::fft::base_interface::axis, vsip::impl::fft_facade::exponent>::scalar_type) [with unsigned int D = 1u, I = vsip::cscalar_f, O = vsip::cscalar_f, L = vsip::impl::fft::LibraryTagList, int S = -0x000000001, unsigned int N = 0u, vsip::alg_hint_type H = alg_time]': /usr/local/include/vsip/core/fft.hpp:439: instantiated from `vsip::Fft::Fft(const vsip::Domain::dim>&, typename vsip::impl::fft_facade::dim, I, O, vsip::impl::fft::LibraryTagList, S, R, N, H>::scalar_type) [with V = vsip::const_Vector, I = vsip::cscalar_f, O = vsip::cscalar_f, int S = -0x000000001, vsip::return_mechanism_type R = by_value, unsigned int N = 0u, vsip::alg_hint_type H = alg_time]' fft.cpp:46: instantiated from here /usr/local/include/vsip/core/fft.hpp:199: error: no type named `first' in `struct vsip::impl::fft::LibraryTagList' This electronic message and any files transmitted with it contain information which may be privileged and/or proprietary. The information is intended for use solely by the intended recipient(s). If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of this information is prohibited. If you have received this electronic message in error, please advise the sender by reply email or by telephone (301-939-7000) and delete the message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at codesourcery.com Tue Jun 26 23:27:33 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 26 Jun 2007 19:27:33 -0400 Subject: [vsipl++] fftm compile problem In-Reply-To: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> Message-ID: <4681A0E5.6010707@codesourcery.com> Hello John, Day, John wrote: > Hello, > > I am trying out vsipl++ 1.3 !A32 binary for Windows/XP and am having a > problem instantiating the fftm templates in your fft.cpp code example > (from the source distribution). > > I using MinGW with g++ 3.4.5 and I am getting ?error: no type named > `first' in `struct vsip::impl::fft::LibraryTagList'? (see below for all > messages). I suspect this has something to do with > vsip/opt/dispatch.hpp, but I don?t know enough about the architecture to > figure out how to fix it. You are right, the problem is related to the dispatch mechanism we use to delegate calls to backends. In this case, it sounds as if you haven't configured any fft backends. You mention the windows binary release, which is configured / compiled for use with Intel's IPP and MKL libraries. But then you are talking about the source distribution, and mingw. To help you a little further it is important to know what Sourcery VSIPL++ package you use, and how exactly the command looks that raises the error. Are you configuring / building Sourcery VSIPL++ yourself ? What commands are you using ? Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From John.Day at EssexCorp.com Wed Jun 27 01:07:00 2007 From: John.Day at EssexCorp.com (Day, John) Date: Tue, 26 Jun 2007 21:07:00 -0400 Subject: [vsipl++] fftm compile problem References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> Message-ID: <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> Stefan wrote: >> You mention the windows binary release, which is configured / compiled >> for use with Intel's IPP and MKL libraries. But then you are talking about >> the source distribution, and mingw. To help you a little further it >> is important to know what Sourcery VSIPL++ package you use ..... At first I tried to build the source distribution using MinGW and g++ 3.4.5, but the build failed trying to configure ATLAS and I was not able to produce the config files. So then I tried using the IA32 binary, just to see if I could compile the example fft.cpp (and the BeamformEx files) from the MS-DOS command line: > g++ -c -I/usr/local/include fft.cpp That's when the error occurred. I did not configure the fft backends or anything else. Nor did I expect the link step to work because there are no .a libraries in the Windows binary. I suppose I will have to set up a Cygwin environment, but I was hoping that MinGW alone would work. Tnx, John Day ________________________________ From: Stefan Seefeld [mailto:stefan at codesourcery.com] Sent: Tue 6/26/2007 7:27 PM To: Day, John Cc: vsipl++ at codesourcery.com Subject: Re: [vsipl++] fftm compile problem Hello John, Day, John wrote: > Hello, > > I am trying out vsipl++ 1.3 !A32 binary for Windows/XP and am having a > problem instantiating the fftm templates in your fft.cpp code example > (from the source distribution). > > I using MinGW with g++ 3.4.5 and I am getting "error: no type named > `first' in `struct vsip::impl::fft::LibraryTagList'" (see below for all > messages). I suspect this has something to do with > vsip/opt/dispatch.hpp, but I don't know enough about the architecture to > figure out how to fix it. You are right, the problem is related to the dispatch mechanism we use to delegate calls to backends. In this case, it sounds as if you haven't configured any fft backends. You mention the windows binary release, which is configured / compiled for use with Intel's IPP and MKL libraries. But then you are talking about the source distribution, and mingw. To help you a little further it is important to know what Sourcery VSIPL++ package you use, and how exactly the command looks that raises the error. Are you configuring / building Sourcery VSIPL++ yourself ? What commands are you using ? Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 This electronic message and any files transmitted with it contain information which may be privileged and/or proprietary. The information is intended for use solely by the intended recipient(s). If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of this information is prohibited. If you have received this electronic message in error, please advise the sender by reply email or by telephone (301-939-7000) and delete the message. From stefan at codesourcery.com Wed Jun 27 02:28:39 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 26 Jun 2007 22:28:39 -0400 Subject: [vsipl++] fftm compile problem In-Reply-To: <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> Message-ID: <4681CB57.50905@codesourcery.com> Day, John wrote: > Stefan wrote: >>> You mention the windows binary release, which is configured / compiled >>> for use with Intel's IPP and MKL libraries. But then you are talking about >>> the source distribution, and mingw. To help you a little further it >>> is important to know what Sourcery VSIPL++ package you use ..... > > At first I tried to build the source distribution using MinGW and g++ 3.4.5, but the build failed trying to configure ATLAS and I was not able to produce the config files. Right, configuring ATLAS is not easy. We have never attempted to support ATLAS on Windows. Note, however, that there are a number of configure options to work around those problems by using alternate lapack implementations, or none at all (thus disabling parts of the functionality provided by the VSIPL++ spec). You can find out more about these in the quickstart (http://www.codesourcery.com/public/vsiplplusplus/sourceryvsipl++-1.3/quickstart/ch02s03.html) > So then I tried using the IA32 binary, just to see if I could compile the example fft.cpp (and the BeamformEx files) from the MS-DOS command line: > >> g++ -c -I/usr/local/include fft.cpp > > That's when the error occurred. I did not configure the fft backends or anything else. Nor did I expect the link step to work because there are no .a libraries in the Windows binary. That is strange, as the Windows binary package is configured / built for use with Intel's IPP and MKL. I'm thus not sure what causes the error message you are reporting. Please note that the suggested way to build applications with Sourcery VSIPL++ is to query compiler options from the vsipl++.pc files that are part of binary releases. It is possible, or even likely, that you are missing some important macro definition that causes the built-in FFT backends to be masked. > I suppose I will have to set up a Cygwin environment, but I was hoping that MinGW alone would work. The only supported compiler on Windows is Intel's ICC. We haven't attempted to build using GCC on Windows, though we are now considering it. Regards, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From jules at codesourcery.com Wed Jun 27 11:33:40 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 27 Jun 2007 07:33:40 -0400 Subject: [vsipl++] fftm compile problem In-Reply-To: <4681CB57.50905@codesourcery.com> References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> <4681CB57.50905@codesourcery.com> Message-ID: <46824B14.4090009@codesourcery.com> John, A couple of bits: - It should be possible to build Sourcery VSIPL++ with MinGW on windows. Unfortunately, you won't be able to use MinGW with the windows binary package from our website, because that has been built with Intel C++, which IIUC has a different C++ ABI than GCC on windows. To use MinGW, you will need to build the library from the source package. This requires you to run configure, so you will need either MSys or cygwin (something to provide the equiv of /bin/sh). - MinGW GCC 3.4.5 will work fine (we use 3.4.4 to build our Linux binary pacakges). GCC 4.1/4.2 will give better performance, but that is another matter ... - The compile error you're seeing is a result of the library not being able to find a FFT backend. This happens because you're missing some macro definitions that need to be on the command line. If you look in the file 'lib/pkgconfig/vsipl++.pc' of the binary package, you will see a line: cppflags=-I${includedir} -DVSIP_IMPL_PAR_SERVICE=0 -DVSIP_IMPL_IPP_FFT=1 -DVSIP_IMPL_FFT_USE_FLOAT=1 -DVSIP_IMPL_FFT_USE_DOUBLE=1 -DVSIP_IMPL_FFT_USE_LONG_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_FLOAT=1 -DVSIP_IMPL_PROVIDE_FFT_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE=0 -DVSIP_IMPL_USE_CBLAS=2 These macros tell the library which FFT backends to use (in this case, we're using the IPP FFT, which happens to be how the windows binary package was configured). Those definitions need to be on the command line when you compile. You might retry compiling fft.cpp as g++ -c -I/usr/local/include -DVSIP_IMPL_PAR_SERVICE=0 -DVSIP_IMPL_IPP_FFT=1 -DVSIP_IMPL_FFT_USE_FLOAT=1 -DVSIP_IMPL_FFT_USE_DOUBLE=1 -DVSIP_IMPL_FFT_USE_LONG_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_FLOAT=1 -DVSIP_IMPL_PROVIDE_FFT_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE=0 -DVSIP_IMPL_USE_CBLAS=2 That should fix the compilation errors. However, the above mentioned problem of ICC and MinGW C++ ABI's being incompatible still remains of course! - Sourcery VSIPL++ can be built with Cygwin too. Do you have MSYS installed along with MinGW? If so, you should configure the library from the source package. The following configure command would be a good starting point: configure \ --with-lapack=simple-builtin \ --enable-fft=builtin Let us know how that works! -- Jules Stefan Seefeld wrote: > Day, John wrote: >> Stefan wrote: >>>> You mention the windows binary release, which is configured / compiled >>>> for use with Intel's IPP and MKL libraries. But then you are talking about >>>> the source distribution, and mingw. To help you a little further it >>>> is important to know what Sourcery VSIPL++ package you use ..... >> >> At first I tried to build the source distribution using MinGW and g++ 3.4.5, but the build failed trying to configure ATLAS and I was not able to produce the config files. > > Right, configuring ATLAS is not easy. We have never attempted to support ATLAS > on Windows. Note, however, that there are a number of configure options to work > around those problems by using alternate lapack implementations, or none at all > (thus disabling parts of the functionality provided by the VSIPL++ spec). You > can find out more about these in the quickstart > (http://www.codesourcery.com/public/vsiplplusplus/sourceryvsipl++-1.3/quickstart/ch02s03.html) > >> So then I tried using the IA32 binary, just to see if I could compile the example fft.cpp (and the BeamformEx files) from the MS-DOS command line: >> >>> g++ -c -I/usr/local/include fft.cpp >> >> That's when the error occurred. I did not configure the fft backends or anything else. Nor did I expect the link step to work because there are no .a libraries in the Windows binary. > > That is strange, as the Windows binary package is configured / built > for use with Intel's IPP and MKL. I'm thus not sure what causes the > error message you are reporting. Please note that the suggested way > to build applications with Sourcery VSIPL++ is to query compiler options > from the vsipl++.pc files that are part of binary releases. It is possible, > or even likely, that you are missing some important macro definition that > causes the built-in FFT backends to be masked. > >> I suppose I will have to set up a Cygwin environment, but I was hoping that MinGW alone would work. > > The only supported compiler on Windows is Intel's ICC. We haven't attempted > to build using GCC on Windows, though we are now considering it. > > Regards, > Stefan > -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From John.Day at EssexCorp.com Thu Jun 28 11:31:50 2007 From: John.Day at EssexCorp.com (Day, John) Date: Thu, 28 Jun 2007 07:31:50 -0400 Subject: [vsipl++] fftm compile problem References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> <4681CB57.50905@codesourcery.com> <46824B14.4090009@codesourcery.com> Message-ID: <5CD1C9B961A59D4592F02D12CD93238B066A2A05@STITCH.essexcorp.com> Jules, We tried building vsipl++1.3 on Windows using the Cygwin enviroment, but had many problems. However (surprisingly) we were successful in building using standalone MinGW with Msys and gcc/g++/g77 3.4.5, with only two minor glitches: 1. MinGW didn't have sys/times.h, so we created one with just a tms structure which satisfied the make. 2. Modified vendor\fftw\kernel\alloc.c to allow compilation of our_alloc16() The two build examples, fft.exe and example1.exe were linked and ran OK, which suggests that our compiler switches and linkage issues were resolved OK. But we are still having a problem compiling the Judd/Cottel BeamformEx code (http://hpec-si.com/MinimumVarianceBeamformerExample.pdf) in files BeamformEx.cpp and beam_steer_coeff.cpp [See listing compile/link commands and errors at end of this message] BeamformEx.cpp: pg 7 // Create a cholesky object vsip::chold chold_object(vsip::chold::lower,nh); I was able to get this to compile by changing the first parameter of the constructor to (vsip::mat_uplo)0, since it seems to be looking for an enumeration of zero. beam_steer_coeff.cpp: pg37 k *= (2.0 * M_PI/sv); This statement causes the error, possibly due to incorrect overloading of *= operator. I can get all of the beamformer files to compile and link if I comment out this last statement. Is this still a config problem, or is this code possibly out of date? A comment on page 43 suggests that this is using a very early implementation of VSIPL++. We are trying to get this beamformer working to do signal processing on some towed-array sonar data. Are there any other adaptive beamformers available similar to this in the VSIPL++ community, either commercially or as free software? Thanks for your help and suggestions, John Day >set files=BeamformEx\BeamformEx.cpp BeamformEx\array.cpp BeamformEx\beam_steer_coef.cpp BeamformEx\data_input.cpp BeamformEx\param_mvdr.cpp BeamformEx\phat.cpp >g++ -c -I src -I ./src -I./vendor/clapack/SRC -I/usr/local/include/fftw3 -DVSIP_IMPL_FFTW3=1 -DVSIP_IMPL_PAR_SERVICE=0 -DVSIP_IMPL_FFT_USE_FLOAT=1 -DVSIP_IMPL_FFT_USE_DOUBLE=1 -DVSIP_IMPL_FFT_USE_LONG_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_FLOAT=1 -DVSIP_IMPL_PROVIDE_FFT_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE=1 -I/lapack -DVSIP_IMPL_USE_CBLAS=0 -g -O2 -I./src BeamformEx\BeamformEx.cpp BeamformEx\array.cpp BeamformEx\beam_steer_coef.cpp BeamformEx\data_input.cpp BeamformEx\param_mvdr.cpp BeamformEx\phat.cpp BeamformEx\BeamformEx.cpp: In function `int main(int, char**)': BeamformEx\BeamformEx.cpp:220: error: `lower' is not a member of `vsip::chold' src/vsip/core/expr/scalar_block.hpp: In instantiation of `vsip::impl::Scalar_block_base<1u, double>': src/vsip/core/expr/scalar_block.hpp:69: instantiated from `vsip::impl::Scalar_block<1u, double>' src/vsip/core/expr/binary_block.hpp:76: instantiated from `vsip::impl::Binary_expr_block<1u, vsip::impl::op::Mult, vsip::Dense<1u, vsip::scalar_f, vsip::tuple<0u, 1u, 2u>, vsip::Local_map>, vsip::scalar_f, vsip::impl::Scalar_block<1u, double>, double>' src/vsip/vector.hpp:45: instantiated from `vsip::const_Vector, vsip::Local_map>, vsip::scalar_f, vsip::impl::Scalar_block<1u, double>, double> >' src/vsip/vector.hpp:270: instantiated from `vsip::Vector& vsip::Vector::operator*=(const T0&) [with T0 = double, T = vsip::scalar_f, Block = vsip::Dense<1u, vsip::scalar_f, vsip::tuple<0u, 1u, 2u>, vsip::Local_map>]' BeamformEx\beam_steer_coef.cpp:71: instantiated from here src/vsip/core/expr/scalar_block.hpp:60: error: `vsip::impl::Scalar_block_base::map_' has incomplete type src/vsip/core/parallel/local_map.hpp:32: error: declaration of `struct vsip::Local_or_global_map<1u>' ________________________________ From: Jules Bergmann [mailto:jules at codesourcery.com] Sent: Wed 6/27/2007 7:33 AM Cc: Day, John; vsipl++ at codesourcery.com Subject: Re: [vsipl++] fftm compile problem John, A couple of bits: - It should be possible to build Sourcery VSIPL++ with MinGW on windows. Unfortunately, you won't be able to use MinGW with the windows binary package from our website, because that has been built with Intel C++, which IIUC has a different C++ ABI than GCC on windows. To use MinGW, you will need to build the library from the source package. This requires you to run configure, so you will need either MSys or cygwin (something to provide the equiv of /bin/sh). - MinGW GCC 3.4.5 will work fine (we use 3.4.4 to build our Linux binary pacakges). GCC 4.1/4.2 will give better performance, but that is another matter ... - The compile error you're seeing is a result of the library not being able to find a FFT backend. This happens because you're missing some macro definitions that need to be on the command line. If you look in the file 'lib/pkgconfig/vsipl++.pc' of the binary package, you will see a line: cppflags=-I${includedir} -DVSIP_IMPL_PAR_SERVICE=0 -DVSIP_IMPL_IPP_FFT=1 -DVSIP_IMPL_FFT_USE_FLOAT=1 -DVSIP_IMPL_FFT_USE_DOUBLE=1 -DVSIP_IMPL_FFT_USE_LONG_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_FLOAT=1 -DVSIP_IMPL_PROVIDE_FFT_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE=0 -DVSIP_IMPL_USE_CBLAS=2 These macros tell the library which FFT backends to use (in this case, we're using the IPP FFT, which happens to be how the windows binary package was configured). Those definitions need to be on the command line when you compile. You might retry compiling fft.cpp as g++ -c -I/usr/local/include -DVSIP_IMPL_PAR_SERVICE=0 -DVSIP_IMPL_IPP_FFT=1 -DVSIP_IMPL_FFT_USE_FLOAT=1 -DVSIP_IMPL_FFT_USE_DOUBLE=1 -DVSIP_IMPL_FFT_USE_LONG_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_FLOAT=1 -DVSIP_IMPL_PROVIDE_FFT_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE=0 -DVSIP_IMPL_USE_CBLAS=2 That should fix the compilation errors. However, the above mentioned problem of ICC and MinGW C++ ABI's being incompatible still remains of course! - Sourcery VSIPL++ can be built with Cygwin too. Do you have MSYS installed along with MinGW? If so, you should configure the library from the source package. The following configure command would be a good starting point: configure \ --with-lapack=simple-builtin \ --enable-fft=builtin Let us know how that works! -- Jules Stefan Seefeld wrote: > Day, John wrote: >> Stefan wrote: >>>> You mention the windows binary release, which is configured / compiled >>>> for use with Intel's IPP and MKL libraries. But then you are talking about >>>> the source distribution, and mingw. To help you a little further it >>>> is important to know what Sourcery VSIPL++ package you use ..... >> >> At first I tried to build the source distribution using MinGW and g++ 3.4.5, but the build failed trying to configure ATLAS and I was not able to produce the config files. > > Right, configuring ATLAS is not easy. We have never attempted to support ATLAS > on Windows. Note, however, that there are a number of configure options to work > around those problems by using alternate lapack implementations, or none at all > (thus disabling parts of the functionality provided by the VSIPL++ spec). You > can find out more about these in the quickstart > (http://www.codesourcery.com/public/vsiplplusplus/sourceryvsipl++-1.3/quickstart/ch02s03.html) > >> So then I tried using the IA32 binary, just to see if I could compile the example fft.cpp (and the BeamformEx files) from the MS-DOS command line: >> >>> g++ -c -I/usr/local/include fft.cpp >> >> That's when the error occurred. I did not configure the fft backends or anything else. Nor did I expect the link step to work because there are no .a libraries in the Windows binary. > > That is strange, as the Windows binary package is configured / built > for use with Intel's IPP and MKL. I'm thus not sure what causes the > error message you are reporting. Please note that the suggested way > to build applications with Sourcery VSIPL++ is to query compiler options > from the vsipl++.pc files that are part of binary releases. It is possible, > or even likely, that you are missing some important macro definition that > causes the built-in FFT backends to be masked. > >> I suppose I will have to set up a Cygwin environment, but I was hoping that MinGW alone would work. > > The only supported compiler on Windows is Intel's ICC. We haven't attempted > to build using GCC on Windows, though we are now considering it. > > Regards, > Stefan > -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 This electronic message and any files transmitted with it contain information which may be privileged and/or proprietary. The information is intended for use solely by the intended recipient(s). If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of this information is prohibited. If you have received this electronic message in error, please advise the sender by reply email or by telephone (301-939-7000) and delete the message. From jules at codesourcery.com Thu Jun 28 13:03:05 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 28 Jun 2007 09:03:05 -0400 Subject: [vsipl++] fftm compile problem In-Reply-To: <5CD1C9B961A59D4592F02D12CD93238B066A2A05@STITCH.essexcorp.com> References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> <4681CB57.50905@codesourcery.com> <46824B14.4090009@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A05@STITCH.essexcorp.com> Message-ID: <4683B189.7040407@codesourcery.com> Day, John wrote: > Jules, > We tried building vsipl++1.3 on Windows using the Cygwin enviroment, > but had many problems. If you don't mind, can you describe the problems? We've had some success with cygwin, however we would like to make things more robust. > However (surprisingly) we were successful in > building using standalone MinGW with Msys and gcc/g++/g77 3.4.5, Great! > with only two minor glitches: > 1. MinGW didn't have sys/times.h, so we created one with just a tms > structure which satisfied the make. OK. Do you know where this was being included from? We try to pull in , but only if you've enabled one of the posix timers (--enable-timer=posix or --enable-timer=realtime). > 2. Modified vendor\fftw\kernel\alloc.c to allow compilation of > our_alloc16() Was this to fix a compilation error in that routine, or to force the #ifdef to true? > > The two build examples, fft.exe and example1.exe were linked and ran OK, which suggests that our compiler switches and linkage issues were resolved OK. > > But we are still having a problem compiling the Judd/Cottel BeamformEx code (http://hpec-si.com/MinimumVarianceBeamformerExample.pdf) in files BeamformEx.cpp and beam_steer_coeff.cpp > [See listing compile/link commands and errors at end of this message] > > BeamformEx.cpp: pg 7 > // Create a cholesky object > vsip::chold > chold_object(vsip::chold::lower,nh); > I was able to get this to compile by changing the first parameter of the constructor to (vsip::mat_uplo)0, since it seems to be looking for an enumeration of zero. 'lower' is no longer part of the chold object, rather it is in the vsip namespace. You might try changing parameter to vsip::lower. > > beam_steer_coeff.cpp: pg37 > k *= (2.0 * M_PI/sv); > This statement causes the error, possibly due to incorrect > overloading of *= operator. > > I can get all of the beamformer files to compile and link if I > comment out this last statement. Is this still a config problem, or > is this code possibly out of date? A comment on page 43 suggests > that this is using a very early implementation of VSIPL++. That statement should work. From the error message below, the library may be failing to include a header file. Can you try adding the following include #include and recompiling? > We are trying to get this beamformer working to do signal > processing on some towed-array sonar data. Are there any other > adaptive beamformers available similar to this in the VSIPL++ > community, either commercially or as free software? There is a K-Omega beamformer (also originating from Randy Judd) that was included with the old VSIPL++ reference implementation. However, I am not sure if it is adaptive. > > Thanks for your help and suggestions, > John Day No problem! Thanks for your feedback on VSIPL++. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From John.Day at EssexCorp.com Thu Jun 28 15:05:50 2007 From: John.Day at EssexCorp.com (Day, John) Date: Thu, 28 Jun 2007 11:05:50 -0400 Subject: [vsipl++] fftm compile problem References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> <4681CB57.50905@codesourcery.com> <46824B14.4090009@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A05@STITCH.essexcorp.com> <4683B189.7040407@codesourcery.com> Message-ID: <5CD1C9B961A59D4592F02D12CD93238B07CEBA55@STITCH.essexcorp.com> Jules wrote: >> We tried building vsipl++1.3 on Windows using the Cygwin enviroment, >> but had many problems. > If you don't mind, can you describe the problems? We've had some > success with cygwin, however we would like to make things more robust. Turns out that there was only a single problem: the configure failed on fftw3l (using the "builtin" parameters that you suggested). Configure reported an error on the console, but the logs did not contain any specific error that we could identify or troubleshoot. That's when we decided to give MingGW a try. I should also mention that our development platform is a Dell running an x64 Dual Core Xeon processor. But we are mostly running in 32-bit emulation (using the WOW64 emulation) which seems to be slightly unstable, for example we cannot get gdb (MinGW or Cygwin versions) to run reliably. So there might be other "x64" side-effects at play here. >> However (surprisingly) we were successful in >> building using standalone MinGW with Msys and gcc/g++/g77 3.4.5, >Great! >> with only two minor glitches: >> 1. MinGW didn't have sys/times.h, so we created one with just a tms >> structure which satisfied the make. >OK. Do you know where this was being included from? We try to pull >in , but only if you've enabled one of the posix timers >(--enable-timer=posix or --enable-timer=realtime). These CLAPACK files included sys/times.h vendor/clapack/SRC/dsecnd.c vendor/clapack/SRC/second.c >> 2. Modified vendor\fftw\kernel\alloc.c to allow compilation of >> our_alloc16() >Was this to fix a compilation error in that routine, or to force the >#ifdef to true? We forced with these #defines #define WITH_OUR_MALLOC16 #define MIN_ALIGNMENT 16 #if defined(WITH_OUR_MALLOC16) && (MIN_ALIGNMENT == 16) ?> >> The two build examples, fft.exe and example1.exe were linked and ran >>OK, which suggests that our compiler switches and linkage issues were >>resolved OK. >> >> But we are still having a problem compiling the Judd/Cottel >>BeamformEx code >>(http://hpec-si.com/MinimumVarianceBeamformerExample.pdf) in files >>BeamformEx.cpp and beam_steer_coeff.cpp >> [See listing compile/link commands and errors at end of this message] >> >> BeamformEx.cpp: pg 7 >> // Create a cholesky object >> vsip::chold >> chold_object(vsip::chold>vsip::by_reference>::lower,nh); >> I was able to get this to compile by changing the first parameter of >>the constructor to (vsip::mat_uplo)0, since it seems to be looking for >>an enumeration of zero. >'lower' is no longer part of the chold object, rather it is in the >vsip namespace. You might try changing parameter to vsip::lower. That worked. >> >> beam_steer_coeff.cpp: pg37 >> k *= (2.0 * M_PI/sv); >> This statement causes the error, possibly due to incorrect >> overloading of *= operator. >> >> I can get all of the beamformer files to compile and link if I >> comment out this last statement. Is this still a config problem, or >> is this code possibly out of date? A comment on page 43 suggests >> that this is using a very early implementation of VSIPL++. >That statement should work. From the error message below, the library >may be failing to include a header file. >Can you try adding the following include > #include >and recompiling? That worked. Also tried replacing both includes with a single #include and that worked too. >> We are trying to get this beamformer working to do signal >> processing on some towed-array sonar data. Are there any other >> adaptive beamformers available similar to this in the VSIPL++ >> community, either commercially or as free software? >There is a K-Omega beamformer (also originating from Randy Judd) that >was included with the old VSIPL++ reference implementation. However, >I am not sure if it is adaptive. We found this presentation with code snippets, http://hpec-si.com/S14-HPEC-SI-VSIPL++.ppt#298,12,VSIPL++ Version ...but can't find the entire source code. How might we obtain this code or similar VSIPL++ implementations? (We are under Navy contract, so might reuse some old government code, if any exists). Thanks, John Day This electronic message and any files transmitted with it contain information which may be privileged and/or proprietary. The information is intended for use solely by the intended recipient(s). If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of this information is prohibited. If you have received this electronic message in error, please advise the sender by reply email or by telephone (301-939-7000) and delete the message. From jules at codesourcery.com Thu Jun 28 16:08:39 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 28 Jun 2007 12:08:39 -0400 Subject: [vsipl++] fftm compile problem In-Reply-To: <5CD1C9B961A59D4592F02D12CD93238B07CEBA55@STITCH.essexcorp.com> References: <5CD1C9B961A59D4592F02D12CD93238B07C9E806@STITCH.essexcorp.com> <4681A0E5.6010707@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A03@STITCH.essexcorp.com> <4681CB57.50905@codesourcery.com> <46824B14.4090009@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B066A2A05@STITCH.essexcorp.com> <4683B189.7040407@codesourcery.com> <5CD1C9B961A59D4592F02D12CD93238B07CEBA55@STITCH.essexcorp.com> Message-ID: <4683DD07.8020904@codesourcery.com> Day, John wrote: > Jules wrote: >>> We tried building vsipl++1.3 on Windows using the Cygwin enviroment, >>> but had many problems. > >> If you don't mind, can you describe the problems? We've had some >> success with cygwin, however we would like to make things more robust. > > Turns out that there was only a single problem: the configure failed on > fftw3l (using the "builtin" parameters that you suggested). Configure > reported an error on the console, but the logs did not contain any > specific error that we could identify or troubleshoot. That's when we > decided to give MingGW a try. Ok, you can work around that by configuring with --disable-fft-long-double. > > I should also mention that our development platform is a Dell running an > x64 Dual Core Xeon processor. But we are mostly running in 32-bit > emulation (using the WOW64 emulation) which seems to be slightly > unstable, for example we cannot get gdb (MinGW or Cygwin versions) to > run reliably. So there might be other "x64" side-effects at play here. Interesting. As you might know, our company also produces Sourcery G++ a productized version of the GNU toolchain. I'm checking with our G++ team to see if we have any solutions for 64-bit windows. > > These CLAPACK files included sys/times.h > vendor/clapack/SRC/dsecnd.c > vendor/clapack/SRC/second.c Thanks! Unfotunately we pull in all of lapack, even though we don't use all of it, including the timer routines. I've captured this issue internally, we'll correct that in our next release. > > >> 2. Modified vendor\fftw\kernel\alloc.c to allow compilation of > >> our_alloc16() > >> Was this to fix a compilation error in that routine, or to force the >> #ifdef to true? > > We forced with these #defines > #define WITH_OUR_MALLOC16 > #define MIN_ALIGNMENT 16 > #if defined(WITH_OUR_MALLOC16) && (MIN_ALIGNMENT == 16) Thanks, we need to look into why FFTW's configure did not detect WITH_OUR_MALLOC16. > >> 'lower' is no longer part of the chold object, rather it is in the >> vsip namespace. You might try changing parameter to vsip::lower. > > That worked. Great! >> Can you try adding the following include > >> #include > >> and recompiling? > > That worked. > Also tried replacing both includes with a single #include > and that worked too. Great, thanks for trying that out. That is an issue in our library that we need to fix. Including map should not be required if maps are not being explicitly used. > >> There is a K-Omega beamformer (also originating from Randy Judd) that >> was included with the old VSIPL++ reference implementation. However, >> I am not sure if it is adaptive. > > We found this presentation with code snippets, > http://hpec-si.com/S14-HPEC-SI-VSIPL++.ppt#298,12,VSIPL++ Version > > ...but can't find the entire source code. How might we obtain this code > or similar VSIPL++ implementations? (We are under Navy contract, so > might reuse some old government code, if any exists). Ok, I'll look into where this code might be. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705