From jules at codesourcery.com Tue May 1 21:36:40 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 01 May 2007 17:36:40 -0400 Subject: [vsipl++] [patch] HPEC Challenge Benchmark, Firbank enhancement In-Reply-To: <463661B6.5020607@codesourcery.com> References: <46365F2D.7000104@codesourcery.com> <463661B6.5020607@codesourcery.com> Message-ID: <4637B2E8.5050906@codesourcery.com> Stefan Seefeld wrote: > Don McCoy wrote: > >> Some minor cleanup of the other benchmarks is included. > >> -struct t_firbank_base : public t_local_view >> +struct t_firbank_base : public t_local_view, Benchmark_base > > Could you please consistently either put the access specifier ('public') everywhere > or nowhere ? (I'd prefer nowhere, as for structs it is implied.) Sounds good to me. Otherwise this looks good. Please check it in. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Thu May 3 15:15:40 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 03 May 2007 11:15:40 -0400 Subject: [patch] Fix Cbe FFTM BE to handle empty subblocks Message-ID: <4639FC9C.6010102@codesourcery.com> Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fftm.diff URL: From don at codesourcery.com Sat May 5 22:35:15 2007 From: don at codesourcery.com (Don McCoy) Date: Sat, 05 May 2007 16:35:15 -0600 Subject: [patch] Benchmarking documentation Message-ID: <463D06A3.6090403@codesourcery.com> Attached is a patch to add a new chapter to the Reference section of the tutorial explaining how to run the benchmarks and interpret their output. It should serve as a good starting point, though many details still remain to be added. Also, I'd appreciate suggestions as to how to better organize the table in the first section. Thanks, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchdoc.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchdoc.diff URL: From don at codesourcery.com Sun May 6 22:46:21 2007 From: don at codesourcery.com (Don McCoy) Date: Sun, 06 May 2007 16:46:21 -0600 Subject: [patch] install benchmark sources Message-ID: <463E5ABD.50403@codesourcery.com> As attached. Subsequent patches will be needed to 1) install the HPEC Benchmark sources 2) install benchmark executables 3) clean up the standalone makefile 4) update tutorial and quickstart documents Plus other revisions for the install directory layout as discussed. I'll try to make all these changes in small sensible pieces to make it easier to review. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchmake.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchmake.diff URL: From jules at codesourcery.com Tue May 8 12:30:49 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 08 May 2007 08:30:49 -0400 Subject: [patch] Fix Fft_return_functor local subblock size Message-ID: <46406D79.6000408@codesourcery.com> This patch fixes Fft_return_functor to compute the correct local subblock size for Fftm, and includes a regression test. Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fftm_lsb.diff URL: From jules at codesourcery.com Tue May 8 14:04:40 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 08 May 2007 10:04:40 -0400 Subject: [vsipl++] Generator expr blocks In-Reply-To: <46339D31.8000008@codesourcery.com> References: <46339D31.8000008@codesourcery.com> Message-ID: <46408378.8040600@codesourcery.com> Assem Salama wrote: > Everyone, > This patch makes the local_block_type a Subset_block if a normal > Block_dist map is used. Assem, This looks good, however, can you extend Choose_subblock to handle Global_map and Replicated_map? Both maps should be able to use a Subset_block. Also, you might consider specializing Create_subblock based on the RetBlock type rather than Map type, since the RetBlock type is what governs the arguments to the constructor. As currently written, if you add a new cases to Choose_subblock (say for Global_map), but forget to add it to Create_subblock, you'll get an error. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Tue May 8 14:53:33 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 08 May 2007 10:53:33 -0400 Subject: [vsipl++] fftw3 In-Reply-To: <46310732.9060908@codesourcery.com> References: <46310732.9060908@codesourcery.com> Message-ID: <46408EED.3020604@codesourcery.com> Assem Salama wrote: > Everyone, > This patch address Jule's comments. Took out create_plan_defs.hpp and > added a new file, fftw_support.hpp that contains overloaded functions > for creating plans. Also, using Rt_tuple and Applied_layout in the > create functions. Assem, This looks good. However, it looks like there are a few loose ends that need fixing before checking in: - you still include create_plan_defs (perhaps you have an old copy in your SVN directory?) - Although the split C->C Create_plan uses Applied_layout, some of the other split Create_plan::create functions refer to construct_dense_domain, which no longer exists. Can you check that those are being exercised with the tests? - A few naming nits. Can you address these and send out another patch? thanks, -- Jules > ------------------------------------------------------------------------ > > Index: src/vsip/opt/fftw3/fft.cpp > =================================================================== > --- src/vsip/opt/fftw3/fft.cpp (revision 165174) > +++ src/vsip/opt/fftw3/fft.cpp (working copy) > @@ -18,6 +18,7 @@ > #include > #include > #include > +#include [1] Why does fft.cpp need to include create_plan.hpp? Is it because it includes other header files that use create_plan.hpp? In that case, it would better for those header files to directly include create_plan.hpp. Or is it because of some complexity in create_plan / create_plan_defs / fftw_support? If so, can you explain that again? :) > Index: src/vsip/opt/fftw3/create_plan.hpp > =================================================================== > --- src/vsip/opt/fftw3/create_plan.hpp (revision 0) > +++ src/vsip/opt/fftw3/create_plan.hpp (revision 0) > @@ -0,0 +1,231 @@ > +/* Copyright (c) 2007 by CodeSourcery. All rights reserved. > + > + This file is available for license from CodeSourcery, Inc. under the terms > + of a commercial license and under the GPL. It is not part of the VSIPL++ > + reference implementation and is not available under the BSD license. > +*/ > +/** @file vsip/opt/fftw3/create_plan.hpp > + @author Assem Salama > + @date 2007-04-13 > + @brief VSIPL++ Library: File that has create_plan struct > +*/ > +#ifndef VSIP_OPT_FFTW3_CREATE_PLAN_HPP > +#define VSIP_OPT_FFTW3_CREATE_PLAN_HPP > + > +#include > + > +#include > + > +namespace vsip > +{ > +namespace impl > +{ > +namespace fftw3 > +{ > + > +// This is a helper strcut to create plans [2] ^^^^^ Spelling > +template > +struct Create_plan; > + > +// interleaved > +template<> > +struct Create_plan > +{ > + > + // create function for complex -> complex > + template + typename T, dimension_type dim> [3] Naming: to be consistent, template parameters should be capitalized, with JavaStyle caps. I.e. plan_type -> PlanT iodim_type -> IodimT dim -> Dim Likewise below. > + static plan_type > + create(std::complex* ptr1, std::complex* ptr2, > + int exp, int flags, Domain const& size) > + { > + int sz[dim],i; > + for(i=0;i + return create_fftw_plan(dim, sz, ptr1,ptr2,exp,flags); > + } > + static rt_complex_type const type = cmplx_inter_fmt; [4] Please use a name other than 'type' for this member variable. Perhaps 'format'? In general, 'type' should be reserved for member type names create by typedefs. > + > +}; > + > +// split > +template<> > +struct Create_plan > +{ > + > + // create for complex -> complex > + template + typename T, dimension_type dim> > + static plan_type > + create(std::pair ptr1, std::pair ptr2, > + int exp, int flags, Domain const& size) > + { > + iodim_type iodims[dim]; > + int i; > + Applied_layout::type, > + Stride_unit_dense, Cmplx_split_fmt> > > + app_layout(size); > + > + for(i=0;i + { > + iodims[i].n = app_layout.size(i); > + iodims[i].is = iodims[i].os = app_layout.stride(i); > + } > + > + return create_fftw_plan(dim, iodims, ptr1,ptr2, flags); > + > + } > + > + // create for real -> complex > + template + typename T, dimension_type dim> > + static plan_type > + create(T *ptr1, std::pair ptr2, > + int A, int flags, Domain const& size) > + { > + iodim_type iodims[dim]; > + int i; > + Domain dom = create_dense_domain(extent(size), > + tuple_from_axis(A)); [5] dom is not used. Also, since create_dense_domain is no longer defined, it suggests that this create function is not being tested. Can you check if any of the fft_be, fft, and fft_ext tests cover real -> complex? I think fft_ext should cover this. > + Applied_layout > > + app_layout(Rt_layout(stride_unit_align, > + tuple_from_axis(A), > + cmplx_split_fmt, > + 0), > + size, sizeof(T)); > + > + > + for(i=0;i + { > + iodims[i].n = app_layout.size(i); > + iodims[i].is = iodims[i].os = app_layout.stride(i); > + } > + > + return create_fftw_plan(dim, iodims, ptr1,ptr2, flags); > + } > + > + // create for complex -> real > + template + typename T, dimension_type dim> > + static plan_type > + create(std::pair ptr1, T* ptr2, > + int A, int flags, Domain const& size) > + { > + iodim_type iodims[dim]; > + int i; [6] Does the Applied_layout object not work for complex -> real? Likewise to above, since create_dense_domain is not defined, this create function is not being exercised. Can you check if any of the fft tests cover complex->real? > + Domain dom = create_dense_domain(extent(size), > + tuple_from_axis(A)); > + > + > + for(i=0;i + { > + iodims[i].n = dom[i].size(); > + iodims[i].is = iodims[i].os = dom[i].stride(); > + } > + > + return create_fftw_plan(dim, iodims, ptr1,ptr2, flags); > + } > + > + static rt_complex_type const type = cmplx_split_fmt; > +}; > + > + > +} // namespace vsip::impl::fftw3 > +} // namespace vsip::impl > +} // namespace vsip > + > +#endif // VSIP_OPT_FFTW3_CREATE_PLAN_HPP > Index: src/vsip/opt/fftw3/fftw_support.hpp Looks good. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Tue May 8 18:33:50 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 08 May 2007 14:33:50 -0400 Subject: fftw Message-ID: <4640C28E.9060507@codesourcery.com> Everyone, This patch addresses Jule's comments. Some cleanup, removed create_dense_domain. Thanks, Assem -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: svn.diff.05082007.1.log URL: From assem at codesourcery.com Tue May 8 18:40:09 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 08 May 2007 14:40:09 -0400 Subject: fftw3 Message-ID: <4640C409.2010806@codesourcery.com> Everyone, Sorry about last patch, forgot to change something on line 169. Here is new one. Thanks, Assem -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: svn.diff.05082007.1.log URL: From jules at codesourcery.com Tue May 8 22:00:16 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 08 May 2007 18:00:16 -0400 Subject: [vsipl++] [patch] Benchmarking documentation In-Reply-To: <463D06A3.6090403@codesourcery.com> References: <463D06A3.6090403@codesourcery.com> Message-ID: <4640F2F0.6010700@codesourcery.com> Don McCoy wrote: > Attached is a patch to add a new chapter to the Reference section of the > tutorial explaining how to run the benchmarks and interpret their > output. It should serve as a good starting point, though many details > still remain to be added. Don, please check this in. We can edit it "in-place". -- Jules > > Also, I'd appreciate suggestions as to how to better organize the table > in the first section. You could use the VSIPL++ specification section to provide structure to the table. I.e. +----------------------------------------------+ | View Elementwise Functions | +--------+-------------------------------------+ | vma | fused multiply-add (Z = A*B+C) | | vmul | multiply (Z = A*B) | | ... | ... | +--------+-------------------------------------+ | Redutions | +--------+-------------------------------------+ | maxval | maximum value (Z = maxval(A, idx)) | | ... | ... | +--------+-------------------------------------+ -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Tue May 8 22:09:39 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 08 May 2007 18:09:39 -0400 Subject: [vsipl++] [patch] install benchmark sources In-Reply-To: <463E5ABD.50403@codesourcery.com> References: <463E5ABD.50403@codesourcery.com> Message-ID: <4640F523.9060004@codesourcery.com> Don McCoy wrote: > As attached. Subsequent patches will be needed to > > 1) install the HPEC Benchmark sources > 2) install benchmark executables > 3) clean up the standalone makefile > 4) update tutorial and quickstart documents > > Plus other revisions for the install directory layout as discussed. > I'll try to make all these changes in small sensible pieces to make it > easier to review. > > Regards, Don, This looks good in principal. Did you and Stefan work out the issue with svn mv not showing the new makefile.standalone.in? Can you post the new file makefile.standalone.in, or comment on the changes? Also, I noticed that you changed the form of some of the makefile variables from ':= 'to '='. Was this intentional? I'm not the best person to comment on the differences between the two and whether that makes a difference here. Stefan, can you comment if it looks OK. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Tue May 8 23:38:21 2007 From: don at codesourcery.com (Don McCoy) Date: Tue, 08 May 2007 17:38:21 -0600 Subject: [vsipl++] [patch] install benchmark sources In-Reply-To: <4640F523.9060004@codesourcery.com> References: <463E5ABD.50403@codesourcery.com> <4640F523.9060004@codesourcery.com> Message-ID: <464109ED.9080305@codesourcery.com> Jules Bergmann wrote: > This looks good in principal. Did you and Stefan work out the issue > with svn mv not showing the new makefile.standalone.in? Can you post > the new file makefile.standalone.in, or comment on the changes? > There were no changes, so I'm not sure what happened. I'll try it again on a clean checkout and see if the problem repeats. > Also, I noticed that you changed the form of some of the makefile > variables from ':= 'to '='. Was this intentional? I'm not the best > person to comment on the differences between the two and whether that > makes a difference here. Stefan, can you comment if it looks OK. That was not intentional. Thanks for catching that. The patch has changed quite a bit as I discussed it with Stefan and we determined a better way to do it. Expect a new patch Here Soon. A new question arose however: Some time ago, we chose to put the HPEC benchmarks in a subdirectory of benchmarks/, much like the other specialized benchmarks, yet we gave it its own makefile so that it would stand alone a bit better. This seems odd now. I propose we move it alongside benchmarks/, perhaps as benchmarks_hpec/ to make it a little more obvious than 'hpec_kernel'. We'll then mirror this arrangement upon installation. Is this ok with you? Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From don at codesourcery.com Tue May 8 23:48:09 2007 From: don at codesourcery.com (Don McCoy) Date: Tue, 08 May 2007 17:48:09 -0600 Subject: [vsipl++] [patch] Benchmarking documentation In-Reply-To: <4640F2F0.6010700@codesourcery.com> References: <463D06A3.6090403@codesourcery.com> <4640F2F0.6010700@codesourcery.com> Message-ID: <46410C39.9070102@codesourcery.com> Jules Bergmann wrote: >> Also, I'd appreciate suggestions as to how to better organize the >> table in the first section. > > > You could use the VSIPL++ specification section to provide structure > to the table. > > I.e. :) I meant to ask about the actual markup for doing so in DocBook... Do we have an example of a more complicated table somewhere handy? If not, I'll find one. But overall, does following the specification sound like a good idea? We could do it any number of other ways, but I thought pointing back at the spec works fairly well. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From stefan at codesourcery.com Wed May 9 00:08:05 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 08 May 2007 20:08:05 -0400 Subject: [vsipl++] [patch] Benchmarking documentation In-Reply-To: <46410C39.9070102@codesourcery.com> References: <463D06A3.6090403@codesourcery.com> <4640F2F0.6010700@codesourcery.com> <46410C39.9070102@codesourcery.com> Message-ID: <464110E5.1040009@codesourcery.com> Don McCoy wrote: > I meant to ask about the actual markup for doing so in DocBook... Do > we have an example of a more complicated table somewhere handy? If not, > I'll find one. A good starting point for docbook vocabulary is the online book by Norman Walsh (the docbook author). In particular: http://docbook.org/tdg/en/html/table.html Regards, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From jules at codesourcery.com Wed May 9 02:16:53 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 08 May 2007 22:16:53 -0400 Subject: [patch] MCOE/GCC Fixes Message-ID: <46412F15.1070105@codesourcery.com> This patch collects and cleans up some of the fixes necessary to use GCC with MCOE. Patch applied to the 1.3 branch and to trunk. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mc-updates.diff URL: From jules at codesourcery.com Wed May 9 10:44:12 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 09 May 2007 06:44:12 -0400 Subject: [vsipl++] [patch] MCOE/GCC Fixes In-Reply-To: <46412F15.1070105@codesourcery.com> References: <46412F15.1070105@codesourcery.com> Message-ID: <4641A5FC.4080500@codesourcery.com> Jules Bergmann wrote: > This patch collects and cleans up some of the fixes necessary to use GCC > with MCOE. Fix a small typo in configure (was failing if std::isfinite not found). Change mcoe-setup.sh to force exceptions (--enable-exceptions) instead of probing when exceptions="y". Patch applied to 1.3 branch and trunk. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mc2.diff URL: From jules at codesourcery.com Wed May 9 14:07:50 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 09 May 2007 10:07:50 -0400 Subject: [patch] Fix faux-complex SIMD trait for GHS Message-ID: <4641D5B6.90205@codesourcery.com> Works around a GHS internal error. Patch applied to 1.3 branch and trunk. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ghs-simd.diff URL: From jules at codesourcery.com Wed May 9 14:27:18 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 09 May 2007 10:27:18 -0400 Subject: [vsipl++] fftw3 In-Reply-To: <4640C409.2010806@codesourcery.com> References: <4640C409.2010806@codesourcery.com> Message-ID: <4641DA46.1050905@codesourcery.com> Assem Salama wrote: > Everyone, > Sorry about last patch, forgot to change something on line 169. Here is > new one. Assem, This looks good. There is one comment you missed for Create_plan: > > + static rt_complex_type const type = cmplx_inter_fmt; > > [4] Please use a name other than 'type' for this member variable. > Perhaps 'format'? > > In general, 'type' should be reserved for member type names create by > typedefs. Once you address that, please check it in. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Wed May 9 14:37:28 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 09 May 2007 10:37:28 -0400 Subject: [vsipl++] [patch] install benchmark sources In-Reply-To: <464109ED.9080305@codesourcery.com> References: <463E5ABD.50403@codesourcery.com> <4640F523.9060004@codesourcery.com> <464109ED.9080305@codesourcery.com> Message-ID: <4641DCA8.1060809@codesourcery.com> > A new question arose however: Some time ago, we chose to put the HPEC > benchmarks in a subdirectory of benchmarks/, much like the other > specialized benchmarks, yet we gave it its own makefile so that it would > stand alone a bit better. This seems odd now. I propose we move it > alongside benchmarks/, perhaps as benchmarks_hpec/ to make it a little > more obvious than 'hpec_kernel'. We'll then mirror this arrangement > upon installation. Is this ok with you? How would you handle shared files, like loop.hpp and main.cpp? -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Wed May 9 14:39:58 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 09 May 2007 10:39:58 -0400 Subject: [vsipl++] [patch] Benchmarking documentation In-Reply-To: <46410C39.9070102@codesourcery.com> References: <463D06A3.6090403@codesourcery.com> <4640F2F0.6010700@codesourcery.com> <46410C39.9070102@codesourcery.com> Message-ID: <4641DD3E.4090603@codesourcery.com> > I meant to ask about the actual markup for doing so in DocBook... Do > we have an example of a more complicated table somewhere handy? If not, > I'll find one. Unfortunately, no. > > But overall, does following the specification sound like a good idea? > We could do it any number of other ways, but I thought pointing back at > the spec works fairly well. Yes I think so. The specification has a good logical grouping of functionality. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Wed May 9 16:04:59 2007 From: assem at codesourcery.com (Assem Salama) Date: Wed, 09 May 2007 12:04:59 -0400 Subject: Generator_expr_block Message-ID: <4641F12B.5050707@codesourcery.com> Everyone, Added Choose_subblock fro Global_map and Replicated_map. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05092007.1.log Type: text/x-log Size: 2382 bytes Desc: not available URL: From don at codesourcery.com Wed May 9 16:18:14 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 09 May 2007 10:18:14 -0600 Subject: [vsipl++] [patch] install benchmark sources In-Reply-To: <4641DCA8.1060809@codesourcery.com> References: <463E5ABD.50403@codesourcery.com> <4640F523.9060004@codesourcery.com> <464109ED.9080305@codesourcery.com> <4641DCA8.1060809@codesourcery.com> Message-ID: <4641F446.6050405@codesourcery.com> Jules Bergmann wrote: >> ... I propose we move it alongside benchmarks/, perhaps as >> benchmarks_hpec/ to make it a little more obvious than >> 'hpec_kernel'. We'll then mirror this arrangement upon >> installation. Is this ok with you? > > How would you handle shared files, like loop.hpp and main.cpp? > > Well, we could duplicate them upon install, but then again, it's probably not going to improve things. Scratch that idea. Attached is a new patch that handles the installation of the source files a bit better. It also includes a few fixes for things discovered when testing with Intel's IPP/MKL and Mercury's SAL libraries. One other thing to note regarding the renaming of the standalone makefile: svn diff does not show clearly that the file was simply renamed without any changes. So far, it was simply renamed. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchmake2.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchmake2.diff URL: From assem at codesourcery.com Wed May 9 16:50:48 2007 From: assem at codesourcery.com (Assem Salama) Date: Wed, 09 May 2007 12:50:48 -0400 Subject: SIMD threshold Message-ID: <4641FBE8.5080508@codesourcery.com> Everyone, This patch implements a SIMD threshold using the ite operator. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05092007.2.log Type: text/x-log Size: 9916 bytes Desc: not available URL: From don at codesourcery.com Wed May 9 17:15:35 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 09 May 2007 11:15:35 -0600 Subject: [vsipl++] [patch] Benchmarking documentation In-Reply-To: <4640F2F0.6010700@codesourcery.com> References: <463D06A3.6090403@codesourcery.com> <4640F2F0.6010700@codesourcery.com> Message-ID: <464201B7.6020807@codesourcery.com> Jules Bergmann wrote: > Don McCoy wrote: >> Attached is a patch to add a new chapter to the Reference section of >> the tutorial explaining how to run the benchmarks and interpret their >> output. It should serve as a good starting point, though many >> details still remain to be added. > > Don, please check this in. We can edit it "in-place". -- Jules This is now checked in. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From don at codesourcery.com Wed May 9 22:54:35 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 09 May 2007 16:54:35 -0600 Subject: [patch] missing install directory Message-ID: <4642512B.2090202@codesourcery.com> As attached. Also contains small changes to the stand-alone benchmark makefile. Ok to commit? Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mi.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mi.diff URL: From stefan at codesourcery.com Wed May 9 23:00:45 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Wed, 09 May 2007 19:00:45 -0400 Subject: [vsipl++] [patch] missing install directory In-Reply-To: <4642512B.2090202@codesourcery.com> References: <4642512B.2090202@codesourcery.com> Message-ID: <4642529D.8070605@codesourcery.com> Don McCoy wrote: > 2007-05-09 Don McCoy > > * GNUmakefile.in: Install missing directory. > * src/vsip/GNUmakefile.inc.in: Likewise. This looks as if 'make install' wasn't complete, prior to this patch. Is that correct ? if so, should this patch be backported to 1.3 ? Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From don at codesourcery.com Thu May 10 03:15:15 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 09 May 2007 21:15:15 -0600 Subject: [vsipl++] [patch] missing install directory In-Reply-To: <4642529D.8070605@codesourcery.com> References: <4642512B.2090202@codesourcery.com> <4642529D.8070605@codesourcery.com> Message-ID: <46428E43.9070106@codesourcery.com> Stefan Seefeld wrote: > Don McCoy wrote: > > >> 2007-05-09 Don McCoy >> >> * GNUmakefile.in: Install missing directory. >> * src/vsip/GNUmakefile.inc.in: Likewise. >> > > This looks as if 'make install' wasn't complete, prior to this > patch. Is that correct ? if so, should this patch be backported > to 1.3 ? > Yes, just for those two files. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From jules at codesourcery.com Thu May 10 11:35:21 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 10 May 2007 07:35:21 -0400 Subject: [vsipl++] [patch] missing install directory In-Reply-To: <46428E43.9070106@codesourcery.com> References: <4642512B.2090202@codesourcery.com> <4642529D.8070605@codesourcery.com> <46428E43.9070106@codesourcery.com> Message-ID: <46430379.1040600@codesourcery.com> >> This looks as if 'make install' wasn't complete, prior to this >> patch. Is that correct ? if so, should this patch be backported >> to 1.3 ? >> > Yes, just for those two files. > Good thinking, however 1.3 did not have src/vsip/opt/reductions, so a backport isn't necessary. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Thu May 10 12:18:31 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 10 May 2007 08:18:31 -0400 Subject: [vsipl++] [patch] missing install directory In-Reply-To: <4642512B.2090202@codesourcery.com> References: <4642512B.2090202@codesourcery.com> Message-ID: <46430D97.6030607@codesourcery.com> Don McCoy wrote: > As attached. Also contains small changes to the stand-alone benchmark > makefile. > > Ok to commit? Don, this looks good, please commit. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Thu May 10 15:34:17 2007 From: don at codesourcery.com (Don McCoy) Date: Thu, 10 May 2007 09:34:17 -0600 Subject: [vsipl++] [patch] HPEC Challenge Benchmark, Firbank enhancement In-Reply-To: <4637B2E8.5050906@codesourcery.com> References: <46365F2D.7000104@codesourcery.com> <463661B6.5020607@codesourcery.com> <4637B2E8.5050906@codesourcery.com> Message-ID: <46433B79.3020908@codesourcery.com> Jules Bergmann wrote: > Stefan Seefeld wrote: >> Could you please consistently either put the access specifier >> ('public') everywhere >> or nowhere ? (I'd prefer nowhere, as for structs it is implied.) > > Sounds good to me. > > Otherwise this looks good. Please check it in. > Revised as suggested. I also corrected the parallel case after doing proper testing and discovering some problems. It is not necessary to pass local views to this->firbank(), as this is done inside that function for the cases tagged 'Full' and 'Fast'. This is not an error, per se, it only adds a slight amount of unnecessary overhead. For the new 'Expr' case, taking the local view is not needed because it is taken after the evaluator dispatches the expression. Specifically, in the Cell case the local view is taken before splitting it up amongst the SPEs and in the fall-back case, it is taken within the FFT workspace objects. Secondly, I made sure that when taking the local view that the macro version LOCAL() was used in place of the member .local(), so that when PARALLEL_FIRBANK is not defined, it still works correctly. This should allow it to work when compiled against the reference implementation (or other). This was tested as well. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: hbb2.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: hbb2.diff URL: From assem at codesourcery.com Thu May 10 20:20:53 2007 From: assem at codesourcery.com (Assem Salama) Date: Thu, 10 May 2007 16:20:53 -0400 Subject: ite SIMD with loop fusion Message-ID: <46437EA5.2000308@codesourcery.com> Everyone, This patch implements ite(A>B,A,k). There is a serial dispatch and also loop fusion support. Thanks, Assem From jules at codesourcery.com Thu May 10 21:42:50 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 10 May 2007 17:42:50 -0400 Subject: [vsipl++] [patch] HPEC Challenge Benchmark, Firbank enhancement In-Reply-To: <46433B79.3020908@codesourcery.com> References: <46365F2D.7000104@codesourcery.com> <463661B6.5020607@codesourcery.com> <4637B2E8.5050906@codesourcery.com> <46433B79.3020908@codesourcery.com> Message-ID: <464391DA.4070709@codesourcery.com> Don McCoy wrote: > Jules Bergmann wrote: >> Stefan Seefeld wrote: >>> Could you please consistently either put the access specifier >>> ('public') everywhere >>> or nowhere ? (I'd prefer nowhere, as for structs it is implied.) >> >> Sounds good to me. >> >> Otherwise this looks good. Please check it in. >> > Revised as suggested. Don, If you haven't already, please check this in. thanks, -- Jules > > I also corrected the parallel case after doing proper testing and > discovering some problems. It is not necessary to pass local views to > this->firbank(), as this is done inside that function for the cases > tagged 'Full' and 'Fast'. This is not an error, per se, it only adds a > slight amount of unnecessary overhead. For the new 'Expr' case, taking > the local view is not needed because it is taken after the evaluator > dispatches the expression. Specifically, in the Cell case the local > view is taken before splitting it up amongst the SPEs and in the > fall-back case, it is taken within the FFT workspace objects. Sounds good. > > Secondly, I made sure that when taking the local view that the macro > version LOCAL() was used in place of the member .local(), so that when > PARALLEL_FIRBANK is not defined, it still works correctly. This should > allow it to work when compiled against the reference implementation (or > other). This was tested as well. We can safely phase out the LOCAL() macro now, in favor of .local(). The new reference implementation supports parallel VSIPL++. The old reference implementation has been decomissioned. Don't worry about expunging LOCAL() right away (i.e. check this in), but let's deprecate it going forward. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Thu May 10 22:16:32 2007 From: don at codesourcery.com (Don McCoy) Date: Thu, 10 May 2007 16:16:32 -0600 Subject: [patch] install benchmark executables Message-ID: <464399C0.9060200@codesourcery.com> This patch builds upon the 5/9 patch 'install benchmark sources'. The diff only shows the cumulative makefile changes though. Is this along with the sources patch ok to check in? Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchinst.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: benchinst.diff URL: From assem at codesourcery.com Thu May 10 23:00:21 2007 From: assem at codesourcery.com (Assem Salama) Date: Thu, 10 May 2007 19:00:21 -0400 Subject: ite SIMD with loop fusion Message-ID: <4643A405.9060900@codesourcery.com> Everyone, forgot patch :) Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05102007.1.log Type: text/x-log Size: 17123 bytes Desc: not available URL: From jules at codesourcery.com Fri May 11 13:08:45 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 11 May 2007 09:08:45 -0400 Subject: [vsipl++] [patch] install benchmark sources In-Reply-To: <4641F446.6050405@codesourcery.com> References: <463E5ABD.50403@codesourcery.com> <4640F523.9060004@codesourcery.com> <464109ED.9080305@codesourcery.com> <4641DCA8.1060809@codesourcery.com> <4641F446.6050405@codesourcery.com> Message-ID: <46446ADD.6040600@codesourcery.com> > > Attached is a new patch that handles the installation of the source > files a bit better. It also includes a few fixes for things discovered > when testing with Intel's IPP/MKL and Mercury's SAL libraries. Don, This looks good, please check it in. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Fri May 11 19:52:48 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 11 May 2007 15:52:48 -0400 Subject: [vsipl++] SIMD threshold In-Reply-To: <4641FBE8.5080508@codesourcery.com> References: <4641FBE8.5080508@codesourcery.com> Message-ID: <4644C990.4090300@codesourcery.com> Assem Salama wrote: > Everyone, > This patch implements a SIMD threshold using the ite operator. Assem, This looks good. I have a couple of commments below. Please address those, then check it in. Next, can you benchmark this on belgarath, our PowerPC system? In particular, if you can compare performance with and without the evlauator. thanks, -- Jules > > ------------------------------------------------------------------------ > > Index: src/vsip/opt/simd/threshold.hpp > =================================================================== > +// Simd function to do threshold only when K is 0 > +template > +void simd_thresh0(T* Z, T* A, T* B, int n) [1] For defensive programming, it is a good idea to make A and B 'const' so they aren't accidentally modified. Likewise below. Hmmm, I notice that vmul also has them non-const (who wrote that? :). Also, for coding standard consistency, the function name should be at the start of the line, i.e. void simd_thresh0(... Likewise below. > +{ > + typedef Simd_traits simd; > + typedef Simd_traits simdi; > + typedef typename simd::simd_type simd_type; > + typedef typename simdi::simd_type simd_itype; [2] Since you're requiring the caller to guarentee that Z, A, and B are all SIMD aligned (which is OK by the way), you should: a) document it b) check it with an assert > + > + simd::enter(); > + > + while (n >= simd::vec_size) > + { > + n -= simd::vec_size; > + > + simd_type A_v = simd::load(A); > + simd_type B_v = simd::load(B); > + simd_itype mask = simd_itype(simd::gt(A_v,B_v)); > + simd_itype nmask = simdi::bnot(mask); > + simd_itype res = simdi::band(simd_itype(A_v),nmask); > + simd::store(Z,simd_type(res)); > + > + A += simd::vec_size; > + B += simd::vec_size; > + Z += simd::vec_size; > + } > + > + simd::exit(); > +} > +template > +struct Simd_threshold > +{ > + static void exec(T* Z, T* A, T* B, T k, int n) > + { > + typedef Simd_traits simd; > + typedef Simd_traits simdi; > + typedef typename simd::simd_type simd_type; > + typedef typename simdi::simd_type simd_itype; > + > + // handle mis-aligned vectors > + if (simd::alignment_of(A) != simd::alignment_of(B) || > + simd::alignment_of(Z) != simd::alignment_of(A)) > + { > + Simd_threshold::exec(Z,A,B,k,n); > + return; > + } > + > + // clean up initial unaligned values > + while (simd::alignment_of(A) != 0) > + { > + if(*A > *B) *Z = *A; > + else *Z = k; > + A++;B++;Z++; > + n--; > + } > + > + if (n == 0) return; > + > + > + if(k != T(0)) { > + simd_thresh0(Z,A,B,n); > + } else { > + simd_thresh(Z,A,B,k,n); > + } > + > + // handle last bits > + while(n) [3] And 'n' is what at this point? :) This cleanup code is going to recompute all the work done by simd_thresh0 or simd_thresh. You either need to push this cleanup code into simd_thresh (which knows the value of 'n' after it has done as much SIMD work as possible), or have simd_thresh return the new value of n. My preference would be to push this down into simd_thresh. The sharing here is nice, but pushing it down makes simd_thresh more self-contained. I wouldn't pass n by reference. The compiler might not generate optimal code if it is worried about n's value changing out from underneath. > + { > + if(*A > *B) *Z = *A; > + else *Z = k; > + A++;B++;Z++; > + n--; > + } > + > + } > +}; -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Sun May 13 16:28:03 2007 From: don at codesourcery.com (Don McCoy) Date: Sun, 13 May 2007 10:28:03 -0600 Subject: [vsipl++] [patch] missing install directory In-Reply-To: <46430D97.6030607@codesourcery.com> References: <4642512B.2090202@codesourcery.com> <46430D97.6030607@codesourcery.com> Message-ID: <46473C93.5040406@codesourcery.com> Jules Bergmann wrote: > Don, this looks good, please commit. thanks, -- Jules > This is now committed. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From don at codesourcery.com Sun May 13 16:35:43 2007 From: don at codesourcery.com (Don McCoy) Date: Sun, 13 May 2007 10:35:43 -0600 Subject: [vsipl++] [patch] install benchmark sources In-Reply-To: <46446ADD.6040600@codesourcery.com> References: <463E5ABD.50403@codesourcery.com> <4640F523.9060004@codesourcery.com> <464109ED.9080305@codesourcery.com> <4641DCA8.1060809@codesourcery.com> <4641F446.6050405@codesourcery.com> <46446ADD.6040600@codesourcery.com> Message-ID: <46473E5F.4040001@codesourcery.com> Jules Bergmann wrote: > > Don, This looks good, please check it in. thanks, -- Jules > This is checked in now. It became a bit blended with the make install patch, for reasons I do not know. This resulted in four actual checkins: the two patches, and two corrections. All should be well now. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From assem at codesourcery.com Mon May 14 17:03:46 2007 From: assem at codesourcery.com (Assem Salama) Date: Mon, 14 May 2007 17:03:46 +0000 Subject: SIMD threshold with loop fusion Message-ID: <46489672.7050100@codesourcery.com> Everyone, This patch is just a preliminary look at the loop fusion patch with lt and gt support. I still have some comments in there which I will take out soon. Thanks, Assem From assem at codesourcery.com Mon May 14 21:23:11 2007 From: assem at codesourcery.com (Assem Salama) Date: Mon, 14 May 2007 17:23:11 -0400 Subject: timezone Message-ID: <4648D33F.2020108@codesourcery.com> sorry about my e-mails with wrong time. I think I fixed it. Thanks, Assem From jules at codesourcery.com Tue May 15 01:14:20 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 14 May 2007 21:14:20 -0400 Subject: [patch] Minor fixes. Message-ID: <4649096C.2080108@codesourcery.com> Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fix.diff URL: From jules at codesourcery.com Tue May 15 01:34:36 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 14 May 2007 21:34:36 -0400 Subject: [patch] Fft diag, fft_ext test fixes, benchmark changes Message-ID: <46490E2C.2090102@codesourcery.com> This patch: - Adds new FFT diagnosis routine (diag/fft.hpp), similar in spirit to the expr diagnosis routines. The diagnosis is used in the FFT benchmarks out-of-place diag() function. It produces output identifying the FFT BE used, like so: dim: 1 rm : ref be : fft-backend-1D-complex inter_fastpath_ok : yes split_fastpath_ok : no Ext_data_cost : 0 Ext_data_cost : 0 - Minor fixes for fft_ext (use test_assert instead of assert, abort if input files not found, return EXIT_FAILURE on failure). This test fails on MCOE, but qmtest was being confused by the return value and wasn't printing out enough debug info. - Minor benchmark changes collecting dust in my working copy :) - single-line cell/fastconv case that uses huge pages, - split-complex FFTW3 benchmark case, - fix fftm benchmark to compute riob/wiob, - scaled variants of vmmul that were interesting at some point. Ok to apply? Stefan do the FFT diag bits look OK? Don, do the benchmark changes look OK? thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misc.diff URL: From don at codesourcery.com Tue May 15 16:08:29 2007 From: don at codesourcery.com (Don McCoy) Date: Tue, 15 May 2007 10:08:29 -0600 Subject: [patch] Message-ID: <4649DAFD.80208@codesourcery.com> These changes are necessary to integrate with the new ALF version 1.1 that came with the Cell SDK 2.1. The new ALF sources are already checked in. Ok to commit? -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: alf_update.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: alf_update.diff URL: From jules at codesourcery.com Tue May 15 16:18:34 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 15 May 2007 12:18:34 -0400 Subject: [vsipl++] [patch] In-Reply-To: <4649DAFD.80208@codesourcery.com> References: <4649DAFD.80208@codesourcery.com> Message-ID: <4649DD5A.2000108@codesourcery.com> Don McCoy wrote: > These changes are necessary to integrate with the new ALF version 1.1 > that came with the Cell SDK 2.1. The new ALF sources are already > checked in. > > Ok to commit? Don, Looks good, please commit. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Tue May 15 16:41:29 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 15 May 2007 12:41:29 -0400 Subject: SIMD threshold with loop fusion Message-ID: <4649E2B9.9060300@codesourcery.com> Everyone, I forgot to attach patch. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05142007.1.log Type: text/x-log Size: 19138 bytes Desc: not available URL: From jules at codesourcery.com Tue May 15 20:28:56 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 15 May 2007 16:28:56 -0400 Subject: [vsipl++] SIMD threshold with loop fusion In-Reply-To: <4649E2B9.9060300@codesourcery.com> References: <4649E2B9.9060300@codesourcery.com> Message-ID: <464A1808.20306@codesourcery.com> Assem Salama wrote: > Everyone, > I forgot to attach patch. Assem, This looks good. Pusing the Binary_operator_map::apply down into simd_thresh really increases the coverage of the routine. Going forward, I would like to be able to support the following. In some contexts, A > B should evaluate to a bool SIMD value. I.e. Vector Z; Vector A, B, C, D; Z = A > B; Z = A > B && C > D; Right now A > B evaluates to an int SIMD value that holds a bitmask. This makes sense when used as the predicat for an ite() operator. Any ideas on how to support both? - We could force all A > B exprs to be treated as bool, and then have ite expand the bool back out to a bitmask, but this would be inefficient. That seems like a bad idea. - We could have the return type of A > B be determined by how it is used. I.e. for 'X = ite(A>B, ...)' the return type of 'A > B' would be int bitmask, but for 'Z = A > B' it would be bool. - We could do all logic as int bitmasks, than force the 'Z =' to convert an int mask into a bool at assignment. That might sacrifice a bit of efficiency in a some cases (like Z = A > B && C > D), but might be a decent solution. However, I don't think the current work prevents that, so let's check it in once you've addressed the feedback below. Also, check with Stefan for feedback too. -- Jules > ------------------------------------------------------------------------ > > Index: src/vsip/opt/simd/simd.hpp > =================================================================== > --- src/vsip/opt/simd/simd.hpp (revision 165174) > +++ src/vsip/opt/simd/simd.hpp (working copy) > @@ -167,6 +167,9 @@ > static simd_type gt(simd_type const& v1, simd_type const& v2) > { return (v1 > v2) ? simd_type(1) : simd_type(0); } > > + static simd_type lt(simd_type const& v1, simd_type const& v2) > + { return (v1 < v2) ? simd_type(1) : simd_type(0); } > + [1] This looks good. However, do you think faux-SIMD should have the same "API" as the real SIMD functions below? For example, AltiVec vgt returns 0xFFFFFFFF or 0x00000000 for each position. That can be used as a mask. (What does SSE do?) Since faux SIMD returns 1 or 0, it can't be used as mask. A generic routine that uses vgt may not work with faux-simd if it expects vgt/vlt to return a value valid for a mask. > static simd_type pack(simd_type const&, simd_type const&) > { assert(0); } > > @@ -998,6 +1019,7 @@ > struct Alg_vbor; > struct Alg_vbxor; > struct Alg_vbnot; > +struct Alg_threshold; [2] Isn't 'Alg_threshold' already checked in? I'm confused. > > template bool IsSplit, > Index: src/vsip/opt/simd/threshold.hpp > =================================================================== > --- src/vsip/opt/simd/threshold.hpp (revision 171195) > +++ src/vsip/opt/simd/threshold.hpp (working copy) > @@ -15,6 +15,7 @@ > #define VSIP_OPT_SIMD_THRESHOLD_HPP > > #include > +#include [3] I'm a little wary about including expr_iterator since it might pull in a lot of unnecessary dependencies. However, we can fix that later by pusing Binary_operator_map into a separate header file. > #include > > /*********************************************************************** > @@ -47,19 +48,22 @@ > // Class for threshold > > template + template class O, [4] Please use a slightly more descriptive parameter name, such as "Op", or document. > bool Is_vectorized> > struct Simd_threshold; > > > Index: src/vsip/opt/simd/expr_iterator.hpp > =================================================================== > +// Proxy for ternary access traits for ite functor > +template > +class Proxy > [5] This is OK. However, since the behavior is governed by Ternary_operator_map<..., ite_functor>, this could be generalized to take ite_functor as an arbitrary template parameter That way, in future when you add other Tenary_access_traits specializations, this specialization could apply too. > +{ > + typedef typename A::access_traits access_traits; > + typedef typename access_traits::value_type value_type; > + typedef typename Simd_traits::simd_type simd_type; > + > +public: > + Proxy(A const &a, B const &b, C const &c) > + : a_(a), b_(b), c_(c) {} > + > + simd_type load() const > + { > + typedef typename A::access_traits::return_type return_type; > + typedef typename A::access_traits::value_type value_type; > + typedef typename Simd_traits::simd_type simd_ret_type; > + typedef typename Simd_traits::simd_type simd_val_type; > + > + simd_ret_type a_ret = a_.load(); // this is the mask > + simd_val_type b = b_.load(); // if true > + simd_val_type c = c_.load(); // if false > + // apply the mask > + return Ternary_operator_map::apply(a_ret,b,c); > + } > + > + void increment(length_type n = 1) > + { > + a_.increment(n); > + b_.increment(n); > + c_.increment(n); > + } > + > +private: > + A a_; > + B b_; > + C c_; > +}; > + > template > struct Iterator > { -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Tue May 15 20:56:22 2007 From: don at codesourcery.com (Don McCoy) Date: Tue, 15 May 2007 14:56:22 -0600 Subject: [vsipl++] [patch] install benchmark executables In-Reply-To: <464399C0.9060200@codesourcery.com> References: <464399C0.9060200@codesourcery.com> Message-ID: <464A1E76.9010907@codesourcery.com> This is now checked in. Don McCoy wrote: > This patch builds upon the 5/9 patch 'install benchmark sources'. The > diff only shows the cumulative makefile changes though. > > Is this along with the sources patch ok to check in? > > Regards, > > ------------------------------------------------------------------------ > > 2007-05-10 Don McCoy > > * benchmarks/GNUmakefile.inc.in: Now installs benchmarks (binaries). > * benchmarks/hpec_kernel/GNUmakefile.inc.in: Likewise. > * benchmarks/hpec_kernel/make.standalone: Removed (again -- was > removed previously on 2006-07-07). > From assem at codesourcery.com Tue May 15 21:23:13 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 15 May 2007 17:23:13 -0400 Subject: [vsipl++] SIMD threshold with loop fusion In-Reply-To: <464A1808.20306@codesourcery.com> References: <4649E2B9.9060300@codesourcery.com> <464A1808.20306@codesourcery.com> Message-ID: <464A24C1.5080005@codesourcery.com> Jules Bergmann wrote: > > [1] This looks good. However, do you think faux-SIMD should have the > same "API" as the real SIMD functions below? > > For example, AltiVec vgt returns 0xFFFFFFFF or 0x00000000 for each > position. That can be used as a mask. (What does SSE do?) SSE is the same thing because there is a website that has a cross-reference for altivec and sse instructions. > > Since faux SIMD returns 1 or 0, it can't be used as mask. A generic > routine that uses vgt may not work with faux-simd if it expects > vgt/vlt to return a value valid for a mask. Why not? I use normal bit operations on the return values. If I and '1' with another value, I get the value, right? > > [2] Isn't 'Alg_threshold' already checked in? I'm confused. I did check in simd.hpp. I will look and see why this is still a change... From jules at codesourcery.com Wed May 16 03:43:25 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 15 May 2007 23:43:25 -0400 Subject: [vsipl++] SIMD threshold with loop fusion In-Reply-To: <464A24C1.5080005@codesourcery.com> References: <4649E2B9.9060300@codesourcery.com> <464A1808.20306@codesourcery.com> <464A24C1.5080005@codesourcery.com> Message-ID: <464A7DDD.9090905@codesourcery.com> >> [1] This looks good. However, do you think faux-SIMD should have the >> same "API" as the real SIMD functions below? >> >> For example, AltiVec vgt returns 0xFFFFFFFF or 0x00000000 for each >> position. That can be used as a mask. (What does SSE do?) > SSE is the same thing because there is a website that has a > cross-reference for altivec and sse instructions. >> >> Since faux SIMD returns 1 or 0, it can't be used as mask. A generic >> routine that uses vgt may not work with faux-simd if it expects >> vgt/vlt to return a value valid for a mask. > Why not? I use normal bit operations on the return values. If I and '1' > with another value, I get the value, right? It depends on whether the 'and' is binary or logical. I.e. if you do something like mask = simd::vgt(a, b); result = simd::band(mask, a); For AltiVec and SSE, this does the right thing because mask[i] is 0xffffffff when a[i] > b[i]. For faux-simd, mask[0] is 0x00000001 when a[0] > b[0]. That will pull just the lowest order bit out of a[0], not the entire value. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Wed May 16 15:05:39 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 16 May 2007 11:05:39 -0400 Subject: Characterization scripts Message-ID: <464B1DC3.2070302@codesourcery.com> Here's a HOWTO on using the char.pl and graph.pl scripts, along with some generated graphs. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: howto-graph URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fft-svn-inter.png Type: image/png Size: 6470 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fft-svn-split.png Type: image/png Size: 6051 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fft-op-planning-svn-inter.png Type: image/png Size: 5955 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fft-op-vendor-svn-inter.png Type: image/png Size: 7002 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: evo-fft-op-1.3-vs-svn.png Type: image/png Size: 5652 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: evo-fft-op-split-vs-inter.png Type: image/png Size: 5894 bytes Desc: not available URL: From jules at codesourcery.com Wed May 16 15:23:38 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 16 May 2007 11:23:38 -0400 Subject: [patch] Re: [vsipl++] Characterization scripts In-Reply-To: <464B1DC3.2070302@codesourcery.com> References: <464B1DC3.2070302@codesourcery.com> Message-ID: <464B21FA.8030109@codesourcery.com> This patch provides graph.pl and graph.db as described in the howto. Jules Bergmann wrote: > Here's a HOWTO on using the char.pl and graph.pl scripts, along with > some generated graphs. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: char.diff URL: From assem at codesourcery.com Wed May 16 15:49:33 2007 From: assem at codesourcery.com (Assem Salama) Date: Wed, 16 May 2007 11:49:33 -0400 Subject: SIMD threshold benchmark Message-ID: <464B280D.9040602@codesourcery.com> Everyone, This benchmark benchmarks the benchmark :) Actually, it benchmarks ite expressions using different tags. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05162007.1.log Type: text/x-log Size: 5453 bytes Desc: not available URL: From don at codesourcery.com Wed May 16 15:56:08 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 16 May 2007 09:56:08 -0600 Subject: [vsipl++] [patch] Fft diag, fft_ext test fixes, benchmark changes In-Reply-To: <46490E2C.2090102@codesourcery.com> References: <46490E2C.2090102@codesourcery.com> Message-ID: <464B2998.9070002@codesourcery.com> Jules Bergmann wrote: > > - Minor benchmark changes collecting dust in my working copy :) > - single-line cell/fastconv case that uses huge pages, > - split-complex FFTW3 benchmark case, > - fix fftm benchmark to compute riob/wiob, > - scaled variants of vmmul that were interesting at some point. > > Ok to apply? Stefan do the FFT diag bits look OK? Don, do the > benchmark changes look OK? They look fine to me. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From jules at codesourcery.com Thu May 17 13:03:47 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 17 May 2007 09:03:47 -0400 Subject: [vsipl++] SIMD threshold with loop fusion In-Reply-To: <4649E2B9.9060300@codesourcery.com> References: <4649E2B9.9060300@codesourcery.com> Message-ID: <464C52B3.7000308@codesourcery.com> Assem, I'm sorry, but I missed this earlier. > Index: src/vsip/opt/simd/eval_generic.hpp > =================================================================== > @@ -658,8 +663,10 @@ > return(ext_dst.stride(0) == 1 && > ext_a.stride(0) == 1 && > ext_b.stride(0) == 1 && > - // make sure (A > B, A, k) > - (&(src.first().left()) == &(src.second()))); > + // make sure (A op B, A, k) > + (&(src.first().left()) == &(src.second())) && > + // make sure op is supported > + simd::Binary_operator_map::is_supported); If possible, Binary_operator_map<...>::is_supported should be part of the compile-time check (part of ct_valid), rather than run-time. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Thu May 17 13:23:01 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 17 May 2007 09:23:01 -0400 Subject: [patch] SIMD load_unaligned, scalar load optimization Message-ID: <464C5735.4090000@codesourcery.com> This patch contains two unrelated SIMD items: - Implements a SIMD load_unaligned function for unaligned loads, and adds unit test. - Optimizes SIMD loop fusion handling of scalar values to load the value into a SIMD register once, rather than each time it is accessed. On a PPC 970FX, this improves floating-point scalar * vector performance at 2048 points from 241 MFLOP/s to 1942 MFLOP/s. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: simd.diff URL: From jules at codesourcery.com Thu May 17 16:23:19 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 17 May 2007 12:23:19 -0400 Subject: [patch] Fix package.py describe Message-ID: <464C8177.8060009@codesourcery.com> This patch fixes package.py describe to work again (in conjunction the source configs patch). Patch applied. From your checkout directory, you should now be able to do: % scripts/package.py describe --configdir=scripts --configfile=scripts/trunk-gpl-snapshot.cfg --package Mondo suffix : -par-builtin-amd64-debug options : CXXFLAGS="-g -W -Wall" --with-g2c-copy=/usr/lib/gcc/x86_64-redhat-linux/3.4.3/libg2c.a --enable-fft=builtin --with-lapack=fortran-builtin --with-atlas-tarball=/home/jules/csl/atlas/atlas3.6.0_Linux_HAMMER64SSE2.tar.gz --with-atlas-cfg-opts="--with-mach=HAMMER64 --with-isa=SSE2 --with-int-type=int --with-string-convention=sun" --enable-mpi --enable-timer=x86_64_tsc ... lots more ... -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pkg.diff URL: From don at codesourcery.com Thu May 17 19:03:41 2007 From: don at codesourcery.com (Don McCoy) Date: Thu, 17 May 2007 13:03:41 -0600 Subject: [patch] fix for generating ALF dependencies Message-ID: <464CA70D.6060908@codesourcery.com> This patch corrects a problem generating the dependencies for ALF source files when building for Cell/B.E. The problem did not affect the way the source files were built, but the fix eliminates an error complaining about missing header files. Ok to commit? -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ad.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ad.diff URL: From stefan at codesourcery.com Thu May 17 19:06:44 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Thu, 17 May 2007 15:06:44 -0400 Subject: [vsipl++] [patch] fix for generating ALF dependencies In-Reply-To: <464CA70D.6060908@codesourcery.com> References: <464CA70D.6060908@codesourcery.com> Message-ID: <464CA7C4.8030609@codesourcery.com> Don McCoy wrote: > This patch corrects a problem generating the dependencies for ALF source > files when building for Cell/B.E. The problem did not affect the way > the source files were built, but the fix eliminates an error complaining > about missing header files. > > Ok to commit? Looks good. Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From jules at codesourcery.com Fri May 18 20:49:03 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 18 May 2007 16:49:03 -0400 Subject: [vsipl++] [patch] fix for generating ALF dependencies In-Reply-To: <464CA70D.6060908@codesourcery.com> References: <464CA70D.6060908@codesourcery.com> Message-ID: <464E113F.8000500@codesourcery.com> Don McCoy wrote: > This patch corrects a problem generating the dependencies for ALF source > files when building for Cell/B.E. The problem did not affect the way > the source files were built, but the fix eliminates an error complaining > about missing header files. > > Ok to commit? Sorry for the late review, I know this has already been checked in. It looks like this moves the C dependency rule from the top-level makefile to the ALF makefile, and customizes it for the ALF flags. IIUC This only works if the only C files we have in the library are for ALF (which is currently the case -- we C files in FFTW, ATLAS, and CLAPACK, but they get built with separate makefiles). In retrospect, we should either: - qualify this rule to only apply for C files in ALF, or - keep the original general rule and use the fancy per-directory variable thing "$(call dir_var,$(dir $<),CFLAGS)" to use the ALF flags for C files in the ALF directories. I think the former solution is the easiest (and it is consistent with how we build the ALF C files). Replacing: > +%.d: %.c > + $(make_alf_dep) > + With: alf_depends := $(patsubst $(srcdir)/%.c, %.d, $(alf_src)) $(alf_depends): %d: %c $(make_alf_dep) should do the trick. Does that sound OK? If so, I'll fold it into a patch that fixes C++ dependencies on MCOE. (This is purely defensive programming -- we've gone 2+ years before adding C files to the library, it will probably be another 2+ years before we add more and really have to fix this!) -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Fri May 18 22:20:07 2007 From: don at codesourcery.com (Don McCoy) Date: Fri, 18 May 2007 16:20:07 -0600 Subject: [vsipl++] [patch] fix for generating ALF dependencies In-Reply-To: <464E113F.8000500@codesourcery.com> References: <464CA70D.6060908@codesourcery.com> <464E113F.8000500@codesourcery.com> Message-ID: <464E2697.6040409@codesourcery.com> Jules Bergmann wrote: > > In retrospect, we should either: > - qualify this rule to only apply for C files in ALF, or I like this approach. Thanks for pointing this out. I was a little concerned, but unsure what to do about it. -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From stefan at codesourcery.com Sat May 19 04:17:30 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Sat, 19 May 2007 00:17:30 -0400 Subject: [vsipl++] [patch] fix for generating ALF dependencies In-Reply-To: <464E113F.8000500@codesourcery.com> References: <464CA70D.6060908@codesourcery.com> <464E113F.8000500@codesourcery.com> Message-ID: <464E7A5A.3030906@codesourcery.com> Jules Bergmann wrote: > Replacing: > >> +%.d: %.c >> + $(make_alf_dep) >> + > > With: > > alf_depends := $(patsubst $(srcdir)/%.c, %.d, $(alf_src)) > > $(alf_depends): %d: %c > $(make_alf_dep) > > should do the trick. > > Does that sound OK? Yes, I was going to suggest that enhancement, too. Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From assem at codesourcery.com Mon May 21 00:14:41 2007 From: assem at codesourcery.com (Assem Salama) Date: Sun, 20 May 2007 20:14:41 -0400 Subject: SIMD Loop fusion support for unaligned vectors Message-ID: <4650E471.1060700@codesourcery.com> Everyone, This patch adds support for unaligned vectors using loop fusion. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05202007.1.log Type: text/x-log Size: 21513 bytes Desc: not available URL: From jules at codesourcery.com Mon May 21 15:32:27 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 21 May 2007 11:32:27 -0400 Subject: [vsipl++] SIMD Loop fusion support for unaligned vectors In-Reply-To: <4650E471.1060700@codesourcery.com> References: <4650E471.1060700@codesourcery.com> Message-ID: <4651BB8B.2080707@codesourcery.com> Assem Salama wrote: > Everyone, > This patch adds support for unaligned vectors using loop fusion. Assem, This looks promising. I have a couple of design comments below. Can you address those and then send out an updated patch? Let me know if you have any questions. -- Jules > > Index: src/vsip/opt/simd/simd.hpp > =================================================================== > --- src/vsip/opt/simd/simd.hpp (revision 171547) > +++ src/vsip/opt/simd/simd.hpp (working copy) > @@ -143,6 +143,10 @@ > static simd_type load_unaligned(value_type const* addr) > { return *addr; } [1] I see what you are trying to do with this variant of load_unaligned, however, there are a couple of problems with the design: - First the name 'load_unaligned'. It implies that a load is being done. In faux-SIMD and SSE/SSE2 cases this is true (although in those cases it is confusing what x0 and x1 are for since they aren't used. In the AltiVec case, no load is being done (but x0 and x1 are being used). From below, it looks like the new 'load_unaligned' is intended to be used for AltiVec (where no load is being done). The faux-SIMD and SSE/SSE2 versions are provided to avoid complation error. The function should give a good a indication of what it is to be used for. Instead I would call it something like "shift_unaligned" or "extract_unaligned". - Second, if the function can't easily be defined for SSE/SSE2, instead of giving it different behavior (doing a load instead of a permute), it would be better to either leave it undefined, or have it assert(0). - Finally, the permutation vector (sh) can be reused. Given this, it would expose more efficiency to have two functions: static simd_itype unaligned_permutation(value_type const* addr) { return vec_lvsl(0, (value_type*)addr); } static simd_type permute(simd_type x0, simd_type x1, simd_itype sh) { return vec_perm(x0, x1, sh); } Does SSE/SSE2 have an equivalent permute? If so, let's use it. If not, let's either: - have permute and unaligned_permute assert(0), and define a static bool in the simd traits as to the presence of permute static bool const has_permute = true/false; By defining the functions to assert 0, the 'has_permute' check can either be performed at run-time or compile-time. - fake it with a union. > > + static simd_type load_unaligned(simd_type x0, simd_type x1, > + value_type const* addr) > + { return *addr; } > + > static simd_type load_scalar(value_type value) > { return value; } > > @@ -262,6 +266,13 @@ > return vec_perm(x0, x1, sh); > } > > + static simd_type load_unaligned(simd_type x0, simd_type x1, > + value_type const* addr) > + { > + __vector unsigned char sh = vec_lvsl(0, (value_type*)addr); > + return vec_perm(x0, x1, sh); > + } > + > static simd_type load_scalar(value_type value) > { > union > @@ -646,6 +676,9 @@ > static simd_type load_unaligned(value_type* addr) > { return _mm_loadu_si128((simd_type*)addr); } > > + static simd_type load_unaligned(simd_type x0, simd_type x1, value_type* addr) > + { return _mm_loadu_si128((simd_type*)addr); } > + > static simd_type load_scalar(value_type value) > { return _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, value); } > Index: src/vsip/opt/simd/expr_evaluator.hpp > =================================================================== > --- src/vsip/opt/simd/expr_evaluator.hpp (revision 171353) > +++ src/vsip/opt/simd/expr_evaluator.hpp (working copy) > @@ -43,11 +43,11 @@ > namespace simd > { > > -template > +template [2] what is A? Aligned? Document, or pick a more descriptive name (or both :) You use "IsAligned" in expr_iterator.hpp. That would be good here too. > struct Proxy_factory > { > typedef Direct_access_traits access_traits; > - typedef Proxy proxy_type; > + typedef Proxy proxy_type; > typedef typename Adjust_layout_dim< > 1, typename Block_layout::layout_type>::type > layout_type; > @@ -62,7 +62,15 @@ > return dda.stride(0) == 1 && > Simd_traits::alignment_of(dda.data()) == 0; > } > - static proxy_type > + static bool > + is_aligned(BlockT const& b) > + { > + Ext_data dda(b, SYNC_IN); > + return > + Simd_traits::alignment_of(dda.data()) == 0; > + } [3] I don't see how this works. Since rt_valid still checks alignment, rt_valid will be false for unaligned data. There is no to tell whether rt_valid is false because data is unaligned, or because it doesn't have unit stride. Rather than add an 'is_aligned()' method to each proxy, you should modify 'rt_valid' to account for IsAligned. When IsAligned (aka A) is true, it should check alignment and stride. When IsAlined is false, it should only check stride: static bool rt_valid(BlockT const &b) { Ext_data dda(b, SYNC_IN); return dda.stride(0) == 1 && (!IsAligned || Simd_traits::alignment_of(dda.data()) == 0); } > @@ -221,20 +262,17 @@ > return (dda.stride(0) == 1 && > simd::Simd_traits:: > alignment_of(dda.data()) == 0 && > - simd::Proxy_factory::rt_valid(rhs)); > + simd::Proxy_factory::rt_valid(rhs)); [4] Again, I don't see how this works for unaligned data. rt_valid will be false if any data is unaligned, preventing this evaluator from being used. The is_aligned check below will always be true, because exec() will only be called when rt_valid is true, i.e. when the data is aligned. With the rt_valid change suggested above, you could have two evaluators: Simd_loop_fusion and Simd_loop_fusion_unaligned. They could share a common base class that takes alignment as a template parameter. (You could have a single evaluator, but that requires either checking the Proxy's rt_valid a third time (bad), or keeping state between the evaluator's rt_valid and exec (difficult, since no evaluator object is created. Using two evaluators captures that state in the conditional). > } > > static void exec(LB& lhs, RB const& rhs) > { > typedef typename simd::LValue_access_traits WAT; > - typedef typename simd::Proxy_factory::access_traits EAT; > length_type const vec_size = > simd::Simd_traits::vec_size; > Ext_data dda(lhs, SYNC_OUT); > length_type const size = dda.size(0); > length_type n = size; > - simd::Proxy lp(dda.data()); > - simd::Proxy rp(simd::Proxy_factory::create(rhs)); > #if 0 > // simple iterator-based loop. It has the most concise syntax, > // but generates suboptimal code with gcc 3.4 > @@ -271,12 +309,35 @@ > // loop using proxy interface. This generates the best code > // with gcc 3.4 (with gcc 4.1 the difference to the first case > // above is negligible). > - while (n >= vec_size) > - { > - lp.store(rp.load()); > - n -= vec_size; > - lp.increment(); > - rp.increment(); > + > + // If any of the blocks are unaligned, we treat all of the blocks as > + // unaligned > + if(simd::Proxy_factory::is_aligned(rhs)) { > + typedef typename simd::Proxy_factory::access_traits EAT; > + > + simd::Proxy lp(dda.data()); > + simd::Proxy rp(simd::Proxy_factory::create(rhs)); > + > + while (n >= vec_size) > + { > + lp.store(rp.load()); > + n -= vec_size; > + lp.increment(); > + rp.increment(); > + } > + } else { [5] Coding standards: put braces on separate lines for consistency > + typedef typename simd::Proxy_factory::access_traits EAT; > + > + simd::Proxy lp(dda.data()); > + simd::Proxy rp(simd::Proxy_factory::create(rhs)); > + > + while (n >= vec_size) > + { > + lp.store(rp.load()); > + n -= vec_size; > + lp.increment(); > + rp.increment(); > + } > } > #endif > // Process the remainder, using simple loop fusion. > Index: src/vsip/opt/simd/expr_iterator.hpp > =================================================================== > +template > +class Proxy,false > > +{ > +public: > + typedef T value_type; > + typedef Simd_traits simd; > + typedef typename simd::simd_type simd_type; > + > + Proxy(value_type const *ptr) : ptr_unaligned_(ptr) > + { > + ptr_aligned_ = (simd_type*)((intptr_t)ptr & ~(simd::alignment-1)); > + > + // We do not need x0 and x1 if we are using sse because sse has > + // a uload intruction. > +#if !defined(__SSE__) and !defined(_SSE2__) > + x0_ = simd::load((value_type*)ptr_aligned_); > + x1_ = simd::load((value_type*)(ptr_aligned_+simd::vec_size)); > +#endif [6] Don't mix the pre-processor and traits like this! The simd traits class abstracts the interface to the SIMD ISA. Using the preprocessor on top of the simd traits creates redundancy that makes the code more difficult to maintain and understand. > + } > + > + simd_type load() const > + { > + return simd::load_unaligned(x0_, x1_, ptr_unaligned_); > + } > + > + void set_x0(simd_type x0) { x0_ = x0; } [7] what is set_x0 for? > + > + void increment(length_type n = 1) > + { > + ptr_unaligned_ += n * Simd_traits::vec_size; > + ptr_aligned_ += n; > + > + // We do not need x0 and x1 if we are using sse because sse has > + // a uload intruction. > +#if !defined(__SSE__) and !defined(_SSE2__) > + // update x0 > + x0_ = (n == 1)? x1_:simd::load((value_type*)ptr_aligned_); > + > + // update x1 > + x1_ = simd::load((value_type*)(ptr_aligned_+simd::vec_size)); > +#endif > + } > + > +private: > + simd_type x0_; > + simd_type x1_; > + > + simd_type const *ptr_aligned_; > + value_type const *ptr_unaligned_; > +}; [8] You're using the preprocessor because some SIMD ISAs support permute (AltiVec), and some don't (SSE). Instead, let's add a static const 'has_premute') that indicates the presence of permute, as suggested above in [1]. Then you can make the use of permute conditional. Becuase using permute requires storage for x0 and x1, the best time to make the decision is at compile-time with a template parameter: template class Proxy, false> : public Proxy_direct_access_helper::has_permute> { ... }; Then you can specialize Proxy_direct_access_helper to use permute if it is available template struct Proxy_direct_access_helper; template struct Proxy_direct_access_helper { Proxy(value_type const *ptr) : ptr_unaligned_(ptr) ptr_aligned_ ((simd_type*)((intptr_t)ptr & ~(simd::alignment-1))), x1_ (simd::load((value_type*)(ptr_aligned_+simd::vec_size))), perm_ (simd::unaligned_permuation((value_type*)(ptr))) {} simd_type load() const { // update x0 x0_ = (n == 1)? x1_:simd::load((value_type*)ptr_aligned_); // update x1 x1_ = simd::load((value_type*)(ptr_aligned_+simd::vec_size)); return simd::permute(x0_, x1_, perm_); } void increment(length_type n = 1) { ptr_unaligned_ += n * Simd_traits::vec_size; ptr_aligned_ += n; } private: simd_type x0_; simd_type x1_; simd_type perm_; simd_type const *ptr_aligned_; value_type const *ptr_unaligned_; }; > @@ -522,7 +578,7 @@ > B b_; > C c_; > }; > - > +/* > template > struct Iterator > { > @@ -548,7 +604,7 @@ > r += n; > return r; > } > - > +*/ [9] Why is Iterator being commented out? > } // namespace vsip::impl::simd > } // namespace vsip::impl > } // namespace vsip -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Tue May 22 15:49:21 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 22 May 2007 11:49:21 -0400 Subject: [patch] Misc bits; prep for updating svn:externals Message-ID: <46531101.7030602@codesourcery.com> I'm clearing out my SVN checkout in preparation for removing FFTW from svn:externals (I'm also going to move the ref-impl tests from svn:externals to the main repo). This patch: - Fixes dependency files on MCOE (GreenHills curiously uses .o in the dependency file, even though it generates .oppc files). - Guards the ALF dependency rule. - Fixes configure to recognize Ubunutu 7.04's ATLAS (zelda) - Fixes simd.hpp to not use altivec intrinsics that don't exist (altivec only defines vec_cmpge and vec_cmple for float. But vec_cmplt and vec_cmpgt are still available). Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Tue May 22 15:54:03 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 22 May 2007 11:54:03 -0400 Subject: [vsipl++] [patch] Misc bits; prep for updating svn:externals In-Reply-To: <46531101.7030602@codesourcery.com> References: <46531101.7030602@codesourcery.com> Message-ID: <4653121B.6070907@codesourcery.com> Jules Bergmann wrote: > I'm clearing out my SVN checkout in preparation for removing FFTW from > svn:externals (I'm also going to move the ref-impl tests from > svn:externals to the main repo). > > This patch: > - Fixes dependency files on MCOE (GreenHills curiously uses .o in the > dependency file, even though it generates .oppc files). > - Guards the ALF dependency rule. > - Fixes configure to recognize Ubunutu 7.04's ATLAS (zelda) > - Fixes simd.hpp to not use altivec intrinsics that don't exist > (altivec only defines vec_cmpge and vec_cmple for float. But > vec_cmplt and vec_cmpgt are still available). > > Patch applied. > -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cfg.diff URL: From stefan at codesourcery.com Tue May 22 16:27:03 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 22 May 2007 12:27:03 -0400 Subject: patch: Fix build failure in scripting code. Message-ID: <465319D7.7050406@codesourcery.com> The attached patch fixes a build failure in the (python) scripting bindings. Patch is applied. Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 -------------- next part -------------- A non-text attachment was scrubbed... Name: scripting.patch Type: text/x-patch Size: 714 bytes Desc: not available URL: From stefan at codesourcery.com Tue May 22 16:33:36 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Tue, 22 May 2007 12:33:36 -0400 Subject: [vsipl++] [patch] Misc bits; prep for updating svn:externals In-Reply-To: <4653121B.6070907@codesourcery.com> References: <46531101.7030602@codesourcery.com> <4653121B.6070907@codesourcery.com> Message-ID: <46531B60.3090100@codesourcery.com> Jules Bergmann wrote: > Index: configure.ac > =================================================================== > --- configure.ac (revision 171903) > +++ configure.ac (working copy) > @@ -1976,12 +1985,12 @@ > echo "HOST: $host BUILD: $build" > if test "$host" != "$build"; then > # Can't cross-compile builtin atlas > - lapack_packages="atlas generic1 generic2 simple-builtin" > + lapack_packages="atlas generic_wo_blas generic_with_blas generic_v3_wo_blas generic_v3_with_blas simple-builtin" > else > - lapack_packages="atlas generic1 generic2 builtin" > + lapack_packages="atlas generic_wo_blas generic_with_blas generic_v3_wo_blas generic_v3_with_blas builtin" > fi > elif test "$with_lapack" == "generic"; then > - lapack_packages="generic1 generic2" > + lapack_packages="generic_wo_blas generic_with_blas generic_v3_wo_blas generic_v3_with_blas" > elif test "$with_lapack" == "simple-builtin"; then > lapack_packages="simple-builtin"; > else Jules, some weeks ago we were discussing the possibility to allow users to pass a comma-separated list of backends to --with-lapack, obsoleting some of the (compound) options above. Do you still think we may do that ? (Also, some documentation concerning the meaning of the various options may help to illustrate the parameter space. I'd offer to write that, but each time I spend more than two minutes thinking about it I'm getting confused again. :-) ) Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From jules at codesourcery.com Tue May 22 16:36:46 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 22 May 2007 12:36:46 -0400 Subject: [vsipl++] [patch] Misc bits; prep for updating svn:externals In-Reply-To: <4653121B.6070907@codesourcery.com> References: <46531101.7030602@codesourcery.com> <4653121B.6070907@codesourcery.com> Message-ID: <46531C1E.7050101@codesourcery.com> >> I'm clearing out my SVN checkout in preparation for removing FFTW from >> svn:externals (I'm also going to move the ref-impl tests from >> svn:externals to the main repo). In the future, you will need to manually checkout FFTW into vendor/fftw. Before your next update, you will need to remove your tests/ref-impl directory (otherwise you may see an error "Failed to add directory 'tests/ref-impl': object of the same name already exists"). Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ext.diff URL: From jules at codesourcery.com Tue May 22 19:30:57 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 22 May 2007 15:30:57 -0400 Subject: [vsipl++] [patch] Misc bits; prep for updating svn:externals In-Reply-To: <46531B60.3090100@codesourcery.com> References: <46531101.7030602@codesourcery.com> <4653121B.6070907@codesourcery.com> <46531B60.3090100@codesourcery.com> Message-ID: <465344F1.50302@codesourcery.com> Stefan Seefeld wrote: > Jules Bergmann wrote: > >> Index: configure.ac >> =================================================================== >> --- configure.ac (revision 171903) >> +++ configure.ac (working copy) > >> @@ -1976,12 +1985,12 @@ >> echo "HOST: $host BUILD: $build" >> if test "$host" != "$build"; then >> # Can't cross-compile builtin atlas >> - lapack_packages="atlas generic1 generic2 simple-builtin" >> + lapack_packages="atlas generic_wo_blas generic_with_blas generic_v3_wo_blas generic_v3_with_blas simple-builtin" >> else >> - lapack_packages="atlas generic1 generic2 builtin" >> + lapack_packages="atlas generic_wo_blas generic_with_blas generic_v3_wo_blas generic_v3_with_blas builtin" >> fi >> elif test "$with_lapack" == "generic"; then >> - lapack_packages="generic1 generic2" >> + lapack_packages="generic_wo_blas generic_with_blas generic_v3_wo_blas generic_v3_with_blas" >> elif test "$with_lapack" == "simple-builtin"; then >> lapack_packages="simple-builtin"; >> else > > Jules, > > some weeks ago we were discussing the possibility to allow users to pass > a comma-separated list of backends to --with-lapack, obsoleting some of > the (compound) options above. Do you still think we may do that ? Yes, definitely. This was meant to be a step in that direction (replacing cryptic 'generic1' with less-cryptic 'generic_wo_blas', etc). I would like to allow users to specify a comma separated list of backends, but I would like to keep some of the current smarts, such as avoiding ATLAS when cross-compiling (since ATLAS can't be cross compiled) and searching for the different permutations of lapack ({with/without BLAS} x {lapack.a / lapack-3.a}). I don't know why there is so much variation in lapack/atlas/mkl. What I would *really* like is for those libraries to provide us with a pkg-config file :) > (Also, some documentation concerning the meaning of the various options > may help to illustrate the parameter space. I'd offer to write that, but > each time I spend more than two minutes thinking about it I'm getting > confused again. :-) ) Ok. Let's start with the documentation embedded in configure. Right now it documents the following lapack choices ('--with-lapack=PKG'): mkl -- Intel Math Kernel Library acml -- AMD Core Math Library atlas -- System ATLAS/LAPACK installation generic -- system generic LAPACK installation builtin -- Sourcery VSIPL++'s builtin ATLAS/C-LAPACK fortran-builtin -- Sourcery VSIPL++'s builtin ATLAS/LAPACK simple-builtin -- Lapack that doesn't require atlas. I think it is fairly descriptive. We could clarify the builtin variants, perhaps: builtin -- Sourcery VSIPL++'s builtin ATLAS/C-LAPACK. Requires only C compiler. Cannot be cross-compiled. fortran-builtin -- Sourcery VSIPL++'s builtin ATLAS/C-LAPACK. Requires C and FORTRAN compilers. Cannot be cross-compiled. Typically has better performance than 'builtin'. simple-builtin -- Sourcery VSIPL++'s builtin C-LAPACK. Does not use ATLAS so performance is lower than 'builtin' and 'fortran-builtin'. However, only requires C compiler and can be cross-compiled. Although I'm not sure we need that level of detail in configure itself. It could certainly go into the quickstart though. What do you think? What additional clarification would be useful? -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Tue May 22 20:58:01 2007 From: assem at codesourcery.com (Assem Salama) Date: Tue, 22 May 2007 16:58:01 -0400 Subject: SIMD loop fusion support for unaligned Message-ID: <46535959.5070809@codesourcery.com> Everyone, This patch adds support for unaligned vectors in SIMD loop fusion. Thanks, Assem -------------- next part -------------- A non-text attachment was scrubbed... Name: svn.diff.05222007.1.log Type: text/x-log Size: 25847 bytes Desc: not available URL: From jules at codesourcery.com Tue May 22 21:09:08 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 22 May 2007 17:09:08 -0400 Subject: [patch] Fix benchmark install; fix ref-impl tests for MCOE Message-ID: <46535BF4.9000405@codesourcery.com> Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misc.diff URL: From don at codesourcery.com Tue May 22 21:25:37 2007 From: don at codesourcery.com (Don McCoy) Date: Tue, 22 May 2007 15:25:37 -0600 Subject: [vsipl++] [patch] Fix benchmark install; fix ref-impl tests for MCOE In-Reply-To: <46535BF4.9000405@codesourcery.com> References: <46535BF4.9000405@codesourcery.com> Message-ID: <46535FD1.6020300@codesourcery.com> Jules Bergmann wrote: > Index: benchmarks/GNUmakefile.inc.in > =================================================================== > --- benchmarks/GNUmakefile.inc.in (revision 171918) > +++ benchmarks/GNUmakefile.inc.in (working copy) > @@ -80,7 +80,7 @@ > rm -f $(benchmarks_targets) $(benchmarks_static_targets) > > # Install benchmark source code and executables > -install:: > +install:: benchmarks > $(INSTALL) -d $(DESTDIR)$(pkgdatadir)/benchmarks > $(INSTALL) -d $(DESTDIR)$(pkgdatadir)/benchmarks/lapack > $(INSTALL) -d $(DESTDIR)$(pkgdatadir)/benchmarks/ipp > Similarly, will we need something like this for benchmarks/hpec_kernel/GNUmakefile.inc.in? install:: hpec_kernel Idly, I wonder if 'make' shouldn't make whatever 'make install' needs... Or is there no hard-and-fast rule there? -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From jules at codesourcery.com Wed May 23 11:15:23 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 23 May 2007 07:15:23 -0400 Subject: [vsipl++] [patch] Fix benchmark install; fix ref-impl tests for MCOE In-Reply-To: <46535FD1.6020300@codesourcery.com> References: <46535BF4.9000405@codesourcery.com> <46535FD1.6020300@codesourcery.com> Message-ID: <4654224B.40001@codesourcery.com> > Similarly, will we need something like this for > benchmarks/hpec_kernel/GNUmakefile.inc.in? > > install:: hpec_kernel Thanks, patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: misc.diff URL: From don at codesourcery.com Wed May 23 22:49:32 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 23 May 2007 16:49:32 -0600 Subject: [patch] more cleanup with benchmarks Message-ID: <4654C4FC.40300@codesourcery.com> This patch is really two seperate ones. The configure change was as a result of today's discussion, but the other changes were from last week -- apparently I forgot to post this. Ok to commit? Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ms.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ms.diff URL: From stefan at codesourcery.com Wed May 23 22:58:14 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Wed, 23 May 2007 18:58:14 -0400 Subject: [vsipl++] [patch] more cleanup with benchmarks In-Reply-To: <4654C4FC.40300@codesourcery.com> References: <4654C4FC.40300@codesourcery.com> Message-ID: <4654C706.3070601@codesourcery.com> Don McCoy wrote: > This patch is really two seperate ones. The configure change was as a > result of today's discussion, but the other changes were from last week > -- apparently I forgot to post this. > > Ok to commit? > > Regards, > > > ------------------------------------------------------------------------ > > 2007-05-23 Don McCoy > > * configure.ac: Search for the compilers that come with the Cell SDK > over the default system ones (gcc and g++). > * benchmarks/cell/bw.cpp: Remove debug code. > * benchmarks/makefile.standalone.in: Allow configuration > parameters to set build variables to default values. > * benchmarks/hpec_kernel/GNUmakefile.inc.in: Conditionalize > building of SVD benchmark on whether or not LAPACK is available. > > > ------------------------------------------------------------------------ > > Index: configure.ac > =================================================================== > --- configure.ac (revision 172077) > +++ configure.ac (working copy) > @@ -230,6 +230,11 @@ > > AC_CHECK_PROGS(CC_SPU, [spu-gcc]) > AC_CHECK_PROGS(EMBED_SPU, [ppu-embedspu embedspu]) > + AC_PROG_CXX(ppu-g++ g++) > + AC_PROG_CC(ppu-gcc gcc) > + CXXFLAGS="-m32 $CXXFLAGS" > + LDFLAGS="-m32 $LDFLAGS" > + CFLAGS="-m32 $CFLAGS" > > AC_DEFINE_UNQUOTED(VSIP_IMPL_CBE_SDK, 1, > [Set to 1 to support Cell Broadband Engine.]) I'm not sure the above is valid. At least I would very carefully check autoconf docs, because: * We already call AC_PROG_CXX (et al.) earlier in configure.ac, and I wouldn't be surprised if there were side-effects introduced by this double appearance. * You only present a very limited choice of compilers (admittedly still sufficient for that particular platform, at this time). Regards, Stefan PS: Please make sure checkins remain atomic, i.e. don't address unrelated issues. This makes it easier to roll back, or merge, if ever we must. -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From stefan at codesourcery.com Thu May 24 01:52:11 2007 From: stefan at codesourcery.com (Stefan Seefeld) Date: Wed, 23 May 2007 21:52:11 -0400 Subject: [vsipl++] [patch] more cleanup with benchmarks In-Reply-To: <4654C706.3070601@codesourcery.com> References: <4654C4FC.40300@codesourcery.com> <4654C706.3070601@codesourcery.com> Message-ID: <4654EFCB.5010501@codesourcery.com> Don, Stefan Seefeld wrote: > Don McCoy wrote: >> Index: configure.ac >> =================================================================== >> --- configure.ac (revision 172077) >> +++ configure.ac (working copy) >> @@ -230,6 +230,11 @@ >> >> AC_CHECK_PROGS(CC_SPU, [spu-gcc]) >> AC_CHECK_PROGS(EMBED_SPU, [ppu-embedspu embedspu]) >> + AC_PROG_CXX(ppu-g++ g++) >> + AC_PROG_CC(ppu-gcc gcc) >> + CXXFLAGS="-m32 $CXXFLAGS" >> + LDFLAGS="-m32 $LDFLAGS" >> + CFLAGS="-m32 $CFLAGS" >> >> AC_DEFINE_UNQUOTED(VSIP_IMPL_CBE_SDK, 1, >> [Set to 1 to support Cell Broadband Engine.]) > > I'm not sure the above is valid. At least I would very carefully check > autoconf docs, because: > > * We already call AC_PROG_CXX (et al.) earlier in configure.ac, and I > wouldn't be surprised if there were side-effects introduced by this double appearance. Here is a slightly reworked patch (for configure.ac) that addresses my concern: AC_PROG_CXX and AC_PROG_CC are called once, but with context-dependent arguments. I can't pretend to fully understand the involved m4 macros, or even the generated configure script, but the general conditional logic appears to be fine now. What do you think ? Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.ac.diff Type: text/x-patch Size: 1057 bytes Desc: not available URL: From don at codesourcery.com Thu May 24 03:33:25 2007 From: don at codesourcery.com (Don McCoy) Date: Wed, 23 May 2007 21:33:25 -0600 Subject: [vsipl++] [patch] more cleanup with benchmarks In-Reply-To: <4654EFCB.5010501@codesourcery.com> References: <4654C4FC.40300@codesourcery.com> <4654C706.3070601@codesourcery.com> <4654EFCB.5010501@codesourcery.com> Message-ID: <46550785.1050200@codesourcery.com> Stefan Seefeld wrote: > Here is a slightly reworked patch (for configure.ac) that addresses my concern: > AC_PROG_CXX and AC_PROG_CC are called once, but with context-dependent arguments. > I can't pretend to fully understand the involved m4 macros, or even the generated > configure script, but the general conditional logic appears to be fine now. > > What do you think ? > Looks good to me. Thanks! -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From don at codesourcery.com Thu May 24 20:22:41 2007 From: don at codesourcery.com (Don McCoy) Date: Thu, 24 May 2007 14:22:41 -0600 Subject: [vsipl++] [patch] more cleanup with benchmarks In-Reply-To: <4654C4FC.40300@codesourcery.com> References: <4654C4FC.40300@codesourcery.com> Message-ID: <4655F411.60307@codesourcery.com> Same patch, factoring out the configure change. Ok now? -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ms2.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ms2.diff URL: From jules at codesourcery.com Wed May 30 14:00:12 2007 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 30 May 2007 10:00:12 -0400 Subject: [vsipl++] [patch] more cleanup with benchmarks In-Reply-To: <4655F411.60307@codesourcery.com> References: <4654C4FC.40300@codesourcery.com> <4655F411.60307@codesourcery.com> Message-ID: <465D836C.8050000@codesourcery.com> Don McCoy wrote: > Same patch, factoring out the configure change. > > Ok now? Don, this looks good, please check it in. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705