From jules at codesourcery.com Thu Oct 2 19:26:53 2008 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 02 Oct 2008 15:26:53 -0400 Subject: [patch] CBE split-complex vmmul Message-ID: <48E5207D.5010404@codesourcery.com> This adds CBE dispatch for split-complex vmmul. This is a bit late in the release, but is important to a customer. Tested with cbe32/debug vmmul related tests (regr-vmmul.cpp, vmmul.cpp, fastconv.cpp, fc1-parallel.cpp) and benchmarks (vmmul.cpp). It also includes a new option (--svpp-tm-verbose) that makes the task_manager print out when a new task is loaded. This was useful for optimizing SSAR. This requires a CML patch to work. Without the CML patch it won't do anything. Ok to apply? -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: split-vmmul.diff URL: From stefan at codesourcery.com Thu Oct 2 20:34:56 2008 From: stefan at codesourcery.com (Stefan Seefeld) Date: Thu, 02 Oct 2008 16:34:56 -0400 Subject: [vsipl++] [patch] CBE split-complex vmmul In-Reply-To: <48E5207D.5010404@codesourcery.com> References: <48E5207D.5010404@codesourcery.com> Message-ID: <48E53070.4060909@codesourcery.com> Jules Bergmann wrote: > This adds CBE dispatch for split-complex vmmul. This is a bit late in > the release, but is important to a customer. > > Tested with cbe32/debug vmmul related tests (regr-vmmul.cpp, vmmul.cpp, > fastconv.cpp, fc1-parallel.cpp) and benchmarks (vmmul.cpp). > > It also includes a new option (--svpp-tm-verbose) that makes the > task_manager print out when a new task is loaded. This was useful for > optimizing SSAR. This requires a CML patch to work. Without the CML > patch it won't do anything. The patch looks good, though I do have some comments. While I consider them important, I don't think they are important enough to hold up the release, so please check this patch in as is, then let's discuss how to address these points: * What is the use case for the 'verbose' flag ? How does it differ from profiling ? Couldn't it be integrated into the profiler, such that users would see task initialization / finalization in the profile logs ? > - int argc = 3; > - char* argv[3]; > - char number[256]; > + int argc = 5; > + char* argv[5]; > + char number[256], t_verbose[256]; > sprintf(number, "%u", num_accelerators); > + sprintf(t_verbose, "%u", verbose ? 1 : 0); * AFAIU, we agreed to using stdio for SPE code, and std::iostream for anything else. Shouldn't we thus use std::stringstream here ? * Compiling this and similar code with GCC 4.3 results in warnings related to conversion from char const* to char* (the assignment of string literals to argv), and I wondered (quickly) how to get around that. We may want to get back to that when preparing the 2.1 release on Fedora 9, which uses GCC 4.3. Thanks, Stefan -- Stefan Seefeld CodeSourcery stefan at codesourcery.com (650) 331-3385 x718 From mark at codesourcery.com Thu Oct 2 20:47:16 2008 From: mark at codesourcery.com (Mark Mitchell) Date: Thu, 02 Oct 2008 13:47:16 -0700 Subject: [vsipl++] [patch] CBE split-complex vmmul In-Reply-To: <48E53070.4060909@codesourcery.com> References: <48E5207D.5010404@codesourcery.com> <48E53070.4060909@codesourcery.com> Message-ID: <48E53354.4010500@codesourcery.com> Stefan Seefeld wrote: > * Compiling this and similar code with GCC 4.3 results in warnings > related to conversion from char const* to char* (the assignment of > string literals to argv) That is indeed deprecated. It was grandfathered in for compatibility back to K&R C, but, obviously, string literals are const, so assignments from "char const *" to "char *" require a cast -- and you'd better be sure nobody is going to try to modify the string, as that will cause a SEGV. I generally declare argv as: const char *argv[] because you generally can't expect to change the strings in argv. (Remember that they're coming from the OS via black magic; who knows where they live.) You can, however, expect to change the array elements. So: argv[1] = "foo" is sensible, but: argv[1][3] = 'a' is weird. FWIW, -- Mark Mitchell CodeSourcery mark at codesourcery.com (650) 331-3385 x713 From jules at codesourcery.com Thu Oct 2 21:08:13 2008 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 02 Oct 2008 17:08:13 -0400 Subject: [vsipl++] [patch] CBE split-complex vmmul In-Reply-To: <48E53070.4060909@codesourcery.com> References: <48E5207D.5010404@codesourcery.com> <48E53070.4060909@codesourcery.com> Message-ID: <48E5383D.8010304@codesourcery.com> > * What is the use case for the 'verbose' flag ? How does it differ from > profiling ? Couldn't it be integrated into the profiler, such that users > would see task initialization / finalization in the profile logs ? The use case is determining what kernels are being used/loaded/reused during an execution. For example: cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: svpp_kernels.so alf_vcopy_s_spu reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse cached_alf_task_create: spu_fftw_kernels.so fftwf_spu_fftw reuse This information can in turn drive optimization. As a library developer, if you see a kernel being used repeatedly, but not being reused, it may be case where you can standardize the block sizes across invocations. As a user, if you see a low-performance kernel (for example vmul) causing a high-performance kernel (for example fft) to reload, you can look for fusion opportunities or somehow disabling the low-performanc kernel. Yes, it could go into the profiler somehow, but some design needs to be done. The main reason I included it in the patch was because split vmmul required changes to task_manager.hpp. I'll separate those patches. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Fri Oct 3 14:09:57 2008 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 03 Oct 2008 10:09:57 -0400 Subject: [vsipl++] [patch] CBE split-complex vmmul In-Reply-To: <48E5383D.8010304@codesourcery.com> References: <48E5207D.5010404@codesourcery.com> <48E53070.4060909@codesourcery.com> <48E5383D.8010304@codesourcery.com> Message-ID: <48E627B5.6070705@codesourcery.com> > The main reason I included it in the patch was because split vmmul > required changes to task_manager.hpp. I'll separate those patches. Patch applied as attached. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: split-vmmul.diff URL: From jules at codesourcery.com Mon Oct 6 15:29:33 2008 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 06 Oct 2008 11:29:33 -0400 Subject: [patch] Dump FFT BE configury from check_config Message-ID: <48EA2EDD.2080300@codesourcery.com> Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: chkcfg.diff URL: From jules at codesourcery.com Mon Oct 6 15:36:16 2008 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 06 Oct 2008 11:36:16 -0400 Subject: [patch] Fix fft_be to check VSIP_IMPL_CBE_SDK_FFT Message-ID: <48EA3070.8060704@codesourcery.com> Fft_be was checking VSIP_IMPL_CBE_SDK instead of VSIP_IMPL_CBE_SDK_FFT. This causes unexpected failures when CBE SDK is enabled, but the CBE SDK FFT BE is not. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fft_be.diff URL: From jules at codesourcery.com Tue Oct 7 21:26:41 2008 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 07 Oct 2008 17:26:41 -0400 Subject: [patch] Fix ALF kernels for 64-bit addresses Message-ID: <48EBD411.5000306@codesourcery.com> This fixes ALF kernels to avoid DMA lists when 64-bit addresses are possible. Otherwise it is possible that the upper 32-bits of address may differ. It also relaxes the is_positive check for solver-cholesky. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: alf64.diff URL: