From stefan at codesourcery.com Wed Mar 1 16:28:55 2006 From: stefan at codesourcery.com (Stefan Seefeld) Date: Wed, 01 Mar 2006 11:28:55 -0500 Subject: patch: Fixes for the SAL FFT backend, some speedup. Message-ID: <4405CBC7.4010308@codesourcery.com> The attached patch fixes failures in the fft.cpp tests revealed when we discovered that some (most ?) tests weren't actually executed due to a typo. Regards, Stefan -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch URL: From jules at codesourcery.com Thu Mar 2 19:49:54 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 02 Mar 2006 14:49:54 -0500 Subject: [vsipl++] patch: Fixes for the SAL FFT backend, some speedup. In-Reply-To: <4405CBC7.4010308@codesourcery.com> References: <4405CBC7.4010308@codesourcery.com> Message-ID: <44074C62.9040304@codesourcery.com> Stefan Seefeld wrote: > The attached patch fixes failures in the fft.cpp tests revealed when we > discovered that some (most ?) tests weren't actually executed due to a > typo. > Stefan, this looks good. How is FFT/FFTM looking? -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Fri Mar 3 18:14:41 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 03 Mar 2006 13:14:41 -0500 Subject: [patch] Memory allocation cleanup, disable non power-of-2 FFT tests when using SAL. Message-ID: <44088791.6040902@codesourcery.com> This patch * Closes a memory leak (Fft objects were creating a reference counted Fft_core object, but the initial reference count was 2, making it impossible for the count to go to 0). * Changes the FFT/FFTM/FFTM-par tests to disable non-power-of-2 FFT sizes when using SAL, and other cleanup. * Changes the window test to disable non-power-of-2 Chebychev test when using SAL. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fft.diff URL: From don at codesourcery.com Fri Mar 3 21:56:11 2006 From: don at codesourcery.com (Don McCoy) Date: Fri, 03 Mar 2006 14:56:11 -0700 Subject: [vsipl++] [patch] fixes for profile timer 'realtime' option In-Reply-To: <43FD708F.8020100@codesourcery.com> References: <43FC8D57.9090407@codesourcery.com> <43FC958C.2090507@codesourcery.com> <43FD708F.8020100@codesourcery.com> Message-ID: <4408BB7B.4030305@codesourcery.com> This patch adds the two member functions 'zero' and 'ticks' to the No_time struct. I inadvertently left these out of the original patch. Committed. -- Don McCoy CodeSourcery -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pt3.diff URL: From jules at codesourcery.com Mon Mar 6 04:47:59 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Sun, 05 Mar 2006 23:47:59 -0500 Subject: [patch] FFT fixes, GHS/Mercury fixes Message-ID: <440BBEFF.3030107@codesourcery.com> This patch cleans up several bugs in the FFT IPP backend. First, for 2-D FFTs the IPP backend was constructing a plan with rows and columns swapped. This was previously hidden by the sizeof bug in the fft test. Second, for FFTMs, plans were being destroyed with the wrong-dimension function. This was previously hidden by the Fft_imp memory leak. In addition, this patch changes the FFT IPP backend to use the new alloc_align signature. This patch fixes some problems with the FFT test cleanup that were not exposed testing against SAL. On the Mercury side, this patch adds template instantiation pragmas to signal-window for template functions it uses. This is necessary because Greenhills uses an automated template instantiation algorithm that instantiates the necessary templates at link time and assigns them to a single object file (to avoid multiple definitions) (these are the "prelinker" messages). Since signal-window is compiled as part of the library, its source is not be available when the application is linked. I made some functions used by signal-window inline instead of adding pragmas (in particular small functions in signal-fft, and operators in fns_elementwise). This patch disables all uses of SAL mat_mul when VSIP_IMPL_USE_MAT_MUL is 0 (even when no alternative SAL routine exists). This patch adds support for split-complex convolution. This patch adds configure tests to check if acosh is provided (greenhills cmath defines it, but mercury's libc does not provide it). Stefan, do the FFT changes look OK? Also, is it OK to inline the operators / functions in fns_elementwise? Don, do the SAL changes look OK? thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mc3.diff URL: From jules at codesourcery.com Mon Mar 6 16:46:57 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 06 Mar 2006 11:46:57 -0500 Subject: [patch] Quickstart section on Mercury configuration Message-ID: <440C6781.5070808@codesourcery.com> -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: quickstart.diff URL: From mark at codesourcery.com Mon Mar 6 17:13:39 2006 From: mark at codesourcery.com (Mark Mitchell) Date: Mon, 06 Mar 2006 09:13:39 -0800 Subject: [vsipl++] [patch] Quickstart section on Mercury configuration In-Reply-To: <440C6781.5070808@codesourcery.com> References: <440C6781.5070808@codesourcery.com> Message-ID: <440C6DC3.7020909@codesourcery.com> Jules Bergmann wrote: > + processing objects (including FFT). SAL is a propreitary library, (Done anyone know of a spell-checker for DocBook?) "proprietary" + Use MPI/Pro flavor of MPI. This option is necessary + when using MPI/Pro on the Mercury platform. How about "Use Verari's MPI/Pro.", to give credit where credit is due? > + By default, GreenHills? will only consider functions composed The question mark is a typo, I think. -- Mark Mitchell CodeSourcery mark at codesourcery.com (650) 331-3385 x713 From don at codesourcery.com Mon Mar 6 22:47:26 2006 From: don at codesourcery.com (Don McCoy) Date: Mon, 06 Mar 2006 15:47:26 -0700 Subject: [vsipl++] [patch] FFT fixes, GHS/Mercury fixes In-Reply-To: <440BBEFF.3030107@codesourcery.com> References: <440BBEFF.3030107@codesourcery.com> Message-ID: <440CBBFE.4090505@codesourcery.com> Jules Bergmann wrote: > Don, do the SAL changes look OK? > Yes. They look good to me. >+// We need to make sure that we don't force the instantiation of >+// the same templates in multiply library files because that will > > typo? ... multiple library files... -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 From stefan at codesourcery.com Tue Mar 7 04:14:15 2006 From: stefan at codesourcery.com (Stefan Seefeld) Date: Mon, 06 Mar 2006 23:14:15 -0500 Subject: patch: FFT refactored Message-ID: <440D0897.2060602@codesourcery.com> Please find attached a patch containing a first step towards a refactored FFT implementation. This patch factors out different backend into their respective implementation (and subdirectory, for simpler maintenance). Once finished, different backends can be enabled via configure at the same time, and a compile-/runtime-dispatcher will instantiate the appropriate backend for a given FFT(M) object. Here is a short list of the new files: src/vsip/impl/fft.hpp : Contains the new public Fft(m) API. src/vsip/impl/fft/backend.hpp : Contains the backend interface definition. src/vsip/impl/fft/factory.hpp : Contains the generic backend factory bits. src/vsip/impl/fft/util.hpp : Contains some utility templates. src/vsip/impl/fft/workspace.hpp : Contains the code responsible for temporary buffers. src/vsip/impl/fftw3/ : Directory containing the fftw3 bridge (eventually). src/vsip/impl/ipp/ : Directory containing IPP glue code (eventually). src/vsip/impl/sal/ : Directory containing SAL glue code (eventually). The SAL binding is complete as far as the fft.cpp and fftm.cpp tests are concerned (these new bindings directly support split complex transforms). However, a number of stubs are still empty, or even wrong. To fill / fix them I would prefer to start by writing more tests to get better coverage of all the supported parameters (non-square matrixes, notably, as well as subviews where strides differ from sizes), before moving forward. This new code is mostly independent of existing files, i.e. it can coexist and even be tested with minimal changes to the existing sources / build system. Thanks, Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: fft.patch Type: text/x-patch Size: 116685 bytes Desc: not available URL: From jules at codesourcery.com Tue Mar 7 20:08:37 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 07 Mar 2006 15:08:37 -0500 Subject: [patch] benchmark cleanup Message-ID: <440DE845.8090503@codesourcery.com> Primarily add 'mem_per_point()' function to benchmarks missing it. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: bm.diff URL: From jules at codesourcery.com Wed Mar 8 01:19:20 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 07 Mar 2006 20:19:20 -0500 Subject: [patch] Fix MPI configuration - probe by default Message-ID: <440E3118.7000506@codesourcery.com> There was a mismatch between configure's documentation and its behavior. This patch makes configure probe for MPI unless told not to. Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: confmpi.diff URL: From stefan at codesourcery.com Wed Mar 8 16:11:12 2006 From: stefan at codesourcery.com (Stefan Seefeld) Date: Wed, 08 Mar 2006 11:11:12 -0500 Subject: patch: Fix in-place documentation building Message-ID: <440F0220.6060306@codesourcery.com> The attached patch fixes an error when building inside the source directory. The images were only copied into the html tree when builddir != sourcedir. Regards, Stefan -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch URL: From jules at codesourcery.com Wed Mar 8 16:18:36 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 08 Mar 2006 11:18:36 -0500 Subject: [vsipl++] patch: Fix in-place documentation building In-Reply-To: <440F0220.6060306@codesourcery.com> References: <440F0220.6060306@codesourcery.com> Message-ID: <440F03DC.2060609@codesourcery.com> Stefan Seefeld wrote: > The attached patch fixes an error when building inside the source > directory. > The images were only copied into the html tree when builddir != sourcedir. > Stefan, Patch looks good, please commit! -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From stefan at codesourcery.com Thu Mar 9 05:50:36 2006 From: stefan at codesourcery.com (Stefan Seefeld) Date: Thu, 09 Mar 2006 00:50:36 -0500 Subject: [vsipl++] [patch] Fix MPI configuration - probe by default In-Reply-To: <440E3118.7000506@codesourcery.com> References: <440E3118.7000506@codesourcery.com> Message-ID: <440FC22C.5030503@codesourcery.com> Jules Bergmann wrote: > +enable_mpi=probe > AC_ARG_ENABLE([mpi], > AS_HELP_STRING([--disable-mpi], > [don't use MPI (default is to use it if found)]),, > [enable_mpi=no]) The above doesn't quite work, as the 'enable_mpi' variable is used by the AC_ARG_ENABLE macro internally, so presetting it will confuse configure. The result is that MPI is always probed, and the buildbot's serial builds fail. The attached patch fixes this. Committed. Regards, Stefan -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch URL: From don at codesourcery.com Thu Mar 16 17:57:53 2006 From: don at codesourcery.com (Don McCoy) Date: Thu, 16 Mar 2006 10:57:53 -0700 Subject: [patch] stand-alone makefile for benchmarks Message-ID: <4419A721.70007@codesourcery.com> The attached patch changes the stand-alone makefile for building benchmarks. These minor changes should allow it to work correctly with any implementation of VSIPL++ that provides the .pc file needed for pkg-config. Instructions for invoking it are included below. Also attached is the pkg-config file needed for the building against the reference implementation. Building the full reference implementation requires three things: the TASP C-VSIPL code (tvcpp0p8.tgz, available at www.vsipl.org), FFTW 2.15 (www.fftw.org/) and the reference implementation code (in CVS as vsipl++). C-VSIPL should be put into the vsipl++/ directory (side-by-side with the implementation/ folder that contains the ref-impl code) in order for pkg-config to work correctly. Building the reference implementation is summarized here: 1) Build FFTW for single precision and install it somewhere. 2) Build C-VSIPL (there is no install option) 3) Place a symlink in the tvcpp0p8/lib/ folder to the installed fftw library 4) Build the reference implementation library (named vsippp instead of vsip) 5) Ensure the reference implementation folder has a sub-directory lib/ containing pkgconfig/vsipl++.pc. To link the benchmarks against it, you must invoke make with a path that will allow it to find the correct vsipl++.pc file (i.e. the whole path up to, but not including, lib/). To verify that things are installed correctly, type make -f make.standalone PREFIX=~/vsipl++/implementation vars And it should display something like: echo "PKG-CFG : " env PKG_CONFIG_PATH=/home/don/vsipl++/implementation/lib/pkgconfig pkg-config --define-variable=prefix=/home/don/vsipl++/implementation vsipl++ echo "CXX : " g++ echo "CXXFLAGS: " -I/home/don/work/ref-impl -I/home/don/vsipl++/implementation/../tvcpp0p8/include -O2 -DNDEBUG -funswitch-loops -fgcse-after-reload --param max-inline-insns-single=2000 --param large-function-insns=6000 --param large-function-growth=800 --param inline-unit-growth=300 -m64 -mtune=nocona -mmmx -msse -msse2 -msse3 echo "LIBS : " -L/home/don/vsipl++/implementation/vsip -L/home/don/vsipl++/implementation/../tvcpp0p8/lib -lvsippp -lvsip -lfftw Use the base name of the benchmark you want to build in place of 'vars' above to build the desired benchmark. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ms.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ms.diff URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: vsipl++.pc URL: From mark at codesourcery.com Sat Mar 18 09:11:02 2006 From: mark at codesourcery.com (Mark Mitchell) Date: Sat, 18 Mar 2006 01:11:02 -0800 Subject: PATCH: Put URLs in footnotes Message-ID: <200603180911.k2I9B2fm005865@sethra.codesourcery.com> When generating PDFs, putting URLs inline in the text causes ugly linebreaks and bad character spacing. This patch moves the URLs to footnotes, which looks much better. Applied. -- Mark Mitchell CodeSourcery mark at codesourcery.com (650) 331-3385 x713 2006-03-18 Mark Mitchell * xsl/fo/csl.xsl (ulink.footnotes): Set to 1. Index: docs/csl-docbook/xsl/fo/csl.xsl =================================================================== RCS file: /home/cvs/Repository/csl-docbook/xsl/fo/csl.xsl,v retrieving revision 1.4 diff -c -5 -p -r1.4 csl.xsl *** docs/csl-docbook/xsl/fo/csl.xsl 19 Dec 2005 05:17:18 -0000 1.4 --- docs/csl-docbook/xsl/fo/csl.xsl 18 Mar 2006 09:02:08 -0000 *************** *** 118,127 **** --- 118,131 ---- --> 1 1 + + 1 + From don at codesourcery.com Sun Mar 19 23:01:39 2006 From: don at codesourcery.com (Don McCoy) Date: Sun, 19 Mar 2006 16:01:39 -0700 Subject: [patch] vmul benchmark reorganization Message-ID: <441DE2D3.9030805@codesourcery.com> The primary purpose of this patch is to separate out some implementation-specific functionality that was making it difficult to compile/run tests against the reference implementation of VSIPL++. This reorganization includes several ideas discussed recently, such as the new macro VSIP_IMPL_SOURCERY_VPP to allow applications to check if they are running CodeSourcery's version of the library. This macro is also used to choose between the parallel version of benchmarks/loop.hpp and a new serial-only version (loop_ser.hpp). I have some reservations about doing it this way, but it seemed the best in that it keeps the code in loop.hpp readable. This patch does not yet include the splitting of this benchmark into serial, parallel and impl-specific parts. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rb.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rb.diff URL: From jules at codesourcery.com Mon Mar 20 19:20:29 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 20 Mar 2006 14:20:29 -0500 Subject: [vsipl++] [patch] vmul benchmark reorganization In-Reply-To: <441DE2D3.9030805@codesourcery.com> References: <441DE2D3.9030805@codesourcery.com> Message-ID: <441F007D.7090509@codesourcery.com> Don McCoy wrote: > The primary purpose of this patch is to separate out some > implementation-specific functionality that was making it difficult to > compile/run tests against the reference implementation of VSIPL++. > > This reorganization includes several ideas discussed recently, such as > the new macro VSIP_IMPL_SOURCERY_VPP to allow applications to check if > they are running CodeSourcery's version of the library. > > This macro is also used to choose between the parallel version of > benchmarks/loop.hpp and a new serial-only version (loop_ser.hpp). I > have some reservations about doing it this way, but it seemed the best > in that it keeps the code in loop.hpp readable. Don, There is a lot in common between loop.hpp and loop_ser.hpp. Let's see if we can avoid a complete copy. Otherwise the patch looks good. -- Jules > > This patch does not yet include the splitting of this benchmark into > serial, parallel and impl-specific parts. > > Regards, > -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Mon Mar 20 19:36:07 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Mon, 20 Mar 2006 14:36:07 -0500 Subject: [vsipl++] [patch] vmul benchmark reorganization In-Reply-To: <441F007D.7090509@codesourcery.com> References: <441DE2D3.9030805@codesourcery.com> <441F007D.7090509@codesourcery.com> Message-ID: <441F0427.6050408@codesourcery.com> Jules Bergmann wrote: > > There is a lot in common between loop.hpp and loop_ser.hpp. Let's see > if we can avoid a complete copy. > Don, How about something like this? -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: loop.diff URL: From don at codesourcery.com Tue Mar 21 00:46:57 2006 From: don at codesourcery.com (Don McCoy) Date: Mon, 20 Mar 2006 17:46:57 -0700 Subject: [vsipl++] [patch] vmul benchmark reorganization In-Reply-To: <441F0427.6050408@codesourcery.com> References: <441DE2D3.9030805@codesourcery.com> <441F007D.7090509@codesourcery.com> <441F0427.6050408@codesourcery.com> Message-ID: <441F4D01.1060204@codesourcery.com> Jules Bergmann wrote: >> >> There is a lot in common between loop.hpp and loop_ser.hpp. Let's see >> if we can avoid a complete copy. > > How about something like this? > Revised with your suggestions. Thank you. I would commit this, but I'm having CVS problems at the moment. Will do so as soon as possible. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rb2.diff URL: From assem at codesourcery.com Tue Mar 21 14:18:27 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 21 Mar 2006 09:18:27 -0500 Subject: clapack Message-ID: <44200B33.3070609@codesourcery.com> Everyone, I have added CLAPACK to the repository. I had to change a header file to redefine integer from 64 to 32 bits. I ran make and make check and it seams ok. Attached are cvs diffs and tar of new files. Assem Salama -------------- next part -------------- A non-text attachment was scrubbed... Name: cvs.diff.03212006.log Type: text/x-log Size: 2882 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: new_files.03212006.tar Type: application/x-tar Size: 40960 bytes Desc: not available URL: From jules at codesourcery.com Tue Mar 21 15:03:32 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 21 Mar 2006 10:03:32 -0500 Subject: Feedback on clapack changes Message-ID: <442015C4.9040902@codesourcery.com> Assem, This looks good. A few comments below on the Makefiles. I'll send a separate reply for the 'cvs diff' changes. Can you send out a diff for changes to vendor/GNUmakefile.inc.in? I expect the changes are minor (s/lapack/clapack/) but it would be good to verify. Are there any other files changed outside of that? Finally, you'll need to create a ChangeLog entry for this patch. Take a look at the other entries in ChangeLog to get a sense of the level of detail/brevity necessary. Something like: DATE name Fit CLAPACK into autoconf build. * vendor/clapack/make.inc.in: New file, CLAPACK make include template. ... would be great. -- Jules Comments for make.inc.in: > #################################################################### > # LAPACK make include file. # > # LAPACK, Version 3.0 # > # June 30, 1999 # Add a few descriptive lines to the header, something like: # Modified to build inside of Sourcery VSIPL++ source tree. # # Assem Salama, DATE > #################################################################### > # > SHELL = /bin/sh SHELL = @SH@ > # > # The machine (platform) identifier to append to the library names > # > # leave PLAT empty for now A more descriptive comment # We don't use the platform name for Sourcery VSIPL++, leave it empty. > PLAT = Get rid of the old platform line. (See comment below). > #PLAT = _LINUX > # > # Modify the CC and CFLAGS definitions to refer to the > # compiler and desired compiler options for your machine. NOOPT > # refers to the compiler options desired when NO OPTIMIZATION is > # selected. Define LOADER and LOADOPTS to refer to the loader and > # desired load options for your machine. > # Remove the commented out variables. In general, commented out code without a comment about why it is commented out has the potential to create confusion. In this case, if we need to find out their old values, we could use CVS or diff against the original make.inc. > #CC = gcc > #CFLAGS = -funroll-all-loops -O3 > #LOADER = gcc > CC = @CC@ > CFLAGS = @CFLAGS@ > LOADER = $(CC) > LOADOPTS = $(CFLAGS) > NOOPT = > DRVCFLAGS = $(CFLAGS) > F2CCFLAGS = $(CFLAGS) > # > # The archiver and the flag(s) to use when building archive (library) > # If you system has no ranlib, set RANLIB = echo. > # Remove these: > #ARCH = ar > #ARCHFLAGS= cr > #RANLIB = ranlib > ARCH = @AR@ > ARCHFLAGS= @ARFLAGS@ > RANLIB = @RANLIB@ > # > # The location of the libraries to which you will link. (The > # machine-specific, optimized BLAS library should be used whenever > # possible.) > # > BLASLIB = ../../blas$(PLAT).a > LAPACKLIB = lapack$(PLAT).a > F2CLIB = ../../F2CLIBS/libF77.a ../../F2CLIBS/libI77.a > TMGLIB = tmglib$(PLAT).a > EIGSRCLIB = eigsrc$(PLAT).a > LINSRCLIB = linsrc$(PLAT).a > Comments for GNUmakefile.in - Do we still need this file at all? For FORTRAN LAPACK, we just did a make directly in LAPACK/SRC, i.e. vendor/GNUmakefile.inc.in has: $(MAKE) -C vendor/lapack/SRC all Now that you moved cblaswr into SRC, that should be enough for CLAPACK too. - Also, why did the BLAS directory name change to blas? For SRC/GNUmakefile.in - Looks good, just add a few lines to the header, similar to make.inc.in. Since we don't need libF77 or libI77 for building clapack/SRC, we shouldn't need F2CLIBS/{libF77,libI77}/GNUmakefile.in either, right? -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Tue Mar 21 15:12:30 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 21 Mar 2006 10:12:30 -0500 Subject: [vsipl++] clapack In-Reply-To: <44200B33.3070609@codesourcery.com> References: <44200B33.3070609@codesourcery.com> Message-ID: <442017DE.8020802@codesourcery.com> Assem, Assem Salama wrote: > Index: SRC/f2c.h > =================================================================== > RCS file: /home/cvs/Repository/clapack/SRC/f2c.h,v > retrieving revision 1.1 > retrieving revision 1.2 > diff -u -r1.1 -r1.2 > --- SRC/f2c.h 16 Mar 2006 23:11:40 -0000 1.1 > +++ SRC/f2c.h 21 Mar 2006 13:23:25 -0000 1.2 > @@ -7,7 +7,9 @@ > #ifndef F2C_INCLUDE > #define F2C_INCLUDE > > -typedef long int integer; A more descriptive comment here would be good. Something like: // The original clapack header defined 'integer' to be a 'long int'. // This creates a problem on 64-bit architectures, in particular the // em64t, because 'long int' is 64-bits, while a FORTRAN 'integer' is // only 32-bits. This causes programs compiled for use with the FORTRAN // lapack to not work properly with clapack. // // Defining 'integer' to be an 'int' fixes this problem. typedef int integer; Also, no need to leave the old typedef around in commented out form. If necessary we can use CVS to see the old version. > +// Assem: we don't want integer to be 64 bits!! > +//typedef long int integer; > +typedef int integer; > typedef unsigned long uinteger; > typedef char *address; > typedef short int shortint; > Index: GNUmakefile.inc.in > =================================================================== > RCS file: /home/cvs/Repository/vpp/vendor/GNUmakefile.inc.in,v Changes to this file look good. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Tue Mar 21 17:34:54 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 21 Mar 2006 12:34:54 -0500 Subject: clapack Message-ID: <4420393E.2040205@codesourcery.com> Everyone, Here is a new diff of the files that Jules suggested. They just have some more descriptive comments. Thanks, Assem Salama -------------- next part -------------- A non-text attachment was scrubbed... Name: cvs.diff.03212006.2.log Type: text/x-log Size: 2031 bytes Desc: not available URL: From assem at codesourcery.com Tue Mar 21 18:17:04 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 21 Mar 2006 13:17:04 -0500 Subject: CLAPACK Message-ID: <44204320.5030302@codesourcery.com> Everyone, New diff file better comments and ChangeLog entry. Assem Salama -------------- next part -------------- A non-text attachment was scrubbed... Name: cvs.diff.03212006.2.log Type: text/x-log Size: 4990 bytes Desc: not available URL: From jules at codesourcery.com Tue Mar 21 18:23:15 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 21 Mar 2006 13:23:15 -0500 Subject: [patch] Fix for dependency loop in 4.0/4.1 Message-ID: <44204493.3010003@codesourcery.com> Stefan, This reverses my earlier patch that included from vector.hpp. This should fix the problems we are seeing with GCC 4.0 and 4.1. Also, would you look at the configure.ac changes? These are meant to fix the problem with configuring MPI/Pro in a non-standard directory (--with-mpi-prefix blowing away the value of enable_mpi, which was set to 'mpipro' by '--enable-mpi=mpipro' option). -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: vec.diff URL: From jules at codesourcery.com Tue Mar 21 18:35:33 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 21 Mar 2006 13:35:33 -0500 Subject: [vsipl++] CLAPACK In-Reply-To: <44204320.5030302@codesourcery.com> References: <44204320.5030302@codesourcery.com> Message-ID: <44204775.1010105@codesourcery.com> Assem, Looks good. Comments below. -- Jules > > ------------------------------------------------------------------------ > > Index: make.inc.in > =================================================================== > RCS file: /home/cvs/Repository/clapack/make.inc.in,v > retrieving revision 1.1 > diff -u -r1.1 make.inc.in > --- make.inc.in 21 Mar 2006 13:38:28 -0000 1.1 > +++ make.inc.in 21 Mar 2006 17:31:33 -0000 > @@ -8,9 +8,10 @@ > # > # The machine (platform) identifier to append to the library names > # > -# leave PLAT empty for now > +# Assem: we are now using configure to make this makefile. PLAT is used > +# as a postfix for the library names. We want the library names to be the same > +# regardless of platform, so, we will leave it empty. Putting your name by the comment is OK, but not everyone might be able to figure out what "Assem:" means (or who to blame :) if they have a problem). Instead of just "Assem:", could you end the comment with "(Assem Salama, CSI)" or "(Assem Salama, CodeSourcery)"? > Index: ChangeLog > =================================================================== > RCS file: /home/cvs/Repository/vpp/ChangeLog,v > retrieving revision 1.411 > diff -u -r1.411 ChangeLog > --- ChangeLog 16 Mar 2006 03:27:10 -0000 1.411 > +++ ChangeLog 21 Mar 2006 18:15:15 -0000 > @@ -1,3 +1,18 @@ > +2006-03-21 Assem Salama > + * vendor: added CLAPACK library. CLAPACK now sits in clapack > + * CVSROOT/modules: added a line to automatically checkout CLAPACK when > + vpp is checked out Technically, CVSROOT/modules isn't part of the VSIPL++ repository. Instead, could you say: * vendor/clapack: New directory, contains 'clapack' module, incorporated using CVSROOT/modules. > + * vendor/GNUmakefile.inc: to make CLAPACK instead of LAPACK. Also added > + a command for clean to also clean out CLAPACK when make clean > + is invoked > + * vendor/clapack/SRC/GNUmakefile.in: added this file to allow configure > + to make this directory. This file orginally was Makefile. This is OK, but for new files it is nice if the phrase "new file" starts the comment. Also, it is not clear which 'Makefile' the comment refers to. How about: * vendor/clapack/SRC/GNUmakefile.in: New file, allow configure to make this directory. This file is derived from the clapack Makefile in the same directory. > + * vendor/clapack/make.inc.in: added this file to have configure > + atuomatically fill compile variables > + * vendor/clapack/SRC/f2c.h: modified typedef of integer. integer used to > + be defined as 64 bits. Original FORTRAN code had integer defined > + for 32 bits. > + > 2006-03-15 Stefan Seefeld > > * tests/*: Move various tests into subdirectories. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From mark at codesourcery.com Tue Mar 21 18:38:40 2006 From: mark at codesourcery.com (Mark Mitchell) Date: Tue, 21 Mar 2006 10:38:40 -0800 Subject: [vsipl++] CLAPACK In-Reply-To: <44204775.1010105@codesourcery.com> References: <44204320.5030302@codesourcery.com> <44204775.1010105@codesourcery.com> Message-ID: <44204830.9040305@codesourcery.com> Jules Bergmann wrote: > Instead of just "Assem:", could you end the comment with "(Assem Salama, > CSI)" or "(Assem Salama, CodeSourcery)"? And, in general, we don't do that; ChangeLogs and revision control let us figure out who did what, so we know who to ask for help anyhow. :-) -- Mark Mitchell CodeSourcery mark at codesourcery.com (650) 331-3385 x713 From jules at codesourcery.com Tue Mar 21 18:38:45 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 21 Mar 2006 13:38:45 -0500 Subject: [vsipl++] CLAPACK In-Reply-To: <44204830.9040305@codesourcery.com> References: <44204320.5030302@codesourcery.com> <44204775.1010105@codesourcery.com> <44204830.9040305@codesourcery.com> Message-ID: <44204835.1050001@codesourcery.com> Mark Mitchell wrote: > Jules Bergmann wrote: > >> Instead of just "Assem:", could you end the comment with "(Assem Salama, >> CSI)" or "(Assem Salama, CodeSourcery)"? > > And, in general, we don't do that; ChangeLogs and revision control let > us figure out who did what, so we know who to ask for help anyhow. :-) > Even better! -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Tue Mar 21 18:45:10 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 21 Mar 2006 13:45:10 -0500 Subject: [vsipl++] CLAPACK In-Reply-To: <44204835.1050001@codesourcery.com> References: <44204320.5030302@codesourcery.com> <44204775.1010105@codesourcery.com> <44204830.9040305@codesourcery.com> <44204835.1050001@codesourcery.com> Message-ID: <442049B6.9090408@codesourcery.com> no problem. Jules Bergmann wrote: > Mark Mitchell wrote: > >> Jules Bergmann wrote: >> >>> Instead of just "Assem:", could you end the comment with "(Assem Salama, >>> CSI)" or "(Assem Salama, CodeSourcery)"? >> >> >> And, in general, we don't do that; ChangeLogs and revision control let >> us figure out who did what, so we know who to ask for help anyhow. :-) >> > > Even better! > From assem at codesourcery.com Tue Mar 21 19:46:18 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 21 Mar 2006 14:46:18 -0500 Subject: CLAPACK Message-ID: <4420580A.9090809@codesourcery.com> Everyone, CLAPACK is now in the repositry and works with vsipl. This eliminates the need for the older LAPACK which uses fortran. CLAPACK is f2c'ed version of LAPACK. Thanks, Assem Salama From assem at codesourcery.com Wed Mar 22 00:13:27 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 21 Mar 2006 19:13:27 -0500 Subject: CLAPACK Message-ID: <442096A7.7010105@codesourcery.com> Everyone, I have changed modules file to now only load SRC dir of clapack. I have also added an option to configure to allow a user to specify CFLAGS for building clapack. The option is --with-clapack-cflags. It seams like the option -funroll-all-loops works well with clapack so this is something the user might want to specify. If this option is not used, the default CFLAGS is used. Thanks, Assem Salama -------------- next part -------------- A non-text attachment was scrubbed... Name: cvs.diff.03212006.3.log Type: text/x-log Size: 2421 bytes Desc: not available URL: From jules at codesourcery.com Wed Mar 22 03:29:40 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 21 Mar 2006 22:29:40 -0500 Subject: [vsipl++] CLAPACK In-Reply-To: <442096A7.7010105@codesourcery.com> References: <442096A7.7010105@codesourcery.com> Message-ID: <4420C4A4.9030202@codesourcery.com> Assem Salama wrote: > Everyone, > I have changed modules file to now only load SRC dir of clapack. I > have also added an option to configure to allow a user to specify CFLAGS > for building clapack. The option is --with-clapack-cflags. It seams like > the option -funroll-all-loops works well with clapack so this is > something the user might want to specify. If this option is not used, > the default CFLAGS is used. > > Thanks, > Assem Salama Assem, Looks good. Please address the comments below and then check it in. thanks, -- Jules > > > ------------------------------------------------------------------------ > > Index: configure.ac > =================================================================== > RCS file: /home/cvs/Repository/vpp/configure.ac,v > retrieving revision 1.88 > diff -u -r1.88 configure.ac > --- configure.ac 21 Mar 2006 15:52:23 -0000 1.88 > +++ configure.ac 22 Mar 2006 00:05:39 -0000 > @@ -170,6 +170,21 @@ > > > # LAPACK and related libraries (Intel MKL) > + > +# this option allows the user to OVERRIDE the default CFLAGS for CLAPACK > +# it seams that when -funroll-all-loops is specified, it runs a little better > +# it is up to the user to try specifying his own set of CFLAGS. If this option > +# is not used, CLAPACK_CLFAGS defaults to CFLAGS. .in files will find this > +# value in CLAPACK_CFLAGS This comment says a little too much. We're not really sure if -funroll-all-loops will make a noticeable performance difference (it may be that the longest running routines are all in ATLAS which this wouldn't affect, i.e. amdahls law). How about: This option allows the user to OVERRIDE the default CFLAGS for CLAPACK. If this option is not used, CLAPACK_CLFAGS defaults to CFLAGS. .in files will find this value in CLAPACK_CFLAGS. Also, don't forget punctuation, grammer, etc. In general, comments should be grammatically correct sentences. It makes them easier to read and understand. > +AC_ARG_WITH(clapack-cflags, > + AS_HELP_STRING([--with-clapack-cflags=CLAPACK_CFLAGS], > + [Specify CFLAGS to use when building builtin clapack. > + Only used if --with-lapack=builtin.]), > + CLAPACK_CFLAGS=$withval, > + CLAPACK_CFLAGS=no) We have grouped the argument processing in configure.ac seperately from the logic. Its not strictly necessary to do this, but it makes finding things in configure.ac easier. Also, it is good to keep related logic together. You should move this AC_SUBST line to ---> > +# let's not forget AC_SUBST! > +AC_SUBST(CLAPACK_CFLAGS) > + > AC_ARG_WITH([lapack], > AS_HELP_STRING([--with-lapack\[=PKG\]], > [enable use of LAPACK if found > @@ -317,6 +332,11 @@ > AC_SUBST(CXXDEP) > AC_LANG(C++) > > +# assign cflags to CLAPACK_CFLAGS if the user didn't use --with-clapack-cflags > +if test "$CLAPACK_CFLAGS" == "no"; then ^^ Use "=" instead of "==". It is more portable. > + CLAPACK_CFLAGS=$CFLAGS > +fi ---> move AC_SUBST line here > + > AC_MSG_CHECKING([for FORTRAN float return type]) > if test "$host_cpu" == "x86_64"; then > AC_DEFINE_UNQUOTED(VSIP_IMPL_FORTRAN_FLOAT_RETURN, double, > Index: vendor/clapack/SRC/make.inc.in > =================================================================== > RCS file: /home/cvs/Repository/clapack/SRC/make.inc.in,v > retrieving revision 1.1 > diff -u -r1.1 make.inc.in > --- vendor/clapack/SRC/make.inc.in 21 Mar 2006 21:38:49 -0000 1.1 > +++ vendor/clapack/SRC/make.inc.in 22 Mar 2006 00:05:40 -0000 > @@ -19,9 +19,12 @@ > # selected. Define LOADER and LOADOPTS to refer to the loader and > # desired load options for your machine. > # Likewise, it is not necessary to mention that -funroll-all-loops might work well in this comment. > -# configure will now substitute correct values for these variables > +# configure will now substitute correct values for these variables. > +# we added a variable called CLAPACK_CFLAGS that will allow someone to > +# specify special flags that could make CLAPACK run faster. It seams > +# like -funroll-all-loops works very well. > CC = @CC@ > -CFLAGS = @CFLAGS@ > +CFLAGS = @CLAPACK_CFLAGS@ > LOADER = $(CC) > LOADOPTS = $(CFLAGS) > NOOPT = -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Wed Mar 22 18:24:44 2006 From: assem at codesourcery.com (assem) Date: Wed, 22 Mar 2006 13:24:44 -0500 Subject: CLAPACK Message-ID: <4421966C.1010904@codesourcery.com> Everyone, I have changed the CVS module file to now pull in just the SRC directory of CLAPACK. I also added an option to configure to allow users to override default CFLAGS. Attached are the patches and the ChangeLog. Thanks, Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ChangeLog.03222006 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cvs.diff.03222006.2.log Type: text/x-log Size: 2914 bytes Desc: not available URL: From assem at codesourcery.com Wed Mar 22 18:52:11 2006 From: assem at codesourcery.com (assem) Date: Wed, 22 Mar 2006 13:52:11 -0500 Subject: CLAPACK Message-ID: <44219CDB.3010901@codesourcery.com> Everyone, Just added a few things to the ChangeLog that I forgot to include earlier. Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ChangeLog.03222006 URL: From jules at codesourcery.com Wed Mar 22 19:53:23 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 22 Mar 2006 14:53:23 -0500 Subject: [vsipl++] CLAPACK In-Reply-To: <4421966C.1010904@codesourcery.com> References: <4421966C.1010904@codesourcery.com> Message-ID: <4421AB33.4000009@codesourcery.com> assem wrote: > Everyone, > I have changed the CVS module file to now pull in just the SRC > directory of CLAPACK. I also added an option to configure to allow users > to override default CFLAGS. Attached are the patches and the ChangeLog. > > Thanks, > Assem Salama Assem, This looks good, please check it in. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From stefan at codesourcery.com Thu Mar 23 15:04:59 2006 From: stefan at codesourcery.com (Stefan Seefeld) Date: Thu, 23 Mar 2006 10:04:59 -0500 Subject: patch: setup parallel service for all tests, if MPI is enabled. In-Reply-To: <4422B5AC.70606@codesourcery.com> References: <4422B5AC.70606@codesourcery.com> Message-ID: <4422B91B.10906@codesourcery.com> The attached patch makes parallel_service a global resource for all tests, not just those under parallel/. The patch is checked in. Regards, Stefan -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch URL: From assem at codesourcery.com Thu Mar 23 20:29:28 2006 From: assem at codesourcery.com (Assem Salama) Date: Thu, 23 Mar 2006 15:29:28 -0500 Subject: CLAPACK Message-ID: <44230528.6000808@codesourcery.com> Everyone, I have modified the cblas wrapper. We now have a define NO_INLINE_WRAP that will control weather or not we want to inline the wrapper functions. Attached is the patch. Assem Salama -------------- next part -------------- A non-text attachment was scrubbed... Name: cvs.diff.03232006.1.log Type: text/x-log Size: 47193 bytes Desc: not available URL: From jules at codesourcery.com Fri Mar 24 12:29:49 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 24 Mar 2006 07:29:49 -0500 Subject: [patch] Fix for benchmarks/loop.hpp Message-ID: <4423E63D.5000809@codesourcery.com> Patch applied. -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: bench.diff URL: From jules at codesourcery.com Fri Mar 24 18:58:29 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 24 Mar 2006 13:58:29 -0500 Subject: [patch] Parallel support function updates Message-ID: <44244155.70009@codesourcery.com> This patch updates the parallel support functions provided by the library to match what is covered in the specification. This includes both new functions, and extending some existing functions to work with more distribution types (in particular, the global to local index conversions now work with cyclic distributions). It cleans up the map's internal parallel support API to be more consistent. It implements the Replicated_map map, which is similar to the existing Global_map but allows replication over a subset of processors. This patch fixes a bug with creating local views of sliced subviews of distributed objects (subviews using Sliced_block), and adds a regression test. In particular, if 'distributed_matrix' was a distributed matrix, the local view of a row or column subview: distributed_matrix.row(0).local() would be invalid on all other processors than the one(s) owning row(0). -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: psf.diff URL: From jules at codesourcery.com Tue Mar 28 14:50:47 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 28 Mar 2006 09:50:47 -0500 Subject: [patch] Fix 32-bit clapack build, test cleanup Message-ID: <44294D47.1010005@codesourcery.com> This patch adds a new configure.ac variable, CLAPACK_NOOPT. It is used to pass the -m32/-m64 flags to the CLAPACK make.inc.in for compiling non-optimized files. (This is similar to the LAPACK_NOOPT flags for Fortran lapack). This also fixes a bug in release.sh, and removes debug code from the replicated_data.cpp test. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cla.diff URL: From assem at codesourcery.com Tue Mar 28 20:00:53 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 28 Mar 2006 15:00:53 -0500 Subject: CLAPACK Message-ID: <442995F5.5040708@codesourcery.com> Everyone, I have changed CLAPACK to use inlines for cblas wrappers. Also, I have changed lapack.hpp to include the rest of the functions in a cblas wrapper. Thanks, Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cvs.diff.03282006.log URL: From mark at codesourcery.com Tue Mar 28 20:24:28 2006 From: mark at codesourcery.com (Mark Mitchell) Date: Tue, 28 Mar 2006 12:24:28 -0800 Subject: [vsipl++] CLAPACK In-Reply-To: <442995F5.5040708@codesourcery.com> References: <442995F5.5040708@codesourcery.com> Message-ID: <44299B7C.1050302@codesourcery.com> Assem Salama wrote: > +/* define an overloaded function that helps as pas some scalars to the cblas > + functions. Some cblas functions require the argument to be passed as a > + pointer when it is complex but by reference otherwise. This makes the > + defines a little easier to look at */ Here's where I give my standard speech about coding style: We're building a source product; that means customers will read the source code. As a result, they will judge us by little details. They may not understand what the source code does, but they will judge us based on the bits they do understand -- including comments. The key is to thing about the source code as if it were a marketing flyer: worry about it the same way that you'd worry about colors, fonts, and layout on a brochure. So, you need to proofread comments carefully. Use complete sentences. Start every sentence with a capital letter ("define" should be "Define"). Check spelling: "as pas" should be "us pass". Check punctuation: you need a period after "to look at". Similar comments apply to the comment in blaswrap.h. I know this all seems pedantic; just say nasty things under your breath, and do it anyhow. :-) :-) Thanks, -- Mark Mitchell CodeSourcery mark at codesourcery.com (650) 331-3385 x713 From jules at codesourcery.com Tue Mar 28 20:47:03 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Tue, 28 Mar 2006 15:47:03 -0500 Subject: [vsipl++] CLAPACK In-Reply-To: <442995F5.5040708@codesourcery.com> References: <442995F5.5040708@codesourcery.com> Message-ID: <4429A0C7.7050800@codesourcery.com> Assem Salama wrote: > Everyone, > I have changed CLAPACK to use inlines for cblas wrappers. Also, I have > changed lapack.hpp to include the rest of the functions in a cblas wrapper. > Assem, this looks good, modulo fixing the comments as Mark points out. Please check it in. In the future, can you please include the ChangeLog entry along with the patch? thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From assem at codesourcery.com Wed Mar 29 00:25:31 2006 From: assem at codesourcery.com (Assem Salama) Date: Tue, 28 Mar 2006 19:25:31 -0500 Subject: CLAPACK Message-ID: <4429D3FB.2050902@codesourcery.com> Everyone, Fixed up comments. Thanks, Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cvs.diff.03282006.log URL: From jules at codesourcery.com Wed Mar 29 13:09:03 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Wed, 29 Mar 2006 08:09:03 -0500 Subject: [patch] Generic SIMD rscvmul Message-ID: <442A86EF.7060007@codesourcery.com> This patch implements rscvmul (real-scalar * complex-vector element-wise multiply (!)) using our generic SIMD framework and adds expression evaluators to use it. On the GTRI Xeon machines, this boosts the performance of float rscvmul from ~140 MFLOPS to ~2500 MFLOPS. Since rscvmul gets used for scaling with the FFTW backend, this boosts FFT w/scaling performance from ~2100 MFLOPS to ~5000 MFLOPS. This patch also reverts to using non-streaming SIMD stores in the vmul routine. The streaming stores get better performance (~10%) for very large vectors that do not fit in cache, while the non-streaming stores get way better performance (~100%) for vectors that do fit into the caches. Patch applied. -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rscvmul.diff URL: From assem at codesourcery.com Wed Mar 29 16:22:09 2006 From: assem at codesourcery.com (Assem Salama) Date: Wed, 29 Mar 2006 11:22:09 -0500 Subject: CLAPACK Message-ID: <442AB431.6040108@codesourcery.com> Everyone, This is ChangeLog of last changes. Forgot to attach it the first time around. Thanks, Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ChangeLog.03292006 URL: From assem at codesourcery.com Thu Mar 30 17:02:45 2006 From: assem at codesourcery.com (Assem Salama) Date: Thu, 30 Mar 2006 12:02:45 -0500 Subject: Index and Length Message-ID: <442C0F35.90302@codesourcery.com> Everyone, This patch takes out the use of Point and replaces it with Index and Length. Thanks, Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ChangeLog.03302006 URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cvs.diff.03302006.1.log URL: From assem at codesourcery.com Thu Mar 30 17:14:06 2006 From: assem at codesourcery.com (Assem Salama) Date: Thu, 30 Mar 2006 12:14:06 -0500 Subject: Index and Length Message-ID: <442C11DE.3040800@codesourcery.com> Everyone, As per Jule's suggestion, I changed the << operator in tests/output.hpp to also operate on Index instead of Point. Thanks, Assem Salama -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ChangeLog.03302006 URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cvs.diff.03302006.1.log URL: From jules at codesourcery.com Thu Mar 30 18:01:24 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Thu, 30 Mar 2006 13:01:24 -0500 Subject: [vsipl++] Index and Length In-Reply-To: <442C0F35.90302@codesourcery.com> References: <442C0F35.90302@codesourcery.com> Message-ID: <442C1CF4.3090305@codesourcery.com> Assem, The code changes look good, but some of the comments appear unnecessary. In particular, we don't need the comments about how some piece of code used to use "Point" but now uses "Length/Index". Once "Point" is gone from a particular section of code, there is no general need to mention how it used to be implemented. We can get that history from the changelog and from cvs. Moreover, once we've removed Point from the library altogether, mentioning it will be confusing. I don't see any substantive changes to length.hpp. You should revert your local changes to this file and not commit it. Some comments misspell 'deprecated". Finally, I've given some thought to where these functions should go. The reason that the Point stuff was split out into separate files point.hpp and point-fcn.hpp was to keep the Point class definition relatively lightweight so that it could be included without pulling in the rest of the library. Some of the functions in point-fcn require the definition of the views (i.e. the extent functions). Including vector.hpp ends up including quite a bit. On the surface, worrying about whether point.hpp includes ve,ctor.hpp might seem silly, since any reasonable VSIPL++ program will end up including vector.hpp anyway. However, keeping the header dependencies simple helps improve maintainability. Every so often we make a change that creates a circular requirement that take some effort to sort out. Here's what I'd like to do: - move the view get/put functions from point-fcn.hpp to the respecive view headers. I.e. get(const_Vector, Index) should go into vector.hpp. This makes sense becuase if you're going to use get on a view, you will have to include that view's header. The view headers should already include domain.hpp, so this doesn't create a new dependency. - Move everything else from point.hpp (next, valid, block get/put) and point-fcn (domain_nth, first, domain constructor helper) into domain-utils. How does that sound? -- Jules Assem Salama wrote: > Everyone, > This patch takes out the use of Point and replaces it with Index and > Length. > > Thanks, > Assem Salama > > > ------------------------------------------------------------------------ > > 2006-03-30 Assem Salama > * src/vsip/impl/length.hpp: Changed the extent function in this file > to return Length instead of point. This extent takes a Block. > * src/vsip/impl/par-util.hpp: Changed the foreach_point function to > work on Index instead of Point. > * src/vsip/impl/point-fcn.hpp: Added new extent functions to return > Length instead of Point. > * src/vsip/impl/point.hpp: Added new functions to make Index and > Length work correctly. The new functions are get,put,next,valid, > and domain_nth. > * tests/appmap.cpp: Converted this test to use Length and Index. > * tests/fast-block.cpp: Same as appmap.cpp > * tests/us-block.cpp: Same as above. > * tests/user_storage.cpp: Same as above. > * tests/util-par.hpp: Same as above. > * tests/view.cpp: Same as above. > * tests/vmmul.cpp: Same as above. > > > > > Index: src/vsip/impl/length.hpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/src/vsip/impl/length.hpp,v > retrieving revision 1.3 > diff -u -r1.3 length.hpp > --- src/vsip/impl/length.hpp 15 Sep 2005 14:49:25 -0000 1.3 > +++ src/vsip/impl/length.hpp 30 Mar 2006 17:00:12 -0000 > @@ -15,8 +15,10 @@ > ***********************************************************************/ > > #include > +#include Why is it necessary to include domain.hpp from length.hpp? Domain doesn't appear to be used in any of the changes to length. > #include > > + > namespace vsip > { > namespace impl > @@ -47,6 +49,9 @@ > : Vertex(x, y, z) {} > }; > > +// This function used to return a Point. Now it returns a Length. The use of > +// Point is depricated. > + This comment doesn't really add any value. Once we've removed Point, talking about it will create confusion. This kind of history can be extracted from the ChangeLog and cvs if necessary. > template typename B> > inline Length > @@ -69,7 +74,6 @@ > return size; > } > > - > } // namespace vsip::impl > } // namespace vsip > > Index: src/vsip/impl/par-util.hpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/src/vsip/impl/par-util.hpp,v > retrieving revision 1.8 > diff -u -r1.8 par-util.hpp > --- src/vsip/impl/par-util.hpp 27 Mar 2006 23:19:34 -0000 1.8 > +++ src/vsip/impl/par-util.hpp 30 Mar 2006 17:00:12 -0000 > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > > > > @@ -113,10 +114,11 @@ > Domain ldom = local_domain(view, sb, p); > Domain gdom = global_domain(view, sb, p); > > - for (Point idx; valid(extent_old(ldom), idx); next(extent_old(ldom), idx)) > + Length ext = extent(ldom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > - Point l_idx = domain_nth(ldom, idx); > - Point g_idx = domain_nth(gdom, idx); > + Index l_idx = domain_nth(ldom, idx); > + Index g_idx = domain_nth(gdom, idx); > > put(local_view, l_idx, fcn(get(local_view, l_idx), l_idx, g_idx)); > } > Index: src/vsip/impl/point-fcn.hpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/src/vsip/impl/point-fcn.hpp,v > retrieving revision 1.3 > diff -u -r1.3 point-fcn.hpp > --- src/vsip/impl/point-fcn.hpp 9 Sep 2005 11:55:00 -0000 1.3 > +++ src/vsip/impl/point-fcn.hpp 30 Mar 2006 17:00:12 -0000 > @@ -53,6 +53,24 @@ > } > > > +// This function is like domain_nth function above but returns an Index > +// instead of a point This comment should stand on its own. That way we can remove the above function. > +template > +Index > +domain_nth( > + Domain const& dom, > + Index const& idx) > +{ > + Index res; > + > + for (dimension_type d=0; d + res[d] = dom[d].impl_nth(idx[d]); > + > + return res; > +} > + > + > + > > /// Get the first index of a domain. > > @@ -69,7 +87,8 @@ > > > > -/// Get the extent of a domain, as a point. > +/// Get the extent of a domain, as a point. This function is now depricated. > +/// We should use Length now instead. > > template > Point > @@ -84,9 +103,24 @@ > return res; > } > > +/// Get the extent of a domain as a Length. > > +template > +Length > +extent( > + Domain const& dom) > +{ > + Length res; > > -/// Get the extent of a vector view, as a point. > + for (dimension_type d=0; d + res[d] = dom[d].length(); > + > + return res; > +} > + > + > +/// Get the extent of a vector view, as a point. This function is depricated. > +/// We should use Length now instead. > > template typename Block> > @@ -96,9 +130,21 @@ > return Point<1>(v.size(0)); > } > > +/// Get the extent of a vector view, as a Length. > + > +template + typename Block> > +Length<1> > +extent(const_Vector v) > +{ > + return Length<1>(v.size(0)); > +} > + > + > > > -/// Get the extent of a matrix view, as a point. > +/// Get the extent of a matrix view, as a point. This function is depricated. > +/// We should use Length now instead of point. > > template typename Block> > @@ -109,6 +155,16 @@ > } > > > +/// Get the extent of a matrix view, as a Length. > + > +template + typename Block> > +Length<2> > +extent(const_Matrix v) > +{ > + return Length<2>(v.size(0), v.size(1)); > +} > + > > /// Construct a 1-dim domain with an offset and a size (implicit > /// stride of 1) > Index: src/vsip/impl/point.hpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/src/vsip/impl/point.hpp,v > retrieving revision 1.8 > diff -u -r1.8 point.hpp > --- src/vsip/impl/point.hpp 7 Mar 2006 02:15:22 -0000 1.8 > +++ src/vsip/impl/point.hpp 30 Mar 2006 17:00:12 -0000 > @@ -15,6 +15,7 @@ > ***********************************************************************/ > > #include > +#include > > > /*********************************************************************** > @@ -193,6 +194,58 @@ > return idx; > } > > +/* Now let's make the "next" functions that work Index. This function is the > + * same as the Point one but it operates on an Index > + */ This comment should stand on its own. Also, for comments like this that preceed a function and describe its operation, you should use either C++ style comments '//', or preferably C++ style comments with an extra '/' -> '///'. These comments get recognized by synopsis for generating documentation. For a function, the comments should be something like this: /// Short descriptions (1-line or so) /// Optional longer description, covering parameters, etc. /// This can be multiple lines. > +inline > +Index<1>& > +next( > + Length<1> const& /*extent*/, > + Index<1>& idx) > +{ > + ++idx[0]; > + return idx; > +} > + > + > + > +inline > +Index<2>& > +next( > + Length<2> const& extent, > + Index<2>& idx) > +{ > + if (++idx[0] == extent[0]) > + { > + if (++idx[1] != extent[1]) > + idx[0] = 0; > + } > + return idx; > +} > + > + > + > +inline > +Index<3>& > +next( > + Length<3> const& extent, > + Index<3>& idx) > +{ > + if (++idx[0] == extent[0]) > + { > + if (++idx[1] == extent[1]) > + { > + if (++idx[2] == extent[2]) > + return idx; > + idx[1] = 0; > + } > + idx[0] = 0; > + } > + > + return idx; > +} > + > + > > template > inline bool > @@ -206,6 +259,21 @@ > return true; > } > > +// This function checks if the index is valid given a certain length. This > +// function works for multiple dimension spaces. > +template > +inline bool > +valid( > + Length const& extent, > + Index const& idx) > +{ > + for(dimension_type d=0;d + if(idx[d] >= extent[d]) > + return false; > + return true; > +} > + > + > > template > inline > @@ -297,7 +365,6 @@ > } > > > - > /// Put a value into a 2-dim block. > > template > @@ -325,6 +392,84 @@ > } > > > +// These functions use an Index instead of a Point. > +// The use Point is depricated > + > +/// Get a value from a 1-dim block. > + > +template > +inline typename Block::value_type > +get( > + Block const& block, > + Index<1> const& idx) > +{ > + return block.get(idx[0]); > +} > + > + > + > +/// Get a value from a 2-dim block. > + > +template > +inline typename Block::value_type > +get( > + Block const& block, > + Index<2> const& idx) > +{ > + return block.get(idx[0], idx[1]); > +} > + > + > + > +/// Get a value from a 3-dim block. > + > +template > +inline typename Block::value_type > +get( > + Block const& block, > + Index<3> const& idx) > +{ > + return block.get(idx[0], idx[1], idx[2]); > +} > + > + > + > +/// Put a value into a 1-dim block. > + > +template > +inline void > +put( > + Block& block, > + Index<1> const& idx, > + typename Block::value_type const& val) > +{ > + block.put(idx[0], val); > +} > + > + > +/// Put a value into a 2-dim block. > + > +template > +inline void > +put( > + Block& block, > + Index<2> const& idx, > + typename Block::value_type const& val) > +{ > + block.put(idx[0], idx[1], val); > +} > + > + > +template > +inline void > +put( > + Block& block, > + Index<3> const& idx, > + typename Block::value_type const& val) > +{ > + block.put(idx[0], idx[1], idx[2], val); > +} > + > > > > Index: tests/appmap.cpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/appmap.cpp,v > retrieving revision 1.10 > diff -u -r1.10 appmap.cpp > --- tests/appmap.cpp 27 Mar 2006 23:19:34 -0000 1.10 > +++ tests/appmap.cpp 30 Mar 2006 17:00:12 -0000 > @@ -13,14 +13,15 @@ > #include > #include > #include > +#include > #include "test.hpp" > #include "output.hpp" > > using namespace std; > using namespace vsip; > > -using vsip::impl::Point; > -using vsip::impl::extent_old; > +using vsip::impl::Length; > +using vsip::impl::extent; > using vsip::impl::valid; > using vsip::impl::next; > using vsip::impl::domain_nth; > @@ -95,12 +96,6 @@ > > > > -inline Index<1> as_index(Point<1> const& p) {return Index<1>(p[0]); } > -inline Index<2> as_index(Point<2> const& p) {return Index<2>(p[0],p[1]); } > -inline Index<3> as_index(Point<3> const& p) {return Index<3>(p[0],p[1],p[2]); } > - > - > - > // Check that local and global indices within a patch are consistent. > > template @@ -147,16 +142,20 @@ > } > } > > - Point ext = extent_old(gdom); > + /* We can replace this segment of code with one that uses Length and Index. > + * The use of Point is depricated and Length and Index should be used > + * Instead > + */ Comment not necessary > > - for (Point idx; valid(ext, idx); next(ext, idx)) > + Length ext = extent(gdom); > + for(Index idx; valid(ext,idx); next(ext,idx)) > { > - Index g_idx = as_index(domain_nth(gdom, idx)); > - Index l_idx = as_index(domain_nth(ldom, idx)); > - > + Index g_idx = domain_nth(gdom,idx); > + Index l_idx = domain_nth(ldom,idx); > test_assert(map.impl_subblock_from_global_index(g_idx) == sb); > test_assert(map.impl_patch_from_global_index(g_idx) == p); > } > + > } > > > Index: tests/fast-block.cpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/fast-block.cpp,v > retrieving revision 1.6 > diff -u -r1.6 fast-block.cpp > --- tests/fast-block.cpp 20 Dec 2005 12:48:40 -0000 1.6 > +++ tests/fast-block.cpp 30 Mar 2006 17:00:12 -0000 > @@ -16,13 +16,16 @@ > #include > #include > #include > +#include > #include "test.hpp" > > using namespace std; > using namespace vsip; > > -using vsip::impl::Point; > -using vsip::impl::extent_old; > +using vsip::impl::Length; > +using vsip::impl::extent; > +using vsip::impl::valid; > +using vsip::impl::next; > > > > @@ -30,33 +33,35 @@ > Definitions > ***********************************************************************/ > > +/* We no longer use Point. Instead we will use Index and Length. We need a > + * different set of functions that operate on Index and Length instead of > + * Point. > + */ comment not necessary > + > template > inline T > identity( > - Point<1> /*extent*/, > - Point<1> idx, > + Length<1> /*extent*/, > + Index<1> idx, > int k) > { > return static_cast(k*idx[0] + 1); > } > > > - > template > inline T > identity( > - Point<2> extent, > - Point<2> idx, > - int k) > + Length<2> extent, > + Index<2> idx, > + int k) > { > - Point<2> offset; > + Index<2> offset; > index_type i = (idx[0]+offset[0])*extent[1] + (idx[1]+offset[1]); > return static_cast(k*i+1); > } > > > - > - > template typename Block> > void > @@ -64,15 +69,14 @@ > { > typedef typename Block::value_type value_type; > > - Point ex = extent_old(blk); > - for (Point idx; idx != ex; next(ex, idx)) > + Length ex = extent(blk); > + for (Index idx; valid(ex,idx); next(ex, idx)) > { > put(blk, idx, identity(ex, idx, k)); > } > } > > > - > template typename Block> > void > @@ -80,8 +84,8 @@ > { > typedef typename Block::value_type value_type; > > - Point ex = extent_old(blk); > - for (Point idx; idx != ex; next(ex, idx)) > + Length ex = extent(blk); > + for (Index idx; valid(ex,idx); next(ex, idx)) > { > test_assert(equal( get(blk, idx), > identity(ex, idx, k))); > @@ -89,7 +93,6 @@ > } > > > - > template typename Block> > void > Index: tests/us-block.cpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/us-block.cpp,v > retrieving revision 1.1 > diff -u -r1.1 us-block.cpp > --- tests/us-block.cpp 10 Feb 2006 22:24:02 -0000 1.1 > +++ tests/us-block.cpp 30 Mar 2006 17:00:12 -0000 > @@ -16,14 +16,15 @@ > #include > #include > #include > +#include > > #include "test.hpp" > > using namespace std; > using namespace vsip; > > -using vsip::impl::Point; > -using vsip::impl::extent_old; > +using vsip::impl::Length; > +using vsip::impl::extent; > > > > @@ -31,32 +32,35 @@ > Definitions > ***********************************************************************/ > > +/* We no longer use Point. Instead we will use Index and Length. We need a > + * different set of functions that operate on Index and Length instead of > + * Point. > + */ Comment not necessary > + > template > inline T > identity( > - Point<1> /*extent*/, > - Point<1> idx, > + Length<1> /*extent*/, > + Index<1> idx, > int k) > { > return static_cast(k*idx[0] + 1); > } > > > - > template > inline T > identity( > - Point<2> extent, > - Point<2> idx, > - int k) > + Length<2> extent, > + Index<2> idx, > + int k) > { > - Point<2> offset; > + Index<2> offset; > index_type i = (idx[0]+offset[0])*extent[1] + (idx[1]+offset[1]); > return static_cast(k*i+1); > } > > > - > template typename Block> > void > @@ -64,15 +68,14 @@ > { > typedef typename Block::value_type value_type; > > - Point ex = extent_old(blk); > - for (Point idx; idx != ex; next(ex, idx)) > + Length ex = extent(blk); > + for (Index idx; valid(ex,idx); next(ex, idx)) > { > put(blk, idx, identity(ex, idx, k)); > } > } > > > - > template typename Block> > void > @@ -80,16 +83,17 @@ > { > typedef typename Block::value_type value_type; > > - Point ex = extent_old(blk); > - for (Point idx; idx != ex; next(ex, idx)) > + Length ex = extent(blk); > + for (Index idx; valid(ex,idx); next(ex, idx)) > { > test_assert(equal( get(blk, idx), > - identity(ex, idx, k))); > + identity(ex, idx, k))); > } > } > > > > + > template typename BlockT> > void > Index: tests/user_storage.cpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/user_storage.cpp,v > retrieving revision 1.6 > diff -u -r1.6 user_storage.cpp > --- tests/user_storage.cpp 20 Dec 2005 12:48:41 -0000 1.6 > +++ tests/user_storage.cpp 30 Mar 2006 17:00:12 -0000 > @@ -17,24 +17,24 @@ > #include > #include > #include > +#include > #include "test.hpp" > > using namespace std; > using namespace vsip; > > -using vsip::impl::Point; > +using vsip::impl::Length; > > > > /*********************************************************************** > Definitions > ***********************************************************************/ > - > template dimension_type Dim> > index_type > -to_index(Point const& ext, > - Point const& idx) > +to_index(Length const& ext, > + Index const& idx) > { > if (Dim == 1) > return idx[0]; > @@ -47,7 +47,6 @@ > } > > > - > template typename T, > dimension_type Dim, > @@ -58,8 +57,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext, idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > data[i] = fun(i); > @@ -67,7 +66,6 @@ > } > > > - > template typename T, > dimension_type Dim, > @@ -78,8 +76,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > if (!equal(data[i], fun(i))) > @@ -89,7 +87,6 @@ > } > > > - > template typename T, > dimension_type Dim, > @@ -100,8 +97,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > complex val = fun(i); > @@ -111,7 +108,6 @@ > } > > > - > template typename T, > dimension_type Dim, > @@ -122,8 +118,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > complex val = fun(i); > @@ -135,7 +131,6 @@ > } > > > - > template typename T, > dimension_type Dim, > @@ -147,8 +142,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > complex val = fun(i); > @@ -158,7 +153,6 @@ > } > > > - > template typename T, > dimension_type Dim, > @@ -170,8 +164,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > complex val = fun(i); > @@ -182,8 +176,6 @@ > return true; > } > > - > - > template typename Block, > dimension_type Dim, > @@ -194,16 +186,14 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > put(block, idx, fun(i)); > } > } > > - > - > template typename Block, > dimension_type Dim, > @@ -214,8 +204,8 @@ > Domain const& dom, > Func fun) > { > - Point ext = impl::extent_old(dom); > - for (Point idx; idx != ext; next(ext, idx)) > + Length ext = impl::extent(dom); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > index_type i = to_index(ext, idx); > if (!equal(get(block, idx), fun(i))) > @@ -224,7 +214,6 @@ > return true; > } > > - > template > class Filler > { > Index: tests/util-par.hpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/util-par.hpp,v > retrieving revision 1.8 > diff -u -r1.8 util-par.hpp > --- tests/util-par.hpp 27 Mar 2006 23:19:34 -0000 1.8 > +++ tests/util-par.hpp 30 Mar 2006 17:00:12 -0000 > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > > #include "output.hpp" > #include "extdata-output.hpp" > @@ -276,8 +277,8 @@ > Increment(T delta) : delta_(delta) {} > > T operator()(T value, > - vsip::impl::Point const&, > - vsip::impl::Point const&) > + vsip::Index const&, > + vsip::Index const&) > { return value + delta_; } > > // Member Data > @@ -294,19 +295,22 @@ > class Set_identity > { > public: > + // The Set_identity () operators used to take Point as their argument. The > + // use of Point is depricated. We need to use Index instead. > + > Set_identity(vsip::Domain const& dom, int k = 1, int o = 0) > : dom_(dom), k_(k), o_(o) {} > > template > T operator()(T /*value*/, > - vsip::impl::Point<1> const& /*local*/, > - vsip::impl::Point<1> const& global) > + vsip::Index<1> const& /*local*/, > + vsip::Index<1> const& global) > { return T(k_*global[0] + o_); } > > template > T operator()(T /*value*/, > - vsip::impl::Point<2> const& /*local*/, > - vsip::impl::Point<2> const& global) > + vsip::Index<2> const& /*local*/, > + vsip::Index<2> const& global) > { > vsip::index_type i = global[0]*dom_[1].length()+global[1]; > return T(k_*i+o_); > @@ -314,8 +318,8 @@ > > template > T operator()(T /*value*/, > - vsip::impl::Point<3> const& /*local*/, > - vsip::impl::Point<3> const& global) > + vsip::Index<3> const& /*local*/, > + vsip::Index<3> const& global) > { > vsip::index_type i = global[0]*dom_[1].length()*dom_[2].length() > + global[1]*dom_[2].length() > @@ -343,10 +347,14 @@ > > bool good() { return good_; } > > + // The Check_identity () operators used to take Point as their argument. The > + // use of Point is depricated. We need to use Index instead. > + > + > template > T operator()(T value, > - vsip::impl::Point<1> const& /*local*/, > - vsip::impl::Point<1> const& global) > + vsip::Index<1> const& /*local*/, > + vsip::Index<1> const& global) > { > int i = global[0]; > T expected = T(k_*i + o_); > @@ -363,8 +371,8 @@ > > template > T operator()(T value, > - vsip::impl::Point<2> const& /*local*/, > - vsip::impl::Point<2> const& global) > + vsip::Index<2> const& /*local*/, > + vsip::Index<2> const& global) > { > int i = global[0]*dom_[1].length()+global[1]; > T expected = T(k_*i+o_); > @@ -383,8 +391,8 @@ > > template > T operator()(T value, > - vsip::impl::Point<3> const& /*local*/, > - vsip::impl::Point<3> const& global) > + vsip::Index<3> const& /*local*/, > + vsip::Index<3> const& global) > { > int i = global[0]*dom_[1].length()*dom_[2].length() > + global[1]*dom_[2].length() > Index: tests/view.cpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/view.cpp,v > retrieving revision 1.10 > diff -u -r1.10 view.cpp > --- tests/view.cpp 20 Dec 2005 12:48:41 -0000 1.10 > +++ tests/view.cpp 30 Mar 2006 17:00:12 -0000 > @@ -23,14 +23,14 @@ > #include > #include > #include > +#include > #include "test.hpp" > #include "test-storage.hpp" > > using namespace std; > using namespace vsip; > > -using vsip::impl::Point; > -using vsip::impl::extent_old; > +using vsip::impl::Length; > > > > @@ -348,6 +348,9 @@ > > // Check that all elements of a view have the same const values > > +// The use of Point is depricated. This function was converted to use Length > +// and index instead. > + > template > bool > check_view_const( > @@ -355,7 +358,8 @@ > typename View::value_type scalar) > { > dimension_type const dim = View::dim; > - for (Point idx; idx != extent_old(view); next(extent_old(view), idx)) > + Length ext = extent(view); > + for (Index idx; valid(ext,idx); next(ext, idx)) > { > if (!equal(get(view, idx), scalar)) > return false; > Index: tests/vmmul.cpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/tests/vmmul.cpp,v > retrieving revision 1.4 > diff -u -r1.4 vmmul.cpp > --- tests/vmmul.cpp 20 Dec 2005 12:48:41 -0000 1.4 > +++ tests/vmmul.cpp 30 Mar 2006 17:00:12 -0000 > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > > #include "test.hpp" > #include "util-par.hpp" > @@ -87,8 +88,8 @@ > > template > T operator()(T value, > - vsip::impl::Point<2> const& /*local*/, > - vsip::impl::Point<2> const& global) > + vsip::Index<2> const& /*local*/, > + vsip::Index<2> const& global) > { > vsip::index_type i = global[0]*dom_[1].length()+global[1]; > T expected = (VecDim == 0) ? T(global[1] * i) : T(global[0] * i); -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From don at codesourcery.com Fri Mar 31 02:28:30 2006 From: don at codesourcery.com (Don McCoy) Date: Thu, 30 Mar 2006 19:28:30 -0700 Subject: [patch] FIR Filter bank benchmark Message-ID: <442C93CE.2020103@codesourcery.com> The attached patch adds one of the MIT Lincoln Labs' PCA Kernel-Level Benchmarks to VSIPL++ -- the FIR Filter Bank. It also has a minor re-organization of some support functions, moving them from the tests/ directory to the src/vsip_csl/ directory. Actually, copies have been made as I didn't think it would be good to delete the ones in tests/ until all other references to them have been cleaned up. This benchmark defines two sets of parameters for performing a series of convolutions on the input data. In each case, M input vectors of length N are convolved with filters of length K. The two sets of parameters are given as follows: Set 1 2 M 64 20 N 4096 1024 K 128 12 The benchmark framework defined for VSIPL++ sweeps N over a range of values, so the point of interest for each set may be extracted according to the table above. Refer to the end of benchmarks/firbank.cpp to see the options used to select various tests. Note: the last digit of the option value is always 1 or 2, corresponding to the data set chosen. In order to use external data files with the benchmark, they must be located in benchmarks/data/set1 and benchmarks/data/set2. The filenames must be as follows: inputs_X.matrix, filter.matrix and outputs_X.matrix, where X denotes the size as a power of two [log2(N)]. The default starting and ending values for N are 7 and 16, so files corresponding to those vector sizes must be provided. Validation is performed with external data. For full convolution, all values are checked. The FFT-based algorithm is circular rather than linear though, so values near the beginning and end are not checked. The number of values that are checked is N - 2 * (K - 2). Lastly, I had some difficulty getting the right answers to come out due to the fact that the convolutions are done repeatedly on the same vector in order to take a more accurate measurement. With the Fir class, the state_save/state_no_save template parameter *must* be set to 'no_save', or the results are retained between successive convolutions, thereby corrupting the results. Not what is desired in this case! Similarly with fast convolution, a temporary is used. I.e.: for (index_type l=0; l tmp(N, T()); fwd_fft(l_inputs.row(i), tmp); tmp *= response.row(0); // assume fft already done on response inv_fft(tmp, test.row(i)); } } Moving the declaration and initialization of 'tmp' outside the loop has the same effect as with 'state_save' because the contents of tmp are not zeroed between rows. With it inside the loop (as it should be), performance does not appear to be affected noticeably, though it should have a slight impact. Comments and feedback appreciated. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fb.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fb.diff URL: From don at codesourcery.com Fri Mar 31 19:02:45 2006 From: don at codesourcery.com (Don McCoy) Date: Fri, 31 Mar 2006 12:02:45 -0700 Subject: [patch] Fastconv benchmark Message-ID: <442D7CD5.3060201@codesourcery.com> The attached patch updates the fast convolution benchmark by using the new macro VSIP_IMPL_SOURCERY_VPP to separate code dependent on parallel features of the library. This allows it to be compiled against the reference implementation for performance comparisons. Note that the changes to benchmark.hpp submitted for yesterday's firbank patch are needed for this as well. Regards, -- Don McCoy don (at) CodeSourcery (888) 776-0262 / (650) 331-3385, x712 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fc.changes URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fc.diff URL: From jules at codesourcery.com Fri Mar 31 19:59:03 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 31 Mar 2006 14:59:03 -0500 Subject: [vsipl++] [patch] FIR Filter bank benchmark In-Reply-To: <442C93CE.2020103@codesourcery.com> References: <442C93CE.2020103@codesourcery.com> Message-ID: <442D8A07.7060906@codesourcery.com> Don McCoy wrote: > The attached patch adds one of the MIT Lincoln Labs' PCA Kernel-Level > Benchmarks to VSIPL++ -- the FIR Filter Bank. It also has a minor > re-organization of some support functions, moving them from the tests/ > directory to the src/vsip_csl/ directory. Actually, copies have been > made as I didn't think it would be good to delete the ones in tests/ > until all other references to them have been cleaned up. > > This benchmark defines two sets of parameters for performing a series of > convolutions on the input data. In each case, M input vectors of length > N are convolved with filters of length K. The two sets of parameters > are given as follows: > > Set 1 2 > M 64 20 > N 4096 1024 > K 128 12 > > The benchmark framework defined for VSIPL++ sweeps N over a range of > values, so the point of interest for each set may be extracted according > to the table above. > > Refer to the end of benchmarks/firbank.cpp to see the options used to > select various tests. Note: the last digit of the option value is > always 1 or 2, corresponding to the data set chosen. > > In order to use external data files with the benchmark, they must be > located in benchmarks/data/set1 and benchmarks/data/set2. The filenames > must be as follows: inputs_X.matrix, filter.matrix and outputs_X.matrix, > where X denotes the size as a power of two [log2(N)]. The default > starting and ending values for N are 7 and 16, so files corresponding to > those vector sizes must be provided. > > Validation is performed with external data. For full convolution, all > values are checked. The FFT-based algorithm is circular rather than > linear though, so values near the beginning and end are not checked. The > number of values that are checked is N - 2 * (K - 2). > > > Lastly, I had some difficulty getting the right answers to come out due > to the fact that the convolutions are done repeatedly on the same vector > in order to take a more accurate measurement. With the Fir class, the > state_save/state_no_save template parameter *must* be set to 'no_save', > or the results are retained between successive convolutions, thereby > corrupting the results. Not what is desired in this case! Actually, using state_no_save isn't all that bad. In particular for radar systems, data is usually not collected continuously. A regular interval of pulses are transmitted. In between each pulse the received signal is collected. This received data is not continuous because most systems cannot transmit and recieve data simultaneously (radar signals fall off with the 4th power of distance, so getting the transmitted signal would blow out the receive amplifiers); and because each new pulse "resets" the distance corresponding to the received data. A system might look something like: transmit: * * * receive: ...... ....... ....... ^ ^ | +- the beginning of this pulse is near | +- this end of this pulse is far In a cheapo system, each pulse might have the same waveform (which would simplify the FIRbank into only needing a single set of coefficients). However, systems often use "waveform diversity" where each pulse is slightly different. This makes it harder to jam and may increase the sensitivity of the system. This diversity would require multiple sets of filter kernels. > > Similarly with fast convolution, a temporary is used. I.e.: > > for (index_type l=0; l { > // Perform FIR convolutions > for ( length_type i = 0; i < local_M; ++i ) > { > Vector tmp(N, T()); > fwd_fft(l_inputs.row(i), tmp); > tmp *= response.row(0); // assume fft already done on response > inv_fft(tmp, test.row(i)); > } > } It should be OK to move the declaration of tmp entirely outside the loop. If fwd_fft's size is N, it will completely overwrite the values in 'tmp' > > Moving the declaration and initialization of 'tmp' outside the loop has > the same effect as with 'state_save' because the contents of tmp are not > zeroed between rows. With it inside the loop (as it should be), > performance does not appear to be affected noticeably, though it should > have a slight impact. > > Comments and feedback appreciated. > Reviewing the patch now ... -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Fri Mar 31 20:16:33 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 31 Mar 2006 15:16:33 -0500 Subject: [vsipl++] [patch] FIR Filter bank benchmark In-Reply-To: <442C93CE.2020103@codesourcery.com> References: <442C93CE.2020103@codesourcery.com> Message-ID: <442D8E21.1090500@codesourcery.com> Don McCoy wrote: > > > > ------------------------------------------------------------------------ > > 2006-03-20 Don McCoy > > * benchmarks/benchmarks.hpp: Updated to reflect new location of > test.hpp (see below). > * benchmarks/firbank.cpp: New file. Implements FIR Filter Bank > benchmark, one of the MIT/LL PCA Kernel Benchmarks. Demonstrates > two algorithms, time-domain convolution and "fast" convolution > based on Fourier transforms. Optionally supports using external > data files where the computed result is compared to the given > output file. > * src/vsip_csl/test.hpp: Moved from tests/ directory and into the > 'vsip_csl' namespace. > * src/vsip_csl/output.hpp: Likewise. > * src/vsip_csl/load_view.hpp: Likewise. Changed Load_view to > accept only constant filenames. Don, This patch looks good. The only real change I have is you should put the output into a global matrix (see below). Let me know if that makes sense. Once that is changed, please check it in. thanks -- Jules > > > ------------------------------------------------------------------------ > > > Index: benchmarks/benchmarks.hpp > =================================================================== > RCS file: /home/cvs/Repository/vpp/benchmarks/benchmarks.hpp,v > retrieving revision 1.1 > diff -c -p -r1.1 benchmarks.hpp > *** benchmarks/benchmarks.hpp 21 Mar 2006 15:53:09 -0000 1.1 > --- benchmarks/benchmarks.hpp 31 Mar 2006 01:30:32 -0000 > *************** > *** 18,29 **** > // Sourcery VSIPL++ provides certain resources such as system > // timers that are needed for running the benchmarks. > > #include > ! #include <../tests/test.hpp> > > #else > > ! // when linking with non-sourcery versions of the lib, the > // definitions below provide a minimal set of these resources. > > #include > --- 18,33 ---- > // Sourcery VSIPL++ provides certain resources such as system > // timers that are needed for running the benchmarks. > > + #include > #include > ! #include > ! #include > ! > ! using namespace vsip_csl; > > #else > > ! // When linking with non-Sourcery versions of the lib, the > // definitions below provide a minimal set of these resources. > > #include > *************** typedef P_acc_timer Acc_tim > *** 135,141 **** > > > > - > /// Compare two floating-point values for equality. > /// > /// Algorithm from: > --- 139,144 ---- > Index: benchmarks/firbank.cpp > =================================================================== > RCS file: benchmarks/firbank.cpp > diff -N benchmarks/firbank.cpp > *** /dev/null 1 Jan 1970 00:00:00 -0000 > --- benchmarks/firbank.cpp 31 Mar 2006 01:30:32 -0000 > *************** > *** 0 **** > --- 1,466 ---- > + /* Copyright (c) 2006 by CodeSourcery. All rights reserved. */ > + > + /** @file firbank.cpp > + @author Don McCoy > + @date 2006-01-26 > + @brief VSIPL++ Library: FIR Filter Bank - MIT Lincoln Labs > + Polymorphous Computing Architecture Kernel-Level Benchmarks Let's jump ahead and use "HPEC" instead of "PCA". > + */ > + > + > + t1.start(); > + for (index_type l=0; l + { > + // Perform FIR convolutions > + for ( length_type i = 0; i < local_M; ++i ) > + { > + Vector tmp(N, T()); > + fwd_fft(l_inputs.row(i), tmp); > + tmp *= response.row(0); // assume fft already done on response > + inv_fft(tmp, test.row(i)); why don't you put the result directly in outputs.local() ? I see, you're using outputs to pass the expected result in. That's fine. Instead of declaring 'test' to be a local matrix, can you instead declare a global results matrix, use the local view of that matrix here, and then check the local portion below? It is functionally the same as what you're doing here, but it is closer to what applications will look like. After doing a FIRbank, an application will probably want to reorganize the data for the next operation. Having the data in a global view makes that possible. Can make a similar change to the full convolution too? ... Nice diffs (for the following files)! You did this manually right? I didn't think CVS handled renaming of files. Thanks! > *** tests/test.hpp 2006-03-06 18:15:23.000000000 -0800 > --- src/vsip_csl/test.hpp 2006-03-30 15:51:36.850324000 -0800 > *************** -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705 From jules at codesourcery.com Fri Mar 31 20:23:39 2006 From: jules at codesourcery.com (Jules Bergmann) Date: Fri, 31 Mar 2006 15:23:39 -0500 Subject: [vsipl++] [patch] Fastconv benchmark In-Reply-To: <442D7CD5.3060201@codesourcery.com> References: <442D7CD5.3060201@codesourcery.com> Message-ID: <442D8FCB.5000908@codesourcery.com> Don McCoy wrote: > The attached patch updates the fast convolution benchmark by using the > new macro VSIP_IMPL_SOURCERY_VPP to separate code dependent on parallel > features of the library. This allows it to be compiled against the > reference implementation for performance comparisons. > > Note that the changes to benchmark.hpp submitted for yesterday's firbank > patch are needed for this as well. > Don, patch looks good, please commit. thanks, -- Jules -- Jules Bergmann CodeSourcery jules at codesourcery.com (650) 331-3385 x705