From mark at codesourcery.com Thu Aug 5 16:42:47 2004 From: mark at codesourcery.com (Mark Mitchell) Date: Thu, 05 Aug 2004 09:42:47 -0700 Subject: pooma compilation problems In-Reply-To: References: Message-ID: <41126387.9060207@codesourcery.com> Steve Nolen wrote: >i am having a problem compiling pooma on a Linux cluster. i am using >gcc_3.4.0 and it is giving a host of errors saying it is encountering >multiply undefined variables. the problem seems to be with derived >templated classes being unable to see the templated member variables of >their parent classes. it can be resolved by explicitly scoping the >variables to their proper base class, but this solution seems suboptimal. >am i doing something incorrect or is gcc304 enforcing a stricter adherence >to the standard? > > The latter. I believe that the current sources for POOMA, in CVS, work with GCC 3.4.x. Thanks, -- Mark Mitchell CodeSourcery, LLC (916) 791-8304 mark at codesourcery.com From drnuke at lanl.gov Mon Aug 9 17:15:32 2004 From: drnuke at lanl.gov (Steve Nolen) Date: Mon, 9 Aug 2004 11:15:32 -0600 Subject: error in downloadable cheetah package Message-ID: line 224 of MatchingHandler/MatchingAction.h should have "extra_m", not "extra" From oldham at codesourcery.com Mon Aug 9 22:28:04 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 09 Aug 2004 15:28:04 -0700 Subject: [pooma-dev] error in downloadable cheetah package In-Reply-To: References: Message-ID: <4117FA74.8070209@codesourcery.com> Steve Nolen wrote: >line 224 of MatchingHandler/MatchingAction.h should have "extra_m", not >"extra" > > Thank you for using Cheetah and for the error report. Cheetah is not under active development so incorporating this correction in the release is unlikely to happen in the near future. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 11 17:26:12 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 11 Aug 2004 10:26:12 -0700 Subject: [pooma-dev] error in downloadable cheetah package In-Reply-To: References: Message-ID: <411A56B4.10502@codesourcery.com> Steve Nolen wrote: >should i even be using cheetah for communications? i thought this was the >preferred path. is pooma's internal message passing sufficient? > > Cheetah is still supported by POOMA, but it is stable. No development has happened for several years. Thus, POOMA + Cheetah should still work and will continue to work for the foreseeable future. MPI support for POOMA was recently added. To use it, configure POOMA with the '--mpi' option, not the '--messaging' Cheetah option. One can search the POOMA source code for 'POOMA_MPI' to see the places that have been modified. As a conservative user, I usually recommend using a configuration that currently works well rather than switching to a newer version. Thus, I suggest continuing to use Cheetah unless you expect some advantage unknown to me from switching. >-----Original Message----- >From: Jeffrey D. Oldham [mailto:oldham at codesourcery.com] >Sent: Monday, August 09, 2004 4:28 PM >To: Steve Nolen >Cc: Pooma >Subject: Re: [pooma-dev] error in downloadable cheetah package > > >Steve Nolen wrote: > > > >>line 224 of MatchingHandler/MatchingAction.h should have "extra_m", not >>"extra" >> >> >> >> >Thank you for using Cheetah and for the error report. Cheetah is not >under active development so incorporating this correction in the release >is unlikely to happen in the near future. > > -- Jeffrey D. Oldham oldham at codesourcery.com From drnuke at lanl.gov Wed Aug 11 21:21:14 2004 From: drnuke at lanl.gov (Steve Nolen) Date: Wed, 11 Aug 2004 15:21:14 -0600 Subject: parallel particles wo cheetah Message-ID: is there a mixed signal in all of this? i'm trying to get a r2 version of mc++ running again. i need parallelism with particles. my particles (neutrons) do not need to interact with fields. the following emails tell me that i should use cheetah, but when i report a blatant, uncompilable bug, it gets dismissed because another email informs me that cheetah is no longer supported. i have tried to compile examples/Particles/Bounce with standalone mpi, but i get a ton of errors. is this a known deficiency? Steve Nolen wrote: should i even be using cheetah for communications? i thought this was the preferred path. is pooma's internal message passing sufficient? Cheetah is still supported by POOMA, but it is stable. No development has happened for several years. Thus, POOMA + Cheetah should still work and will continue to work for the foreseeable future. MPI support for POOMA was recently added. To use it, configure POOMA with the '--mpi' option, not the '--messaging' Cheetah option. One can search the POOMA source code for 'POOMA_MPI' to see the places that have been modified. As a conservative user, I usually recommend using a configuration that currently works well rather than switching to a newer version. Thus, I suggest continuing to use Cheetah unless you expect some advantage unknown to me from switching. -----Original Message----- From: Jeffrey D. Oldham [mailto:oldham at xxxxxxxxxxxxxxxx] Sent: Monday, August 09, 2004 4:28 PM To: Steve Nolen Cc: Pooma Subject: Re: [pooma-dev] error in downloadable cheetah package Steve Nolen wrote: line 224 of MatchingHandler/MatchingAction.h should have "extra_m", not "extra" Thank you for using Cheetah and for the error report. Cheetah is not under active development so incorporating this correction in the release is unlikely to happen in the near future. -- Jeffrey D. Oldham oldham at xxxxxxxxxxxxxxxx From oldham at codesourcery.com Wed Aug 11 22:09:33 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 11 Aug 2004 15:09:33 -0700 Subject: Status of Particles and Parallelism Message-ID: <411A991D.6070205@codesourcery.com> Richard, Last month you made progress on particles in the Pooma CVS repository. Steve Nolen wishes to use particles in the Pooma repository code and MPI for his work. What is the current state of the particles codes with MPI. As an alternate, using Cheetah will be acceptable. Thanks for the information. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Sat Aug 14 14:08:31 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sat, 14 Aug 2004 16:08:31 +0200 Subject: [pooma-dev] Status of Particles and Parallelism In-Reply-To: <411A991D.6070205@codesourcery.com> References: <411A991D.6070205@codesourcery.com> Message-ID: <411E1CDF.5000405@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard, > > Last month you made progress on particles in the Pooma CVS > repository. Steve Nolen wishes to use particles in the Pooma repository > code and MPI for his work. What is the current state of the particles > codes with MPI. As an alternate, using Cheetah will be acceptable. > Thanks for the information. > Sorry for the late reply, I was on vacation. Parallel Particles are not supported with MPI as I was not able to understand what PatchParticleSwapLayout (or whatever it is called). If anyone provides me with some explanation, I'll happily look at what is missing. Btw. I have some Cheetah fixes myself, I can collect these together and maybe we could provide at least a patch for download along the cheetah tarball. Richard. From oldham at codesourcery.com Sat Aug 14 14:33:02 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Sat, 14 Aug 2004 07:33:02 -0700 Subject: Cheetah Status In-Reply-To: <411E1CDF.5000405@tat.physik.uni-tuebingen.de> References: <411A991D.6070205@codesourcery.com> <411E1CDF.5000405@tat.physik.uni-tuebingen.de> Message-ID: <411E229E.2090000@codesourcery.com> Richard Guenther wrote: > Btw. I have some Cheetah fixes myself, I can collect these together > and maybe we could provide at least a patch for download along the > cheetah tarball. > > Richard. Yes, let's put together your changes and Steve Nolen's changes into a new Cheetah 1.1.5 release. Will you please send them to me? I'll create a Cheetah CVS repository. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Sat Aug 14 17:00:43 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sat, 14 Aug 2004 19:00:43 +0200 Subject: [pooma-dev] Re: Cheetah Status In-Reply-To: <411E229E.2090000@codesourcery.com> References: <411A991D.6070205@codesourcery.com> <411E1CDF.5000405@tat.physik.uni-tuebingen.de> <411E229E.2090000@codesourcery.com> Message-ID: <411E453B.5040204@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> Btw. I have some Cheetah fixes myself, I can collect these together >> and maybe we could provide at least a patch for download along the >> cheetah tarball. >> >> Richard. > > > Yes, let's put together your changes and Steve Nolen's changes into a > new Cheetah 1.1.5 release. Will you please send them to me? I'll > create a Cheetah CVS repository. I'll send my local changes to you on Monday. But I think we shouldn't release until others are reporting success with a set of collected patches. Obviously I only tested MPI and not the various other means of parallelism in the Cheetah library. Richard. From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 08:15:47 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 10:15:47 +0200 (CEST) Subject: [pooma-dev] Re: Cheetah Status In-Reply-To: <411E453B.5040204@tat.physik.uni-tuebingen.de> Message-ID: On Sat, 14 Aug 2004, Richard Guenther wrote: > Jeffrey D. Oldham wrote: > > Richard Guenther wrote: > > > >> Btw. I have some Cheetah fixes myself, I can collect these together > >> and maybe we could provide at least a patch for download along the > >> cheetah tarball. > >> > >> Richard. > > > > > > Yes, let's put together your changes and Steve Nolen's changes into a > > new Cheetah 1.1.5 release. Will you please send them to me? I'll > > create a Cheetah CVS repository. > > I'll send my local changes to you on Monday. But I think we shouldn't > release until others are reporting success with a set of collected > patches. Obviously I only tested MPI and not the various other means of > parallelism in the Cheetah library. Ok, here's my set of local changes to the Cheetah library. Sometimes Richard Guenther * BUGS: new. bin/makeinstall: ignore SCCS dirs. config/LINUXGCC.conf: use -g, not -ggdb. config/LINUXICC.conf: new. configure: don't set shmem_locksrc, build-system is broken. src/Controller/ControllerFactory.cpp: build factory with arg not removed. src/Controller/Group.h: reorder initializers. src/Controller/Shmem/MM_Allocator.h: remove broken method. src/Utilities/CheetahRefCountedPtr.h: const pointer by value makes no sense. This patch is against the cheetah-1.1.4 tarball. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ -------------- next part -------------- diff -Nru a/cheetah-1.1.4/BUGS b/cheetah-1.1.4/BUGS --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/cheetah-1.1.4/BUGS 2004-08-16 10:04:52 +02:00 @@ -0,0 +1,9 @@ +From Richard Guenther : + + You may experience problems with passing the controller specification + argument using the MPICH mpirun implementation. This one doesnt set + up the arguments for the clients before calling MPI_Init. + To work around this deficiency store a file called CHEETAH_RUNTIME + containing "-mpi" as the first line in the same directory as the + executable. + diff -Nru a/cheetah-1.1.4/bin/makeinstall b/cheetah-1.1.4/bin/makeinstall --- a/cheetah-1.1.4/bin/makeinstall 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/bin/makeinstall 2004-08-16 10:04:52 +02:00 @@ -46,8 +46,8 @@ ### copy the source flies cd $fromsrc echo "Copying source files to $tosrc ..." -hflist=`find . -name "*.h" -print` -cflist=`find . -name "*.cpp" -print` +hflist=`find . -name "*.h" -print | grep -v SCCS` +cflist=`find . -name "*.cpp" -print | grep -v SCCS` tar cf - $hflist $cflist | (cd $tosrc ; tar xvf -) ### copy the library files diff -Nru a/cheetah-1.1.4/config/LINUXGCC.conf b/cheetah-1.1.4/config/LINUXGCC.conf --- a/cheetah-1.1.4/config/LINUXGCC.conf 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/config/LINUXGCC.conf 2004-08-16 10:04:52 +02:00 @@ -82,7 +82,7 @@ ### debug or optimized build settings for C++ applications -$cppdbg = "-ggdb"; +$cppdbg = "-g"; $cppopt = "-O3 -funroll-loops"; diff -Nru a/cheetah-1.1.4/config/LINUXICC.conf b/cheetah-1.1.4/config/LINUXICC.conf --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/cheetah-1.1.4/config/LINUXICC.conf 2004-08-16 10:04:52 +02:00 @@ -0,0 +1,128 @@ +########################################################################### +# Cheetah configuration settings +# +# Platform: LINUX +# Compiler: Intel C++ compiler (icc) +# +########################################################################### +# +# This file contains specialized settings indicating how to build Cheetah +# with this platform and compiler. This is a perl script executed by +# the 'configure' script at the top level of the Cheetah directory structure. +# This file has two sections: +# 1) The locations of include files and libraries for external packages. +# 3) The specialized settings on how to use this platform and compiler. +# You should edit the lines in section 1) to the proper location of the +# external packages. Do not edit the lines in section 2) unless you +# know what you're doing. +# +########################################################################### + +########################################################################### +# Section 1: external package locations. +# Include search directories should have a '-I' prepended. +# Library search directories should have a '-L' prepended. +# Library filenames should just list the name or use -l prefix as needed. +# Required defines should have -a '-D' prepended. +########################################################################### + +### location of MM files, for shmem controller (if available) + +$has_shmem = 1; +$shmem_default_dir = "/home/cheetah/packages/mm/build/linux"; +$shmem_inc = "-I$shmem_default_dir/include"; +$shmem_lib = "-L$shmem_default_dir/lib -lmm"; +$shmem_def = ""; +$shmem_locksrc = "Utilities/i386-lock.s"; +$shmem_lockobj = "i386-lock.o"; +$shmem_as = "as"; + +### location of ULM files, for ULM controller (if available) + +$has_ulm = 0; + + +########################################################################### +# Section 2: compilation settings +########################################################################### + + +################### +### characteristics +################### + +### the name of this architecture + +$archtype = "linux"; +$comptype = "icc"; + +### are shared libraries supported? + +$canmakesharedlib = 1; +$sharedext = "so"; + + +################ +### C++ settings +################ + +### general settings for using the C++ compiler, for both libs and apps + +$cpp = "icc"; + +$cppargs = "-restrict"; + +$cppex = "-Kc++eh"; # flag to use exceptions +$cppnoex = ""; # flag to turn off exceptions + +$cppverbose = ""; # flag for verbose compiler output + +$cppshare = "-KPIC"; # flag for compiling for shared libs + + +### debug or optimized build settings for C++ applications + +$cppdbg = "-g"; + +$cppopt = "-O3"; + + +################### +### linker settings +################### + +$link = "icc"; + +$linkargs = "\$(CHEETAH_CXX_ARGS)"; + +$linkverbose = ""; + +$linkshare = ""; + + +##################### +### archiver settings +##################### + +$ar = "ar"; + +$arargs = "rcsl"; + +$arshare = "icc"; # program to make shared lib + +$arshareargs = "-shared -o"; # arguments to make shared lib + + +# ACL:rcsinfo +# ---------------------------------------------------------------------- +# $RCSfile: LINUXGCC.conf,v $ $Author: rasmussn $ +# $Revision: 1.3 $ $Date: 2000/06/26 22:07:27 $ +# ---------------------------------------------------------------------- +# ACL:rcsinfo + +########################################################################### +# the last line of this file must be a '1' so that Perl sees a non-zero +# results from this file +########################################################################### +1; + diff -Nru a/cheetah-1.1.4/configure b/cheetah-1.1.4/configure --- a/cheetah-1.1.4/configure 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/configure 2004-08-16 10:04:52 +02:00 @@ -846,7 +846,7 @@ { # make sure we don't try to set up any assembly file # with mutex lock code - $shmem_locksrc = ""; + #$shmem_locksrc = ""; } } diff -Nru a/cheetah-1.1.4/src/Controller/ControllerFactory.cpp b/cheetah-1.1.4/src/Controller/ControllerFactory.cpp --- a/cheetah-1.1.4/src/Controller/ControllerFactory.cpp 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/src/Controller/ControllerFactory.cpp 2004-08-16 10:04:52 +02:00 @@ -121,17 +121,21 @@ if (p->first == argv[i]) { // - // We have a match! Delete this arg from the input - // list. + // We have a match! Build this factory. + // + ControllerImpl* impl = p->second(argc, argv); + + // + // Delete the arg from the input list. // for (int j=i+1; jsecond(argc, argv); + return impl; } } } diff -Nru a/cheetah-1.1.4/src/Controller/Group.h b/cheetah-1.1.4/src/Controller/Group.h --- a/cheetah-1.1.4/src/Controller/Group.h 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/src/Controller/Group.h 2004-08-16 10:04:52 +02:00 @@ -49,7 +49,7 @@ // state which must be corrected with the initialize() function. // - Group() : nContexts_m(-1), myContext_m(NOT_A_MEMBER), ranks_m(0), id_m(0) { } + Group() : myContext_m(NOT_A_MEMBER), nContexts_m(-1), ranks_m(0), id_m(0) { } Group(int nContexts, int myContext, int* ranks = 0, int id = 0); diff -Nru a/cheetah-1.1.4/src/Controller/Shmem/MM_Allocator.h b/cheetah-1.1.4/src/Controller/Shmem/MM_Allocator.h --- a/cheetah-1.1.4/src/Controller/Shmem/MM_Allocator.h 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/src/Controller/Shmem/MM_Allocator.h 2004-08-16 10:04:52 +02:00 @@ -55,9 +55,6 @@ // pointer allocate(int n) { return (pointer)MM_malloc(n*sizeof(T)); } - template - void allocate(int n, P) { return (pointer)MM_malloc(n*sizeof(T)); } - // // Free back to shared memory. // diff -Nru a/cheetah-1.1.4/src/Utilities/CheetahRefCountedPtr.h b/cheetah-1.1.4/src/Utilities/CheetahRefCountedPtr.h --- a/cheetah-1.1.4/src/Utilities/CheetahRefCountedPtr.h 2004-08-16 10:04:52 +02:00 +++ b/cheetah-1.1.4/src/Utilities/CheetahRefCountedPtr.h 2004-08-16 10:04:52 +02:00 @@ -79,7 +79,7 @@ // Assignment operators increment the reference count. RefCountedPtr & operator=(const RefCountedPtr &); - RefCountedPtr & operator=(T * const); + RefCountedPtr & operator=(T *); //============================================================ // Accessors and Mutators From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 14:09:09 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 16:09:09 +0200 (CEST) Subject: [pooma-dev] Re: pooma compilation problems In-Reply-To: <41126387.9060207@codesourcery.com> Message-ID: On Thu, 5 Aug 2004, Mark Mitchell wrote: > Steve Nolen wrote: > > >i am having a problem compiling pooma on a Linux cluster. i am using > >gcc_3.4.0 and it is giving a host of errors saying it is encountering > >multiply undefined variables. the problem seems to be with derived > >templated classes being unable to see the templated member variables of > >their parent classes. it can be resolved by explicitly scoping the > >variables to their proper base class, but this solution seems suboptimal. > >am i doing something incorrect or is gcc304 enforcing a stricter adherence > >to the standard? > > > > > The latter. I believe that the current sources for POOMA, in CVS, work > with GCC 3.4.x. Maybe we can do another release of the POOMA library. These problems seem common, and just redirecting everyone to CVS is not good. At least maybe the website could point out possible problems with gcc 3.4 and icpc. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 14:12:52 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 16:12:52 +0200 (CEST) Subject: [pooma-dev] Re: Cheetah Status In-Reply-To: Message-ID: On Mon, 16 Aug 2004, Richard Guenther wrote: > Ok, here's my set of local changes to the Cheetah library. > > Sometimes Richard Guenther > > * BUGS: new. > bin/makeinstall: ignore SCCS dirs. > config/LINUXGCC.conf: use -g, not -ggdb. > config/LINUXICC.conf: new. > configure: don't set shmem_locksrc, build-system > is broken. > src/Controller/ControllerFactory.cpp: build factory > with arg not removed. > src/Controller/Group.h: reorder initializers. > src/Controller/Shmem/MM_Allocator.h: remove broken > method. > src/Utilities/CheetahRefCountedPtr.h: const pointer > by value makes no sense. > > This patch is against the cheetah-1.1.4 tarball. A few more, from a different repository: Sometimes Richard Guenther * src/Controller/MPI/cheetah_mpi.h: remove stray semicolon. src/MatchingHandler/MatchingAction.h; fix typo. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ -------------- next part -------------- diff -ur /tmp/cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h --- /tmp/cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h Wed Oct 24 22:47:47 2001 +++ cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h Fri Dec 19 10:21:07 2003 @@ -28,7 +28,7 @@ class Group; class MPIGroup; class MPIController; -}; +} #ifdef __cplusplus extern "C" diff -ur /tmp/cheetah-1.1.4/src/MatchingHandler/MatchingAction.h cheetah-1.1.4/src/MatchingHandler/MatchingAction.h --- /tmp/cheetah-1.1.4/src/MatchingHandler/MatchingAction.h Mon Apr 17 22:33:03 2000 +++ cheetah-1.1.4/src/MatchingHandler/MatchingAction.h Fri Dec 19 10:21:08 2003 @@ -221,7 +221,7 @@ inline void operator()() { - handler_m(extra); + handler_m(extra_m); } template From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 14:54:03 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 16:54:03 +0200 (CEST) Subject: [PATCH] kill POOMA_REORDER_ITERATES Message-ID: This patch kills POOMA_REORDER_ITERATES, i.e. assumes it is always set. It was used inconsistently before anyways and the stub scheduler doesn't reorder iterates anyway and the SerialAsync does so anyway. So this reduces source code complexity. Not tested, but looks obvious. Ok? Richard. 2004Aug16 Richard Guenther * configure: remove traces of POOMA_REORDER_ITERATES. src/Evaluator/Evaluator.h: likewise. src/Evaluator/MultiArgEvaluator.h: likewise. src/Evaluator/Reduction.h: likewise. src/Pooma/Pooma.cmpl.cpp: likewise. src/Tulip/SendReceive.h: likewise. -------------- next part -------------- ===== r2/configure 1.17 vs edited ===== --- 1.17/r2/configure 2004-07-15 16:22:32 +02:00 +++ edited/r2/configure 2004-08-16 16:47:58 +02:00 @@ -1498,10 +1498,6 @@ } } - $pooma_reorder_iterates = $threads || ($scheduler eq "serialAsync"); - - add_yesno_define("POOMA_REORDER_ITERATES", $pooma_reorder_iterates); - # OpenMP support if (scalar @{$arghash{$openmpnm}} > 1) { ===== r2/src/Evaluator/Evaluator.h 1.3 vs edited ===== --- 1.3/r2/src/Evaluator/Evaluator.h 2003-10-23 14:41:01 +02:00 +++ edited/r2/src/Evaluator/Evaluator.h 2004-08-16 16:44:43 +02:00 @@ -153,12 +153,8 @@ void evaluate(const LHS& lhs, const Op& op, const RHS& rhs) const { typedef typename KernelTag::Kernel_t Kernel_t; -#if POOMA_REORDER_ITERATES Pooma::Iterate_t *iterate = ::generateKernel(lhs, op, rhs, Kernel_t()); Pooma::scheduler().handOff(iterate); -#else - KernelEvaluator::evaluate(lhs, op, rhs); -#endif } }; ===== r2/src/Evaluator/MultiArgEvaluator.h 1.15 vs edited ===== --- 1.15/r2/src/Evaluator/MultiArgEvaluator.h 2004-07-15 16:47:32 +02:00 +++ edited/r2/src/Evaluator/MultiArgEvaluator.h 2004-08-16 16:45:01 +02:00 @@ -220,14 +220,10 @@ const Kernel &) { Kernel kernelf(function, domain); -#if 1 || POOMA_REORDER_ITERATES Pooma::Iterate_t *iterate = new MultiArgKernel(a1, kernelf, info.writers(), info.readers()); Pooma::scheduler().handOff(iterate); -#else - kernelf(a1); -#endif } }; ===== r2/src/Evaluator/Reduction.h 1.10 vs edited ===== --- 1.10/r2/src/Evaluator/Reduction.h 2004-01-29 10:28:58 +01:00 +++ edited/r2/src/Evaluator/Reduction.h 2004-08-16 16:45:20 +02:00 @@ -168,14 +168,9 @@ { typedef typename KernelTag1::Kernel_t Kernel_t; -#if POOMA_REORDER_ITERATES Pooma::Iterate_t *iterate = new ReductionKernel(ret, op, e, csem); Pooma::scheduler().handOff(iterate); -#else - ReductionEvaluator::evaluate(ret, op, e); - csem.incr(); -#endif } template ===== r2/src/Pooma/Pooma.cmpl.cpp 1.3 vs edited ===== --- 1.3/r2/src/Pooma/Pooma.cmpl.cpp 2004-01-17 16:20:23 +01:00 +++ edited/r2/src/Pooma/Pooma.cmpl.cpp 2004-08-16 16:47:24 +02:00 @@ -803,10 +803,6 @@ SystemContext_t::runSomething(); } -# elif POOMA_REORDER_ITERATES - - CTAssert(NO_SUPPORT_FOR_THREADS_WITH_MESSAGING); - # else // we're using the serial scheduler, so we only need to get messages while (Pooma::incomingMessages()) ===== r2/src/Tulip/SendReceive.h 1.4 vs edited ===== --- 1.4/r2/src/Tulip/SendReceive.h 2004-01-07 09:54:09 +01:00 +++ edited/r2/src/Tulip/SendReceive.h 2004-08-16 16:47:00 +02:00 @@ -93,11 +93,9 @@ hintAffinity(engineFunctor(view_m, DataObjectRequest())); -#if POOMA_REORDER_ITERATES // Priority interface was added to r2 version of serial async so that // message iterates would run before any other iterates. priority(-1); -#endif DataObjectRequest writeReq(*this); DataObjectRequest readReq(writeReq); @@ -158,11 +156,9 @@ hintAffinity(engineFunctor(view, DataObjectRequest())); -#if POOMA_REORDER_ITERATES // Priority interface was added to r2 version of serial async so that // message iterates would run before any other iterates. priority(-1); -#endif DataObjectRequest writeReq(*this); engineFunctor(view, writeReq); @@ -181,38 +177,12 @@ // registers a method that gets handled by cheetah when the appropriate // message arrives. -#if !POOMA_REORDER_ITERATES - - bool ready_m; - - static void handle(This_t *me, IncomingView &viewMessage) - { - apply(me->view_m, viewMessage); - me->ready_m = true; - } - - virtual void run() - { - ready_m = false; - Pooma::remoteEngineHandler()->request(fromContext_m, tag_m, - This_t::handle, this); - - while (!ready_m) - { - Pooma::poll(); - } - } - -#else - virtual void run() { Pooma::remoteEngineHandler()->request(fromContext_m, tag_m, This_t::apply, view_m); } -#endif - private: static void apply(const View &viewLocal, IncomingView &viewMessage) @@ -302,11 +272,9 @@ hintAffinity(engineFunctor(view_m, DataObjectRequest())); -#if POOMA_REORDER_ITERATES // Priority interface was added to r2 version of serial async so that // message send iterates would run before any other iterates. priority(-1); -#endif DataObjectRequest writeReq(*this); DataObjectRequest readReq(writeReq); @@ -384,11 +352,9 @@ hintAffinity(engineFunctor(view, DataObjectRequest())); -#if POOMA_REORDER_ITERATES // Priority interface was added to r2 version of serial async so that // message receive iterates would run after any other iterates. priority(-1); -#endif DataObjectRequest writeReq(*this); engineFunctor(view, writeReq); From mark at codesourcery.com Mon Aug 16 15:20:13 2004 From: mark at codesourcery.com (Mark Mitchell) Date: Mon, 16 Aug 2004 08:20:13 -0700 Subject: [pooma-dev] Re: pooma compilation problems In-Reply-To: References: Message-ID: <4120D0AD.2090200@codesourcery.com> Richard Guenther wrote: >On Thu, 5 Aug 2004, Mark Mitchell wrote: > > > >>Steve Nolen wrote: >> >> >> >>>i am having a problem compiling pooma on a Linux cluster. i am using >>>gcc_3.4.0 and it is giving a host of errors saying it is encountering >>>multiply undefined variables. the problem seems to be with derived >>>templated classes being unable to see the templated member variables of >>>their parent classes. it can be resolved by explicitly scoping the >>>variables to their proper base class, but this solution seems suboptimal. >>>am i doing something incorrect or is gcc304 enforcing a stricter adherence >>>to the standard? >>> >>> >>> >>> >>The latter. I believe that the current sources for POOMA, in CVS, work >>with GCC 3.4.x. >> >> > >Maybe we can do another release of the POOMA library. These problems seem >common, and just redirecting everyone to CVS is not good. At least maybe >the website could point out possible problems with gcc 3.4 and icpc. > > Yes, with all of your many improvements, maybe it is time for a POOMA 2.5. I have no particular opinion; you and Jeffrey are better informed than I. -- Mark Mitchell CodeSourcery, LLC (916) 791-8304 mark at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 15:39:59 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 17:39:59 +0200 (CEST) Subject: [pooma-dev] Re: pooma compilation problems In-Reply-To: <4120D0AD.2090200@codesourcery.com> Message-ID: On Mon, 16 Aug 2004, Mark Mitchell wrote: > Richard Guenther wrote: > > >On Thu, 5 Aug 2004, Mark Mitchell wrote: > > > >>The latter. I believe that the current sources for POOMA, in CVS, work > >>with GCC 3.4.x. > > > >Maybe we can do another release of the POOMA library. These problems seem > >common, and just redirecting everyone to CVS is not good. At least maybe > >the website could point out possible problems with gcc 3.4 and icpc. > > > Yes, with all of your many improvements, maybe it is time for a POOMA > 2.5. I have no particular opinion; you and Jeffrey are better informed > than I. If we go that route I want to go over my local repositories again to make sure no fixes are missing and to re-sync the native MPI work. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 19:43:34 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 21:43:34 +0200 Subject: [PATCH] Robustify async MPI request handling Message-ID: <41210E66.6000305@tat.physik.uni-tuebingen.de> The following patch fixes an error and robustifies MPI request handling. Tested by having it in my local tree for a long time. Ok? Richard. 2004Aug16 Richard Guenther * src/Threads/IterateSchedulers/SerialAsync.h: Guard against LAM MPI automatically dragging in C++ support, fix message polling return value check, complete messages first, remove unused variable. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 19:57:25 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 21:57:25 +0200 Subject: [PATCH] Improve mpiCC detection Message-ID: <412111A5.1050704@tat.physik.uni-tuebingen.de> This patch improves (fixes) mpiCC detection and allows overriding with a custom compiler (MPICH and LAM MPI supported). Ok? Richard. 2004Aug16 Richard Guenther * configure: use correct way to detect mpiCC and friends, allow overriding of MPICH and LAM used compilers with --cpp. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p2 URL: From rguenth at tat.physik.uni-tuebingen.de Mon Aug 16 20:16:47 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Aug 2004 22:16:47 +0200 Subject: [RFC] Removing workarounds for pre-ISO C++ compilers Message-ID: <4121162F.5090104@tat.physik.uni-tuebingen.de> Would there be any objections to the removal of the workarounds for pre-ISO C++ compilers like checking whether we have complete IO manipulators... yes checking whether we have a standard iosbase class... yes checking whether we have sstream... yes checking whether we have a complex inside std... yes checking whether we support dependent template arguments... yes checking whether we support delete operators with placement argument... yes checking whether we handle default args to template functions correct... yes checking whether we have std::ios_base::fmtflags... yes checking for support of templated friends... yes checking numer of template arguments of std::ostream_iterator... 1 checking for std::min(), std::max()... yes checking for standard conforming iterators... yes Pretty much any up-to-date compiler handles these correctly today. Also not all such uses are guarded by the workarounds and I lack a dumb enough compiler to check their correct usage. Any thoughts? Richard. From oldham at codesourcery.com Mon Aug 16 22:32:14 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 15:32:14 -0700 Subject: [pooma-dev] Re: Cheetah Status In-Reply-To: References: Message-ID: <412135EE.9070709@codesourcery.com> Richard Guenther wrote: >On Sat, 14 Aug 2004, Richard Guenther wrote: > > > >>Jeffrey D. Oldham wrote: >> >> >>>Richard Guenther wrote: >>> >>> >>> >>>>Btw. I have some Cheetah fixes myself, I can collect these together >>>>and maybe we could provide at least a patch for download along the >>>>cheetah tarball. >>>> >>>>Richard. >>>> >>>> >>>Yes, let's put together your changes and Steve Nolen's changes into a >>>new Cheetah 1.1.5 release. Will you please send them to me? I'll >>>create a Cheetah CVS repository. >>> >>> >>I'll send my local changes to you on Monday. But I think we shouldn't >>release until others are reporting success with a set of collected >>patches. Obviously I only tested MPI and not the various other means of >>parallelism in the Cheetah library. >> >> > >Ok, here's my set of local changes to the Cheetah library. > >Sometimes Richard Guenther > > * BUGS: new. > bin/makeinstall: ignore SCCS dirs. > config/LINUXGCC.conf: use -g, not -ggdb. > config/LINUXICC.conf: new. > configure: don't set shmem_locksrc, build-system > is broken. > src/Controller/ControllerFactory.cpp: build factory > with arg not removed. > src/Controller/Group.h: reorder initializers. > src/Controller/Shmem/MM_Allocator.h: remove broken > method. > src/Utilities/CheetahRefCountedPtr.h: const pointer > by value makes no sense. > >This patch is against the cheetah-1.1.4 tarball. > > I committed this patch to the new Cheetah repository except for the configure change. The repository supports shmem because of work I did this morning. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 22:38:13 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 15:38:13 -0700 Subject: [pooma-dev] Re: Cheetah Status In-Reply-To: References: Message-ID: <41213755.5030409@codesourcery.com> Richard Guenther wrote: >On Mon, 16 Aug 2004, Richard Guenther wrote: > > >A few more, from a different repository: > >Sometimes Richard Guenther > > * src/Controller/MPI/cheetah_mpi.h: remove stray semicolon. > src/MatchingHandler/MatchingAction.h; fix typo. > >Richard. > > Thanks for the patch. I committed the first part. The second part was already fixed previously by me this morning when testing the newly created Cheetah repository. >-- >Richard Guenther >WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ > > >------------------------------------------------------------------------ > >diff -ur /tmp/cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h >--- /tmp/cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h Wed Oct 24 22:47:47 2001 >+++ cheetah-1.1.4/src/Controller/MPI/cheetah_mpi.h Fri Dec 19 10:21:07 2003 >@@ -28,7 +28,7 @@ > class Group; > class MPIGroup; > class MPIController; >-}; >+} > > #ifdef __cplusplus > extern "C" >diff -ur /tmp/cheetah-1.1.4/src/MatchingHandler/MatchingAction.h cheetah-1.1.4/src/MatchingHandler/MatchingAction.h >--- /tmp/cheetah-1.1.4/src/MatchingHandler/MatchingAction.h Mon Apr 17 22:33:03 2000 >+++ cheetah-1.1.4/src/MatchingHandler/MatchingAction.h Fri Dec 19 10:21:08 2003 >@@ -221,7 +221,7 @@ > > inline void operator()() > { >- handler_m(extra); >+ handler_m(extra_m); > } > > template > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 22:42:24 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 15:42:24 -0700 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <4121162F.5090104@tat.physik.uni-tuebingen.de> References: <4121162F.5090104@tat.physik.uni-tuebingen.de> Message-ID: <41213850.5090700@codesourcery.com> Richard Guenther wrote: > Would there be any objections to the removal of the workarounds for > pre-ISO C++ compilers like > > checking whether we have complete IO manipulators... yes > checking whether we have a standard iosbase class... yes > checking whether we have sstream... yes > checking whether we have a complex inside std... yes > checking whether we support dependent template arguments... yes > checking whether we support delete operators with placement > argument... yes > checking whether we handle default args to template functions > correct... yes > checking whether we have std::ios_base::fmtflags... yes > checking for support of templated friends... yes > checking numer of template arguments of std::ostream_iterator... 1 > checking for std::min(), std::max()... yes > checking for standard conforming iterators... yes > > Pretty much any up-to-date compiler handles these correctly today. > Also not all such uses are guarded by the workarounds and I lack a > dumb enough compiler to check their correct usage. > > Any thoughts? > > Richard. There are still a lot of gcc 2.95 and related compilers in use today. I prefer to leave them but let them rot unless there is a compelling reason to remove them now. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 22:48:15 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 15:48:15 -0700 Subject: [PATCH] Robustify async MPI request handling In-Reply-To: <41210E66.6000305@tat.physik.uni-tuebingen.de> References: <41210E66.6000305@tat.physik.uni-tuebingen.de> Message-ID: <412139AF.90900@codesourcery.com> Richard Guenther wrote: > The following patch fixes an error and robustifies MPI request handling. > > Tested by having it in my local tree for a long time. > > Ok? > > Richard. > > > 2004Aug16 Richard Guenther > > * src/Threads/IterateSchedulers/SerialAsync.h: Guard against > LAM MPI automatically dragging in C++ support, fix message > polling return value check, complete messages first, remove > unused variable. What problems does mpicxx.h cause? I am both curious and want to know so I can approve the patch. >------------------------------------------------------------------------ > >Index: SerialAsync.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Threads/IterateSchedulers/SerialAsync.h,v >retrieving revision 1.11 >diff -u -u -r1.11 SerialAsync.h >--- SerialAsync.h 8 Jan 2004 21:45:49 -0000 1.11 >+++ SerialAsync.h 16 Aug 2004 19:22:33 -0000 >@@ -72,6 +72,7 @@ > #include > #include "Pooma/Configuration.h" > #if POOMA_MPI >+# define MPIPP_H // prevent lam mpicxx.h from being included > # include > #endif > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 22:50:36 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 15:50:36 -0700 Subject: [PATCH] Improve mpiCC detection In-Reply-To: <412111A5.1050704@tat.physik.uni-tuebingen.de> References: <412111A5.1050704@tat.physik.uni-tuebingen.de> Message-ID: <41213A3C.2020105@codesourcery.com> Richard Guenther wrote: > This patch improves (fixes) mpiCC detection and allows overriding with > a custom compiler (MPICH and LAM MPI supported). > > Ok? > > Richard. > > > 2004Aug16 Richard Guenther > > * configure: use correct way to detect mpiCC and friends, > allow overriding of MPICH and LAM used compilers with > --cpp. Yes, this is useful. I've been working around it recently. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 22:55:07 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 15:55:07 -0700 Subject: [pooma-dev] Re: pooma compilation problems In-Reply-To: References: Message-ID: <41213B4B.4010705@codesourcery.com> Richard Guenther wrote: >On Thu, 5 Aug 2004, Mark Mitchell wrote: > > > >>Steve Nolen wrote: >> >> >> >>>i am having a problem compiling pooma on a Linux cluster. i am using >>>gcc_3.4.0 and it is giving a host of errors saying it is encountering >>>multiply undefined variables. the problem seems to be with derived >>>templated classes being unable to see the templated member variables of >>>their parent classes. it can be resolved by explicitly scoping the >>>variables to their proper base class, but this solution seems suboptimal. >>>am i doing something incorrect or is gcc304 enforcing a stricter adherence >>>to the standard? >>> >>> >>> >>> >>The latter. I believe that the current sources for POOMA, in CVS, work >>with GCC 3.4.x. >> >> > >Maybe we can do another release of the POOMA library. These problems seem >common, and just redirecting everyone to CVS is not good. At least maybe >the website could point out possible problems with gcc 3.4 and icpc. > > Yes, it's probably time to bump up the minor version numbers. Let's ensure all the outstanding patches are resolved. Then we'll test the various configurations before making a release. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 23:00:11 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 16:00:11 -0700 Subject: [PATCH] kill POOMA_REORDER_ITERATES In-Reply-To: References: Message-ID: <41213C7B.4050905@codesourcery.com> Richard Guenther wrote: >This patch kills POOMA_REORDER_ITERATES, i.e. assumes it is always set. >It was used inconsistently before anyways and the stub scheduler doesn't >reorder iterates anyway and the SerialAsync does so anyway. So this >reduces source code complexity. > >Not tested, but looks obvious. Ok? > >Richard. > > >2004Aug16 Richard Guenther > > * configure: remove traces of POOMA_REORDER_ITERATES. > src/Evaluator/Evaluator.h: likewise. > src/Evaluator/MultiArgEvaluator.h: likewise. > src/Evaluator/Reduction.h: likewise. > src/Pooma/Pooma.cmpl.cpp: likewise. > src/Tulip/SendReceive.h: likewise. > > >------------------------------------------------------------------------ > >===== r2/src/Pooma/Pooma.cmpl.cpp 1.3 vs edited ===== >--- 1.3/r2/src/Pooma/Pooma.cmpl.cpp 2004-01-17 16:20:23 +01:00 >+++ edited/r2/src/Pooma/Pooma.cmpl.cpp 2004-08-16 16:47:24 +02:00 >@@ -803,10 +803,6 @@ > SystemContext_t::runSomething(); > } > >-# elif POOMA_REORDER_ITERATES >- >- CTAssert(NO_SUPPORT_FOR_THREADS_WITH_MESSAGING); >- > # else // we're using the serial scheduler, so we only need to get messages > > while (Pooma::incomingMessages()) > > This change worries me. Doesn't this change the code's meaning? -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 16 23:47:44 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 16 Aug 2004 16:47:44 -0700 Subject: Cheetah CVS Available Message-ID: <412147A0.2020908@codesourcery.com> Cheetah source code is now available via a CVS repository. See http://www.codesourcery.com/pooma/development.html for directions to access the repository. The repository was populated with Cheetah 1.1.4 together with Richard Guenther's patches and my modifications to return support for shared memory parallelism. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Aug 17 07:12:36 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 17 Aug 2004 09:12:36 +0200 (CEST) Subject: [pooma-dev] Re: [PATCH] Robustify async MPI request handling In-Reply-To: <412139AF.90900@codesourcery.com> Message-ID: On Mon, 16 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > > The following patch fixes an error and robustifies MPI request handling. > > > > Tested by having it in my local tree for a long time. > > > > Ok? > > > > Richard. > > > > > > 2004Aug16 Richard Guenther > > > > * src/Threads/IterateSchedulers/SerialAsync.h: Guard against > > LAM MPI automatically dragging in C++ support, fix message > > polling return value check, complete messages first, remove > > unused variable. > > What problems does mpicxx.h cause? I am both curious and want to know > so I can approve the patch. The problem is incompatible C++ ABIs for the compiler used to build LAM (gcc 2.95) and the compiler I try to build POOMA with (gcc 3.4), so linking will fail either with ABI problems or missing symbols if not linking the C++ support libraries (as the header somehow manages to pull symbols regardless of not using any of the C++ support). As we don't use any of the MPI C++ API we don't need its declarations either. Other MPI implementations require you to explicitly pull mpicxx.h, but LAM aims to be clever in just doing #ifdef __cplusplus #include #endif which I think is a bug in LAM, but can be easily worked around by us. But I can leave this chunk of the patch out, if you like. Richard. > >------------------------------------------------------------------------ > > > >Index: SerialAsync.h > >=================================================================== > >RCS file: /home/pooma/Repository/r2/src/Threads/IterateSchedulers/SerialAsync.h,v > >retrieving revision 1.11 > >diff -u -u -r1.11 SerialAsync.h > >--- SerialAsync.h 8 Jan 2004 21:45:49 -0000 1.11 > >+++ SerialAsync.h 16 Aug 2004 19:22:33 -0000 > >@@ -72,6 +72,7 @@ > > #include > > #include "Pooma/Configuration.h" > > #if POOMA_MPI > >+# define MPIPP_H // prevent lam mpicxx.h from being included > > # include > > #endif > > > > > > > -- > Jeffrey D. Oldham > oldham at codesourcery.com > -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Tue Aug 17 07:18:39 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 17 Aug 2004 09:18:39 +0200 (CEST) Subject: [PATCH] kill POOMA_REORDER_ITERATES In-Reply-To: <41213C7B.4050905@codesourcery.com> Message-ID: On Mon, 16 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > >===== r2/src/Pooma/Pooma.cmpl.cpp 1.3 vs edited ===== > >--- 1.3/r2/src/Pooma/Pooma.cmpl.cpp 2004-01-17 16:20:23 +01:00 > >+++ edited/r2/src/Pooma/Pooma.cmpl.cpp 2004-08-16 16:47:24 +02:00 > >@@ -803,10 +803,6 @@ > > SystemContext_t::runSomething(); > > } > > > >-# elif POOMA_REORDER_ITERATES > >- > >- CTAssert(NO_SUPPORT_FOR_THREADS_WITH_MESSAGING); > >- > > # else // we're using the serial scheduler, so we only need to get messages > > > > while (Pooma::incomingMessages()) > > > > > This change worries me. Doesn't this change the code's meaning? Hm. Looking at the context I suppose not: #if POOMA_CHEETAH # if POOMA_SMARTS_SCHEDULER_SERIALASYNC typedef Smarts::SystemContext SystemContext_t; while (Pooma::incomingMessages() || SystemContext_t::workReady()) { controller_g->poll(); SystemContext_t::runSomething(); } # elif POOMA_REORDER_ITERATES CTAssert(NO_SUPPORT_FOR_THREADS_WITH_MESSAGING); # else // we're using the serial scheduler, so we only need to get messages while (Pooma::incomingMessages()) { controller_g->poll(); } # endif // schedulers #else // !POOMA_CHEETAH mainScheduler_s.blockingEvaluate(); #endif // !POOMA_CHEETAH I think the check for POOMA_REORDER_ITERATES was bogous, as for POOMA_SMARTS_SCHEDULER_SERIALASYNC it would have been true and in the other case (!POOMA_SMARTS_SCHEDULER_SERIALASYNC) not set anyways. To check for threads here would need checking POOMA_THREADS (smarts support) or POOMA_OPENMP (OpenMP support), but this checking is already done at configure time. Also I believe using Cheetah (MPI) with the serial scheduler does not work at all. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Tue Aug 17 15:25:42 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 17 Aug 2004 08:25:42 -0700 Subject: [pooma-dev] Re: [PATCH] Robustify async MPI request handling In-Reply-To: References: Message-ID: <41222376.5060304@codesourcery.com> Richard Guenther wrote: >On Mon, 16 Aug 2004, Jeffrey D. Oldham wrote: > > > >>Richard Guenther wrote: >> >> >> >>>The following patch fixes an error and robustifies MPI request handling. >>> >>>Tested by having it in my local tree for a long time. >>> >>>Ok? >>> >>>Richard. >>> >>> >>>2004Aug16 Richard Guenther >>> >>> * src/Threads/IterateSchedulers/SerialAsync.h: Guard against >>> LAM MPI automatically dragging in C++ support, fix message >>> polling return value check, complete messages first, remove >>> unused variable. >>> >>> >>What problems does mpicxx.h cause? I am both curious and want to know >>so I can approve the patch. >> >> > >The problem is incompatible C++ ABIs for the compiler used to build LAM >(gcc 2.95) and the compiler I try to build POOMA with (gcc 3.4), so >linking will fail either with ABI problems or missing symbols if not >linking the C++ support libraries (as the header somehow manages to pull >symbols regardless of not using any of the C++ support). As we don't use >any of the MPI C++ API we don't need its declarations either. Other >MPI implementations require you to explicitly pull mpicxx.h, but LAM aims >to be clever in just doing > >#ifdef __cplusplus >#include >#endif > >which I think is a bug in LAM, but can be easily worked around by us. > >But I can leave this chunk of the patch out, if you like. > > I now understand: o mpicxx.h contains the C++ interface to MPI. o Pooma does not use this MPI interface. I am confused about LAM and gcc 2.95 since I sometimes use LAM with gcc 3.x.y. If your problem goes away by using gcc 3.4 with LAM, let's omit this special-purpose code and commit the rest of this patch. Otherwise, the entire patch is fine. >Richard. > > > >>>------------------------------------------------------------------------ >>> >>>Index: SerialAsync.h >>>=================================================================== >>>RCS file: /home/pooma/Repository/r2/src/Threads/IterateSchedulers/SerialAsync.h,v >>>retrieving revision 1.11 >>>diff -u -u -r1.11 SerialAsync.h >>>--- SerialAsync.h 8 Jan 2004 21:45:49 -0000 1.11 >>>+++ SerialAsync.h 16 Aug 2004 19:22:33 -0000 >>>@@ -72,6 +72,7 @@ >>>#include >>>#include "Pooma/Configuration.h" >>>#if POOMA_MPI >>>+# define MPIPP_H // prevent lam mpicxx.h from being included >>># include >>>#endif >>> >>> >>> >>> >>-- >>Jeffrey D. Oldham >>oldham at codesourcery.com >> >> >> > >-- >Richard Guenther >WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ > > > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Aug 17 15:31:57 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 17 Aug 2004 17:31:57 +0200 Subject: [pooma-dev] Re: [PATCH] Robustify async MPI request handling In-Reply-To: <41222376.5060304@codesourcery.com> References: <41222376.5060304@codesourcery.com> Message-ID: <412224ED.8010601@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> On Mon, 16 Aug 2004, Jeffrey D. Oldham wrote: >> >> >> >>> Richard Guenther wrote: >>> >>> >>> >>>> The following patch fixes an error and robustifies MPI request >>>> handling. >>>> >>>> Tested by having it in my local tree for a long time. >>>> >>>> Ok? >>>> >>>> Richard. >>>> >>>> >>>> 2004Aug16 Richard Guenther >>>> >>>> * src/Threads/IterateSchedulers/SerialAsync.h: Guard against >>>> LAM MPI automatically dragging in C++ support, fix message >>>> polling return value check, complete messages first, remove >>>> unused variable. >>>> >>> >>> What problems does mpicxx.h cause? I am both curious and want to know >>> so I can approve the patch. >>> >> >> >> The problem is incompatible C++ ABIs for the compiler used to build LAM >> (gcc 2.95) and the compiler I try to build POOMA with (gcc 3.4), so >> linking will fail either with ABI problems or missing symbols if not >> linking the C++ support libraries (as the header somehow manages to pull >> symbols regardless of not using any of the C++ support). As we don't use >> any of the MPI C++ API we don't need its declarations either. Other >> MPI implementations require you to explicitly pull mpicxx.h, but LAM aims >> to be clever in just doing >> >> #ifdef __cplusplus >> #include >> #endif >> >> which I think is a bug in LAM, but can be easily worked around by us. >> >> But I can leave this chunk of the patch out, if you like. >> >> > I now understand: > o mpicxx.h contains the C++ interface to MPI. > o Pooma does not use this MPI interface. > > I am confused about LAM and gcc 2.95 since I sometimes use LAM with gcc > 3.x.y. If your problem goes away by using gcc 3.4 with LAM, let's omit > this special-purpose code and commit the rest of this patch. Otherwise, > the entire patch is fine. The problem should be going away as long as POOMA is built with the same compiler as LAM was. So I'll commit without this chunk for now. Thanks, Richard. From rguenth at tat.physik.uni-tuebingen.de Tue Aug 17 15:37:14 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 17 Aug 2004 17:37:14 +0200 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <41213850.5090700@codesourcery.com> References: <4121162F.5090104@tat.physik.uni-tuebingen.de> <41213850.5090700@codesourcery.com> Message-ID: <4122262A.8090203@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> Would there be any objections to the removal of the workarounds for >> pre-ISO C++ compilers like >> >> Pretty much any up-to-date compiler handles these correctly today. >> Also not all such uses are guarded by the workarounds and I lack a >> dumb enough compiler to check their correct usage. >> >> Any thoughts? >> >> Richard. > > > There are still a lot of gcc 2.95 and related compilers in use today. I > prefer to leave them but let them rot unless there is a compelling > reason to remove them now. I see. I'd remove them only to unclutter the source and maybe increase maintainability if formally stating we require an ISO conformant compiler. Oh - we do so already: This version incorporates other minor source code changes to support compilation using g++ version 3.1 and some improvements to POOMA Fields. Compilation using g++ version 2.96 is no longer supported. g++ version 3.1 is freely available at http://gcc.gnu.org/. POOMA has also been tested using KAI C++ 4.0e. Richard. From oldham at codesourcery.com Tue Aug 17 16:02:16 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 17 Aug 2004 09:02:16 -0700 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <4122262A.8090203@tat.physik.uni-tuebingen.de> References: <4121162F.5090104@tat.physik.uni-tuebingen.de> <41213850.5090700@codesourcery.com> <4122262A.8090203@tat.physik.uni-tuebingen.de> Message-ID: <41222C08.9070506@codesourcery.com> Richard Guenther wrote: > Jeffrey D. Oldham wrote: > >> Richard Guenther wrote: >> >>> Would there be any objections to the removal of the workarounds for >>> pre-ISO C++ compilers like >>> >>> Pretty much any up-to-date compiler handles these correctly today. >>> Also not all such uses are guarded by the workarounds and I lack a >>> dumb enough compiler to check their correct usage. >>> >>> Any thoughts? >>> >>> Richard. >> >> >> >> There are still a lot of gcc 2.95 and related compilers in use >> today. I prefer to leave them but let them rot unless there is a >> compelling reason to remove them now. > > > I see. I'd remove them only to unclutter the source and maybe > increase maintainability if formally stating we require an ISO > conformant compiler. Oh - we do so already: > > > This version incorporates other minor source code changes to support > compilation using g++ version 3.1 and some improvements to POOMA > Fields. Compilation using g++ version 2.96 is no longer supported. > g++ version 3.1 is freely available at http://gcc.gnu.org/. POOMA has > also been tested using KAI C++ 4.0e. > > > Richard. Good point. Support for gcc 3.4 differs from support for gcc 3.x.y, x < 4, because 3.4 will correctly parse some constructs that gcc 3.x.y does not. What do you prefer we write in the README for a Pooma 2.5 release? That should drive our code changes. Work on VSIPL++ demonstrates that some templated C++ code that gcc 3.4 easily supports still breaks other compilers. For example, IBM Visual Age 6 (xlc++) can have difficulty parsing with template arguments. Intel C++ 8.0 for IA64, which I believe is the descendant of KAI C++, has trouble with template functions defined outside template classes. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Aug 17 16:59:52 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 17 Aug 2004 18:59:52 +0200 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <41222C08.9070506@codesourcery.com> References: <4121162F.5090104@tat.physik.uni-tuebingen.de> <41213850.5090700@codesourcery.com> <4122262A.8090203@tat.physik.uni-tuebingen.de> <41222C08.9070506@codesourcery.com> Message-ID: <41223988.2090104@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> Jeffrey D. Oldham wrote: >> >>> There are still a lot of gcc 2.95 and related compilers in use >>> today. I prefer to leave them but let them rot unless there is a >>> compelling reason to remove them now. >> >> >> >> I see. I'd remove them only to unclutter the source and maybe >> increase maintainability if formally stating we require an ISO >> conformant compiler. Oh - we do so already: >> >> >> This version incorporates other minor source code changes to support >> compilation using g++ version 3.1 and some improvements to POOMA >> Fields. Compilation using g++ version 2.96 is no longer supported. >> g++ version 3.1 is freely available at http://gcc.gnu.org/. POOMA has >> also been tested using KAI C++ 4.0e. >> >> >> Richard. > > > Good point. Support for gcc 3.4 differs from support for gcc 3.x.y, x < > 4, because 3.4 will correctly parse some constructs that gcc 3.x.y does > not. What do you prefer we write in the README for a Pooma 2.5 > release? That should drive our code changes. I think we should state that we require a ISO standard conforming compiler and standard library. But we should restrict ourselves to using those parts of the standard that are supported by all recent compilers (gcc 3.3, Intel 7.2). I.e. we don't use template template parameters. But working around missing std::min/max or std::complex. Requiring to code like Utilities/Algorithms.h: template inline #if POOMA_NONSTANDARD_ITERATOR typename std::iterator_traits::distance_type #else typename std::iterator_traits::difference_type #endif delete_backfill(DataIterator data_begin, DataIterator data_end, const KillIterator kill_begin, const KillIterator kill_end, #if POOMA_NONSTANDARD_ITERATOR typename std::iterator_traits::distance_type offset = 0) #else typename std::iterator_traits::difference_type offset = 0) #endif { ... doesn't help maintainability either. > Work on VSIPL++ demonstrates that some templated C++ code that gcc 3.4 > easily supports still breaks other compilers. For example, IBM Visual > Age 6 (xlc++) can have difficulty parsing with template arguments. > Intel C++ 8.0 for IA64, which I believe is the descendant of KAI C++, > has trouble with template functions defined outside template classes. I think we should identify a set of compilers we can test compatibility with ourselves and formally state we require ISO conformance. We then can list a set of tested compilers along with testresults for them. A document describing our preffered coding style along with usable language subset would be greatly appreciated, too. I can start a coding style / conformance document and produce an initial readme for an upcoming release if you like. Richard. From oldham at codesourcery.com Tue Aug 17 17:33:09 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 17 Aug 2004 10:33:09 -0700 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <41223988.2090104@tat.physik.uni-tuebingen.de> References: <4121162F.5090104@tat.physik.uni-tuebingen.de> <41213850.5090700@codesourcery.com> <4122262A.8090203@tat.physik.uni-tuebingen.de> <41222C08.9070506@codesourcery.com> <41223988.2090104@tat.physik.uni-tuebingen.de> Message-ID: <41224155.1070007@codesourcery.com> Richard Guenther wrote: > Jeffrey D. Oldham wrote: > >> Richard Guenther wrote: >> >>> Jeffrey D. Oldham wrote: >>> >>>> There are still a lot of gcc 2.95 and related compilers in use >>>> today. I prefer to leave them but let them rot unless there is a >>>> compelling reason to remove them now. >>> >>> >>> >>> >>> I see. I'd remove them only to unclutter the source and maybe >>> increase maintainability if formally stating we require an ISO >>> conformant compiler. Oh - we do so already: >>> >>> >>> This version incorporates other minor source code changes to support >>> compilation using g++ version 3.1 and some improvements to POOMA >>> Fields. Compilation using g++ version 2.96 is no longer supported. >>> g++ version 3.1 is freely available at http://gcc.gnu.org/. POOMA has >>> also been tested using KAI C++ 4.0e. >>> >>> >>> Richard. >> >> >> >> Good point. Support for gcc 3.4 differs from support for gcc 3.x.y, x >> < 4, because 3.4 will correctly parse some constructs that gcc 3.x.y >> does not. What do you prefer we write in the README for a Pooma 2.5 >> release? That should drive our code changes. > > > I think we should state that we require a ISO standard conforming > compiler and standard library. But we should restrict ourselves to > using those parts of the standard that are supported by all recent > compilers (gcc 3.3, Intel 7.2). I.e. we don't use template template > parameters. > > But working around missing std::min/max or std::complex. Requiring to > code like Utilities/Algorithms.h: > > template > inline > #if POOMA_NONSTANDARD_ITERATOR > typename std::iterator_traits::distance_type > #else > typename std::iterator_traits::difference_type > #endif > delete_backfill(DataIterator data_begin, DataIterator data_end, > const KillIterator kill_begin, const KillIterator kill_end, > #if POOMA_NONSTANDARD_ITERATOR > typename std::iterator_traits::distance_type offset = 0) > #else > typename std::iterator_traits::difference_type offset = 0) > #endif > { > ... > > doesn't help maintainability either. > >> Work on VSIPL++ demonstrates that some templated C++ code that gcc 3.4 >> easily supports still breaks other compilers. For example, IBM Visual >> Age 6 (xlc++) can have difficulty parsing with template arguments. >> Intel C++ 8.0 for IA64, which I believe is the descendant of KAI C++, >> has trouble with template functions defined outside template classes. > > > I think we should identify a set of compilers we can test compatibility > with ourselves and formally state we require ISO conformance. We then > can list a set of tested compilers along with test results for them. A > document describing our preferred coding style along with usable > language subset would be greatly appreciated, too. > > I can start a coding style / conformance document and produce an initial > readme for an upcoming release if you like. > > Richard. Yes, I think this is a good approach, but it's probably sufficient for now to write one or two paragraphs in the README file describing compilation requirements. Compilation and conformance is probably more important and easier to write and more useful to user than a coding style document so I would prefer to put our energies into the former if we do not have energy for both. gcc 3.3 and gcc 3.4 differ significantly in parsing because 3.3 uses an LALR-based parser while 3.4 uses recursive descent. The difference is more than just template-template parameters. Despite this, I think we should support gcc 3.2 or 3.3 if we can. The amount of testing is non-trivial since we have several variables: serial v. distributed distributed: MPI-only, Cheetah+MPI, Cheetah+MM various compilers I would prefer to keep the compiler list relatively short and containing the most popular compliant compilers. Would you be willing to start modifying the README file for a release? -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Aug 17 17:47:06 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 17 Aug 2004 19:47:06 +0200 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <41224155.1070007@codesourcery.com> References: <4121162F.5090104@tat.physik.uni-tuebingen.de> <41213850.5090700@codesourcery.com> <4122262A.8090203@tat.physik.uni-tuebingen.de> <41222C08.9070506@codesourcery.com> <41223988.2090104@tat.physik.uni-tuebingen.de> <41224155.1070007@codesourcery.com> Message-ID: <4122449A.1080209@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> Jeffrey D. Oldham wrote: >> >>> Richard Guenther wrote: >>> >>>> Jeffrey D. Oldham wrote: >>>> >>>>> There are still a lot of gcc 2.95 and related compilers in use >>>>> today. I prefer to leave them but let them rot unless there is a >>>>> compelling reason to remove them now. >>>> >>>> >>>> >>>> >>>> >>>> I see. I'd remove them only to unclutter the source and maybe >>>> increase maintainability if formally stating we require an ISO >>>> conformant compiler. Oh - we do so already: >>>> >>>> >>>> This version incorporates other minor source code changes to support >>>> compilation using g++ version 3.1 and some improvements to POOMA >>>> Fields. Compilation using g++ version 2.96 is no longer supported. >>>> g++ version 3.1 is freely available at http://gcc.gnu.org/. POOMA has >>>> also been tested using KAI C++ 4.0e. >>>> >>>> >>>> Richard. >>> >>> >>> >>> >>> Good point. Support for gcc 3.4 differs from support for gcc 3.x.y, >>> x < 4, because 3.4 will correctly parse some constructs that gcc >>> 3.x.y does not. What do you prefer we write in the README for a >>> Pooma 2.5 release? That should drive our code changes. >> >> >> >> I think we should state that we require a ISO standard conforming >> compiler and standard library. But we should restrict ourselves to >> using those parts of the standard that are supported by all recent >> compilers (gcc 3.3, Intel 7.2). I.e. we don't use template template >> parameters. >> >> But working around missing std::min/max or std::complex. Requiring to >> code like Utilities/Algorithms.h: >> >> template >> inline >> #if POOMA_NONSTANDARD_ITERATOR >> typename std::iterator_traits::distance_type >> #else >> typename std::iterator_traits::difference_type >> #endif >> delete_backfill(DataIterator data_begin, DataIterator data_end, >> const KillIterator kill_begin, const KillIterator kill_end, >> #if POOMA_NONSTANDARD_ITERATOR >> typename std::iterator_traits::distance_type offset = 0) >> #else >> typename std::iterator_traits::difference_type offset >> = 0) >> #endif >> { >> ... >> >> doesn't help maintainability either. >> >>> Work on VSIPL++ demonstrates that some templated C++ code that gcc >>> 3.4 easily supports still breaks other compilers. For example, IBM >>> Visual Age 6 (xlc++) can have difficulty parsing with template >>> arguments. Intel C++ 8.0 for IA64, which I believe is the descendant >>> of KAI C++, has trouble with template functions defined outside >>> template classes. >> >> >> >> I think we should identify a set of compilers we can test >> compatibility with ourselves and formally state we require ISO >> conformance. We then can list a set of tested compilers along with >> test results for them. A document describing our preferred coding >> style along with usable language subset would be greatly appreciated, >> too. >> >> I can start a coding style / conformance document and produce an >> initial readme for an upcoming release if you like. >> >> Richard. > > > Yes, I think this is a good approach, but it's probably sufficient for > now to write one or two paragraphs in the README file describing > compilation requirements. Compilation and conformance is probably more > important and easier to write and more useful to user than a coding > style document so I would prefer to put our energies into the former if > we do not have energy for both. > > gcc 3.3 and gcc 3.4 differ significantly in parsing because 3.3 uses an > LALR-based parser while 3.4 uses recursive descent. The difference is > more than just template-template parameters. Despite this, I think we > should support gcc 3.2 or 3.3 if we can. > > The amount of testing is non-trivial since we have several variables: > > serial v. distributed > distributed: MPI-only, Cheetah+MPI, Cheetah+MM > various compilers > > I would prefer to keep the compiler list relatively short and containing > the most popular compliant compilers. > > Would you be willing to start modifying the README file for a release? Yes, I can start writing up something along with removing notes for releases preceding 2.4 - the information therein is somewhat misleading now (maybe we can rotate the README file into docs/README-2.3). Richard. From oldham at codesourcery.com Tue Aug 17 19:57:08 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 17 Aug 2004 12:57:08 -0700 Subject: [PATCH] kill POOMA_REORDER_ITERATES In-Reply-To: References: Message-ID: <41226314.2020106@codesourcery.com> Richard Guenther wrote: >On Mon, 16 Aug 2004, Jeffrey D. Oldham wrote: > > > >>Richard Guenther wrote: >> >> >> >>>===== r2/src/Pooma/Pooma.cmpl.cpp 1.3 vs edited ===== >>>--- 1.3/r2/src/Pooma/Pooma.cmpl.cpp 2004-01-17 16:20:23 +01:00 >>>+++ edited/r2/src/Pooma/Pooma.cmpl.cpp 2004-08-16 16:47:24 +02:00 >>>@@ -803,10 +803,6 @@ >>> SystemContext_t::runSomething(); >>> } >>> >>>-# elif POOMA_REORDER_ITERATES >>>- >>>- CTAssert(NO_SUPPORT_FOR_THREADS_WITH_MESSAGING); >>>- >>># else // we're using the serial scheduler, so we only need to get messages >>> >>> while (Pooma::incomingMessages()) >>> >>> >>> >>> >>This change worries me. Doesn't this change the code's meaning? >> >> > >Hm. Looking at the context I suppose not: > >#if POOMA_CHEETAH > ># if POOMA_SMARTS_SCHEDULER_SERIALASYNC > > typedef Smarts::SystemContext SystemContext_t; > > while (Pooma::incomingMessages() || SystemContext_t::workReady()) > { > controller_g->poll(); > SystemContext_t::runSomething(); > } > ># elif POOMA_REORDER_ITERATES > > CTAssert(NO_SUPPORT_FOR_THREADS_WITH_MESSAGING); > ># else // we're using the serial scheduler, so we only need to get >messages > > while (Pooma::incomingMessages()) > { > controller_g->poll(); > } > ># endif // schedulers > >#else // !POOMA_CHEETAH > > mainScheduler_s.blockingEvaluate(); > >#endif // !POOMA_CHEETAH > > >I think the check for POOMA_REORDER_ITERATES was bogous, as for >POOMA_SMARTS_SCHEDULER_SERIALASYNC it would have been true and >in the other case (!POOMA_SMARTS_SCHEDULER_SERIALASYNC) not set >anyways. To check for threads here would need checking POOMA_THREADS >(smarts support) or POOMA_OPENMP (OpenMP support), but this checking >is already done at configure time. > >Also I believe using Cheetah (MPI) with the serial scheduler does not work >at all. > > I guess it's OK. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Wed Aug 18 09:49:29 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 18 Aug 2004 11:49:29 +0200 (CEST) Subject: [PATCH] Fix ScalarCode with expression arguments Message-ID: This patch fixes expression arguments with (read-only) arguments to ScalarCode. Before this patch there were several problems with this: - updating of the engine state did not handle expression engines - internal guards were not updated correctly With fixing the above ScalarCode also gains from the previous guard layer update optimizations. Tested with all ScalarCode tests and Evaluator tests. Ok? Richard. Btw. the test is evaluatorTest10 - patches to merge other tests are on the way. 2004Aug18 Richard Guenther * src/Evaluator/MultiArgEvaluator.h: handle expression engines in EngineWriteNotifier, pass stencil extent to SimpleIntersector. src/Evaluator/SimpleIntersector.h: honour stencil extent, recursively intersect and update expression engines. src/Evaluator/tests/evaluatorTest10.cpp: new. -------------- next part -------------- Index: MultiArgEvaluator.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/MultiArgEvaluator.h,v retrieving revision 1.14 diff -u -u -r1.14 MultiArgEvaluator.h --- MultiArgEvaluator.h 21 Nov 2003 17:36:10 -0000 1.14 +++ MultiArgEvaluator.h 18 Aug 2004 09:41:57 -0000 @@ -74,6 +74,8 @@ //----------------------------------------------------------------------------- template struct MultiArgEvaluatorTag; +template class Field; +template class Array; /** * Implements: MultiArgEvaluator::evaluate @@ -111,19 +113,30 @@ } template - void operator()(const A &a, bool f) const + void operator()(const A &a) const { - if (f) - { - // This isn't quite what we want here, because we may want to - // write to a field containing multiple centering engines. - // Need to rewrite notifyEngineWrite as an ExpressionApply, - // and create a version of ExpressionApply that goes through - // all the engines in a field. + // This isn't quite what we want here, because we may want to + // write to a field containing multiple centering engines. + // Need to rewrite notifyEngineWrite as an ExpressionApply, + // and create a version of ExpressionApply that goes through + // all the engines in a field. - notifyEngineWrite(a.engine()); - dirtyRelations(a, WrappedInt()); - } + notifyEngineWrite(a.engine()); + dirtyRelations(a, WrappedInt()); + } + + // overload for ExpressionTag engines to not fall on our faces compile time + template + void operator()(const Field >&) const + { + // we must be able to compile this, but never execute + PInsist(false, "writing to expression engine?"); + } + template + void operator()(const Array >&) const + { + // we must be able to compile this, but never execute + PInsist(false, "writing to expression engine?"); } }; @@ -172,7 +185,7 @@ MultiArgEvaluator::evaluate(multiArg, function, domain, info, kernel); - applyMultiArg(multiArg, EngineWriteNotifier(), info.writers()); + applyMultiArgIf(multiArg, EngineWriteNotifier(), info.writers()); Pooma::endExpression(); } @@ -265,7 +278,12 @@ const Kernel &kernel) { typedef SimpleIntersector Inter_t; - Inter_t inter(domain); + GuardLayers extent; + for (int i=0; i Inter_t; - Inter_t inter(domain); + GuardLayers extent; + for (int i=0; i &domain) - : seenFirst_m(false), domain_m(domain) + inline SimpleIntersectorData(const Interval &domain, const GuardLayers &extent) + : seenFirst_m(false), domain_m(domain), extent_m(extent) { } @@ -105,9 +105,10 @@ inline ~SimpleIntersectorData() { } template - void intersect(const Engine &engine) + void intersect(const Engine &engine, bool useGuards) { typedef typename Engine::Layout_t Layout_t; + typedef typename NewEngine >::Type_t NewEngine_t; const Layout_t &layout(engine.layout()); // add an assertion that all layouts have the same base (probably @@ -126,6 +127,15 @@ { shared(layout.ID(), firstID_m); } + // We need to process possible expression engines with different + // guard needs here. Modeled after StencilIntersector. + if (useGuards) { + expressionApply(NewEngine_t(engine, grow(domain_m, extent_m)), + IntersectorTag >(lhsi_m)); + } else { + expressionApply(NewEngine_t(engine, domain_m), + IntersectorTag >(lhsi_m)); + } } inline @@ -149,10 +159,14 @@ INodeContainer_t inodes_m; GlobalIDDataBase gidStore_m; Interval domain_m; + GuardLayers extent_m; + Intersector lhsi_m; }; /** - * This intersector handles matching layouts only. + * This intersector handles matching layouts only. It also assumes you + * know in advance the amount of guards used. But it allows differentiating + * between engines that use or do not use guards. * * It doesnt intersect individual layouts but is done with creating INodes * from the first layout it sees by intersecting with the domain. @@ -179,8 +193,8 @@ enum { dimensions = Dim }; - SimpleIntersector(const Interval &domain) - : pdata_m(new SimpleIntersectorData_t(domain)), useGuards_m(true) + SimpleIntersector(const Interval &domain, const GuardLayers &extent) + : pdata_m(new SimpleIntersectorData_t(domain, extent)), useGuards_m(true) { } SimpleIntersector(const This_t &model) @@ -189,8 +203,10 @@ This_t &operator=(const This_t &model) { - if (this != &model) + if (this != &model) { pdata_m = model.pdata_m; + useGuards_m = model.useGuards_m; + } return *this; } @@ -221,7 +237,8 @@ inline void intersect(const Engine &l) const { - data()->intersect(l); + data()->intersect(l, useGuards()); + } inline @@ -236,7 +253,7 @@ useGuards_m = f; } - // Interface to be used by applyNode() + // Interface to be used by applyMultiArg() template void operator()(const A &a, bool f) const @@ -284,39 +301,39 @@ // with the enclosed intersector. //--------------------------------------------------------------------------- -template +template struct LeafFunctor >, - ExpressionApply > > + ExpressionApply > > { typedef int Type_t; static Type_t apply(const Engine > &engine, - const ExpressionApply > &apply) + const ExpressionApply > &apply) { apply.tag().intersect(engine); if (apply.tag().useGuards()) - engine.fillGuards(); + engine.fillGuards(apply.tag().data()->extent_m); return 0; } }; -template +template struct LeafFunctor >, - ExpressionApply > > + ExpressionApply > > { typedef int Type_t; static Type_t apply(const Engine > &engine, - const ExpressionApply > &apply) + const ExpressionApply > &apply) { apply.tag().intersect(engine); if (apply.tag().useGuards()) - engine.fillGuards(); + engine.fillGuards(apply.tag().data()->extent_m); return 0; } --- /dev/null Tue May 18 17:20:27 2004 +++ tests/evaluatorTest10.cpp Wed Aug 18 11:19:58 2004 @@ -0,0 +1,108 @@ +// -*- C++ -*- +// ACL:license +// ---------------------------------------------------------------------- +// This software and ancillary information (herein called "SOFTWARE") +// called POOMA (Parallel Object-Oriented Methods and Applications) is +// made available under the terms described here. The SOFTWARE has been +// approved for release with associated LA-CC Number LA-CC-98-65. +// +// Unless otherwise indicated, this SOFTWARE has been authored by an +// employee or employees of the University of California, operator of the +// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with +// the U.S. Department of Energy. The U.S. Government has rights to use, +// reproduce, and distribute this SOFTWARE. The public may copy, distribute, +// prepare derivative works and publicly display this SOFTWARE without +// charge, provided that this Notice and any statement of authorship are +// reproduced on all copies. Neither the Government nor the University +// makes any warranty, express or implied, or assumes any liability or +// responsibility for the use of this SOFTWARE. +// +// If SOFTWARE is modified to produce derivative works, such modified +// SOFTWARE should be clearly marked, so as not to confuse it with the +// version available from LANL. +// +// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, +// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. +// ---------------------------------------------------------------------- +// ACL:license + +//----------------------------------------------------------------------------- +// evaluatorTest5 - testing ScalarCode and expression arguments +//----------------------------------------------------------------------------- + +#include "Pooma/Pooma.h" +#include "Pooma/Arrays.h" +#include "Evaluator/ScalarCode.h" +#include "Utilities/Tester.h" +#include + + +// ScalarCode just evaluating/assigning an expression + +struct EvaluateExpr +{ + EvaluateExpr() {} + + template + inline void operator()(const LHS &a, const RHS &b, const Loc<1> &i) const + { + a(i) = b.read(i); + } + + void scalarCodeInfo(ScalarCodeInfo& i) const + { + i.arguments(2); + i.dimensions(1); + i.write(0, true); + i.write(1, false); + i.useGuards(0, false); + i.useGuards(1, false); + } +}; + + +int main(int argc, char *argv[]) +{ + // Initialize POOMA and output stream, using Tester class + Pooma::initialize(argc, argv); + Pooma::Tester tester(argc, argv); + + Pooma::blockingExpressions(true); + + Interval<1> domain(8); + UniformGridLayout<1> layout(domain, Loc<1>(2), GuardLayers<1>(1), DistributedTag()); + + Array<1, int, MultiPatch > > + a(layout), b(layout), c(layout); + + a = 0; + b = 1; + c = 2; + ScalarCode()(a, c-b); + tester.check("a = c - b", all(a(domain) == 1)); + tester.out() << a(domain) << std::endl; + + a = 0; + ScalarCode()(a, b(domain-1)+c(domain+1)); + tester.check("a = b(i-1) + c(i+1)", all(a(domain) == 3)); + tester.out() << a(domain) << std::endl; + + tester.out() << "Manually triggering igc fill" << std::endl; + b.engine().fillGuards(); + c.engine().fillGuards(); + a = 0; + ScalarCode()(a, b(domain-1)+c(domain+1)); + tester.check("a = b(i-1) + c(i+1)", all(a(domain) == 3)); + tester.out() << a(domain) << std::endl; + + int retval = tester.results("evaluatorTest10 (ScalarCode with expressions)"); + Pooma::finalize(); + return retval; +} + +// ACL:rcsinfo +// ---------------------------------------------------------------------- +// $RCSfile: evaluatorTest5.cpp,v $ $Author: pooma $ +// $Revision: 1.1 $ $Date: 2003/02/20 16:39:42 $ +// ---------------------------------------------------------------------- +// ACL:rcsinfo From rguenth at tat.physik.uni-tuebingen.de Wed Aug 18 10:03:10 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 18 Aug 2004 12:03:10 +0200 (CEST) Subject: [PATCH] Allow custom evaluation domain for ScalarCode Message-ID: This patch adds the ability to provide a custom evaluation domain for a ScalarCode expression (like including external guards or excluding the boundary from vertex centered fields). This is much less fragile than trying to pass appropriate views as arguments. Tested with Evaluator and ScalarCode tests. Ok? Richard. 2004Aug18 Richard Guenther * src/Evaluator/ScalarCode.h: add variants of operator() with specified evaluation domain. src/Evaluator/tests/evaluatorTest9.cpp: new. -------------- next part -------------- Index: ScalarCode.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/ScalarCode.h,v retrieving revision 1.13 diff -u -u -r1.13 ScalarCode.h --- ScalarCode.h 7 Apr 2004 16:38:23 -0000 1.13 +++ ScalarCode.h 18 Aug 2004 09:52:47 -0000 @@ -391,6 +391,19 @@ Interval domain_m; }; + +/** + * ScalarCode is a Stencil like operation that allows for more than one + * field to be operated on. Generally the functor is a local (set of) + * function(s) which could be described as + * + * (f1..fM) = op(fM+1..fN) + * + * where fM+1 to fN are input fields read from and f1 to fM are output + * fields written to (this distinction nor its ordering is strictly + * required, but both will result in the least possible surprises). + */ + template struct ScalarCode { @@ -427,113 +440,149 @@ return f.centeringSize() == 1 && f.numMaterials() == 1; } + /// @name Evaluators + /// Evaluate the ScalarCode functor on the fields f1 to fN using the + /// specified evaluation domain. Note that views of the evaluation domain + /// are taken of every field, so domains of the fields should be strictly + /// conforming (in fact, passing views to these operators is a bug unless + /// you really know what you are doing). + /// + /// The evaluation domain defaults to the physical domain of + /// the first field which should usually be (on of) the left hand side(s). + /// If you want the functor to operate on a different domain use the + /// operators with the explicit specified evaluation domain. + //@{ + template - void operator()(const F1 &f1) const + void operator()(const F1 &f1, const Interval &evalDom) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg1 multiArg(f1); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); + } + + template + inline void operator()(const F1 &f1) const + { + (*this)(f1, f1.physicalDomain()); } + template - void operator()(const F1 &f1, const F2 &f2) const + void operator()(const F1 &f1, const Interval &evalDom, + const F2 &f2) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg2 multiArg(f1, f2); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); } + template + inline void operator()(const F1 &f1, const F2 &f2) const + { + (*this)(f1, f1.physicalDomain(), f2); + } + + template - void operator()(const F1 &f1, const F2 &f2, const F3 &f3) const + void operator()(const F1 &f1, const Interval &evalDom, + const F2 &f2, const F3 &f3) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg3 multiArg(f1, f2, f3); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); } + template + inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3) const + { + (*this)(f1, f1.physicalDomain(), f2, f3); + } + + template - void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4) const + void operator()(const F1 &f1, const Interval &evalDom, + const F2 &f2, const F3 &f3, const F4 &f4) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg4 multiArg(f1, f2, f3, f4); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); } + template + inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4) const + { + (*this)(f1, f1.physicalDomain(), f2, f3, f4); + } + + template - void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, - const F5 &f5) const + void operator()(const F1 &f1, const Interval &evalDom, + const F2 &f2, const F3 &f3, const F4 &f4, const F5 &f5) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg5 multiArg(f1, f2, f3, f4, f5); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); } + template + inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, + const F5 &f5) const + { + (*this)(f1, f1.physicalDomain(), f2, f3, f4, f5); + } + + template - void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, - const F5 &f5, const F6 &f6) const + void operator()(const F1 &f1, const Interval &evalDom, + const F2 &f2, const F3 &f3, const F4 &f4, const F5 &f5, + const F6 &f6) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg6 multiArg(f1, f2, f3, f4, f5, f6); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); } + template + inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, + const F5 &f5, const F6 &f6) const + { + (*this)(f1, f1.physicalDomain(), f2, f3, f4, f5, f6); + } + + template - void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, + void operator()(const F1 &f1, const Interval &evalDom, + const F2 &f2, const F3 &f3, const F4 &f4, const F5 &f5, const F6 &f6, const F7 &f7) const { PAssert(checkValidity(f1, WrappedInt())); - enum { dimensions = F1::dimensions }; MultiArg7 multiArg(f1, f2, f3, f4, f5, f6, f7); - EvaluateLocLoop kernel(function_m, - f1.physicalDomain()); - + EvaluateLocLoop kernel(function_m, evalDom); MultiArgEvaluator:: - evaluate(multiArg, function_m, - f1.physicalDomain(), - kernel); + evaluate(multiArg, function_m, evalDom, kernel); } + + template + inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, + const F5 &f5, const F6 &f6, const F7 &f7) const + { + (*this)(f1, f1.physicalDomain(), f2, f3, f4, f5, f6, f7); + } + + //@} Function function_m; }; --- /dev/null Tue May 18 17:20:27 2004 +++ tests/evaluatorTest9.cpp Wed Aug 18 11:51:07 2004 @@ -0,0 +1,121 @@ +// -*- C++ -*- +// ACL:license +// ---------------------------------------------------------------------- +// This software and ancillary information (herein called "SOFTWARE") +// called POOMA (Parallel Object-Oriented Methods and Applications) is +// made available under the terms described here. The SOFTWARE has been +// approved for release with associated LA-CC Number LA-CC-98-65. +// +// Unless otherwise indicated, this SOFTWARE has been authored by an +// employee or employees of the University of California, operator of the +// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with +// the U.S. Department of Energy. The U.S. Government has rights to use, +// reproduce, and distribute this SOFTWARE. The public may copy, distribute, +// prepare derivative works and publicly display this SOFTWARE without +// charge, provided that this Notice and any statement of authorship are +// reproduced on all copies. Neither the Government nor the University +// makes any warranty, express or implied, or assumes any liability or +// responsibility for the use of this SOFTWARE. +// +// If SOFTWARE is modified to produce derivative works, such modified +// SOFTWARE should be clearly marked, so as not to confuse it with the +// version available from LANL. +// +// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, +// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. +// ---------------------------------------------------------------------- +// ACL:license + +//----------------------------------------------------------------------------- +// evaluatorTest9 - testing ScalarCode and custom evaluation domain +//----------------------------------------------------------------------------- + +#include "Pooma/Pooma.h" +#include "Pooma/Arrays.h" +#include "Pooma/Fields.h" // for PerformUpdateTag() only! +#include "Evaluator/ScalarCode.h" +#include "Utilities/Tester.h" +#include + + +// dummy operation + +template +struct Copy +{ + Copy(int val) : val_m(val) {} + + template + inline void operator()(const A &a, const Loc &i) const + { + a(i) = val_m; + } + + void scalarCodeInfo(ScalarCodeInfo& i) const + { + i.arguments(1); + i.dimensions(Dim); + i.write(1, true); + i.useGuards(0, false); + } + + const int val_m; +}; + + +int main(int argc, char *argv[]) +{ + // Initialize POOMA and output stream, using Tester class + Pooma::initialize(argc, argv); + Pooma::Tester tester(argc, argv); + + Pooma::blockingExpressions(true); + + Interval<2> domain(16, 16); + Loc<2> blocks(4, 4); + UniformGridLayout<2> layout(domain, blocks, GuardLayers<2>(1), DistributedTag()); + UniformRectilinearMesh<2> mesh(layout); + Centering<2> cell = canonicalCentering<2>(CellType, Continuous); + + Field, int, MultiPatch > > + a(cell, layout, mesh), + b(cell, layout, mesh); + + // initialize with zero + a.all() = 0; + b.all() = 0; + + // do assignments to various subdomains with both expression engine + // and scalar code functor and compare the full results. + Interval<2> I; + + (ScalarCode >(1))(a); + b = 1; + tester.check("default (physical) domain", all(a.all() == b.all())); + + I = Interval<2>(Interval<1>(8, 14), Interval<1>(0, 14)); + (ScalarCode >(2))(a, I); + b(I) = 2; + tester.check("partial set of physical patches", all(a.all() == b.all())); + + I = Interval<2>(Interval<1>(6, 9), Interval<1>(6, 9)); + (ScalarCode >(3))(a, I); + b(I) = 3; + tester.check("arbitrary physical domain", all(a.all() == b.all())); + + I = Interval<2>(Interval<1>(0, 15), Interval<1>(-1, 2)); + (ScalarCode >(4))(a, I); + b(I) = 4; + tester.check("arbitrary domain", all(a.all() == b.all())); + + int retval = tester.results("evaluatorTest9 (ScalarCode, evaluation domain)"); + Pooma::finalize(); + return retval; +} + +// ACL:rcsinfo +// ---------------------------------------------------------------------- +// $RCSfile: evaluatorTest2.cpp,v $ $Author: pooma $ +// $Revision: 1.7 $ $Date: 2003/01/29 19:32:07 $ +// ---------------------------------------------------------------------- +// ACL:rcsinfo From rguenth at tat.physik.uni-tuebingen.de Wed Aug 18 11:50:11 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 18 Aug 2004 13:50:11 +0200 (CEST) Subject: [PATCH] add extensive test for igc updates Message-ID: With igc update optimizations it is important to check if they work correctly in all cases, so here is a testcase that (tires to) enumerate all possible cases. Ok? Richard. 2004Aug18 Richard Guenther * src/Array/tests/array_test30.cpp: new. -------------- next part -------------- --- /dev/null Tue May 18 17:20:27 2004 +++ array_test30.cpp Wed Aug 18 13:47:46 2004 @@ -0,0 +1,143 @@ +// -*- C++ -*- +// ACL:license +// ---------------------------------------------------------------------- +// This software and ancillary information (herein called "SOFTWARE") +// called POOMA (Parallel Object-Oriented Methods and Applications) is +// made available under the terms described here. The SOFTWARE has been +// approved for release with associated LA-CC Number LA-CC-98-65. +// +// Unless otherwise indicated, this SOFTWARE has been authored by an +// employee or employees of the University of California, operator of the +// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with +// the U.S. Department of Energy. The U.S. Government has rights to use, +// reproduce, and distribute this SOFTWARE. The public may copy, distribute, +// prepare derivative works and publicly display this SOFTWARE without +// charge, provided that this Notice and any statement of authorship are +// reproduced on all copies. Neither the Government nor the University +// makes any warranty, express or implied, or assumes any liability or +// responsibility for the use of this SOFTWARE. +// +// If SOFTWARE is modified to produce derivative works, such modified +// SOFTWARE should be clearly marked, so as not to confuse it with the +// version available from LANL. +// +// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, +// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. +// ---------------------------------------------------------------------- +// ACL:license + +//----------------------------------------------------------------------------- +// array_test30: verify correctness of igc updates +//----------------------------------------------------------------------------- + +// Include files + +#include "Pooma/Arrays.h" +#include "Utilities/Tester.h" +#include + + +template +bool test(Pooma::Tester& tester, + const A1& a_mp, const A1& b_mp, + const A2& a_sp, const A2& b_sp, + const Loc<2>& delta1, const Loc<2>& delta2, + bool initial_f, const Loc<2>& initial) +{ + static int sequence = 0; + Interval<2> I; + + // initialize rhs arrays, ensure wrong igc values + // via sequence number. + I = b_sp.totalDomain(); + b_sp(I) = sequence + iota(I).comp(0) + I[0].size()*iota(I).comp(1); + b_mp.engine().setGuards(0); + b_mp(I) = b_sp(I); + + // if requested, force initial update of a set of igcs + if (initial_f) { + b_sp(b_sp.physicalDomain()) = b_mp(b_sp.physicalDomain()+initial); + b_sp(I) = sequence + iota(I).comp(0) + I[0].size()*iota(I).comp(1); + Pooma::blockAndEvaluate(); + } + + // do calculation both sp and mp + I = a_sp.physicalDomain(); + a_sp(I) = b_sp(I+delta1) - b_sp(I+delta2); + a_mp(I) = b_mp(I+delta1) - b_mp(I+delta2); + + // check the results are the same everywhere + bool res = all(a_sp(I) == a_mp(I)); + tester.out() << "For deltas " << delta1 << " and " << delta2 << " "; + if (initial_f) + tester.out() << "with initial " << initial << " "; + tester.check("result is", res); + if (!res) { + int n = b_mp.layout().sizeGlobal(); + for (int i=0; i > b(b_mp.engine().globalPatch(i)); + tester.out() << "Brick " << i << " " << intersect(b.domain(), b_mp.physicalDomain()) + << " on context " << b.engine().owningContext() + << " is\n" << b(intersect(b.totalDomain(), b_mp.physicalDomain())) + << std::endl; + } + tester.out() << "Aborting." << std::endl; + return false; + } + + sequence++; + + return true; +} + + +int main(int argc, char *argv[]) +{ + // Initialize POOMA and output stream, using Tester class + Pooma::initialize(argc, argv); + Pooma::Tester tester(argc, argv); + + Interval<2> domain(12, 12); + UniformGridLayout<2> layout_mp(domain, Loc<2>(3, 3), + GuardLayers<2>(2), DistributedTag()); + DomainLayout<2> layout_sp(domain, GuardLayers<2>(2)); + + Array<2, int, MultiPatch > > + a_mp(layout_mp), b_mp(layout_mp); + Array<2, int, Brick> + a_sp(layout_sp), b_sp(layout_sp); + + // all 5^4 == 625 uninitialized cases + for (int d1i = -2; d1i <= 2; ++d1i) + for (int d1j = -2; d1j <= 2; ++d1j) + for (int d2i = -2; d2i <= 2; ++d2i) + for (int d2j = -2; d2j <= 2; ++d2j) + if (!test(tester, a_mp, b_mp, a_sp, b_sp, + Loc<2>(d1i, d1j), Loc<2>(d2i, d2j), + false, Loc<2>(0))) + goto out; + + // all 5^4 == 625 initialized cases with simplified expression + for (int ii = -2; ii <= 2; ++ii) + for (int ij = -2; ij <= 2; ++ij) + for (int d1i = -2; d1i <= 2; ++d1i) + for (int d1j = -2; d1j <= 2; ++d1j) + if (!test(tester, a_mp, b_mp, a_sp, b_sp, + Loc<2>(d1i, d1j), Loc<2>(d1i, d1j), + true, Loc<2>(ii, ij))) + goto out; + + out: + tester.out() << "Best testing is done with all 1 to 9 processes" << std::endl; + + int retval = tester.results("array_test30"); + Pooma::finalize(); + return retval; +} + +// ACL:rcsinfo +// ---------------------------------------------------------------------- +// $RCSfile: array_test29.cpp,v $ $Author: pooma $ +// $Revision: 1.1 $ $Date: 2004/07/20 18:41:00 $ +// ---------------------------------------------------------------------- +// ACL:rcsinfo From rguenth at tat.physik.uni-tuebingen.de Wed Aug 18 12:21:17 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 18 Aug 2004 14:21:17 +0200 (CEST) Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <41224155.1070007@codesourcery.com> Message-ID: On Tue, 17 Aug 2004, Jeffrey D. Oldham wrote: > I would prefer to keep the compiler list relatively short and containing > the most popular compliant compilers. > > Would you be willing to start modifying the README file for a release? Like the following? Richard. Index: README =================================================================== RCS file: /home/pooma/Repository/r2/README,v retrieving revision 1.63 diff -u -u -r1.63 README --- README 11 Jul 2002 21:28:52 -0000 1.63 +++ README 18 Aug 2004 12:20:42 -0000 @@ -1,5 +1,35 @@ //////////////////////////////////////////////////////////////////// +RELEASE NOTES v2.4.1 + +//////////////////////////////////////////////////////////////////// + +Version 2.4.1 cleans up the codebase to be ISO C++ conformant. As +such an ISO C++ conforming compiler and standard library is recommended, +but still compilers close to that may be supported (gcc 3.3 and Intel 7.2 +are). + +Most visible enhancements in this release are the addition of native +MPI support for message passing parallelism and OpenMP support for +thread parallelism. MPI support was tested with the MPICH and LAM MPI +implementations, OpenMP support was tested with the Intel compiler +on ia32 and ia64 architectures. Message passing parallelism through +using the Cheetah library is still supported. + +Numerous restrictions on the use of Arrays, Fields and expressions in +certain constructs were lifted. Also may bugs were fixed and performance +was improved. + +The status of POOMA particles, especially parallel particles, is +undetermined. So is the status of thread parallelism based on the +SMARTS library. + +Support libraries for POOMA such as Cheetah, SMARTS and PETE can be +obtained from http://www.pooma.com/. + + +//////////////////////////////////////////////////////////////////// + RELEASE NOTES v2.4.0 //////////////////////////////////////////////////////////////////// Index: INSTALL.unix =================================================================== RCS file: /home/pooma/Repository/r2/INSTALL.unix,v retrieving revision 1.28 diff -u -u -r1.28 INSTALL.unix --- INSTALL.unix 12 Jan 2003 16:16:15 -0000 1.28 +++ INSTALL.unix 18 Aug 2004 12:20:42 -0000 @@ -1,7 +1,7 @@ /******************************************************************* * * * POOMA build and installation instructions for UNIX * - * Version 2.4.0 * + * Version 2.4.1 * * * ******************************************************************* * For release notes, see README. * @@ -26,42 +26,25 @@ SUPPORTED PLATFORMS AND COMPILERS: ---------------------------------- -POOMA version 2.4.0 has been ported to the following platforms and +POOMA version 2.4.1 has been tested on the following platforms and compilers; please find the instructions for your platform within this document and follow the steps. - o SGI IRIX 6.X, with the Kuck and Associates KCC compiler - (v3.3d or later, including 3.4x) - o SGI IRIX 6.X, with the GCC compiler - (v2.95 or greater) - o SGI IRIX 6.X, with SGI C++ 7.3 or later compiler - (without patch 3659!) - o Linux, with the Kuck and Associates KCC compiler - (v3.3d or later, including 3.4x) o Linux, with the GCC compiler - (v2.95 or greater) + (v3.3 or greater) o Linux, with the Intel icpc compiler - (v6.0 or greater) + (v7.2 or greater) More information about the compilers above can be obtained from the following URLs: o GCC Home Page (GCC): http://gcc.gnu.org - o Silicon Graphics (SGI C++): http://www.sgi.com o Intel (icpc): http://www.intel.com - o The Kuck and Associates (KCC) is no longer available. On Unix machines, POOMA can be compiled with one or more optional packages. The currently available optional packages are: - o SMARTS, for multithreaded parallelism and dataflow analysis - - o PDT, for static analysis of source code - - o TAU, for automatic source code profiling - - o PAWS, for run-time coupling of parallel data structures with - other parallel programs. + o Cheetah, for message passing or shared memory parallelism When compiling with other packages, be sure to check the section on known problems section at the end of this document. Some combinations of packages @@ -76,7 +59,7 @@ Since you're reading this file, you've successfully expanded the .tgz file you downloaded from the net. -You should notice the following files/folders inside of pooma-2.4.0: +You should notice the following files/folders inside of pooma-2.4.1: configure ..................... used for Unix builds CREDITS ....................... the people who developed POOMA and this CD @@ -120,12 +103,12 @@ directories. To build POOMA for a given "suite" you set the environment variable POOMASUITE to the name of that suite and then execute make at the top level. Basic configurations for various systems and compilers -are found in the config/arch directory (for example LINUXKCC.conf contains -definitions for building with the KAI compiler on Linux systems). You +are found in the config/arch directory (for example LINUXgcc.conf contains +definitions for building with the GCC compiler on Linux systems). You should start by finding one of the .conf files that most closely matches your system and possibly editing the definitions in it where they are -incorrect. (You may also copy a .conf file and use the new name as an -option to configure.) +incorrect. You may also copy a .conf file and use the new name as an +option to configure. In general, the configure command will look like: @@ -144,64 +127,6 @@ In addition, configure may look at several environment variables, SMARTSDIR, CHEETAHDIR, TAUDIR, PDTDIR etc. to find the locations of other -installed packages. - -To see some examples of building POOMA for different systems look at -some of the following files. These files are scripts that assume you have -downloaded the tar'd distributions of all the necessary packages to one -location and want to build them in place: - -scripts/buildPoomaLinuxEgcs - build pooma and an example on Linux -scripts/buildPoomaSGI - build pooma and an example on IRIX -scripts/buildPoomaCheetaLinux - build pooma with Cheetah -scripts/buildPoomaCheetaTauSGI - build pooma with Cheetah and Tau - -The SMARTS distribution also comes with some scripts that demonstrate -building pooma with smarts. - - -* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * - - NOTES ON KNOWN PROBLEMS: - ------------------------ - -Linux with RedHat 6 (and possibly other distributions): -------------------------------------------------------- -We have observed an odd problem with gmake on RedHat 6 installations, -where a target will be built successfully, but then gmake will attempt -to build one more incorrectly specified target. The latter should not -happen, but does. It will just print a strange error message and -exit, but will not affect the target that was actually built -properly. If you see such behavior, add the '-r' flag to your 'gmake' -commands, to turn off implicit build rules in GNU make. This should -prevent these errors from happening. - -KCC 3.4d compiler: ------------------- -We have run across some compiler bugs triggered by our developer test -codes. These bugs have been reported but are not yet resolved. - -GCC compiler: --------------- -Some codes may cause the GCC v2.95 compiler to exhaust virtual memory while -compiling, leading to an error message from g++ such as "Cannot allocate -XXX bytes", where XXX is some number. This is especially common when -compiling with optimization turned on. Two possible solutions are to -turn down the level of optimization (by modifying the optimization settings -in the appropriate .conf file) or to break the code up into smaller -source code files. - -Using more recent versions of GCC (3.2 and up) is recommended and will fix -this particular and many other problems. Note that GCC v3.0 and v3.1 generate -wrong code under certain circumstances when using optimization. Use of them -is not recommended. - - -Using SMARTS: -------------- - -SMARTS places particular requirements on the thread-safety of the compilers -used. See the SMARTS documentation for details, but in some situations this -may restrict your choice of build options. (For example, with KCC 3.4g, the -thread-safe version requires that exceptions be turned on.) +installed packages. Refer to the configure script for further information +in case of problems. From oldham at codesourcery.com Wed Aug 18 15:10:56 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 18 Aug 2004 08:10:56 -0700 Subject: [PATCH] add extensive test for igc updates In-Reply-To: References: Message-ID: <41237180.4070907@codesourcery.com> Richard Guenther wrote: >With igc update optimizations it is important to check if they work >correctly in all cases, so here is a testcase that (tires to) >enumerate all possible cases. > >Ok? > > > Testing is good. Yes, please add it to the Pooma repository. >Richard. > > >2004Aug18 Richard Guenther > > * src/Array/tests/array_test30.cpp: new. > > >------------------------------------------------------------------------ > >--- /dev/null Tue May 18 17:20:27 2004 >+++ array_test30.cpp Wed Aug 18 13:47:46 2004 >@@ -0,0 +1,143 @@ >+// -*- C++ -*- >+// ACL:license >+// ---------------------------------------------------------------------- >+// This software and ancillary information (herein called "SOFTWARE") >+// called POOMA (Parallel Object-Oriented Methods and Applications) is >+// made available under the terms described here. The SOFTWARE has been >+// approved for release with associated LA-CC Number LA-CC-98-65. >+// >+// Unless otherwise indicated, this SOFTWARE has been authored by an >+// employee or employees of the University of California, operator of the >+// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with >+// the U.S. Department of Energy. The U.S. Government has rights to use, >+// reproduce, and distribute this SOFTWARE. The public may copy, distribute, >+// prepare derivative works and publicly display this SOFTWARE without >+// charge, provided that this Notice and any statement of authorship are >+// reproduced on all copies. Neither the Government nor the University >+// makes any warranty, express or implied, or assumes any liability or >+// responsibility for the use of this SOFTWARE. >+// >+// If SOFTWARE is modified to produce derivative works, such modified >+// SOFTWARE should be clearly marked, so as not to confuse it with the >+// version available from LANL. >+// >+// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, >+// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. >+// ---------------------------------------------------------------------- >+// ACL:license >+ >+//----------------------------------------------------------------------------- >+// array_test30: verify correctness of igc updates >+//----------------------------------------------------------------------------- >+ >+// Include files >+ >+#include "Pooma/Arrays.h" >+#include "Utilities/Tester.h" >+#include >+ >+ >+template >+bool test(Pooma::Tester& tester, >+ const A1& a_mp, const A1& b_mp, >+ const A2& a_sp, const A2& b_sp, >+ const Loc<2>& delta1, const Loc<2>& delta2, >+ bool initial_f, const Loc<2>& initial) >+{ >+ static int sequence = 0; >+ Interval<2> I; >+ >+ // initialize rhs arrays, ensure wrong igc values >+ // via sequence number. >+ I = b_sp.totalDomain(); >+ b_sp(I) = sequence + iota(I).comp(0) + I[0].size()*iota(I).comp(1); >+ b_mp.engine().setGuards(0); >+ b_mp(I) = b_sp(I); >+ >+ // if requested, force initial update of a set of igcs >+ if (initial_f) { >+ b_sp(b_sp.physicalDomain()) = b_mp(b_sp.physicalDomain()+initial); >+ b_sp(I) = sequence + iota(I).comp(0) + I[0].size()*iota(I).comp(1); >+ Pooma::blockAndEvaluate(); >+ } >+ >+ // do calculation both sp and mp >+ I = a_sp.physicalDomain(); >+ a_sp(I) = b_sp(I+delta1) - b_sp(I+delta2); >+ a_mp(I) = b_mp(I+delta1) - b_mp(I+delta2); >+ >+ // check the results are the same everywhere >+ bool res = all(a_sp(I) == a_mp(I)); >+ tester.out() << "For deltas " << delta1 << " and " << delta2 << " "; >+ if (initial_f) >+ tester.out() << "with initial " << initial << " "; >+ tester.check("result is", res); >+ if (!res) { >+ int n = b_mp.layout().sizeGlobal(); >+ for (int i=0; i+ Array<2, int, Remote > b(b_mp.engine().globalPatch(i)); >+ tester.out() << "Brick " << i << " " << intersect(b.domain(), b_mp.physicalDomain()) >+ << " on context " << b.engine().owningContext() >+ << " is\n" << b(intersect(b.totalDomain(), b_mp.physicalDomain())) >+ << std::endl; >+ } >+ tester.out() << "Aborting." << std::endl; >+ return false; >+ } >+ >+ sequence++; >+ >+ return true; >+} >+ >+ >+int main(int argc, char *argv[]) >+{ >+ // Initialize POOMA and output stream, using Tester class >+ Pooma::initialize(argc, argv); >+ Pooma::Tester tester(argc, argv); >+ >+ Interval<2> domain(12, 12); >+ UniformGridLayout<2> layout_mp(domain, Loc<2>(3, 3), >+ GuardLayers<2>(2), DistributedTag()); >+ DomainLayout<2> layout_sp(domain, GuardLayers<2>(2)); >+ >+ Array<2, int, MultiPatch > > >+ a_mp(layout_mp), b_mp(layout_mp); >+ Array<2, int, Brick> >+ a_sp(layout_sp), b_sp(layout_sp); >+ >+ // all 5^4 == 625 uninitialized cases >+ for (int d1i = -2; d1i <= 2; ++d1i) >+ for (int d1j = -2; d1j <= 2; ++d1j) >+ for (int d2i = -2; d2i <= 2; ++d2i) >+ for (int d2j = -2; d2j <= 2; ++d2j) >+ if (!test(tester, a_mp, b_mp, a_sp, b_sp, >+ Loc<2>(d1i, d1j), Loc<2>(d2i, d2j), >+ false, Loc<2>(0))) >+ goto out; >+ >+ // all 5^4 == 625 initialized cases with simplified expression >+ for (int ii = -2; ii <= 2; ++ii) >+ for (int ij = -2; ij <= 2; ++ij) >+ for (int d1i = -2; d1i <= 2; ++d1i) >+ for (int d1j = -2; d1j <= 2; ++d1j) >+ if (!test(tester, a_mp, b_mp, a_sp, b_sp, >+ Loc<2>(d1i, d1j), Loc<2>(d1i, d1j), >+ true, Loc<2>(ii, ij))) >+ goto out; >+ >+ out: >+ tester.out() << "Best testing is done with all 1 to 9 processes" << std::endl; >+ >+ int retval = tester.results("array_test30"); >+ Pooma::finalize(); >+ return retval; >+} >+ >+// ACL:rcsinfo >+// ---------------------------------------------------------------------- >+// $RCSfile: array_test29.cpp,v $ $Author: pooma $ >+// $Revision: 1.1 $ $Date: 2004/07/20 18:41:00 $ >+// ---------------------------------------------------------------------- >+// ACL:rcsinfo > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 18 15:33:56 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 18 Aug 2004 08:33:56 -0700 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: References: Message-ID: <412376E4.6060907@codesourcery.com> Richard Guenther wrote: >On Tue, 17 Aug 2004, Jeffrey D. Oldham wrote: > > > >>I would prefer to keep the compiler list relatively short and containing >>the most popular compliant compilers. >> >>Would you be willing to start modifying the README file for a release? >> >> > >Like the following? > >Richard. > > Yes, like the following. Thanks for making all these changes. Perhaps, we should add one paragraph describing the purpose of Pooma to the very beginning of the README file. POOMA is a C++ library supporting element-wise, data-parallel, and stencil-based physics computations using one or more processors. The library automatically handles all interprocessor communication, obviating the need for any explicit communication code and enabling the same program to be run on one or thousands of processors. The library supports high-level syntax close to mathematical or algorithmic syntax, easing the conversion from algorithms to code. Pooma, originally developed at Los Alamos National Laboratory to support nuclear simulations, is now used throughout the physics establishment around the world. >Index: README >=================================================================== >RCS file: /home/pooma/Repository/r2/README,v >retrieving revision 1.63 >diff -u -u -r1.63 README >--- README 11 Jul 2002 21:28:52 -0000 1.63 >+++ README 18 Aug 2004 12:20:42 -0000 >@@ -1,5 +1,35 @@ > //////////////////////////////////////////////////////////////////// > >+RELEASE NOTES v2.4.1 >+ >+//////////////////////////////////////////////////////////////////// >+ >+Version 2.4.1 cleans up the codebase to be ISO C++ conformant. As >+such an ISO C++ conforming compiler and standard library is recommended, >+but still compilers close to that may be supported (gcc 3.3 and Intel 7.2 >+are). >+ >+Most visible enhancements in this release are the addition of native >+MPI support for message passing parallelism and OpenMP support for >+thread parallelism. MPI support was tested with the MPICH and LAM MPI >+implementations, OpenMP support was tested with the Intel compiler >+on ia32 and ia64 architectures. Message passing parallelism through >+using the Cheetah library is still supported. >+ >+Numerous restrictions on the use of Arrays, Fields and expressions in >+certain constructs were lifted. Also may bugs were fixed and performance >+was improved. >+ >+The status of POOMA particles, especially parallel particles, is >+undetermined. So is the status of thread parallelism based on the >+SMARTS library. >+ >+Support libraries for POOMA such as Cheetah, SMARTS and PETE can be >+obtained from http://www.pooma.com/. >+ >+ >+//////////////////////////////////////////////////////////////////// >+ > RELEASE NOTES v2.4.0 > > //////////////////////////////////////////////////////////////////// >Index: INSTALL.unix >=================================================================== >RCS file: /home/pooma/Repository/r2/INSTALL.unix,v >retrieving revision 1.28 >diff -u -u -r1.28 INSTALL.unix >--- INSTALL.unix 12 Jan 2003 16:16:15 -0000 1.28 >+++ INSTALL.unix 18 Aug 2004 12:20:42 -0000 >@@ -1,7 +1,7 @@ > /******************************************************************* > * * > * POOMA build and installation instructions for UNIX * >- * Version 2.4.0 * >+ * Version 2.4.1 * > * * > ******************************************************************* > * For release notes, see README. * >@@ -26,42 +26,25 @@ > SUPPORTED PLATFORMS AND COMPILERS: > ---------------------------------- > >-POOMA version 2.4.0 has been ported to the following platforms and >+POOMA version 2.4.1 has been tested on the following platforms and > compilers; please find the instructions for your platform within this > document and follow the steps. > >- o SGI IRIX 6.X, with the Kuck and Associates KCC compiler >- (v3.3d or later, including 3.4x) >- o SGI IRIX 6.X, with the GCC compiler >- (v2.95 or greater) >- o SGI IRIX 6.X, with SGI C++ 7.3 or later compiler >- (without patch 3659!) >- o Linux, with the Kuck and Associates KCC compiler >- (v3.3d or later, including 3.4x) > o Linux, with the GCC compiler >- (v2.95 or greater) >+ (v3.3 or greater) > o Linux, with the Intel icpc compiler >- (v6.0 or greater) >+ (v7.2 or greater) > > More information about the compilers above can be obtained from > the following URLs: > > o GCC Home Page (GCC): http://gcc.gnu.org >- o Silicon Graphics (SGI C++): http://www.sgi.com > o Intel (icpc): http://www.intel.com >- o The Kuck and Associates (KCC) is no longer available. > > On Unix machines, POOMA can be compiled with one or more optional > packages. The currently available optional packages are: > >- o SMARTS, for multithreaded parallelism and dataflow analysis >- >- o PDT, for static analysis of source code >- >- o TAU, for automatic source code profiling >- >- o PAWS, for run-time coupling of parallel data structures with >- other parallel programs. >+ o Cheetah, for message passing or shared memory parallelism > > When compiling with other packages, be sure to check the section on known > problems section at the end of this document. Some combinations of packages >@@ -76,7 +59,7 @@ > Since you're reading this file, you've successfully expanded the .tgz > file you downloaded from the net. > >-You should notice the following files/folders inside of pooma-2.4.0: >+You should notice the following files/folders inside of pooma-2.4.1: > > configure ..................... used for Unix builds > CREDITS ....................... the people who developed POOMA and this CD >@@ -120,12 +103,12 @@ > directories. To build POOMA for a given "suite" you set the environment > variable POOMASUITE to the name of that suite and then execute make at > the top level. Basic configurations for various systems and compilers >-are found in the config/arch directory (for example LINUXKCC.conf contains >-definitions for building with the KAI compiler on Linux systems). You >+are found in the config/arch directory (for example LINUXgcc.conf contains >+definitions for building with the GCC compiler on Linux systems). You > should start by finding one of the .conf files that most closely matches > your system and possibly editing the definitions in it where they are >-incorrect. (You may also copy a .conf file and use the new name as an >-option to configure.) >+incorrect. You may also copy a .conf file and use the new name as an >+option to configure. > > In general, the configure command will look like: > >@@ -144,64 +127,6 @@ > > In addition, configure may look at several environment variables, > SMARTSDIR, CHEETAHDIR, TAUDIR, PDTDIR etc. to find the locations of other >-installed packages. >- >-To see some examples of building POOMA for different systems look at >-some of the following files. These files are scripts that assume you have >-downloaded the tar'd distributions of all the necessary packages to one >-location and want to build them in place: >- >-scripts/buildPoomaLinuxEgcs - build pooma and an example on Linux >-scripts/buildPoomaSGI - build pooma and an example on IRIX >-scripts/buildPoomaCheetaLinux - build pooma with Cheetah >-scripts/buildPoomaCheetaTauSGI - build pooma with Cheetah and Tau >- >-The SMARTS distribution also comes with some scripts that demonstrate >-building pooma with smarts. >- >- >-* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * >- >- NOTES ON KNOWN PROBLEMS: >- ------------------------ >- >-Linux with RedHat 6 (and possibly other distributions): >-------------------------------------------------------- >-We have observed an odd problem with gmake on RedHat 6 installations, >-where a target will be built successfully, but then gmake will attempt >-to build one more incorrectly specified target. The latter should not >-happen, but does. It will just print a strange error message and >-exit, but will not affect the target that was actually built >-properly. If you see such behavior, add the '-r' flag to your 'gmake' >-commands, to turn off implicit build rules in GNU make. This should >-prevent these errors from happening. >- >-KCC 3.4d compiler: >------------------- >-We have run across some compiler bugs triggered by our developer test >-codes. These bugs have been reported but are not yet resolved. >- >-GCC compiler: >--------------- >-Some codes may cause the GCC v2.95 compiler to exhaust virtual memory while >-compiling, leading to an error message from g++ such as "Cannot allocate >-XXX bytes", where XXX is some number. This is especially common when >-compiling with optimization turned on. Two possible solutions are to >-turn down the level of optimization (by modifying the optimization settings >-in the appropriate .conf file) or to break the code up into smaller >-source code files. >- >-Using more recent versions of GCC (3.2 and up) is recommended and will fix >-this particular and many other problems. Note that GCC v3.0 and v3.1 generate >-wrong code under certain circumstances when using optimization. Use of them >-is not recommended. >- >- >-Using SMARTS: >-------------- >- >-SMARTS places particular requirements on the thread-safety of the compilers >-used. See the SMARTS documentation for details, but in some situations this >-may restrict your choice of build options. (For example, with KCC 3.4g, the >-thread-safe version requires that exceptions be turned on.) >+installed packages. Refer to the configure script for further information >+in case of problems. > > > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 18 15:40:27 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 18 Aug 2004 08:40:27 -0700 Subject: [PATCH] Allow custom evaluation domain for ScalarCode In-Reply-To: References: Message-ID: <4123786B.8020901@codesourcery.com> Richard Guenther wrote: >This patch adds the ability to provide a custom evaluation domain >for a ScalarCode expression (like including external guards or >excluding the boundary from vertex centered fields). This is much >less fragile than trying to pass appropriate views as arguments. > >Tested with Evaluator and ScalarCode tests. > >Ok? > > > This is good. The existing interface is maintained but also extended. Please commit it. I have one small correction about thirty lines below. >Richard. > > >2004Aug18 Richard Guenther > > * src/Evaluator/ScalarCode.h: add variants of operator() > with specified evaluation domain. > src/Evaluator/tests/evaluatorTest9.cpp: new. > > >------------------------------------------------------------------------ > >Index: ScalarCode.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Evaluator/ScalarCode.h,v >retrieving revision 1.13 >diff -u -u -r1.13 ScalarCode.h >--- ScalarCode.h 7 Apr 2004 16:38:23 -0000 1.13 >+++ ScalarCode.h 18 Aug 2004 09:52:47 -0000 >@@ -391,6 +391,19 @@ > Interval domain_m; > }; > >+ >+/** >+ * ScalarCode is a Stencil like operation that allows for more than one >+ * field to be operated on. Generally the functor is a local (set of) >+ * function(s) which could be described as >+ * >+ * (f1..fM) = op(fM+1..fN) >+ * > > I assume commas are needed: (f1, ..., fM) = op(F1, ..., FN) Also, fM+1 is ambiguous: f_{M+1} or (fM)+1 >+ * where fM+1 to fN are input fields read from and f1 to fM are output > > g1 to gN >+ * fields written to (this distinction nor its ordering is strictly >+ * required, but both will result in the least possible surprises). >+ */ >+ > template > struct ScalarCode > { >@@ -427,113 +440,149 @@ > return f.centeringSize() == 1 && f.numMaterials() == 1; > } > >+ /// @name Evaluators >+ /// Evaluate the ScalarCode functor on the fields f1 to fN using the >+ /// specified evaluation domain. Note that views of the evaluation domain >+ /// are taken of every field, so domains of the fields should be strictly >+ /// conforming (in fact, passing views to these operators is a bug unless >+ /// you really know what you are doing). >+ /// >+ /// The evaluation domain defaults to the physical domain of >+ /// the first field which should usually be (on of) the left hand side(s). >+ /// If you want the functor to operate on a different domain use the >+ /// operators with the explicit specified evaluation domain. >+ //@{ >+ > template >- void operator()(const F1 &f1) const >+ void operator()(const F1 &f1, const Interval &evalDom) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg1 multiArg(f1); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); >+ } >+ >+ template >+ inline void operator()(const F1 &f1) const >+ { >+ (*this)(f1, f1.physicalDomain()); > } > >+ > template >- void operator()(const F1 &f1, const F2 &f2) const >+ void operator()(const F1 &f1, const Interval &evalDom, >+ const F2 &f2) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg2 multiArg(f1, f2); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); > } > >+ template >+ inline void operator()(const F1 &f1, const F2 &f2) const >+ { >+ (*this)(f1, f1.physicalDomain(), f2); >+ } >+ >+ > template >- void operator()(const F1 &f1, const F2 &f2, const F3 &f3) const >+ void operator()(const F1 &f1, const Interval &evalDom, >+ const F2 &f2, const F3 &f3) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg3 multiArg(f1, f2, f3); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); > } > >+ template >+ inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3) const >+ { >+ (*this)(f1, f1.physicalDomain(), f2, f3); >+ } >+ >+ > template >- void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4) const >+ void operator()(const F1 &f1, const Interval &evalDom, >+ const F2 &f2, const F3 &f3, const F4 &f4) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg4 multiArg(f1, f2, f3, f4); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); > } > >+ template >+ inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4) const >+ { >+ (*this)(f1, f1.physicalDomain(), f2, f3, f4); >+ } >+ >+ > template >- void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, >- const F5 &f5) const >+ void operator()(const F1 &f1, const Interval &evalDom, >+ const F2 &f2, const F3 &f3, const F4 &f4, const F5 &f5) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg5 multiArg(f1, f2, f3, f4, f5); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); > } > >+ template >+ inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, >+ const F5 &f5) const >+ { >+ (*this)(f1, f1.physicalDomain(), f2, f3, f4, f5); >+ } >+ >+ > template >- void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, >- const F5 &f5, const F6 &f6) const >+ void operator()(const F1 &f1, const Interval &evalDom, >+ const F2 &f2, const F3 &f3, const F4 &f4, const F5 &f5, >+ const F6 &f6) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg6 multiArg(f1, f2, f3, f4, f5, f6); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); > } > >+ template >+ inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, >+ const F5 &f5, const F6 &f6) const >+ { >+ (*this)(f1, f1.physicalDomain(), f2, f3, f4, f5, f6); >+ } >+ >+ > template >- void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, >+ void operator()(const F1 &f1, const Interval &evalDom, >+ const F2 &f2, const F3 &f3, const F4 &f4, > const F5 &f5, const F6 &f6, const F7 &f7) const > { > PAssert(checkValidity(f1, WrappedInt())); >- enum { dimensions = F1::dimensions }; > MultiArg7 multiArg(f1, f2, f3, f4, f5, f6, f7); >- EvaluateLocLoop kernel(function_m, >- f1.physicalDomain()); >- >+ EvaluateLocLoop kernel(function_m, evalDom); > MultiArgEvaluator:: >- evaluate(multiArg, function_m, >- f1.physicalDomain(), >- kernel); >+ evaluate(multiArg, function_m, evalDom, kernel); > } >+ >+ template >+ inline void operator()(const F1 &f1, const F2 &f2, const F3 &f3, const F4 &f4, >+ const F5 &f5, const F6 &f6, const F7 &f7) const >+ { >+ (*this)(f1, f1.physicalDomain(), f2, f3, f4, f5, f6, f7); >+ } >+ >+ //@} > > Function function_m; > }; >--- /dev/null Tue May 18 17:20:27 2004 >+++ tests/evaluatorTest9.cpp Wed Aug 18 11:51:07 2004 >@@ -0,0 +1,121 @@ >+// -*- C++ -*- >+// ACL:license >+// ---------------------------------------------------------------------- >+// This software and ancillary information (herein called "SOFTWARE") >+// called POOMA (Parallel Object-Oriented Methods and Applications) is >+// made available under the terms described here. The SOFTWARE has been >+// approved for release with associated LA-CC Number LA-CC-98-65. >+// >+// Unless otherwise indicated, this SOFTWARE has been authored by an >+// employee or employees of the University of California, operator of the >+// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with >+// the U.S. Department of Energy. The U.S. Government has rights to use, >+// reproduce, and distribute this SOFTWARE. The public may copy, distribute, >+// prepare derivative works and publicly display this SOFTWARE without >+// charge, provided that this Notice and any statement of authorship are >+// reproduced on all copies. Neither the Government nor the University >+// makes any warranty, express or implied, or assumes any liability or >+// responsibility for the use of this SOFTWARE. >+// >+// If SOFTWARE is modified to produce derivative works, such modified >+// SOFTWARE should be clearly marked, so as not to confuse it with the >+// version available from LANL. >+// >+// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, >+// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. >+// ---------------------------------------------------------------------- >+// ACL:license >+ >+//----------------------------------------------------------------------------- >+// evaluatorTest9 - testing ScalarCode and custom evaluation domain >+//----------------------------------------------------------------------------- >+ >+#include "Pooma/Pooma.h" >+#include "Pooma/Arrays.h" >+#include "Pooma/Fields.h" // for PerformUpdateTag() only! >+#include "Evaluator/ScalarCode.h" >+#include "Utilities/Tester.h" >+#include >+ >+ >+// dummy operation >+ >+template >+struct Copy >+{ >+ Copy(int val) : val_m(val) {} >+ >+ template >+ inline void operator()(const A &a, const Loc &i) const >+ { >+ a(i) = val_m; >+ } >+ >+ void scalarCodeInfo(ScalarCodeInfo& i) const >+ { >+ i.arguments(1); >+ i.dimensions(Dim); >+ i.write(1, true); >+ i.useGuards(0, false); >+ } >+ >+ const int val_m; >+}; >+ >+ >+int main(int argc, char *argv[]) >+{ >+ // Initialize POOMA and output stream, using Tester class >+ Pooma::initialize(argc, argv); >+ Pooma::Tester tester(argc, argv); >+ >+ Pooma::blockingExpressions(true); >+ >+ Interval<2> domain(16, 16); >+ Loc<2> blocks(4, 4); >+ UniformGridLayout<2> layout(domain, blocks, GuardLayers<2>(1), DistributedTag()); >+ UniformRectilinearMesh<2> mesh(layout); >+ Centering<2> cell = canonicalCentering<2>(CellType, Continuous); >+ >+ Field, int, MultiPatch > > >+ a(cell, layout, mesh), >+ b(cell, layout, mesh); >+ >+ // initialize with zero >+ a.all() = 0; >+ b.all() = 0; >+ >+ // do assignments to various subdomains with both expression engine >+ // and scalar code functor and compare the full results. >+ Interval<2> I; >+ >+ (ScalarCode >(1))(a); >+ b = 1; >+ tester.check("default (physical) domain", all(a.all() == b.all())); >+ >+ I = Interval<2>(Interval<1>(8, 14), Interval<1>(0, 14)); >+ (ScalarCode >(2))(a, I); >+ b(I) = 2; >+ tester.check("partial set of physical patches", all(a.all() == b.all())); >+ >+ I = Interval<2>(Interval<1>(6, 9), Interval<1>(6, 9)); >+ (ScalarCode >(3))(a, I); >+ b(I) = 3; >+ tester.check("arbitrary physical domain", all(a.all() == b.all())); >+ >+ I = Interval<2>(Interval<1>(0, 15), Interval<1>(-1, 2)); >+ (ScalarCode >(4))(a, I); >+ b(I) = 4; >+ tester.check("arbitrary domain", all(a.all() == b.all())); >+ >+ int retval = tester.results("evaluatorTest9 (ScalarCode, evaluation domain)"); >+ Pooma::finalize(); >+ return retval; >+} >+ >+// ACL:rcsinfo >+// ---------------------------------------------------------------------- >+// $RCSfile: evaluatorTest2.cpp,v $ $Author: pooma $ >+// $Revision: 1.7 $ $Date: 2003/01/29 19:32:07 $ >+// ---------------------------------------------------------------------- >+// ACL:rcsinfo > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 18 15:48:35 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 18 Aug 2004 08:48:35 -0700 Subject: [PATCH] Fix ScalarCode with expression arguments In-Reply-To: References: Message-ID: <41237A53.7040206@codesourcery.com> Richard Guenther wrote: >This patch fixes expression arguments with (read-only) arguments >to ScalarCode. Before this patch there were several problems with >this: >- updating of the engine state did not handle expression engines >- internal guards were not updated correctly > >With fixing the above ScalarCode also gains from the previous guard >layer update optimizations. > >Tested with all ScalarCode tests and Evaluator tests. > >Ok? > > > I guess so. I do not understand, but that is because of my ignorance. >Richard. > >Btw. the test is evaluatorTest10 - patches to merge other tests are on the >way. > > >2004Aug18 Richard Guenther > > * src/Evaluator/MultiArgEvaluator.h: handle expression engines > in EngineWriteNotifier, pass stencil extent to SimpleIntersector. > src/Evaluator/SimpleIntersector.h: honour stencil extent, > recursively intersect and update expression engines. > src/Evaluator/tests/evaluatorTest10.cpp: new. > > >------------------------------------------------------------------------ > >Index: MultiArgEvaluator.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Evaluator/MultiArgEvaluator.h,v >retrieving revision 1.14 >diff -u -u -r1.14 MultiArgEvaluator.h >--- MultiArgEvaluator.h 21 Nov 2003 17:36:10 -0000 1.14 >+++ MultiArgEvaluator.h 18 Aug 2004 09:41:57 -0000 >@@ -74,6 +74,8 @@ > //----------------------------------------------------------------------------- > > template struct MultiArgEvaluatorTag; >+template class Field; >+template class Array; > > /** > * Implements: MultiArgEvaluator::evaluate >@@ -111,19 +113,30 @@ > } > > template >- void operator()(const A &a, bool f) const >+ void operator()(const A &a) const > { >- if (f) >- { >- // This isn't quite what we want here, because we may want to >- // write to a field containing multiple centering engines. >- // Need to rewrite notifyEngineWrite as an ExpressionApply, >- // and create a version of ExpressionApply that goes through >- // all the engines in a field. >+ // This isn't quite what we want here, because we may want to >+ // write to a field containing multiple centering engines. >+ // Need to rewrite notifyEngineWrite as an ExpressionApply, >+ // and create a version of ExpressionApply that goes through >+ // all the engines in a field. > >- notifyEngineWrite(a.engine()); >- dirtyRelations(a, WrappedInt()); >- } >+ notifyEngineWrite(a.engine()); >+ dirtyRelations(a, WrappedInt()); >+ } >+ >+ // overload for ExpressionTag engines to not fall on our faces compile time >+ template >+ void operator()(const Field >&) const >+ { >+ // we must be able to compile this, but never execute >+ PInsist(false, "writing to expression engine?"); >+ } >+ template >+ void operator()(const Array >&) const >+ { >+ // we must be able to compile this, but never execute >+ PInsist(false, "writing to expression engine?"); > } > }; > >@@ -172,7 +185,7 @@ > MultiArgEvaluator::evaluate(multiArg, function, > domain, info, kernel); > >- applyMultiArg(multiArg, EngineWriteNotifier(), info.writers()); >+ applyMultiArgIf(multiArg, EngineWriteNotifier(), info.writers()); > > Pooma::endExpression(); > } >@@ -265,7 +278,12 @@ > const Kernel &kernel) > { > typedef SimpleIntersector Inter_t; >- Inter_t inter(domain); >+ GuardLayers extent; >+ for (int i=0; i+ extent.lower(i) = info.lowerExtent(i); >+ extent.upper(i) = info.upperExtent(i); >+ } >+ Inter_t inter(domain, extent); > > applyMultiArg(multiArg, inter, info.useGuards()); > >@@ -368,7 +386,12 @@ > const Kernel &kernel) > { > typedef SimpleIntersector Inter_t; >- Inter_t inter(domain); >+ GuardLayers extent; >+ for (int i=0; i+ extent.lower(i) = info.lowerExtent(i); >+ extent.upper(i) = info.upperExtent(i); >+ } >+ Inter_t inter(domain, extent); > > applyMultiArg(multiArg, inter, info.useGuards()); > >Index: SimpleIntersector.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Evaluator/SimpleIntersector.h,v >retrieving revision 1.6 >diff -u -u -r1.6 SimpleIntersector.h >--- SimpleIntersector.h 22 Oct 2003 20:43:26 -0000 1.6 >+++ SimpleIntersector.h 18 Aug 2004 09:41:57 -0000 >@@ -91,8 +91,8 @@ > > // Default constructor is trival. > >- inline SimpleIntersectorData(const Interval &domain) >- : seenFirst_m(false), domain_m(domain) >+ inline SimpleIntersectorData(const Interval &domain, const GuardLayers &extent) >+ : seenFirst_m(false), domain_m(domain), extent_m(extent) > { > } > >@@ -105,9 +105,10 @@ > inline ~SimpleIntersectorData() { } > > template >- void intersect(const Engine &engine) >+ void intersect(const Engine &engine, bool useGuards) > { > typedef typename Engine::Layout_t Layout_t; >+ typedef typename NewEngine >::Type_t NewEngine_t; > const Layout_t &layout(engine.layout()); > > // add an assertion that all layouts have the same base (probably >@@ -126,6 +127,15 @@ > { > shared(layout.ID(), firstID_m); > } >+ // We need to process possible expression engines with different >+ // guard needs here. Modeled after StencilIntersector. >+ if (useGuards) { >+ expressionApply(NewEngine_t(engine, grow(domain_m, extent_m)), >+ IntersectorTag >(lhsi_m)); >+ } else { >+ expressionApply(NewEngine_t(engine, domain_m), >+ IntersectorTag >(lhsi_m)); >+ } > } > > inline >@@ -149,10 +159,14 @@ > INodeContainer_t inodes_m; > GlobalIDDataBase gidStore_m; > Interval domain_m; >+ GuardLayers extent_m; >+ Intersector lhsi_m; > }; > > /** >- * This intersector handles matching layouts only. >+ * This intersector handles matching layouts only. It also assumes you >+ * know in advance the amount of guards used. But it allows differentiating >+ * between engines that use or do not use guards. > * > * It doesnt intersect individual layouts but is done with creating INodes > * from the first layout it sees by intersecting with the domain. >@@ -179,8 +193,8 @@ > > enum { dimensions = Dim }; > >- SimpleIntersector(const Interval &domain) >- : pdata_m(new SimpleIntersectorData_t(domain)), useGuards_m(true) >+ SimpleIntersector(const Interval &domain, const GuardLayers &extent) >+ : pdata_m(new SimpleIntersectorData_t(domain, extent)), useGuards_m(true) > { } > > SimpleIntersector(const This_t &model) >@@ -189,8 +203,10 @@ > > This_t &operator=(const This_t &model) > { >- if (this != &model) >+ if (this != &model) { > pdata_m = model.pdata_m; >+ useGuards_m = model.useGuards_m; >+ } > return *this; > } > >@@ -221,7 +237,8 @@ > inline > void intersect(const Engine &l) const > { >- data()->intersect(l); >+ data()->intersect(l, useGuards()); >+ > } > > inline >@@ -236,7 +253,7 @@ > useGuards_m = f; > } > >- // Interface to be used by applyNode() >+ // Interface to be used by applyMultiArg() > > template > void operator()(const A &a, bool f) const >@@ -284,39 +301,39 @@ > // with the enclosed intersector. > //--------------------------------------------------------------------------- > >-template >+template > struct LeafFunctor >, >- ExpressionApply > > >+ ExpressionApply > > > { > typedef int Type_t; > > static Type_t > apply(const Engine > &engine, >- const ExpressionApply > &apply) >+ const ExpressionApply > &apply) > { > apply.tag().intersect(engine); > > if (apply.tag().useGuards()) >- engine.fillGuards(); >+ engine.fillGuards(apply.tag().data()->extent_m); > > return 0; > } > }; > >-template >+template > struct LeafFunctor >, >- ExpressionApply > > >+ ExpressionApply > > > { > typedef int Type_t; > > static Type_t > apply(const Engine > &engine, >- const ExpressionApply > &apply) >+ const ExpressionApply > &apply) > { > apply.tag().intersect(engine); > > if (apply.tag().useGuards()) >- engine.fillGuards(); >+ engine.fillGuards(apply.tag().data()->extent_m); > > return 0; > } >--- /dev/null Tue May 18 17:20:27 2004 >+++ tests/evaluatorTest10.cpp Wed Aug 18 11:19:58 2004 >@@ -0,0 +1,108 @@ >+// -*- C++ -*- >+// ACL:license >+// ---------------------------------------------------------------------- >+// This software and ancillary information (herein called "SOFTWARE") >+// called POOMA (Parallel Object-Oriented Methods and Applications) is >+// made available under the terms described here. The SOFTWARE has been >+// approved for release with associated LA-CC Number LA-CC-98-65. >+// >+// Unless otherwise indicated, this SOFTWARE has been authored by an >+// employee or employees of the University of California, operator of the >+// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with >+// the U.S. Department of Energy. The U.S. Government has rights to use, >+// reproduce, and distribute this SOFTWARE. The public may copy, distribute, >+// prepare derivative works and publicly display this SOFTWARE without >+// charge, provided that this Notice and any statement of authorship are >+// reproduced on all copies. Neither the Government nor the University >+// makes any warranty, express or implied, or assumes any liability or >+// responsibility for the use of this SOFTWARE. >+// >+// If SOFTWARE is modified to produce derivative works, such modified >+// SOFTWARE should be clearly marked, so as not to confuse it with the >+// version available from LANL. >+// >+// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, >+// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. >+// ---------------------------------------------------------------------- >+// ACL:license >+ >+//----------------------------------------------------------------------------- >+// evaluatorTest5 - testing ScalarCode and expression arguments >+//----------------------------------------------------------------------------- >+ >+#include "Pooma/Pooma.h" >+#include "Pooma/Arrays.h" >+#include "Evaluator/ScalarCode.h" >+#include "Utilities/Tester.h" >+#include >+ >+ >+// ScalarCode just evaluating/assigning an expression >+ >+struct EvaluateExpr >+{ >+ EvaluateExpr() {} >+ >+ template >+ inline void operator()(const LHS &a, const RHS &b, const Loc<1> &i) const >+ { >+ a(i) = b.read(i); >+ } >+ >+ void scalarCodeInfo(ScalarCodeInfo& i) const >+ { >+ i.arguments(2); >+ i.dimensions(1); >+ i.write(0, true); >+ i.write(1, false); >+ i.useGuards(0, false); >+ i.useGuards(1, false); >+ } >+}; >+ >+ >+int main(int argc, char *argv[]) >+{ >+ // Initialize POOMA and output stream, using Tester class >+ Pooma::initialize(argc, argv); >+ Pooma::Tester tester(argc, argv); >+ >+ Pooma::blockingExpressions(true); >+ >+ Interval<1> domain(8); >+ UniformGridLayout<1> layout(domain, Loc<1>(2), GuardLayers<1>(1), DistributedTag()); >+ >+ Array<1, int, MultiPatch > > >+ a(layout), b(layout), c(layout); >+ >+ a = 0; >+ b = 1; >+ c = 2; >+ ScalarCode()(a, c-b); >+ tester.check("a = c - b", all(a(domain) == 1)); >+ tester.out() << a(domain) << std::endl; >+ >+ a = 0; >+ ScalarCode()(a, b(domain-1)+c(domain+1)); >+ tester.check("a = b(i-1) + c(i+1)", all(a(domain) == 3)); >+ tester.out() << a(domain) << std::endl; >+ >+ tester.out() << "Manually triggering igc fill" << std::endl; >+ b.engine().fillGuards(); >+ c.engine().fillGuards(); >+ a = 0; >+ ScalarCode()(a, b(domain-1)+c(domain+1)); >+ tester.check("a = b(i-1) + c(i+1)", all(a(domain) == 3)); >+ tester.out() << a(domain) << std::endl; >+ >+ int retval = tester.results("evaluatorTest10 (ScalarCode with expressions)"); >+ Pooma::finalize(); >+ return retval; >+} >+ >+// ACL:rcsinfo >+// ---------------------------------------------------------------------- >+// $RCSfile: evaluatorTest5.cpp,v $ $Author: pooma $ >+// $Revision: 1.1 $ $Date: 2003/02/20 16:39:42 $ >+// ---------------------------------------------------------------------- >+// ACL:rcsinfo > > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Thu Aug 19 11:11:39 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 19 Aug 2004 13:11:39 +0200 (CEST) Subject: Compilers (was: Re: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers) In-Reply-To: <412376E4.6060907@codesourcery.com> Message-ID: On Wed, 18 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > >-POOMA version 2.4.0 has been ported to the following platforms and > >+POOMA version 2.4.1 has been tested on the following platforms and > > compilers; please find the instructions for your platform within this > > document and follow the steps. > > > >- o SGI IRIX 6.X, with the Kuck and Associates KCC compiler > >- (v3.3d or later, including 3.4x) > >- o SGI IRIX 6.X, with the GCC compiler > >- (v2.95 or greater) > >- o SGI IRIX 6.X, with SGI C++ 7.3 or later compiler > >- (without patch 3659!) > >- o Linux, with the Kuck and Associates KCC compiler > >- (v3.3d or later, including 3.4x) > > o Linux, with the GCC compiler > >- (v2.95 or greater) > >+ (v3.3 or greater) > > o Linux, with the Intel icpc compiler > >- (v6.0 or greater) > >+ (v7.2 or greater) Ok, I just checked which additional compilers/architectures I can test on and unfortunately the SGI and HP machines I had access to were shut down recently (they were old anyways). So the ones above are the only ones I can test on. Does anyone here have access to machines with the SGI or the HP compilers? Does it even matter? What about NAG and PGI? It would be interesting for those who support it to check OpenMP support. Before a release I will test with gcc-3.3 and gcc-3.4 on ia32 and with Intel on ia64 including OpenMP. I'll also do one round of testing with MPI and Cheetah using gcc-3.4 on ia32. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Thu Aug 19 15:14:39 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 19 Aug 2004 08:14:39 -0700 Subject: Compilers In-Reply-To: References: Message-ID: <4124C3DF.7030807@codesourcery.com> Richard Guenther wrote: >On Wed, 18 Aug 2004, Jeffrey D. Oldham wrote: > > > >>Richard Guenther wrote: >> >> >> >>>-POOMA version 2.4.0 has been ported to the following platforms and >>>+POOMA version 2.4.1 has been tested on the following platforms and >>>compilers; please find the instructions for your platform within this >>>document and follow the steps. >>> >>>- o SGI IRIX 6.X, with the Kuck and Associates KCC compiler >>>- (v3.3d or later, including 3.4x) >>>- o SGI IRIX 6.X, with the GCC compiler >>>- (v2.95 or greater) >>>- o SGI IRIX 6.X, with SGI C++ 7.3 or later compiler >>>- (without patch 3659!) >>>- o Linux, with the Kuck and Associates KCC compiler >>>- (v3.3d or later, including 3.4x) >>> o Linux, with the GCC compiler >>>- (v2.95 or greater) >>>+ (v3.3 or greater) >>> o Linux, with the Intel icpc compiler >>>- (v6.0 or greater) >>>+ (v7.2 or greater) >>> >>> > >Ok, I just checked which additional compilers/architectures I can test on >and unfortunately the SGI and HP machines I had access to were shut down >recently (they were old anyways). So the ones above are the only ones >I can test on. Does anyone here have access to machines with the >SGI or the HP compilers? Does it even matter? What about NAG and PGI? >It would be interesting for those who support it to check OpenMP support. > >Before a release I will test with gcc-3.3 and gcc-3.4 on ia32 and with >Intel on ia64 including OpenMP. I'll also do one round of testing with >MPI and Cheetah using gcc-3.4 on ia32. > > I do not think SGI or HP matter right now. We should definitely check OpenMP. I can check gcc 3.3.4 and 3.4 on ia32 with Cheetah and MM. Have all outstanding patches been checked in? Are we ready to start testing Pooma and Cheetah? (Cheetah has a set of tests to check. See the Cheetah CVS repository.) -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Thu Aug 19 20:16:20 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 19 Aug 2004 22:16:20 +0200 Subject: [pooma-dev] Re: Compilers In-Reply-To: <4124C3DF.7030807@codesourcery.com> References: <4124C3DF.7030807@codesourcery.com> Message-ID: <41250A94.3020305@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> On Wed, 18 Aug 2004, Jeffrey D. Oldham wrote: >> >> >> >>> Richard Guenther wrote: >>> >>> >>> >>>> -POOMA version 2.4.0 has been ported to the following platforms and >>>> +POOMA version 2.4.1 has been tested on the following platforms and >>>> compilers; please find the instructions for your platform within this >>>> document and follow the steps. >>>> >>>> - o SGI IRIX 6.X, with the Kuck and Associates KCC compiler >>>> - (v3.3d or later, including 3.4x) >>>> - o SGI IRIX 6.X, with the GCC compiler >>>> - (v2.95 or greater) >>>> - o SGI IRIX 6.X, with SGI C++ 7.3 or later compiler >>>> - (without patch 3659!) >>>> - o Linux, with the Kuck and Associates KCC compiler >>>> - (v3.3d or later, including 3.4x) >>>> o Linux, with the GCC compiler >>>> - (v2.95 or greater) >>>> + (v3.3 or greater) >>>> o Linux, with the Intel icpc compiler >>>> - (v6.0 or greater) >>>> + (v7.2 or greater) >>>> >> >> >> Ok, I just checked which additional compilers/architectures I can test on >> and unfortunately the SGI and HP machines I had access to were shut down >> recently (they were old anyways). So the ones above are the only ones >> I can test on. Does anyone here have access to machines with the >> SGI or the HP compilers? Does it even matter? What about NAG and PGI? >> It would be interesting for those who support it to check OpenMP support. >> >> Before a release I will test with gcc-3.3 and gcc-3.4 on ia32 and with >> Intel on ia64 including OpenMP. I'll also do one round of testing with >> MPI and Cheetah using gcc-3.4 on ia32. >> >> > I do not think SGI or HP matter right now. We should definitely check > OpenMP. I can check gcc 3.3.4 and 3.4 on ia32 with Cheetah and MM. Ok. > Have all outstanding patches been checked in? I don't know yet - I think I submitted all patches I can remember but still want to go through a diff of two of my local repositories to current CVS. That may take another week or so. Richard. > Are we ready to start > testing Pooma and Cheetah? (Cheetah has a set of tests to check. See > the Cheetah CVS repository.) > From oldham at codesourcery.com Thu Aug 19 20:20:41 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 19 Aug 2004 13:20:41 -0700 Subject: [pooma-dev] Re: Compilers In-Reply-To: <41250A94.3020305@tat.physik.uni-tuebingen.de> References: <4124C3DF.7030807@codesourcery.com> <41250A94.3020305@tat.physik.uni-tuebingen.de> Message-ID: <41250B99.7030102@codesourcery.com> Richard Guenther wrote: > Jeffrey D. Oldham wrote: > >> Richard Guenther wrote: >> >>> On Wed, 18 Aug 2004, Jeffrey D. Oldham wrote: >>> >>> >>> >>>> Richard Guenther wrote: >>>> >>>> >>>> >>>>> -POOMA version 2.4.0 has been ported to the following platforms and >>>>> +POOMA version 2.4.1 has been tested on the following platforms and >>>>> compilers; please find the instructions for your platform within this >>>>> document and follow the steps. >>>>> >>>>> - o SGI IRIX 6.X, with the Kuck and Associates KCC compiler >>>>> - (v3.3d or later, including 3.4x) >>>>> - o SGI IRIX 6.X, with the GCC compiler >>>>> - (v2.95 or greater) >>>>> - o SGI IRIX 6.X, with SGI C++ 7.3 or later compiler >>>>> - (without patch 3659!) >>>>> - o Linux, with the Kuck and Associates KCC compiler >>>>> - (v3.3d or later, including 3.4x) >>>>> o Linux, with the GCC compiler >>>>> - (v2.95 or greater) >>>>> + (v3.3 or greater) >>>>> o Linux, with the Intel icpc compiler >>>>> - (v6.0 or greater) >>>>> + (v7.2 or greater) >>>>> >>>> >>> >>> >>> Ok, I just checked which additional compilers/architectures I can >>> test on >>> and unfortunately the SGI and HP machines I had access to were shut >>> down >>> recently (they were old anyways). So the ones above are the only ones >>> I can test on. Does anyone here have access to machines with the >>> SGI or the HP compilers? Does it even matter? What about NAG and PGI? >>> It would be interesting for those who support it to check OpenMP >>> support. >>> >>> Before a release I will test with gcc-3.3 and gcc-3.4 on ia32 and with >>> Intel on ia64 including OpenMP. I'll also do one round of testing with >>> MPI and Cheetah using gcc-3.4 on ia32. >>> >>> >> I do not think SGI or HP matter right now. We should definitely >> check OpenMP. I can check gcc 3.3.4 and 3.4 on ia32 with Cheetah and >> MM. > > > Ok. > >> Have all outstanding patches been checked in? > > > I don't know yet - I think I submitted all patches I can remember but > still want to go through a diff of two of my local repositories to > current CVS. That may take another week or so. OK. I await your confirmation before beginning testing. > Richard. > >> Are we ready to start testing Pooma and Cheetah? (Cheetah has a set >> of tests to check. See the Cheetah CVS repository.) >> > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Thu Aug 19 20:35:03 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 19 Aug 2004 22:35:03 +0200 Subject: [pooma-dev] [RFC] Removing workarounds for pre-ISO C++ compilers In-Reply-To: <412376E4.6060907@codesourcery.com> References: <412376E4.6060907@codesourcery.com> Message-ID: <41250EF7.1050909@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> On Tue, 17 Aug 2004, Jeffrey D. Oldham wrote: >> >> >> >>> I would prefer to keep the compiler list relatively short and containing >>> the most popular compliant compilers. >>> >>> Would you be willing to start modifying the README file for a release? >>> >> >> >> Like the following? >> >> Richard. >> >> > > Yes, like the following. Thanks for making all these changes. Perhaps, > we should add one paragraph describing the purpose of Pooma to the very > beginning of the README file. Committed like so: 2004Aug19 Richard Guenther * README: add general description, add 2.4.1 release notes. INSTALL.unix: remove obsolete information, update for 2.4.1. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Thu Aug 19 21:17:23 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 19 Aug 2004 23:17:23 +0200 Subject: [PATCH] Bounds check only if POOMA_BOUNDS_CHECK Message-ID: <412518E3.5080001@tat.physik.uni-tuebingen.de> This patch disables GuardLayers boundschecking if not configured with bounds checking on. Ok? Richard. 2004Aug19 Richard Guenther * src/Layout/GuardLayers.h: disable bounds-checking if not POOMA_BOUNDS_CHECK. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Thu Aug 19 21:28:38 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 19 Aug 2004 23:28:38 +0200 Subject: [PATCH] Correct some docs Message-ID: <41251B86.9050007@tat.physik.uni-tuebingen.de> This patch corrects hyperrefs of the html documents inside docs/ and does some minor improvements (just as I came along). Ok? Richard. 2004Aug19 Richard Guenther * docs/introduction.html: fix references to POOMA homepage and mailinglist. docs/legal.html: likewise. docs/reading.html: remove defunct links. docs/tut-02.html: minor corrections. docs/tut-04.html: likewise. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From oldham at codesourcery.com Thu Aug 19 21:53:53 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 19 Aug 2004 14:53:53 -0700 Subject: [PATCH] Correct some docs In-Reply-To: <41251B86.9050007@tat.physik.uni-tuebingen.de> References: <41251B86.9050007@tat.physik.uni-tuebingen.de> Message-ID: <41252171.3050109@codesourcery.com> Richard Guenther wrote: > > This patch corrects hyperrefs of the html documents inside docs/ and > does some minor improvements (just as I came along). > > Ok? > > Richard. > > > 2004Aug19 Richard Guenther > > * docs/introduction.html: fix references to POOMA homepage > and mailinglist. > docs/legal.html: likewise. > docs/reading.html: remove defunct links. > docs/tut-02.html: minor corrections. > docs/tut-04.html: likewise. > > > Yes, please commit this good improvements. After you commit these changes, we should probably use the W3C link checker and HTML validity checker. >------------------------------------------------------------------------ > >? docs/reference/doxygen.log >? docs/reference/html >Index: docs/introduction.html >=================================================================== >RCS file: /home/pooma/Repository/r2/docs/introduction.html,v >retrieving revision 1.1 >diff -u -u -r1.1 introduction.html >--- docs/introduction.html 19 Mar 2001 16:11:13 -0000 1.1 >+++ docs/introduction.html 19 Aug 2004 21:23:42 -0000 >@@ -154,11 +154,11 @@ > before proceeding. > >

You may also wish to look at the -href="http://www.acl.lanl.gov/pooma">POOMA web site for updates, >+href="http://www.pooma.com/">POOMA web site for updates, > bug fixes, and discussion of the library and how it can be used. If > you have any questions about POOMA or its terms of use, or if you need > help downloading or installing POOMA, please send mail to -href="mailto:pooma-devel at lanl.gov">pooma-devel at lanl.gov. >+href="mailto:pooma-dev at pooma.codesourcery.com">pooma-dev at pooma.codesourcery.com. > > >
>Index: docs/legal.html >=================================================================== >RCS file: /home/pooma/Repository/r2/docs/legal.html,v >retrieving revision 1.2 >diff -u -u -r1.2 legal.html >--- docs/legal.html 15 Oct 2001 17:34:29 -0000 1.2 >+++ docs/legal.html 19 Aug 2004 21:23:42 -0000 >@@ -33,9 +33,9 @@ > version available from LANL. > >

For more information about POOMA, send e-mail to >-pooma-devel at lanl.gov, >+pooma-dev at pooma.codesourcery.com, > or visit the POOMA web page at >-http://www.acl.lanl.gov/pooma. >+http://www.pooma.com/. > >
>
>Index: docs/reading.html >=================================================================== >RCS file: /home/pooma/Repository/r2/docs/reading.html,v >retrieving revision 1.1 >diff -u -u -r1.1 reading.html >--- docs/reading.html 19 Mar 2001 16:11:13 -0000 1.1 >+++ docs/reading.html 19 Aug 2004 21:23:42 -0000 >@@ -64,12 +64,6 @@ > of these entities can serve as a model of a concept. Using these > ideas, Austern also provides a complete reference for the STL. > >-

Finally, see the POOMA web site for >- >-on-line presentations and >- >-technical papers describing the POOMA framework. >- > >

Bibliography

> >Index: docs/tut-02.html >=================================================================== >RCS file: /home/pooma/Repository/r2/docs/tut-02.html,v >retrieving revision 1.2 >diff -u -u -r1.2 tut-02.html >--- docs/tut-02.html 26 Mar 2001 23:49:59 -0000 1.2 >+++ docs/tut-02.html 19 Aug 2004 21:23:42 -0000 >@@ -377,7 +377,7 @@ >

A Note on Expressions

> >

As you may have guessed from the preceding discussion, >-POOMA expressions are first-class ConstArrays >+POOMA expressions are first-class non-writable Arrays > with an expression engine. As a consequence, expressions can be > subscripted directly, as in: > >@@ -419,7 +419,7 @@ > a = sin(iota(n1,n2).comp(0)) + iota(n1,n2).comp(1)*5; > > >-

In general, iota(domain) returns a ConstArray >+

In general, iota(domain) returns an Array > whose elements are vectors, such that iota(domain)(i,j) is > Vector<2,int>(i,j). These values can be used in > expressions, or stored in objects, as in: >Index: docs/tut-04.html >=================================================================== >RCS file: /home/pooma/Repository/r2/docs/tut-04.html,v >retrieving revision 1.2 >diff -u -u -r1.2 tut-04.html >--- docs/tut-04.html 26 Mar 2001 23:49:59 -0000 1.2 >+++ docs/tut-04.html 19 Aug 2004 21:23:43 -0000 >@@ -401,7 +401,7 @@ > arithmetic types like int or double. In particular, > Vector, Tensor, and complex are explicitly > supported. Please contact -href="mailto:pooma-devel at lanl.gov">pooma-devel at lanl.gov for >+href="mailto:pooma-dev at pooma.codesourcery.com">pooma-dev at pooma.codesourcery.com for > information on using other, more complicated types. > >

The Array::comp() method used on line 16 does @@ -1007,8 +1007,8 @@ > a.comp(2) does not copy values out of a > into temporary storage. > >-

If the source array of a component view is writable (i.e. not a >-ConstArray), then that component view can appear on >+

If the source array of a component view is writable, >+then that component view can appear on > either side of the assignment operator. For example: > >

>@@ -1021,7 +1021,7 @@
> used to make an object to store the view, as in:
> 
> 
>-ComponentView<Loc<1>, Array<2, Vector<3> > > va = a.comp(1);
>+typename ComponentView<Loc<1>, Array<2, Vector<3> > >::Type_t va = a.comp(1);
> 
> >

Here, the argument "Loc<1>" indicates that the component is singly-indexed. > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Thu Aug 19 21:59:09 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 19 Aug 2004 14:59:09 -0700 Subject: [PATCH] Bounds check only if POOMA_BOUNDS_CHECK In-Reply-To: <412518E3.5080001@tat.physik.uni-tuebingen.de> References: <412518E3.5080001@tat.physik.uni-tuebingen.de> Message-ID: <412522AD.1030906@codesourcery.com> Richard Guenther wrote: > > This patch disables GuardLayers boundschecking if not configured with > bounds checking on. > > Ok? > > Richard. > > > 2004Aug19 Richard Guenther > > * src/Layout/GuardLayers.h: disable bounds-checking if not > POOMA_BOUNDS_CHECK. > Yes, please commit this change. >------------------------------------------------------------------------ > >Index: src/Layout/GuardLayers.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Layout/GuardLayers.h,v >retrieving revision 1.10 >diff -u -u -r1.10 GuardLayers.h >--- src/Layout/GuardLayers.h 26 Oct 2003 11:28:11 -0000 1.10 >+++ src/Layout/GuardLayers.h 19 Aug 2004 21:15:39 -0000 >@@ -123,12 +123,16 @@ > > int lower(int i) const > { >+#if POOMA_BOUNDS_CHECK > PInsist(i=0," GuardLayers index out of range "); >+#endif > return lower_m[i]; > } > int upper(int i) const > { >+#if POOMA_BOUNDS_CHECK > PInsist(i=0," GuardLayers index out of range "); >+#endif > return upper_m[i]; > } > >@@ -138,12 +142,16 @@ > > int &lower(int i) > { >+#if POOMA_BOUNDS_CHECK > PInsist(i=0," GuardLayers index out of range "); >+#endif > return lower_m[i]; > } > int &upper(int i) > { >+#if POOMA_BOUNDS_CHECK > PInsist(i=0," GuardLayers index out of range "); >+#endif > return upper_m[i]; > } > > > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Fri Aug 20 09:11:55 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 20 Aug 2004 11:11:55 +0200 (CEST) Subject: [pooma-dev] Re: [PATCH] Correct some docs In-Reply-To: <41252171.3050109@codesourcery.com> Message-ID: On Thu, 19 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > > > > This patch corrects hyperrefs of the html documents inside docs/ and > > does some minor improvements (just as I came along). > > > > Ok? > > > > Richard. > > > > > > 2004Aug19 Richard Guenther > > > > * docs/introduction.html: fix references to POOMA homepage > > and mailinglist. > > docs/legal.html: likewise. > > docs/reading.html: remove defunct links. > > docs/tut-02.html: minor corrections. > > docs/tut-04.html: likewise. > > > > > > > Yes, please commit this good improvements. After you commit these > changes, we should probably use the W3C link checker and HTML validity > checker. Hum. The documents seem to be not in a good shape wrt conformance. But I can easily run them through HTML Tidy - manually fixing them will take too much time. Would this be ok? Thanks, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Fri Aug 20 19:37:52 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Fri, 20 Aug 2004 12:37:52 -0700 Subject: [pooma-dev] Re: [PATCH] Correct some docs In-Reply-To: References: Message-ID: <41265310.3010606@codesourcery.com> Richard Guenther wrote: >On Thu, 19 Aug 2004, Jeffrey D. Oldham wrote: > > > >>Richard Guenther wrote: >> >> >> >>>This patch corrects hyperrefs of the html documents inside docs/ and >>>does some minor improvements (just as I came along). >>> >>>Ok? >>> >>>Richard. >>> >>> >>>2004Aug19 Richard Guenther >>> >>> * docs/introduction.html: fix references to POOMA homepage >>> and mailinglist. >>> docs/legal.html: likewise. >>> docs/reading.html: remove defunct links. >>> docs/tut-02.html: minor corrections. >>> docs/tut-04.html: likewise. >>> >>> >>> >>> >>> >>Yes, please commit this good improvements. After you commit these >>changes, we should probably use the W3C link checker and HTML validity >>checker. >> >> > >Hum. The documents seem to be not in a good shape wrt conformance. But I >can easily run them through HTML Tidy - manually fixing them will take >too much time. > >Would this be ok? > I modified the HTML documents in the docs/ subdirectory to achieve HTML 4.0 validity and to also check the links. I used http://validator.w3.org/ and http://validator.w3.org/checklink for this work. No major changes were made except four tables are no longer shifted left. All documents now pass except for links to known missing illustrations (these illustrations have been missing for several years) and incorrect use of ... in background.html. I do not know how to revise this

section to support and maintain HTML 4.0 validity. I learned that HTML should be created by tools to ensure validity. Are these OK to commit to the Pooma CVS repository? (If you want to use tidy on these after we resolve these proposed changes, that's fine, but it's not essential. I'd rather ensure all outstanding patches are resolved and the code works correctly.) -- Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: doc.20Aug.12.4.ChangeLog URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: doc.20Aug.12.4.patch Type: text/x-patch Size: 186907 bytes Desc: not available URL: From rguenth at tat.physik.uni-tuebingen.de Fri Aug 20 19:43:52 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 20 Aug 2004 21:43:52 +0200 Subject: [pooma-dev] Re: [PATCH] Correct some docs In-Reply-To: References: Message-ID: <41265478.70109@tat.physik.uni-tuebingen.de> Richard Guenther wrote: > On Thu, 19 Aug 2004, Jeffrey D. Oldham wrote: > > >>Richard Guenther wrote: >> >> >>>This patch corrects hyperrefs of the html documents inside docs/ and >>>does some minor improvements (just as I came along). >>> >>>Ok? >>> >>>Richard. >>> >>> >>>2004Aug19 Richard Guenther >>> >>> * docs/introduction.html: fix references to POOMA homepage >>> and mailinglist. >>> docs/legal.html: likewise. >>> docs/reading.html: remove defunct links. >>> docs/tut-02.html: minor corrections. >>> docs/tut-04.html: likewise. >>> >>> >>> >> >>Yes, please commit this good improvements. After you commit these >>changes, we should probably use the W3C link checker and HTML validity >>checker. > > > Hum. The documents seem to be not in a good shape wrt conformance. But I > can easily run them through HTML Tidy - manually fixing them will take > too much time. > > Would this be ok? Doing this and manually fixing a few warnings results in a 180kB compressed diff of the pages which I don't want to post and possibly you don't want to review :) Would it be ok to check that changes in and then possibly re-iterate after checking the result? Thanks, Richard. From rguenth at tat.physik.uni-tuebingen.de Fri Aug 20 19:49:12 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 20 Aug 2004 21:49:12 +0200 Subject: [pooma-dev] Re: [PATCH] Correct some docs In-Reply-To: <41265310.3010606@codesourcery.com> References: <41265310.3010606@codesourcery.com> Message-ID: <412655B8.3090806@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> On Thu, 19 Aug 2004, Jeffrey D. Oldham wrote: >> >> >> >>> Richard Guenther wrote: >>> >>> >>> >>>> This patch corrects hyperrefs of the html documents inside docs/ and >>>> does some minor improvements (just as I came along). >>>> >>>> Ok? >>>> >>>> Richard. >>>> >>>> >>>> 2004Aug19 Richard Guenther >>>> >>>> * docs/introduction.html: fix references to POOMA homepage >>>> and mailinglist. >>>> docs/legal.html: likewise. >>>> docs/reading.html: remove defunct links. >>>> docs/tut-02.html: minor corrections. >>>> docs/tut-04.html: likewise. >>>> >>>> >>>> >>>> >>> >>> Yes, please commit this good improvements. After you commit these >>> changes, we should probably use the W3C link checker and HTML validity >>> checker. >>> >> >> >> Hum. The documents seem to be not in a good shape wrt conformance. >> But I >> can easily run them through HTML Tidy - manually fixing them will take >> too much time. >> >> Would this be ok? >> > I modified the HTML documents in the docs/ subdirectory to achieve HTML > 4.0 validity and to also check the links. I used > http://validator.w3.org/ and http://validator.w3.org/checklink for this > work. No major changes were made except four tables are no longer > shifted left. All documents now pass except for links to known missing > illustrations (these illustrations have been missing for several years) > and incorrect use of ... in background.html. I do not know > how to revise this
section to support and > maintain HTML 4.0 validity. > > I learned that HTML should be created by tools to ensure validity. > > Are these OK to commit to the Pooma CVS repository? I think these are ok - they cover more stuff than I got with simply tidy -m, my manual fixes seem to be contained, too. I'll work on-top of your changes if necessary. Thanks for doing the work, Richard. > (If you want to use tidy on these after we resolve these proposed > changes, that's fine, but it's not essential. I'd rather ensure all > outstanding patches are resolved and the code works correctly.) Yes, me too. From oldham at codesourcery.com Fri Aug 20 20:15:30 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Fri, 20 Aug 2004 13:15:30 -0700 Subject: [pooma-dev] Re: [PATCH] Correct some docs In-Reply-To: <412655B8.3090806@tat.physik.uni-tuebingen.de> References: <41265310.3010606@codesourcery.com> <412655B8.3090806@tat.physik.uni-tuebingen.de> Message-ID: <41265BE2.90008@codesourcery.com> Richard Guenther wrote: > Jeffrey D. Oldham wrote: > >> Richard Guenther wrote: >> >>> On Thu, 19 Aug 2004, Jeffrey D. Oldham wrote: >>> >>> >>> >>>> Richard Guenther wrote: >>>> >>>> >>>> >>>>> This patch corrects hyperrefs of the html documents inside docs/ and >>>>> does some minor improvements (just as I came along). >>>>> >>>>> Ok? >>>>> >>>>> Richard. >>>>> >>>>> >>>>> 2004Aug19 Richard Guenther >>>>> >>>>> * docs/introduction.html: fix references to POOMA homepage >>>>> and mailinglist. >>>>> docs/legal.html: likewise. >>>>> docs/reading.html: remove defunct links. >>>>> docs/tut-02.html: minor corrections. >>>>> docs/tut-04.html: likewise. >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> Yes, please commit this good improvements. After you commit these >>>> changes, we should probably use the W3C link checker and HTML validity >>>> checker. >>>> >>> >>> >>> >>> Hum. The documents seem to be not in a good shape wrt conformance. >>> But I >>> can easily run them through HTML Tidy - manually fixing them will take >>> too much time. >>> >>> Would this be ok? >>> >> I modified the HTML documents in the docs/ subdirectory to achieve >> HTML 4.0 validity and to also check the links. I used >> http://validator.w3.org/ and http://validator.w3.org/checklink for >> this work. No major changes were made except four tables are no >> longer shifted left. All documents now pass except for links to >> known missing illustrations (these illustrations have been missing >> for several years) and incorrect use of ... in >> background.html. I do not know how to revise this
>> section to support and maintain HTML 4.0 validity. >> >> I learned that HTML should be created by tools to ensure validity. >> >> Are these OK to commit to the Pooma CVS repository? > > > I think these are ok - they cover more stuff than I got with simply > tidy -m, my manual fixes seem to be contained, too. I'll work on-top > of your changes if necessary. > > Thanks for doing the work, > > Richard. > >> (If you want to use tidy on these after we resolve these proposed >> changes, that's fine, but it's not essential. I'd rather ensure all >> outstanding patches are resolved and the code works correctly.) > > > Yes, me too. > I committed the changes. Now, it's your turn if you wish. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Sat Aug 21 19:16:31 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sat, 21 Aug 2004 21:16:31 +0200 Subject: [PATCH] Fix/improve PETSc wrapper Message-ID: <41279F8F.1010107@tat.physik.uni-tuebingen.de> Found in one of my repositories. Ok? Richard. 2004Aug21 Richard Guenther * src/Transform/PETSc.h: handle expression engines for initialization, support periodic setup, fix MP patch computation. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Sat Aug 21 20:13:32 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sat, 21 Aug 2004 22:13:32 +0200 Subject: [PATCH] Fix reductions for MPI operation Message-ID: <4127ACEC.2020708@tat.physik.uni-tuebingen.de> This patch fixes (works around) a previously discovered problem (remember the WaitingIterate). I'm sure there is a real problem to fix (at least for MPI - I'm not sure about Cheetah), and this is the least intrusive way of fixing it until the right idea for a cross-context csem like mechanism pops up. Without this patch random lockups during reductions may occour. Ok? Richard. 2004Aug21 Richard Guenther * src/Engine/RemoteEngine.h: For MPI avoid doing blocking operation during reductions while iterates are still pending. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p2 URL: From rguenth at tat.physik.uni-tuebingen.de Sat Aug 21 22:02:25 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 22 Aug 2004 00:02:25 +0200 Subject: [PATCH] Fix compiling Doof2d Message-ID: <4127C671.10503@tat.physik.uni-tuebingen.de> Fixes ISO conformance problems with Doof2d benchmark. Ok? Richard. 2004Aug22 Richard Guenther * benchmarks/Doof2d/Doof2d.h: fix ISO conformance. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p3 URL: From rguenth at tat.physik.uni-tuebingen.de Mon Aug 23 11:17:50 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Aug 2004 13:17:50 +0200 (CEST) Subject: [PATCH] Convert ParticlesDoc.txt to html Message-ID: As subject says. Also adds common header to Layout.html. Ok? Richard. 2004Aug23 Richard Guenther * docs/Layout.html: adjust background color, add head image. docs/index.html: refer to ParticlesDoc.html. docs/ParticlesDoc.html: new. docs/ParticlesDoc.txt: remove. -------------- next part -------------- Index: index.html =================================================================== RCS file: /home/pooma/Repository/r2/docs/index.html,v retrieving revision 1.3 diff -u -u -r1.3 index.html --- index.html 20 Aug 2004 20:14:19 -0000 1.3 +++ index.html 23 Aug 2004 11:14:54 -0000 @@ -23,7 +23,7 @@

Parallelism Models: Messaging and Threads

Layouts

-

New description of Particles

+

New description of Particles

Text I/O

Object I/O

New Tensor functionality

--- /dev/null Tue May 18 17:20:27 2004 +++ ParticlesDoc.html Mon Aug 23 13:13:27 2004 @@ -0,0 +1,1520 @@ + + + + + + Layout and related classes + + + +
POOMA banner
+ + +

POOMA Particles Documentation

+ + +

Introduction

+ +

+Particles are primarily used in one of two ways in large scientific +applications. The first is to track sample particles using Monte +Carlo techniques, for example, to gather statistics that describe the +conditions of a complex physical system. Particles of this kind are +often referred to as "tracers". The second is to perform direct +numerical simulation of systems that contain discrete point-like +entities such as ions or molecules. + +

+In both scenarios, the application contains one or more sets of +particles. Each set has some data associated with it that describes +its members' characteristics, such as mass or momentum. Particles +typically exist in a spatial domain, and they may interact directly +with one another or with field quantities defined on that domain. + +

+This document gives an overview of POOMA's support for particles, +then discusses some implementation details. The classes introduced in +this tutorial are illustrated by two short programs: one that tracks +particles under the influence of a simple one-dimensional harmonic +oscillator potential, and another that models particles bouncing off +the walls of a closed three-dimensional box. Later on, we will show +how particles and fields can interact in a simulation code. + + +

Overview

+ +

+POOMA's Particles class is a container for a heterogeneous collection +of particle attributes. The class uses dynamic storage for particle +data (in the form of a set of POOMA DynamicArray objects), so that +particles can be added or deleted as necessary. It contains a layout +object that manages the distribution of particle data across multiple +patches, and it applies boundary conditions to particles when attribute +data values exceed a prescribed range. In addition, global functions +are provided for interpolating data between particle and field element +positions. + +

+Each Particles object keeps a list of pointers to its elements' +attributes. When an application wants to add or delete particles, it +invokes a method on the Particles object, which delegates the call to +the layout object for the contained attributes. Particles also +provides a member function called sync(), which the application +invokes in order to update the global particle count and numbering, +update the data distribution across patches, and apply the particle +boundary conditions. + +

+Applications define a specific type of particles collection by +deriving from the Particles class. The derived class declares data +members for the attributes needed to characterize this type of +particle. (The types of these data members are discussed below.) +The constructor for this derived class should call the method +Particles::addAttribute() to register each attribute and add +it to the list maintained by Particles. In this way, the Particles +class can be extended by the application to accommodate any sort of +particle description. + +

+The distribution of particle data stored in DynamicArray objects +is directed by a particle layout class. Each particle layout class +employs a particular strategy to determine the patch in which a +particle's data should be stored. For instance, SpatialLayout keeps +each particle in the patch that contains field data for elements +that are nearest to the particle's current spatial position. This +strategy is useful for cases where the particles need to interact +with field data or with neighboring particles. + + +

Particle Attributes

+ +

+Each particle attribute is implemented as a DynamicArray, a class +derived from the one-dimensional specialization of POOMA's Array +class. DynamicArray extends the notion of a one-dimensional array +to allow applications to add or delete elements at will. When +particles are destroyed, the empty slots left behind can be filled +by moving elements from the end of the list (BackFill) or by sliding +all the remaining elements over and preserving the existing order +(ShiftUp). At the same time, DynamicArray objects can be used in +data-parallel expressions in the same way as ordinary Array objects, +so that the application can update particle attributes such as +position and velocity using either a single statement or a loop over +individual particles. + +

+At first glance, it might seem more sensible to have applications +define a struct that stores all the attribute data for one particle +in a single data structure, and then use this as a template argument +to the Particles class, which would store a DynamicArray of values +of this type. POOMA's designers considered this option, but discarded +it. The reason is that most compute-intensive operations in scientific +applications are implemented as loops in which one or more separate +attributes are read or written. In order to make the evaluation of +expressions involving attributes as efficient as possible, it is +therefore important to ensure that data are arranged as separate +one-dimensional arrays for each attribute, rather than as a single +array of structures with one structure per particle. This makes typical +computational loops such as + +

+for (int i=0; i<n; ++i)
+{
+  x[i] += dt * vx[i];
+  y[i] += dt * vy[i];
+}
+
+ +

+run more quickly, as it makes much better use of the data cache. + + +

Particle Layout

+ +

+As mentioned above, each Particles object uses a layout object to +determine in which patch a particle's data should be stored. The +layout manages the program's requests to re-arrange particle data. +With SpatialLayout, for example, the application provides a particle +position attribute which is used to determine how particle data +should be distributed. The particle layout then directs the Particles +object to move particle data from one patch to another as dictated by +its strategy. The Particles object in turn delegates this task to the +layout object for the particle attributes, which tells each of the +attributes using this layout to move their data as needed. All of +this is handled by a single call to the method Particles::sync(). + + +

Derivation of Particles Subclass

+ +

+In general, creating a new Particles subclass is a three-step process. +The first step is to declare a traits class with typedef's specifying +the type of engine the particle attributes will use and the type of +particle layout. An example of such a traits class is the following: + +

+struct MyParticleTraits
+{
+  typedef MultiPatch<DynamicTag,Dynamic> AttributeEngineTag_t;
+  typedef UniformLayout                  ParticleLayout_t;
+};
+
+ +

+This traits class will be used to specialize the Particles class +template when an application-specific subclass representing a +concrete set of particles is derived from it. Particles uses public +typedef's to give sensible names to these traits parameters, so that +the derived subclass can access them (as shown below). The traits +approach is used here to provide flexibility in the Particles design +for future extensions. In addition to specifying the attribute engine +and particle layout types, this traits class could also set some +application-specific parameters. + +

+Currently, there is a fairly limited set of valid choices for attribute +engine type and particle layout strategy. Because we require that +the particle attributes share a common layout and remain synchronized, +we must use a MultiPatch engine with a DynamicLayout. The patch +engines inside the MultiPatch engine must have dynamic capabilities. +They can be either of type Dynamic or, when running across multiple +contexts, of type Remote<Dynamic>. As for the particle layout, POOMA +currently provides only two possible strategies: UniformLayout, which +just tries to keep a similar number of particles in each patch; and +SpatialLayout, which organizes the particles into patches based upon +their current spatial position. For the user's convenience, a set of +pre-defined particle traits classes with specific choices of attribute +engine and particle layout type are provided in the header file +Particles/CommonParticleTraits.h. These define all the combinations +of multi-patch dynamic and remote dynamic engines with both uniform +and spatial layouts. Ordinarily, the user can simply choose one of +these pre-defined traits classes for their Particles subclass. + +

+The second step is to actually derive a class from Particles. The +new class can be templated on whatever the developer desires, as long +as a traits class type is provided for the template parameter of the +Particles base class. In the example below, the new class being derived +from Particles is templated on the same traits class as Particles. For +the sake of convenience, typedef's may be provided for the instantiated +parent class and for its layout type. The constructor for the user's +subclass then usually takes a concrete layout object of the type specified +in the typedef above as a constructor argument: + +

+template <class PT>
+class MyParticles : public Particles<PT>
+{
+public:
+  // instantiated type of base class
+  typedef Particles<PT> Base_t;
+
+  // type of particle layout (from traits class via base class)
+  typedef typename Base_t::ParticleLayout_t ParticleLayout_t;
+
+  // type of attribute engine tag (from traits class via base class)
+  typedef typename Base_t::AttributeEngineTag_t EngineTag_t;
+ 
+  // some particle attributes as public data members
+  DynamicArray<double, EngineTag_t> charge;
+  DynamicArray<double, EngineTag_t> mass;
+  DynamicArray<int, EngineTag_t>    count;
+
+  // constructor passes particle layout to base class
+  MyParticles(const ParticleLayout_t &layout)
+  : Particles<PT>(layout)
+  {
+    // register attributes with base class
+    addAttribute(charge);
+    addAttribute(mass);
+    addAttribute(count);
+  }
+};
+
+ +

+Note that the attribute elements in this example have different +element types; i.e., charge and mass are of type double, while +count is of type int. Attribute elements may in general have any +type, including any user-defined type. + +

+Finally, the application-specific class MyParticles is instantiated +with a traits class such as MyParticleTraits to create an actual +set of particles. A particle layout is declared first, and it is +passed as a constructor argument to the instance of the user's class +to control the distribution of particle data between patches. This +layout object typically has one or more constructor arguments that +specify such things as the number of patches the particles are to be +distributed over. Here is an example of creating a MyParticles object: + +

+const int numPatches = 10;
+MyParticleTraits::ParticleLayout_t layout(numPatches);
+MyParticles<MyParticleTraits>      particles(layout);
+
+ +

+While this may seem complex at first, each level of indirection or +generalization is needed in order to provide flexibility. The type of +engine and layout to be used, for example, could be passed directly as +template parameters to Particles, rather than being combined together +in a traits class. However, this would make user-level code fragile +in the face of future changes to the library. If other traits are +needed later, they can be added to the traits class in one place, rather +than needing to be specified every time a new class is derived from +Particles. This bundling also makes it easier to specify the same basic +properties (engine and layout) for two or more interacting Particles +subclasses. + + +

Synchronization and Related Issues

+ +

+For efficiency reasons, Particles does not automatically move particle +data between patches after every operation, but instead waits for the +application to call the method sync(). Particles can also be configured +to cache requests to delete particles, rather than deleting them immediately. + +

+Particles::sync() is a member function template taking a single argument. +This argument must be one of the particle attributes (i.e., a DynamicArray). +SpatialLayout assumes that the attribute given to sync() is the particle +positions, and it uses this to update the distribution of particle data so +that particles are located on the same patch as nearby field data. +Applications must therefore be careful not to mistakenly pass a non-position +attribute, such as temperature or pressure, to SpatialLayout via the +sync() method. + +

+UniformLayout, which divides particles as evenly as possible between patches, +without regard to spatial position, only uses the attribute passed to sync() +as a template for the current distribution of particle data. Any attribute +with the same distribution as the actual particle data can therefore be used. + +

+The use of a parameter in Particles::sync() is one important difference +between the implementation of particles in POOMA I and POOMA II. In the old +design, all Particles classes came with a pre-defined attribute named R that +was the particles' position, and synchronization always referred to this +attribute. The new scheme allows applications to change the attribute that +is used to represent the position; e.g., to switch back and forth in a time +advance algorithm between a "current" position attribute and a "new" position +attribute. It also allows particles to be weighted according to some attribute, +so that the distribution scheme load-balances by weight. + +

+Of course, before particle data can be distributed, the particles themselves +must be created. Particles provides two methods for doing this. The first, +globalCreate(N,renum), creates a specified number of particles N, spread +as evenly as possible across all patches. The particles are normally renumbered +after the creation operation, although this can be overridden by passing the +second parameter (renum) with a value of "false". POOMA automatically uses a +numbering scheme in which the particles are ordered by patch number and labeled +consecutively within a patch. For instance, if patch 0 has 6 particles and +patch 1 has 4 particles, then the particles on patch 0 are labeled 0 through 5, +and the particles on patch 1 are labeled 6 through 9. + +

+Particles::create(N,patchID,renum), on the other hand, creates a specified +number of particles N on each local context, and adds them to a specific +patch (or to the last local patch if none is specified). Once again, the +particles are renumbered after this operation unless renum is false. Used in +conjunction with the Pooma::context() method, this create() method can be +utilized to allocate a specific number of particles on each context and in +each local patch within a context. If a program contains a series of calls +to the create() method, the user may wish set renum to false to avoid +renumbering particles until all of the particle creation tasks have been +completed. + +

+After particles have been created (or destroyed), they should be renumbered +to ensure that each has a unique ID and that the global domain of the +particle attributes is consistent. This is critical to the proper behavior +of data-parallel expressions involving attributes. The Particles::renumber() +method surveys all the patches to find out what the current local domain of +each patch is. It then reconstructs a global domain across all the patches, +effectively renumbering the particles from 0 to N-1, where N is the total +number of particles. The more complex sync() method applies the particle +boundary conditions, performs any deferred particle destroy requests, swaps +particles between patches according to the particle layout strategy, and +then renumbers the particles by calling renumber(). Programs should call +renumber() if they have only created or destroyed particles, but have not +done deferred destroy requests, modified particle attributes in a way that +would require applying boundary conditions (or have no boundary conditions), +and do not need to swap particles across patches. Note that calls to +globally synchronizing functions such as renumber() or sync() must be done +on all contexts in a SPMD fashion. + +

+If a program does not (implicitly or explicitly) call renumber() after +creating or destroying particles, the global domain for the particle +attributes will be incorrect. If the program then tries to read or write +a view of a particle attribute by indexing with some domain object, it +will not get the right section of the data. This failure could be silent +if the view that the program requests exists. Alternatively, the requested +view could be outside of the (incorrect) global domain, in which case the +layout object for the particle attributes will suffer a run-time assertion +failure. It is the user's responsibility to ensure that the particle +attributes are properly synchronized prior to any data-parallel expression. + +

+There are also two ways to destroy particles. The first way, which +always destroys the particles immediately, is implemented by the +method Particles::destroy(domain,patchID,renum). This function deletes +particles from the local patch indicated by patchID, and domain is +assumed to refer to a subset of the local domain in that patch. If +patchID is not specified, then domain is assumed to refer to a subset +of the global domain of the particle attributes, in which case the +function may delete particles from multiple patches. The renum argument +indicates whether to renumber the particles after the destroy command +is performed, and it is true by default. + +

+The second destruction method is Particles::deferredDestroy(domain,patchID). +The meanings of the two arguments are the same as in the destroy() method. +This is new in POOMA II, and it only does deferred destruction; i.e., caches +the requested indices for use later when performDestroy() is called. Since +this method doesn't actually destroy particles right away, there is no need +for it to call renumber(). This deferredDestroy() method can be useful when +there are many separate destroy requests, because it lumps them all together +and amortizes the expense of having to move particle data around and shrink +the particle attributes. The performDestroy() method, which causes +the cached destruction requests to be executed, always performs renumbering. +The performDestroy() method can be called explicitly by the user in order +to process and flush any cached destroy requests or implicitly by calling +the sync() method. + +

+All particle destroys are implemented using one of two possible methods: +BackFill or ShiftUp. With the BackFill method, the "holes" that are left +behind when particles are deleted are filled with particle data from the +end of the list for the given patch. The ShiftUp method, on the other hand, +slides all of the remaining particles up the list in order to fill holes. +Thus, the ShiftUp destroy method is plainly slower, but it preserves the +existing order of the particles within a given patch. The user can select +the preferred destroy method using the setDestroyMethod() function. + + +

Example: Simple Harmonic Oscillator

+ +

+The example for this tutorial simulates the motion of particles under the +influence of a simple one-dimensional harmonic oscillator potential. The +code, a version of which is included in the POOMA II release in the +examples/Particles/Oscillation directory, is as follows: + +

+001  #include <iostream>
+002  #include <stdlib.h>
+003
+004  #include "Pooma/Particles.h"
+005  #include "Pooma/DynamicArrays.h"
+006
+007  // Dimensionality of this problem
+008  static const int PDim = 1;
+009
+010  // A traits class specifying the engine and layout of a Particles class.
+011  template <class EngineTag>
+012  struct PTraits
+013  {
+014    // The type of engine to use in the particle attributes.
+015    typedef EngineTag AttributeEngineTag_t;
+016
+017    // The type of particle layout to use.  Here we use a UniformLayout,
+018    // which divides the particle data up so as to have an equal number
+019    // on each patch.
+020    typedef UniformLayout ParticleLayout_t;
+021  };
+022  
+023  // A Particles subclass that defines position and velocity as
+024  // attributes.
+025  template <class PT>
+026  class Quanta : public Particles<PT>
+027  {
+028  public:
+029    // Useful typedef's to extract from the base class
+030    typedef Particles<PT>                         Base_t;
+031    typedef double                                AxisType_t;
+032    typedef typename Base_t::ParticleLayout_t     ParticleLayout_t;
+033    typedef typename Base_t::AttributeEngineTag_t AttributeEngineTag_t;
+034    enum { dimensions = PDim };
+035  
+036    // Constructor sets up layouts and registers attributes
+037    Quanta(const ParticleLayout_t &pl)
+038    : Particles<PT>(pl)
+039    {
+040      addAttribute(x);
+041      addAttribute(v);
+042    }
+043  
+044    // X position and velocity are attributes (these could be declared
+045    // private and accessed with public methods, but this gives nice syntax)
+046    DynamicArray< Vector<dimensions,AxisType_t>, AttributeEngineTag_t > x;
+047    DynamicArray< Vector<dimensions,AxisType_t>, AttributeEngineTag_t > v;
+048  };
+049  
+050  // Engine tag type for attributes.  Here we use a MultiPatch engine
+051  // with the patches being Dynamic engines, and a DynamicTag, which allows
+052  // the patches to change sizes during the application.  This is important
+053  // since we may change the number of particles in each patch.
+054  typedef MultiPatch<DynamicTag,Dynamic> AttrEngineTag_t;
+055  
+056  // The particle traits class and layout type for this application
+057  typedef PTraits<AttrEngineTag_t> PTraits_t;
+058  typedef PTraits_t::ParticleLayout_t PLayout_t;
+059  
+060  // Simulation control constants
+061  const double omega      = 2.0;
+062  const double dt         = 1.0 / (50.0 * omega);
+063  const int nParticle     = 100;
+064  const int nPatch        = 4;
+065  const int nIter         = 500;
+066  
+067  // Main simulation routine.
+068  int main(int argc, char *argv[])
+069  {
+070    // Initialize POOMA and Inform object for output to terminal
+071    Pooma::initialize(argc,argv);
+072    Inform out(argv[0]);
+073    out << "Begin Oscillation example code" << std::endl;
+074  
+075    // Create a uniform layout object to control particle positions.
+076    PLayout_t layout(nPatch);
+077  
+078    // Create Particles, using our special subclass and the particle layout
+079    typedef Quanta<PTraits_t> Particles_t;
+080    Particles_t p(layout);
+081  
+082    // Create particles on one patch, then re-distribute (just to show off)
+083    p.create(nParticle, 0);
+084    for (int ip=0; ip<nPatch; ++ip)
+085    {
+086      out << "Current size of patch " << ip << " = "
+087          << p.attributeLayout().patchDomain(ip).size()
+088          << std::endl;
+089    }
+090  
+091    out << "Resyncing particles object ... " << std::endl;
+092    p.sync(p.x);
+093  
+094    // Show re-balanced distribution.
+095    for (int ip=0; ip<nPatch; ++ip)
+096    {
+097      out << "Current size of patch " << ip << " = "
+098          << p.attributeLayout().patchDomain(ip).size()
+099          << std::endl;
+100    }
+101  
+102    // Randomize positions in domain [-1,+1], and set velocities to zero.
+103    // This is done with a loop because POOMA does not yet have parallel RNGs.
+104    typedef Particles_t::AxisType_t Coordinate_t;
+105    Vector<PDim,Coordinate_t> initPos;
+106    srand(12345U);
+107    Coordinate_t ranmax = static_cast<Coordinate_t>(RAND_MAX);
+108    for (int ip=0; ip<nParticle; ++ip)
+109    {
+110      for (int idim=0; idim<PDim; ++idim)
+111      {
+112        initPos(idim) = 2 * (rand() / ranmax) - 1;
+113      }
+114      p.x(ip) = initPos;
+115      p.v(ip) = Vector<PDim,Coordinate_t>(0.0);
+116    }
+117  
+118    // print initial state
+119    out << "Time = 0.0:" << std::endl;
+120    out << "Quanta positions:" << std::endl << p.x << std::endl;
+121    out << "Quanta velocities:" << std::endl << p.v << std::endl;
+122  
+123    // Advance particles in each time step according to:
+124    //         dx/dt = v
+125    //         dv/dt = -omega^2 * x
+126    for (int it=1; it<=nIter; ++it)
+127    {
+128      p.x = p.x + dt * p.v;
+129      p.v = p.v - dt * omega * omega * p.x;
+130      out << "Time = " << it*dt << ":" << std::endl;
+131      out << "Quanta positions:" << std::endl << p.x << std::endl;
+132      out << "Quanta velocities:" << std::endl << p.v << std::endl;
+133    }
+134  
+135    // Finalize POOMA
+136    Pooma::finalize();
+137    return 0;
+138  }
+
+ +

+As discussed earlier, the program begins by creating a traits class that +provides typedef's for the names AttributeEngineTag_t and ParticleLayout_t +(lines 11-21). An application-specific class called Quanta is then derived +from Particles, without specifying the traits to be used (lines 25-48). +This class declares two attributes, to store the particles' x coordinate +and velocity. The body of its constructor (lines 40-41) adds these attributes +to the attribute list, while passing the actual particle layout object +specified by the application up to Particles. + +

+Lines 54, 57 and 58 create some convenience typedef's for the engine and +layout that the application will use. Lines 61-65 then define constants +describing both the physical parameters to the problem (such as the +oscillation frequency) and the computational parameters (the number of +particles, the number of patches, etc.). In a real application, many of +these values would be variables, rather than hard-wired constants. + +

+After the POOMA library is initialized (line 71), an Inform object is +created to manage output. An actual layout is then created (line 76), and +is used to create a set of particles (line 80). The particles themselves +are created by the call to Particles::create() on line 83. The output on +lines 84-89 shows that all particles are initially created in patch 0. + +

+The sync() call on line 92 redistributes particles across the available +patches, using the x coordinate as a template for the current particle +distribution. As the output from lines 95-100 shows, this distributes the +particles across patches as evenly as possible. + +

+The particle positions are randomized on lines 108-116. (A loop is used +here rather than a data-parallel expression because parallel random number +generation has not yet been integrated into the expression evaluation +machinery in this release of POOMA.) After some more output to show the +particles' initial positions and velocities, the application finally enters +the main timestep loop (lines 126-133). In each time step, particle positions +and velocities are updated under the influence of a simple harmonic oscillator +force and then printed out. Once the specified number of timesteps has been +executed, the library is shut down (line 136) and the application exits. + + +

Boundary Conditions

+ +

+In addition to an AttributeList, each Particles object also stores a +ParticleBCList of boundary conditions to be applied to the attributes. +These are generalized boundary conditions in the sense that they can be +applied not only to a particle position attribute, but to any sort of +attribute or expression involving attributes. POOMA provides typical +particle boundary conditions including periodicity, reflection, absorption, +reversal (reflection of one attribute and negation of another), and kill +(destroying a particle). Boundary conditions can be updated explicitly by +calling Particles::applyBoundaryConditions(), or implicitly by calling +Particles::sync(). + +

+Each boundary condition is assembled by first constructing an instance of +the type of boundary condition desired, then invoking the addBoundaryCondition() +member function of Particles with three parameters: the subject of the +boundary condition (i.e., the attribute or expression to be checked against +a specified range), its object (the attribute to be modified when the subject +is outside the range), and the actual boundary condition object. The boundary +condition is then applied each time applyBoundaryConditions() is invoked. + +

+The subject and object of a boundary condition are usually the same, but this +is not required. In one common case, the subject is an expression involving +particle attributes, while the object is the Particles object itself. For +example, an application's boundary condition might specify that particles are +to be deleted if their kinetic energy goes above some limit. The subject +would be an expression like 0.5*P.m*P.v*P.v, and the object would be P. +The object cannot be the expression 0.5*P.m*P.v*P.v because an expression +contains no actual data and thus cannot be modified. + +

+Another case involves the reversal boundary condition, which is used to make +particles bounce off walls. Bouncing not only reflects the particle position +back inside the wall, but also reverses the particle's velocity component in +that direction. The reversal boundary condition therefore needs an additional +object besides the original subject. + +

+POOMA provides the pre-defined boundary condition classes listed in the table below. + + + + + + + + +
Class Behavior +
AbsorbBC If attribute is outside specified lower or upper bounds, it is + reset to the boundary value. +
KillBC If particles go outside the given bounds, they are destroyed by + putting their index into the deferred destroy list. +
PeriodicBC Keeps attributes within a given periodic domain. +
ReflectBC If attribute exceeds a given boundary, its value is reflected + back inside that boundary. +
ReverseBC Reflects the value of the subject attribute if it crosses outside + the given domain, and reverses (negates) the value of the object + attribute. +
+ + +

Example: Elastic Collision

+ +

+As an example of how particle boundary conditions are used, consider a set of +particles bouncing around in a box in three dimensions. The sample code in +file examples/Particles/Bounce/Bounce.cpp shows how this can be implemented +using POOMA for the case of perfectly elastic collisions. + +

+001  #include "Pooma/Particles.h"
+002  #include "Pooma/DynamicArrays.h"
+003  #include "Tiny/Vector.h"
+004  #include "Utilities/Inform.h"
+005  #include <iostream>
+006  #include <stdlib.h>
+007
+008  
+009  // Dimensionality of this problem
+010  static const int PDim = 3;
+011  
+012  // Particles subclass with position and velocity
+013  template <class PT>
+014  class Balls : public Particles<PT>
+015  {
+016  public:
+017    // Typedefs
+018    typedef Particles<PT>                          Base_t;
+019    typedef typename Base_t::AttributeEngineTag_t  AttributeEngineTag_t;
+020    typedef typename Base_t::ParticleLayout_t      ParticleLayout_t;
+021    typedef double                                 AxisType_t;
+022    typedef Vector<PDim,AxisType_t>                PointType_t;
+023  
+024    // Constructor: set up layouts, register attributes
+025    Balls(const ParticleLayout_t &pl)
+026    : Particles<PT>(pl)
+027    {
+028      addAttribute(pos);
+029      addAttribute(vel);
+030    }
+031  
+032    // Position and velocity attributes (as public members)
+033    DynamicArray<PointType_t,AttributeEngineTag_t>  pos;
+034    DynamicArray<PointType_t,AttributeEngineTag_t>  vel;
+035  };
+036  
+037  // Use canned traits class from CommonParticleTraits.h
+038  // MPDynamicUniform provides MultiPatch Dynamic Engine for
+039  // particle attributes and UniformLayout for particle data.
+040  typedef MPDynamicUniform PTraits_t;
+041  
+042  // Type of particle layout
+043  typedef PTraits_t::ParticleLayout_t ParticleLayout_t;
+044  
+045  // Type of actual particles
+046  typedef Balls<PTraits_t> Particle_t;
+047  
+048  // Number of particles in simulation
+049  const int NumPart = 100;
+050  
+051  // Number of timesteps in simulation
+052  const int NumSteps = 100;
+053  
+054  // Number of patches to distribute particles across.
+055  // Typically one would use one patch per processor.
+056  const int numPatches = 16;
+057  
+058  // Main simulation routine
+059  int main(int argc, char *argv[])
+060  {
+061    // Initialize POOMA and output stream
+062    Pooma::initialize(argc,argv);
+063    Inform out(argv[0]);
+064  
+065    out << "Begin Bounce example code" << std::endl;
+066    out << "-------------------------" << std::endl;
+067  
+068    // Create a particle layout object for our use
+069    ParticleLayout_t particleLayout(numPatches);
+070  
+071    // Create the Particles subclass object
+072    Particle_t balls(particleLayout);
+073  
+074    // Create some particles, recompute the global domain, and initialize
+075    // the attributes randomly.  The globalCreate call will create an equal
+076    // number of particles on each patch.  The particle positions are initialized
+077    // within a 12 X 20 X 28 domain, and the velocity components are all
+078    // in the range -4 to +4.
+079    balls.globalCreate(NumPart);
+080    srand(12345U);
+081    Particle_t::PointType_t initPos, initVel;
+082    typedef Particle_t::AxisType_t Coordinate_t;
+083    Coordinate_t ranmax = static_cast<Coordinate_t>(RAND_MAX);
+084    for (int i = 0; i < NumPart; ++i)
+085    {
+086      for (int d = 0; d < PDim; ++d)
+087      {
+088        initPos(d) = 4 * (2 * (d+1) + 1) * (rand() / ranmax);
+089        initVel(d) = 4 * (2 * (rand() / ranmax) - 1);
+090      }
+091      balls.pos(i) = initPos;
+092      balls.vel(i) = initVel;
+093    }
+094  
+095    // Display the particle positions and velocities.
+096    out << "Timestep 0: " << std::endl;
+097    out << "Ball positions: "  << balls.pos << std::endl;
+098    out << "Ball velocities: " << balls.vel << std::endl;
+099  
+100    // Set up a reversal boundary condition, so that particles will
+101    // bounce off the domain boundaries.
+103    Particle_t::PointType_t lower, upper;
+104    for (int d = 0; d < PDim; ++d)
+105    {
+106      lower(d) = 0.0;
+107      upper(d) = (d+1) * 8.0 + 4.0;
+108    }
+109    ReverseBC<Particle_t::PointType_t> bounce(lower, upper);
+110    balls.addBoundaryCondition(balls.pos, balls.vel, bounce);
+111    
+112    // Simulation timestep loop
+113    for (int it=1; it <= NumSteps; ++it)
+114    {
+115      // Advance ball positions (timestep dt = 1)
+116      balls.pos += balls.vel;
+117  
+118      // Invoke boundary conditions
+119      balls.applyBoundaryConditions();
+120  
+121      // Print out the current particle data
+122      out << "Timestep " << it << ": " << std::endl;
+123      out << "Ball positions: " << balls.pos << std::endl;
+124      out << "Ball velocities: " << balls.vel << std::endl;
+125    }
+126  
+127    // Shut down POOMA and exit
+128    Pooma::finalize();
+129    return 0;
+130  }
+
+ +

+After defining the dimension of the problem (line 10), this program defines +a class Balls, which represents the set of particles (lines 13-35). Its two +attributes represent the particles' positions and velocities (lines 33-34). +Note how the type of engine used for storing these attributes is defined in +terms of the types exported by the traits class with which Balls is instantiated +(AttributeEngineTag_t, line 19). Meanwhile the type used to represent the points +is defined in terms of the dimension of the problem (line 22), rather than being +made dimension-specific. This style of coding makes it much easier to change +the simulation as the program evolves. + +

+Rather than defining a particle traits class explicitly, as the oscillation +example above did, this program uses one of the pre-defined traits classes +given in src/Particles/CommonParticleTraits.h. For the purposes of this +example, a multipatch dynamic engine is used for particle attributes, and +particle data is laid out uniformly. Once again, a typedef is used to create +a symbolic name for this combination, so that the program can be updated by +making a single change in one location. + +

+Lines 43-56 define the types used in the simulation, and the constants that +control the simulation's evolution. The main body of the program follows. +As usual, it begins by initializing the POOMA library, and creating an output +handler of type Inform (lines 62-63). Line 69 then creates a layout object +describing the domain of the problem. + +

+The Particles object itself comes into being on line 72, although the actual +particles aren't created until line 79. Recall that by default, globalCreate() +renumbers the particles by calling the Particles::renumber() method. Lines 80-93 +then randomize the balls' initial positions and velocities. + +

+Lines 103-110 are the most novel part of this simulation, as they create +reflecting boundary conditions for the simulation and add them to the +Particles object. Lines 103-108 defines where particles bounce; again, this +is done in a dimension-independent fashion in order to make code evolution as +easy as possible. Line 104 turns lower and upper into a reversing boundary +condition, which line 105 then adds to balls. The main simulation loop now +consists of nothing more than advancing the balls in each time step, and +calling sync() to enforce the boundary conditions. + + +

Summary of Particles Interface

+ +

+Particles are a fundamental construct in physical calculations. POOMA's +Particles class, and the classes that support it, allow programmers to +create and manage sets of particles both efficiently and flexibly. While +doing this is a multi-step process, the payoff as programs are extended +and updated is considerable. The list below summarizes the most important +aspects of the Particles interface. + +

    +
  • Particles<PT>(layout): +Construct the Particles object with the given particle layout. This +constructor will normally be called by the constructor of the user-defined +Particles subclass. + +
  • initialize(layout): +Initialize the Particles object with the given particle layout. This is +used if the Particles object was created with the default constructor. + +
  • size(): +Return the current total number of particles since the last renumbering. + +
  • domain(): +Return the one-dimensional domain of the particle attributes, represented +as the Interval<1> object storing the interval 0, 1, 2, ... size()-1. + +
  • attributes(): +Return the number of registered attributes. + +
  • addAttribute(attrib): +Add the given attribute (which should be a DynamicArray of the proper +engine type) to the attribute list stored by Particles. + +
  • removeAttribute(attrib): +Remove the given attribute from the attribute list. + +
  • sync(posattrib): +Apply boundary conditions, carry out cached destroys, swap particles +across patches as needed, and renumber the particles. If relevant, the +posattrib is used by the particle layout as a particle position attribute. + +
  • swap(posattrib): +Move particle data between patches as specified by the particle layout +strategy (uniform or spatial) and renumber the particles. This is more +efficient than sync() if the user knows that there are no boundary +conditions or cached destroy requests to carry out. + +
  • applyBoundaryConditions(): +Apply the boundary conditions to the current particle attributes, without +renumbering or destroying any particles. + +
  • performDestroy(): +Destroy any particles that were specified in previous deferredDestroy() +requests. (Note that these requests may be generated by a KillBC.) + +
  • renumber(): +Recalculate the per-patch and total domain of the system by inspecting +the Particles attribute layout and resequencing the particles. + +
  • create(N, patchID, renum): +Create N particles in the specified patch and optionally renumber. If +the patchID and renum arguments are omitted, this creates the particles +in the last patch, so as not to disturb the numbering of existing particles. + +
  • globalCreate(N, renum): +Create N particles with a roughly equal number of particles created in +each patch and optionally renumber. + +
  • setDestroyMethod(method): +Set the preferred destroy method to be either BackFill (default) or ShiftUp. + +
  • destroy(domain, patchID, renum): +Immediately destroy particles in the specified local domain within the +specified patch and optionally renumber. The domain may be an Interval<1> +or IndirectionList<int> of particle index numbers. If the patchID is +omitted, the domain is assumed to be a global domain that may stretch +across multiple patches. + +
  • deferredDestroy(domain, patchID): +Put the indices of the particles in the given domain in the deferred +destroy list of the Particles object, so that they will be destroyed by +the next call to performDestroy(). If patchID is omitted, the domain +is assumed to be a global domain. + +
  • addBoundaryCondition(Subj,Obj,BC) and addBoundaryCondition(Subj,BC): +Add a new boundary condition BC that depends on the subject Subj and +affects the object Obj (if different from Subj). + +
  • removeBoundaryCondition(i) and removeBoundaryConditions(): +Delete the ith boundary condition, or all boundary conditions. +
+ + +

Particles and Fields

+ +

+The previous sections have described how POOMA represents a set of +particles and allows the user to perform typical operations in a +particle simulation. The remainder of this document shows how POOMA +Particles and Fields can be combined to create complete simulations +of complex physical systems. The first section describes how POOMA +interpolates values when gathering and scattering field and +particle data. This is followed by a look at the in's and out's of +data layout, and a medium-sized example that illustrates how these +ideas all fit together. (Note: The current implementation of POOMA +Particles allows interaction with the original version of the Field +abstraction created for POOMA II. Particles have not yet been +modified to work with the new experimental design for POOMA Fields +that is implemented in the src/Field directory. Thus, all the +discussion here of POOMA Fields refers to the original implementation.) + + +

Particle/Field Interpolation

+ +

+POOMA's Particles class is designed to be used in conjunction with the +Field class. Interpolators are the glue that bind these two together, +by specifying how to calculate field values at particle (or other) +locations that lie in between the locations of Field elements. These +interpolators can also be used to go in the opposite direction, +acumulating contributions from particles at arbitrary locations into +the elements of a Field. + +

+Interpolators are used to gather values to specific positions in a +field's spatial domain from nearby field elements, or to scatter +values from such positions into the field. The interpolation stencil +describes how values are translated between field element locations +and arbitrary points in space. An example of using this kind of +interpolation is particle-in-cell (PIC) simulations, in which charged +particles move through a discretized domain. The particle interactions +are determined by scattering the particle charge density into a field, +solving for the self-consistent electric field, and gathering that +field back to the particle positions to compute forces on the particles. +The last code example in this document describes a simulation of this kind. + +

+POOMA currently offers three types of interpolation stencils: +nearest grid point (NGP), cloud-in-cell (CIC), and subtracted dipole +scheme (SUDS). NGP is a zeroth-order interpolation that gathers +from or scatters to the field element nearest the specified location. +CIC is a first-order scheme that performs linear weighting among the +2^D field elements nearest the point in D-dimensional space. SUDS is +also first-order, but it uses just the nearest field element and its +two neighbors along each dimension, so it is only a 7-point stencil in +three dimensions. Other types of interpolation schemes can be added +in a straightforward manner. + +

+Interpolation is invoked by calling the global functions gather() and +scatter(), both of which take four arguments: + +

    +
  1. the particle attribute to be gathered to or scattered from +(usually a single DynamicArray, although one could scatter an +expression involving DynamicArray objects as well); + +
  2. the Field to be gathered from or scattered to; + +
  3. the particle positions (normally a DynamicArray that is +a member of a Particles subclass); and + +
  4. an interpolator tag object of type NGP, CIC or SUDS, indicating +which interpolation stencil to use. +
+ +

+An example of using the gather() function is: + +

+gather(P.efd, Efield, P.pos, CIC());
+
+ +

+where P is a Particles subclass object whose attributes include efd +for storing the gathered electric field from the Field Efield and +pos for the particle positions. The default constructor of the +interpolator tag CIC is used to create a temporary instance of the +class to pass to gather(), telling it which interpolation scheme to use. + +

+The particle attribute and position arguments passed to gather() and +scatter() should have the same layout, and the positions must refer to +the geometry of the Field being used. The interpolator will compute +the required interpolated values for the particles on each patch. +These functions assume each particle is only interacting with field +elements in the Field patch that exactly corresponds to the particle's +current patch. Thus, applications must use the SpatialLayout particle +layout strategy and make sure that the Field has enough guard layers +to accommodate the interpolation stencil. + +

+In addition to the basic gather() and scatter() functions, POOMA offers +some variants that optimize other common operations. The first of these, +scatterValue(), scatters a single value into a Field rather than a +particle attribute with different values for each particle. Its first +argument is a single value with a type that is compatible with the Field +element type. + +

+The other three optimized methods are gatherCache(), scatterCache(), and +scatterValueCache(). Each of these methods has two overloaded variants, +which allow applications to cache and reuse interpolated data, such as +the nearest grid point for each particle and the distance from the +particle's position to that grid point. The difference between the +elements of each overloaded pair of methods is that one takes both a +particle position attribute and a particle interpolator cache attribute +among its arguments, while the other takes only the cache attribute. +When the first of these is called, it caches position information in +the provided cache attribute. When the second version is called with +that cache attribute as an argument, it re-uses that information. This +can speed up computation considerably, but it is important to note that +applications can only do this safely when the particle positions are +guaranteed not to have changed since the last interpolation. + + +

Data Layout for Particles and Fields

+ +

+The use of particles and fields together in a single application brings +up some issues regarding data layout that do not arise when either is +used on its own. There are two characteristics of Engine objects that +must be considered in order to determine whether they can be used for +attributes in Particles objects: + +

    +
  1. +Can the engine use a layout that is "shared" among several engines +of the same category, such that the size and layout of the engine is +synchronized with the other engines using the layout? + +

    +If this is the case, then creation, destruction, repartitioning, and +other operations are done for all the engines sharing the layout. +Particles require all their attributes to use a shared layout, so only +engines that use a shared layout can be used for particle attributes. +The only engine type with this capability in this release of POOMA +(i.e., the only engine that is usable in Particles attributes) is +MultiPatch. + +

    +MultiPatch can use several different types of layouts and single-patch +engines, and all MultiPatch engines use a shared layout. However, only +the MultiPatch<DynamicTag,*> specializations of MultiPatch engines are +useful for Particles attributes, since only that engine type can have +patches of dynamically varying size. + +

  2. Can the engine change size dynamically? + +

    +The engine type used for particle attributes must have dynamic +capabilities. Thus, we should use dynamic single-patch engines +inside of MultiPatch. The only engines available in this release +of POOMA that meet this requirement are Dynamic and Remote<Dynamic>. +Both of these are inherently one-dimensional and support operations +such as create(), destroy() and copy(). Remote<Dynamic> is similar +to Dynamic, but it is context-aware and useful for multi-context codes. + +

    +Implicit in the discussion above is the fact that there are actually +three different types of layout classes that an application programmer +must keep in mind: + +

      +
    1. the layout for the particle attributes; + +
    2. the layout for the Field given to the particle SpatialLayout (which +is used to determine the layout of the space in which the particles +move around); and + +
    3. the actual SpatialLayout that connects the info about the Field +layout to the Particles attribute layout. +
    + +

    +The only thing that needs to match between the attribute and Field +layouts is the number of patches, which must be exactly the same. +The engine type (and thus the layout type) of the attributes +and of the Field do not have to match. Typically, both the attributes +and the Field will have a MultiPatch engine with the same number of +patches, but these engines will have different single-patch engine +types (Dynamic vs. Brick) and use different types of layouts (Dynamic +vs. Grid). + +

    +Note once again that in the simple case of a UniformLayout particle +layout, applications do not need to worry about any Field layout type, +only the particle attribute layout (which still needs to be shared) +and the particle layout. This commonly arises during the early +prototyping (i.e., pre-parallel) stages of application development, +when you might limit an application to a single patch for simplicity. + +

+ + +

Example: Particle-in-Cell Simulation

+ +

+Our third and final example of the Particles class is a +particle-in-cell program, which simulates the motion of charged +particles in a static sinusoidal electrical field in two dimensions. +This example brings together the Field classes (discussed elsewhere) +with the Particles capabilities we have been describing here. + +

+Because this example is longer than the others in this document, +it will be described in sections. For a unified listing of the source +code, please see the file examples/Particles/PIC2d/PIC2d.cpp. + +The first step is to include all of the usual header files: + +

+001  #include "Pooma/Particles.h"
+002  #include "Pooma/DynamicArrays.h"
+003  #include "Pooma/Fields.h"
+004  #include "Utilities/Inform.h"
+005  #include <iostream>
+006  #include <stdlib.h>
+007  #include <math.h>
+
+ +

+Once this has been done, the application can define a traits class +for the Particles object it is going to create. As always, this +contains typedef's for AttributeEngineTag_t and ParticleLayout_t. +The traits class for this example also includes an application-specific +typedef called InterpolatorTag_t, for reasons discussed below. + +

+008  template <class EngineTag, class Centering, class MeshType, class FL,
+009            class InterpolatorTag>
+010  struct PTraits
+011  {
+012    // The type of engine to use in the attributes
+013    typedef EngineTag AttributeEngineTag_t;
+014  
+015    // The type of particle layout to use
+016    typedef SpatialLayout< DiscreteGeometry<Centering,MeshType>, FL> 
+017      ParticleLayout_t;
+018  
+019    // The type of interpolator to use
+020    typedef InterpolatorTag InterpolatorTag_t;
+021  };
+
+ +

+The interpolator tag type is included in the traits class because an +application might want the Particles subclass to provide the type of +interpolator to use. One example of this is the case in which a +gather() or scatter() call occurs in a subroutine which is passed an +object of a Particles subclass. This subroutine could extract the +desired interpolator type from that object using: + +// Particles-derived type Particles_t already defined +typedef typename Particles_t::InterpolatorTag_t InterpolatorTag_t; + +

+In this short example, this is not really necessary because +InterpolatorTag_t is being defined and then used within the +same file scope. Nevertheless, this illustrates a situation in +which the user might want to add new traits to their PTraits class +beyond the required traits AttributeEngineTag_t and ParticleLayout_t. + +

+We can now also define the class which will represent the charged +particles in the simulation. As in other examples, this is derived +from Particles and templated on a traits class, so that such things +as its layout and evaluation engine can be quickly, easily and +reliably changed. This class has three intrinsic properties: the +particle positions R, their velocities V, and their charge/mass +ratios qm. The class also has a fourth attribute called E, which is +used to record the electric field at each particle's position in order +to calculate forces. This calculation will be discussed in greater +detail below. + +

+024  template <class PT>
+025  class ChargedParticles : public Particles<PT>
+026  {
+027  public:
+028    // Typedefs
+029    typedef Particles<PT>                          Base_t;
+030    typedef typename Base_t::AttributeEngineTag_t  AttributeEngineTag_t;
+031    typedef typename Base_t::ParticleLayout_t      ParticleLayout_t;
+032    typedef typename ParticleLayout_t::AxisType_t  AxisType_t;
+033    typedef typename ParticleLayout_t::PointType_t PointType_t;
+034    typedef typename PT::InterpolatorTag_t         InterpolatorTag_t;
+035  
+036    // Dimensionality
+037    static const int dimensions = ParticleLayout_t::dimensions;
+038  
+039    // Constructor: set up layouts, register attributes
+040    ChargedParticles(const ParticleLayout_t &pl)
+041    : Particles<PT>(pl)
+042    {
+043      addAttribute(R);
+044      addAttribute(V);
+045      addAttribute(E);
+046      addAttribute(qm);
+047    }
+048  
+049    // Position and velocity attributes (as public members)
+050    DynamicArray<PointType_t,AttributeEngineTag_t> R;
+051    DynamicArray<PointType_t,AttributeEngineTag_t> V;
+052    DynamicArray<PointType_t,AttributeEngineTag_t> E;
+053    DynamicArray<double,     AttributeEngineTag_t> qm;
+054  };
+
+ +

+With the two classes that the simulation relies upon defined, the +program next defines the dependent types, constants, and other values +that the application needs. These include the dimensionality of the +problem (which can easily be changed), the type of mesh on which the +calculations are done, the mesh's size, and so on: + +

+058  // Dimensionality of this problem
+059  static const int PDim = 2;
+060  
+061  // Engine tag type for attributes
+062  typedef MultiPatch<DynamicTag,Dynamic> AttrEngineTag_t;
+063  
+064  // Mesh type
+065  typedef UniformRectilinearMesh< PDim, Cartesian<PDim>, double > Mesh_t;
+066  
+067  // Centering of Field elements on mesh
+068  typedef Cell Centering_t;
+069  
+070  // Geometry type for Fields
+071  typedef DiscreteGeometry<Centering_t,Mesh_t> Geometry_t;
+072  
+073  // Field types
+074  typedef Field< Geometry_t, double,
+075                 MultiPatch<UniformTag,Brick> > DField_t;
+076  typedef Field< Geometry_t, Vector<PDim,double>,
+077                 MultiPatch<UniformTag,Brick> > VecField_t;
+078  
+079  // Field layout type, derived from Engine type
+080  typedef DField_t::Engine_t Engine_t;
+081  typedef Engine_t::Layout_t FLayout_t;
+082  
+083  // Type of interpolator
+084  typedef NGP InterpolatorTag_t;
+085  
+086  // Particle traits class
+087  typedef PTraits<AttrEngineTag_t,Centering_t,Mesh_t,FLayout_t,
+088                  InterpolatorTag_t> PTraits_t;
+089  
+090  // Type of particle layout
+091  typedef PTraits_t::ParticleLayout_t PLayout_t;
+092  
+093  // Type of actual particles
+094  typedef ChargedParticles<PTraits_t> Particles_t;
+095  
+096  // Grid sizes
+097  const int nx = 200, ny = 200;
+098  
+099  // Number of particles in simulation
+100  const int NumPart = 400;
+101  
+102  // Number of timesteps in simulation
+103  const int NumSteps = 20;
+104  
+105  // The value of pi (some compilers don't define M_PI)
+106  const double pi = acos(-1.0);
+107  
+108  // Maximum value for particle q/m ratio
+109  const double qmmax = 1.0;
+110  
+111  // Timestep
+112  const double dt = 1.0;
+
+ +

+The preparations above might seem overly elaborate, but the payoff +comes when the main simulation routine is written. After the usual +initialization call, and the creation of an Inform object to handle +output, the program defines one geometry object to represent the +problem domain, and another that includes a guard layer: + +

+115  int main(int argc, char *argv[])
+116  {
+117    // Initialize POOMA and output stream.
+118    Pooma::initialize(argc,argv);
+119    Inform out(argv[0]);
+120  
+121    out << "Begin PIC2d example code" << std::endl;
+122    out << "------------------------" << std::endl;
+123  
+124    // Create mesh and geometry objects for cell-centered fields.
+125    Interval<PDim> meshDomain(nx+1,ny+1);
+126    Mesh_t mesh(meshDomain);
+127    Geometry_t geometry(mesh);
+128  
+129    // Create a second geometry object that includes a guard layer.
+130    GuardLayers<PDim> gl(1);
+131    Geometry_t geometryGL(mesh,gl);
+
+ +

+The program then creates a pair of Field objects. The first, phi, +is a field of doubles and records the electrostatic potential at +points in the mesh. The second, EFD, is a field of two-dimensional +Vectors and stores the electric field at each mesh point. The types +used in these definitions were declared on lines 74-77 above. Note +how these definitions are made in terms of other defined types, such +as Geometry_t, rather than directly in terms of basic types. This +allows the application to be modified quickly and reliably with +minimal changes to the code. + +

+133    // Create field layout objects for our electrostatic potential
+134    // and our electric field.  Decomposition is 4 x 4 and replicated.
+135    Loc<PDim> blocks(4,4);
+136    FLayout_t flayout(geometry.physicalDomain(),blocks,ReplicatedTag()),
+137      flayoutGL(geometryGL.physicalDomain(),blocks,gl,ReplicatedTag());
+138  
+139    // Create and initialize electrostatic potential and electric field.
+140    DField_t phi(geometryGL,flayoutGL);
+141    VecField_t EFD(geometry,flayout);
+
+ +

+The application now adds periodic boundary conditions to the electrostatic +field phi, so that particles will not see sharp changes in the potential +at the edges of the simulation domain. The values of phi and EFD are then +set: phi is defined explicitly, while EFD records the gradient of phi. + +

+144    // potential phi = phi0 * sin(2*pi*x/Lx) * cos(4*pi*y/Ly)
+145    // Note that phi is a periodic Field
+146    // Electric field EFD = -grad(phi)
+147    phi.addBoundaryConditions(AllPeriodicFaceBC());
+148    double phi0 = 0.01 * nx;
+149    phi = phi0 * sin(2.0*pi*phi.x().comp(0)/nx)
+150               * cos(4.0*pi*phi.x().comp(1)/ny);
+151    EFD = -grad<Centering_t>(phi);
+
+ +

+With the fields in place, the application creates the particles +whose motions are to be simulated, and adds periodic boundary +conditions to this object as well. The globalCreate() call +creates the same number of particles on each processor. + +

+153    // Create a particle layout object for our use
+154    PLayout_t layout(geometry,flayout);
+155  
+156    // Create a Particles object and set periodic boundary conditions
+157    Particles_t P(layout);
+158    Particles_t::PointType_t lower(0.0,0.0), upper(nx,ny);
+159    PeriodicBC<Particles_t::PointType_t> bc(lower,upper);
+160    P.addBoundaryCondition(P.R,bc);
+161  
+162    // Create an equal number of particles on each processor
+163    // and recompute global domain.
+164    P.globalCreate(NumPart);
+
+ +

+Note that the definitions of lower and upper could be made +dimension-independent by defining them with a loop. If ng +is an array of ints of length PDim, then this loop would be: + +

+Particles_t::PointType_t lower, upper;
+for (int d=0; d<PDim; ++d)
+{
+  lower(d) = 0;
+  upper(d) = ng[d];
+}
+
+ +

+The application then randomizes the particles' positions and +charge/mass ratios using a sequential loop (since parallel random +number generation is not yet in POOMA). Once this has finished, the +method swap() is called to redistribute the particles based on their +positions; i.e., to move each particle to its home processor. +The initial positions, velocities, and charge/mass ratios of the +particles are then printed out. + +

+166    // Random initialization for particle positions in nx by ny domain
+167    // Zero initialization for particle velocities
+168    // Random intialization for charge-to-mass ratio from -qmmax to qmmax
+169    P.V = Particles_t::PointType_t(0.0);
+170    srand(12345U);
+171    Particles_t::PointType_t initPos;
+172    typedef Particle_t::AxisType_t Coordinate_t;
+173    Coordinate_t ranmax = static_cast<Coordinate_t>(RAND_MAX);
+174    double ranmaxd = static_cast<double>(RAND_MAX);
+175    for (int i = 0; i < NumPart; ++i)
+176    {
+177      initPos(0) = nx * (rand() / ranmax);
+178      initPos(1) = ny * (rand() / ranmax);
+179      P.R(i) = initPos;
+180      P.qm(i) = qmmax * (2 * (rand() / ranmaxd) - 1);
+181    }
+182  
+183    // Redistribute particle data based on spatial layout
+184    P.swap(P.R);
+185  
+186    out << "PIC2d setup complete." << std::endl;
+187    out << "---------------------" << std::endl;
+188  
+189    // Display the initial particle positions, velocities and qm values.
+190    out << "Initial particle data:" << std::endl;
+191    out << "Particle positions: "  << P.R << std::endl;
+192    out << "Particle velocities: " << P.V << std::endl;
+193    out << "Particle charge-to-mass ratios: " << P.qm << std::endl;
+
+ +

+The application is now able to enter its main timestep loop. +In each time step, the particle positions are updated, and then +sync() is called to invoke boundary conditions, swap particles, +and then renumber. A call is then made to gather() (line 208) to +determine the field at each particle's location. As discussed +earlier, this function uses the interpolator to determine values +that lie off mesh points. Once the field strength is known, the +particle velocities can be updated: + +

+195    // Begin main timestep loop
+196    for (int it=1; it <= NumSteps; ++it)
+197    {
+198      // Advance particle positions
+199      out << "Advance particle positions ..." << std::endl;
+200      P.R = P.R + dt * P.V;
+201  
+202      // Invoke boundary conditions and update particle distribution
+203      out << "Synchronize particles ..." << std::endl;
+204      P.sync(P.R);
+205     
+206      // Gather the E field to the particle positions
+207      out << "Gather E field ..." << std::endl;
+208      gather( P.E, EFD, P.R, Particles_t::InterpolatorTag_t() );
+209  
+210      // Advance the particle velocities
+211      out << "Advance particle velocities ..." << std::endl;
+212      P.V = P.V + dt * P.qm * P.E;
+213    }
+
+ +

+Finally, the state of the particles at the end of the simulation is +printed out, and the simulation is closed down: + +

+215    // Display the final particle positions, velocities and qm values.
+216    out << "PIC2d timestep loop complete!" << std::endl;
+217    out << "-----------------------------" << std::endl;
+218    out << "Final particle data:" << std::endl;
+219    out << "Particle positions: "  << P.R << std::endl;
+220    out << "Particle velocities: " << P.V << std::endl;
+221    out << "Particle charge-to-mass ratios: " << P.qm << std::endl;
+222  
+223    // Shut down POOMA and exit
+224    out << "End PIC2d example code." << std::endl;
+225    out << "-----------------------" << std::endl;
+226    Pooma::finalize();
+227    return 0;
+
+ + +

Summary

+ +

+This document has shown how POOMA's Field and Particles classes can +be combined to create complete physical simulations. While more setup +code is required than with Fortran-77 or C, the payoff is high-performance +programs that are more flexible and easier to maintain. + + + Index: Layout.html =================================================================== RCS file: /home/pooma/Repository/r2/docs/Layout.html,v retrieving revision 1.3 diff -u -u -r1.3 Layout.html --- Layout.html 20 Aug 2004 20:14:18 -0000 1.3 +++ Layout.html 23 Aug 2004 11:16:32 -0000 @@ -5,8 +5,11 @@ Layout and related classes - -  + + +

POOMA banner
+

Layouts and related classes:

From rguenth at tat.physik.uni-tuebingen.de Mon Aug 23 15:39:25 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Aug 2004 17:39:25 +0200 (CEST) Subject: [PATCH] Random stuff Message-ID: Collected stuff from another repository and icc -strict_ansi checking. Ok? Richard. 2004Aug23 Richard Guenther * config/arch/LINUXICC.conf: ignore tail padding warnings, specify strict cmdline argument. src/Connect/Lux/tests/lux_test2.cpp: fix compiling without Lux. src/DataBrowser/tests/TestDataBrowser.cpp: fix old Field stuff. src/Domain/NewDomain.h: avoid comparison warnings. src/Layout/LayoutBase.h: likewise. src/Engine/Stencil.h: add missing return to assignment operator. src/Particles/tests/interpolate.cpp: honour two-stage name lookup rules. src/Particles/tests/particle_tests.h: likewise. -------------- next part -------------- ===== r2/config/arch/LINUXICC.conf 1.4 vs edited ===== --- 1.4/r2/config/arch/LINUXICC.conf 2004-01-07 10:02:17 +01:00 +++ edited/r2/config/arch/LINUXICC.conf 2004-08-23 16:29:37 +02:00 @@ -165,13 +165,13 @@ $cppnoex = ""; # flag to use to turn off exceptions $cppverbose = "-v"; # flag for verbose compiler output $cpponeper = ""; # flag to turn on one-instantance-per-obj -$cppstrict = ""; # flag for ANSI conformance checking +$cppstrict = "-strict_ansi"; # flag for ANSI conformance checking ### debug or optimized build settings for C++ applications -$cppdbg_app = "-g -wd161"; -$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -wd161"; +$cppdbg_app = "-g -wd161,1476"; +$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -wd161,1476"; ### debug or optimized build settings for C++ libraries ===== r2/src/Connect/Lux/tests/lux_test2.cpp 1.1 vs edited ===== --- 1.1/r2/src/Connect/Lux/tests/lux_test2.cpp 2002-05-13 17:47:28 +02:00 +++ edited/r2/src/Connect/Lux/tests/lux_test2.cpp 2004-08-23 17:16:24 +02:00 @@ -39,7 +39,7 @@ // Traits class for Particles object -template struct PTraits { @@ -47,7 +47,7 @@ typedef EngineTag AttributeEngineTag_t; // The type of particle layout to use - typedef SpatialLayout,FL> + typedef SpatialLayout ParticleLayout_t; // The type of interpolator to use @@ -74,10 +74,10 @@ ChargedParticles(const ParticleLayout_t &pl) : Particles(pl) { - addAttribute(R); - addAttribute(V); - addAttribute(E); - addAttribute(qm); + this->addAttribute(R); + this->addAttribute(V); + this->addAttribute(E); + this->addAttribute(qm); } // Position and velocity attributes (as public members) @@ -95,18 +95,12 @@ typedef MultiPatch AttrEngineTag_t; // Mesh type -typedef UniformRectilinearMesh,double> Mesh_t; - -// Centering of Field elements on mesh -typedef Cell Centering_t; - -// Geometry type for Fields -typedef DiscreteGeometry Geometry_t; +typedef UniformRectilinearMesh Mesh_t; // Field types -typedef Field< Geometry_t, double, +typedef Field< Mesh_t, double, MultiPatch > DField_t; -typedef Field< Geometry_t, Vector, +typedef Field< Mesh_t, Vector, MultiPatch > VecField_t; // Field layout type, derived from Engine type @@ -117,7 +111,7 @@ typedef NGP InterpolatorTag_t; // Particle traits class -typedef PTraits PTraits_t; // Type of particle layout ===== r2/src/DataBrowser/tests/TestDataBrowser.cpp 1.1 vs edited ===== --- 1.1/r2/src/DataBrowser/tests/TestDataBrowser.cpp 2002-05-13 17:47:29 +02:00 +++ edited/r2/src/DataBrowser/tests/TestDataBrowser.cpp 2004-08-23 17:08:08 +02:00 @@ -76,8 +76,8 @@ // Global typedefs; useful in making user-defined functions below: // 1D typedef UniformRectilinearMesh<1> Mesh1_t; -typedef Field, double> ScalarField1_t; -typedef Field, Vector<1> > VectorField1_t; +typedef Field ScalarField1_t; +typedef Field > VectorField1_t; typedef Array<1, double, CompressibleBrick> ScalarArray1_t; typedef Array<1, Vector<1>, CompressibleBrick> VectorArray1_t; // 2D @@ -136,17 +136,18 @@ Mesh1_t mesh(vertDomain); // Create the 1D geometry: - DiscreteGeometry geomc(mesh, GuardLayers<1>(2)); + Centering<1> cell = canonicalCentering<1>(CellType, Continuous); + DomainLayout<1> layout(vertDomain); fout << std::endl << "=========== 1D ============" << std::endl; // Make some 1D fields: - ScalarField1_t s1(geomc); - VectorField1_t v1(geomc); + ScalarField1_t s1(cell, layout, mesh); + VectorField1_t v1(cell, layout, mesh); // Assign to spatially-varying values: - s1.all() = s1.xAll().comp(0); - v1.all() = v1.xAll(); + s1.all() = positions(s1).comp(0); + v1.all() = positions(v1); // Create some 1D Arrays: ScalarArray1_t sa1(cellDomain); ===== r2/src/Domain/NewDomain.h 1.6 vs edited ===== --- 1.6/r2/src/Domain/NewDomain.h 2003-10-27 11:25:05 +01:00 +++ edited/r2/src/Domain/NewDomain.h 2004-08-23 11:10:02 +02:00 @@ -225,7 +225,7 @@ static void combine(RT &rt, const UT &u, const CT& ct) { CTAssert(DS >= 0 && SliceDS >= 0); CTAssert(DRT > (DS + DCT - 1)); - CTAssert(DUT == DRT); + CTAssert((int)DUT == DRT); for (int i=0; i < DCT; ++i) { DomainTraits::getDomain(rt, DS + i).setWildcardDomain( DomainTraits::getPointDomain(u, DS + i), ===== r2/src/Engine/Stencil.h 1.13 vs edited ===== --- 1.13/r2/src/Engine/Stencil.h 2004-08-21 20:44:21 +02:00 +++ edited/r2/src/Engine/Stencil.h 2004-08-23 16:47:27 +02:00 @@ -346,6 +346,7 @@ domain_m[d] = model.domain()[d]; offset_m[d] = model.offset(d); } + return *this; } //============================================================ ===== r2/src/Layout/LayoutBase.h 1.7 vs edited ===== --- 1.7/r2/src/Layout/LayoutBase.h 2004-01-17 16:24:21 +01:00 +++ edited/r2/src/Layout/LayoutBase.h 2004-08-23 11:10:31 +02:00 @@ -947,12 +947,12 @@ // Our dimensionality must be the same as the slice's reduced // dimensionality. - CTAssert(DT::sliceDimensions == Dim); + CTAssert((int)DT::sliceDimensions == Dim); // The slice's dimensionality must match that of the previous // view. - CTAssert(DT::dimensions == LV::dimensions); + CTAssert((int)DT::dimensions == LV::dimensions); // The layout passed in must be initialized. ===== r2/src/Particles/tests/interpolate.cpp 1.3 vs edited ===== --- 1.3/r2/src/Particles/tests/interpolate.cpp 2004-07-15 11:25:53 +02:00 +++ edited/r2/src/Particles/tests/interpolate.cpp 2004-08-23 16:55:45 +02:00 @@ -93,10 +93,10 @@ MyParticles(const ParticleLayout_t& pl) : Particles(pl) { - addAttribute(pos); - addAttribute(efield); - addAttribute(charge); - addAttribute(cache); + this->addAttribute(pos); + this->addAttribute(efield); + this->addAttribute(charge); + this->addAttribute(cache); } // List of attributes; we'll just make them public data members here, ===== r2/src/Particles/tests/particle_tests.h 1.1 vs edited ===== --- 1.1/r2/src/Particles/tests/particle_tests.h 2002-05-13 17:47:41 +02:00 +++ edited/r2/src/Particles/tests/particle_tests.h 2004-08-23 16:30:54 +02:00 @@ -153,10 +153,10 @@ void addAllAttributes() { - addAttribute(pos); - addAttribute(mom); - addAttribute(ad); - addAttribute(ai); + this->addAttribute(pos); + this->addAttribute(mom); + this->addAttribute(ad); + this->addAttribute(ai); } }; From oldham at codesourcery.com Mon Aug 23 15:39:38 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 23 Aug 2004 08:39:38 -0700 Subject: [PATCH] Fix/improve PETSc wrapper In-Reply-To: <41279F8F.1010107@tat.physik.uni-tuebingen.de> References: <41279F8F.1010107@tat.physik.uni-tuebingen.de> Message-ID: <412A0FBA.4020808@codesourcery.com> Richard Guenther wrote: > > Found in one of my repositories. > > Ok? > > Richard. > > > 2004Aug21 Richard Guenther > > * src/Transform/PETSc.h: handle expression engines for > initialization, support periodic setup, fix MP patch > computation. Yes, this looks fine to commit. Does changing from e(I) to e.read(I) yield measurable changes? >------------------------------------------------------------------------ > >Index: src/Transform/PETSc.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Transform/PETSc.h,v >retrieving revision 1.1 >diff -u -u -r1.1 PETSc.h >--- src/Transform/PETSc.h 24 Mar 2004 18:26:32 -0000 1.1 >+++ src/Transform/PETSc.h 21 Aug 2004 19:13:58 -0000 >@@ -109,7 +109,7 @@ > int idx=0; > Interval<1> d(e.domain()); > for (int I=d.first(); I<=d.last(); ++I) >- pa[idx++] = e(I); >+ pa[idx++] = e.read(I); > VecRestoreArray(v, &pa); > } > template >@@ -138,7 +138,7 @@ > Interval<2> d(e.domain()); > for (int J=d[1].first(); J<=d[1].last(); ++J) > for (int I=d[0].first(); I<=d[0].last(); ++I) >- pa[idx++] = e(I, J); >+ pa[idx++] = e.read(I, J); > VecRestoreArray(v, &pa); > } > template >@@ -169,7 +169,7 @@ > for (int K=d[2].first(); K<=d[2].last(); ++K) > for (int J=d[1].first(); J<=d[1].last(); ++J) > for (int I=d[0].first(); I<=d[0].last(); ++I) >- pa[idx++] = e(I, J, K); >+ pa[idx++] = e.read(I, J, K); > VecRestoreArray(v, &pa); > } > template >@@ -197,12 +197,27 @@ > template > struct PoomaDA { > >- /// Creates a PETSc DA from the specified layout. >+ /// Creates a PETSc DA from the specified array/field/layout. > /// Extra arguments are like DACreateNd, namely the periodicity > /// and stencil type and the stencil width. > >+ template >+ PoomaDA(const Array &a, DAPeriodicType pt, DAStencilType st, int sw) >+ { >+ initialize(a.physicalDomain(), pt, st, sw); >+ } >+ >+ template >+ PoomaDA(const Field &f, DAPeriodicType pt, DAStencilType st, int sw) >+ { >+ initialize(f.physicalDomain(), pt, st, sw); >+ } >+ > template >- PoomaDA(const Layout &l, DAPeriodicType pt, DAStencilType st, int sw); >+ PoomaDA(const Layout &l, DAPeriodicType pt, DAStencilType st, int sw) >+ { >+ initialize(l.innerDomain(), pt, st, sw); >+ } > > ~PoomaDA() > { >@@ -216,6 +231,15 @@ > operator DA() const { return da; } > > >+ /// Access PeriodicType. >+ >+ DAPeriodicType periodicType() const { return info[0].pt; } >+ >+ /// Access StencilType. >+ >+ DAStencilType stencilType() const { return info[0].st; } >+ >+ > /// Assign from POOMA engine to PETSc vector. > > template >@@ -234,6 +258,7 @@ > template > void assign(Vec v, const Field &f) > { >+ forEach(f, PerformUpdateTag(), NullCombine()); > this->assign(v, f.fieldEngine().engine()); > } > >@@ -257,8 +282,12 @@ > void assign(const Field &f, Vec v) > { > this->assign(f.fieldEngine().engine(), v); >+ f.notifyPostWrite(); > } > >+protected: >+ void initialize(const Interval &d, DAPeriodicType pt, DAStencilType st, int sw); >+ > > private: > DA da; >@@ -270,11 +299,10 @@ > > > template >-template >-PoomaDA::PoomaDA(const Layout &l, DAPeriodicType pt, DAStencilType st, int sw) >- : offset(Loc(0)) >+void PoomaDA::initialize(const Interval &d, DAPeriodicType pt, DAStencilType st, int sw) > { >- Interval domain = l.innerDomain(); >+ offset = Loc(0); >+ Interval domain = d; > if (pt != DA_XPERIODIC > && pt != DA_XYPERIODIC > && pt != DA_XYZPERIODIC >@@ -370,7 +398,7 @@ > Interval lPatch(PoomaDAGetDomain::innerDomain(this->info[i])); > Array > a; > a.engine() = Engine >(i, lPatch); >- Array e_array(ViewEngine_t(e, lPatch - this->offset)); >+ Array e_array(ViewEngine_t(e, lPatch + this->offset)); > a = e_array; > > // remember local engine >@@ -414,7 +442,7 @@ > > // distribute the copy > Array e_array; >- e_array.engine() = ViewEngine_t(e, lPatch - this->offset); >+ e_array.engine() = ViewEngine_t(e, lPatch + this->offset); > e_array = a; > } > } > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 23 15:45:37 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 23 Aug 2004 08:45:37 -0700 Subject: [PATCH] Fix compiling Doof2d In-Reply-To: <4127C671.10503@tat.physik.uni-tuebingen.de> References: <4127C671.10503@tat.physik.uni-tuebingen.de> Message-ID: <412A1121.5030300@codesourcery.com> Richard Guenther wrote: > Fixes ISO conformance problems with Doof2d benchmark. > > Ok? > > Richard. > > > 2004Aug22 Richard Guenther > > * benchmarks/Doof2d/Doof2d.h: fix ISO conformance. Thanks for fixing this. Please commit the changes. I like Doof2D. It's relatively simple but still interesting. It's good to have it work again. >------------------------------------------------------------------------ > >--- pooma-bk/r2/benchmarks/Doof2d/Doof2d.h 2003-11-23 23:19:54.000000000 +0100 >+++ pooma-bib/r2/benchmarks/Doof2d/Doof2d.h 2004-08-22 00:00:48.000000000 +0200 >@@ -346,7 +346,7 @@ > const char* qualification() const > { > typedef typename Store::Engine_t Engine_t; >- return ::qualification(a_m).c_str(); >+ return ::qualification(this->a_m).c_str(); > } > > void run() >@@ -367,27 +367,27 @@ > { > for (i = 2; i <= this->n_m - 1; i++) > { >- a_m(i,j) = fact * >- (b_m(i+1,j+1) + b_m(i+1,j ) + b_m(i+1,j-1) + >- b_m(i ,j+1) + b_m(i ,j ) + b_m(i ,j-1) + >- b_m(i-1,j+1) + b_m(i-1,j ) + b_m(i-1,j-1)); >+ this->a_m(i,j) = fact * >+ (this->b_m(i+1,j+1) + this->b_m(i+1,j ) + this->b_m(i+1,j-1) + >+ this->b_m(i ,j+1) + this->b_m(i ,j ) + this->b_m(i ,j-1) + >+ this->b_m(i-1,j+1) + this->b_m(i-1,j ) + this->b_m(i-1,j-1)); > } > } > for (j = 2; j <= this->n_m-1; j++) > { > for (i = 2; i <= this->n_m-1; i++) > { >- b_m(i,j) = fact * >- (a_m(i+1,j+1) + a_m(i+1,j ) + a_m(i+1,j-1) + >- a_m(i ,j+1) + a_m(i ,j ) + a_m(i ,j-1) + >- a_m(i-1,j+1) + a_m(i-1,j ) + a_m(i-1,j-1)); >+ this->b_m(i,j) = fact * >+ (this->a_m(i+1,j+1) + this->a_m(i+1,j ) + this->a_m(i+1,j-1) + >+ this->a_m(i ,j+1) + this->a_m(i ,j ) + this->a_m(i ,j-1) + >+ this->a_m(i-1,j+1) + this->a_m(i-1,j ) + this->a_m(i-1,j-1)); > } > } > } > > // Save result for checking. > >- this->check_m = b_m(this->n_m / 2, this->n_m / 2); >+ this->check_m = this->b_m(this->n_m / 2, this->n_m / 2); > } > > void runSetup() >@@ -398,11 +398,11 @@ > { > for (int i = 1; i <= this->n_m; i++) > { >- a_m(i,j) = 0.0; >- b_m(i,j) = 0.0; >+ this->a_m(i,j) = 0.0; >+ this->b_m(i,j) = 0.0; > } > } >- b_m(this->n_m/2,this->n_m/2) = 1000.0; >+ this->b_m(this->n_m/2,this->n_m/2) = 1000.0; > } > }; > >@@ -431,7 +431,7 @@ > { > typedef typename Store::Engine_t Engine_t; > >- std::string qual = ::qualification(a_m); >+ std::string qual = ::qualification(this->a_m); > > if (guarded_m) > { >@@ -458,31 +458,31 @@ > > for (k = 0; k < 5; ++k) > { >- a_m(I,J) = fact * >- (b_m(I+1,J+1) + b_m(I+1,J ) + b_m(I+1,J-1) + >- b_m(I ,J+1) + b_m(I ,J ) + b_m(I ,J-1) + >- b_m(I-1,J+1) + b_m(I-1,J ) + b_m(I-1,J-1)); >- b_m(I,J) = fact * >- (a_m(I+1,J+1) + a_m(I+1,J ) + a_m(I+1,J-1) + >- a_m(I ,J+1) + a_m(I ,J ) + a_m(I ,J-1) + >- a_m(I-1,J+1) + a_m(I-1,J ) + a_m(I-1,J-1)); >+ this->a_m(this->I,this->J) = fact * >+ (this->b_m(this->I+1,this->J+1) + this->b_m(this->I+1,this->J ) + this->b_m(this->I+1,this->J-1) + >+ this->b_m(this->I ,this->J+1) + this->b_m(this->I ,this->J ) + this->b_m(this->I ,this->J-1) + >+ this->b_m(this->I-1,this->J+1) + this->b_m(this->I-1,this->J ) + this->b_m(this->I-1,this->J-1)); >+ this->b_m(this->I,this->J) = fact * >+ (this->a_m(this->I+1,this->J+1) + this->a_m(this->I+1,this->J ) + this->a_m(this->I+1,this->J-1) + >+ this->a_m(this->I ,this->J+1) + this->a_m(this->I ,this->J ) + this->a_m(this->I ,this->J-1) + >+ this->a_m(this->I-1,this->J+1) + this->a_m(this->I-1,this->J ) + this->a_m(this->I-1,this->J-1)); > } > > Pooma::blockAndEvaluate(); > > // Save result for checking. > >- this->check_m = b_m(this->n_m / 2, this->n_m / 2); >+ this->check_m = this->b_m(this->n_m / 2, this->n_m / 2); > } > > void runSetup() > { > // Run setup. > >- a_m = 0.0; >- b_m = 0.0; >+ this->a_m = 0.0; >+ this->b_m = 0.0; > Pooma::blockAndEvaluate(); >- b_m(this->n_m/2,this->n_m/2) = 1000.0; >+ this->b_m(this->n_m/2,this->n_m/2) = 1000.0; > } > > private: >@@ -535,7 +535,7 @@ > const char* qualification() const > { > typedef typename Store::Engine_t Engine_t; >- std::string qual = ::qualification(a_m); >+ std::string qual = ::qualification(this->a_m); > > if (guarded_m) > { >@@ -551,7 +551,7 @@ > void run() > { > int k; >- Interval<2> IJ(I,J); >+ Interval<2> IJ(this->I,this->J); > > // Run setup. > >@@ -561,30 +561,30 @@ > > for (k = 0; k < 5; ++k) > { >- a_m(IJ) = stencil_m(b_m,IJ); >+ this->a_m(IJ) = stencil_m(this->b_m,IJ); > > // Note we use this form of the stencil since adding guard cells can > // add external guard cells so the domain of a_m might be bigger than > // we expect, in which case stencil_m(a_m) would be bigger than IJ. > >- b_m(IJ) = stencil_m(a_m,IJ); >+ this->b_m(IJ) = stencil_m(this->a_m,IJ); > } > > Pooma::blockAndEvaluate(); > > // Save result for checking. > >- this->check_m = b_m(this->n_m / 2, this->n_m / 2); >+ this->check_m = this->b_m(this->n_m / 2, this->n_m / 2); > } > > void runSetup() > { > // Run setup. > >- a_m = 0.0; >- b_m = 0.0; >+ this->a_m = 0.0; >+ this->b_m = 0.0; > Pooma::blockAndEvaluate(); >- b_m(this->n_m/2,this->n_m/2) = 1000.0; >+ this->b_m(this->n_m/2,this->n_m/2) = 1000.0; > > } > > > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Mon Aug 23 15:48:01 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Aug 2004 17:48:01 +0200 (CEST) Subject: [pooma-dev] Re: [PATCH] Fix/improve PETSc wrapper In-Reply-To: <412A0FBA.4020808@codesourcery.com> Message-ID: On Mon, 23 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > > > > Found in one of my repositories. > > > > Ok? > > > > Richard. > > > > > > 2004Aug21 Richard Guenther > > > > * src/Transform/PETSc.h: handle expression engines for > > initialization, support periodic setup, fix MP patch > > computation. > > > Yes, this looks fine to commit. Does changing from e(I) to e.read(I) > yield measurable changes? It's always necessary to use the read() versions if (possibly) targeting read-only engines such as the ExpressionEngine. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Mon Aug 23 15:48:10 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 23 Aug 2004 08:48:10 -0700 Subject: [PATCH] Random stuff In-Reply-To: References: Message-ID: <412A11BA.8030503@codesourcery.com> Richard Guenther wrote: >Collected stuff from another repository and icc -strict_ansi checking. > >Ok? > >Richard. > > >2004Aug23 Richard Guenther > > * config/arch/LINUXICC.conf: ignore tail padding warnings, > specify strict cmdline argument. > src/Connect/Lux/tests/lux_test2.cpp: fix compiling without > Lux. > src/DataBrowser/tests/TestDataBrowser.cpp: fix old Field > stuff. > src/Domain/NewDomain.h: avoid comparison warnings. > src/Layout/LayoutBase.h: likewise. > src/Engine/Stencil.h: add missing return to assignment > operator. > src/Particles/tests/interpolate.cpp: honour two-stage name > lookup rules. > src/Particles/tests/particle_tests.h: likewise. > Yes, please commit this. >------------------------------------------------------------------------ > >===== r2/config/arch/LINUXICC.conf 1.4 vs edited ===== >--- 1.4/r2/config/arch/LINUXICC.conf 2004-01-07 10:02:17 +01:00 >+++ edited/r2/config/arch/LINUXICC.conf 2004-08-23 16:29:37 +02:00 >@@ -165,13 +165,13 @@ > $cppnoex = ""; # flag to use to turn off exceptions > $cppverbose = "-v"; # flag for verbose compiler output > $cpponeper = ""; # flag to turn on one-instantance-per-obj >-$cppstrict = ""; # flag for ANSI conformance checking >+$cppstrict = "-strict_ansi"; # flag for ANSI conformance checking > > > ### debug or optimized build settings for C++ applications > >-$cppdbg_app = "-g -wd161"; >-$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -wd161"; >+$cppdbg_app = "-g -wd161,1476"; >+$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -wd161,1476"; > > > ### debug or optimized build settings for C++ libraries >===== r2/src/Connect/Lux/tests/lux_test2.cpp 1.1 vs edited ===== >--- 1.1/r2/src/Connect/Lux/tests/lux_test2.cpp 2002-05-13 17:47:28 +02:00 >+++ edited/r2/src/Connect/Lux/tests/lux_test2.cpp 2004-08-23 17:16:24 +02:00 >@@ -39,7 +39,7 @@ > > > // Traits class for Particles object >-template +template class InterpolatorTag> > struct PTraits > { >@@ -47,7 +47,7 @@ > typedef EngineTag AttributeEngineTag_t; > > // The type of particle layout to use >- typedef SpatialLayout,FL> >+ typedef SpatialLayout > ParticleLayout_t; > > // The type of interpolator to use >@@ -74,10 +74,10 @@ > ChargedParticles(const ParticleLayout_t &pl) > : Particles(pl) > { >- addAttribute(R); >- addAttribute(V); >- addAttribute(E); >- addAttribute(qm); >+ this->addAttribute(R); >+ this->addAttribute(V); >+ this->addAttribute(E); >+ this->addAttribute(qm); > } > > // Position and velocity attributes (as public members) >@@ -95,18 +95,12 @@ > typedef MultiPatch AttrEngineTag_t; > > // Mesh type >-typedef UniformRectilinearMesh,double> Mesh_t; >- >-// Centering of Field elements on mesh >-typedef Cell Centering_t; >- >-// Geometry type for Fields >-typedef DiscreteGeometry Geometry_t; >+typedef UniformRectilinearMesh Mesh_t; > > // Field types >-typedef Field< Geometry_t, double, >+typedef Field< Mesh_t, double, > MultiPatch > DField_t; >-typedef Field< Geometry_t, Vector, >+typedef Field< Mesh_t, Vector, > MultiPatch > VecField_t; > > // Field layout type, derived from Engine type >@@ -117,7 +111,7 @@ > typedef NGP InterpolatorTag_t; > > // Particle traits class >-typedef PTraits+typedef PTraits InterpolatorTag_t> PTraits_t; > > // Type of particle layout >===== r2/src/DataBrowser/tests/TestDataBrowser.cpp 1.1 vs edited ===== >--- 1.1/r2/src/DataBrowser/tests/TestDataBrowser.cpp 2002-05-13 17:47:29 +02:00 >+++ edited/r2/src/DataBrowser/tests/TestDataBrowser.cpp 2004-08-23 17:08:08 +02:00 >@@ -76,8 +76,8 @@ > // Global typedefs; useful in making user-defined functions below: > // 1D > typedef UniformRectilinearMesh<1> Mesh1_t; >-typedef Field, double> ScalarField1_t; >-typedef Field, Vector<1> > VectorField1_t; >+typedef Field ScalarField1_t; >+typedef Field > VectorField1_t; > typedef Array<1, double, CompressibleBrick> ScalarArray1_t; > typedef Array<1, Vector<1>, CompressibleBrick> VectorArray1_t; > // 2D >@@ -136,17 +136,18 @@ > Mesh1_t mesh(vertDomain); > > // Create the 1D geometry: >- DiscreteGeometry geomc(mesh, GuardLayers<1>(2)); >+ Centering<1> cell = canonicalCentering<1>(CellType, Continuous); >+ DomainLayout<1> layout(vertDomain); > > fout << std::endl << "=========== 1D ============" << std::endl; > > // Make some 1D fields: >- ScalarField1_t s1(geomc); >- VectorField1_t v1(geomc); >+ ScalarField1_t s1(cell, layout, mesh); >+ VectorField1_t v1(cell, layout, mesh); > > // Assign to spatially-varying values: >- s1.all() = s1.xAll().comp(0); >- v1.all() = v1.xAll(); >+ s1.all() = positions(s1).comp(0); >+ v1.all() = positions(v1); > > // Create some 1D Arrays: > ScalarArray1_t sa1(cellDomain); >===== r2/src/Domain/NewDomain.h 1.6 vs edited ===== >--- 1.6/r2/src/Domain/NewDomain.h 2003-10-27 11:25:05 +01:00 >+++ edited/r2/src/Domain/NewDomain.h 2004-08-23 11:10:02 +02:00 >@@ -225,7 +225,7 @@ > static void combine(RT &rt, const UT &u, const CT& ct) { > CTAssert(DS >= 0 && SliceDS >= 0); > CTAssert(DRT > (DS + DCT - 1)); >- CTAssert(DUT == DRT); >+ CTAssert((int)DUT == DRT); > for (int i=0; i < DCT; ++i) { > DomainTraits::getDomain(rt, DS + i).setWildcardDomain( > DomainTraits::getPointDomain(u, DS + i), >===== r2/src/Engine/Stencil.h 1.13 vs edited ===== >--- 1.13/r2/src/Engine/Stencil.h 2004-08-21 20:44:21 +02:00 >+++ edited/r2/src/Engine/Stencil.h 2004-08-23 16:47:27 +02:00 >@@ -346,6 +346,7 @@ > domain_m[d] = model.domain()[d]; > offset_m[d] = model.offset(d); > } >+ return *this; > } > > //============================================================ >===== r2/src/Layout/LayoutBase.h 1.7 vs edited ===== >--- 1.7/r2/src/Layout/LayoutBase.h 2004-01-17 16:24:21 +01:00 >+++ edited/r2/src/Layout/LayoutBase.h 2004-08-23 11:10:31 +02:00 >@@ -947,12 +947,12 @@ > // Our dimensionality must be the same as the slice's reduced > // dimensionality. > >- CTAssert(DT::sliceDimensions == Dim); >+ CTAssert((int)DT::sliceDimensions == Dim); > > // The slice's dimensionality must match that of the previous > // view. > >- CTAssert(DT::dimensions == LV::dimensions); >+ CTAssert((int)DT::dimensions == LV::dimensions); > > // The layout passed in must be initialized. > >===== r2/src/Particles/tests/interpolate.cpp 1.3 vs edited ===== >--- 1.3/r2/src/Particles/tests/interpolate.cpp 2004-07-15 11:25:53 +02:00 >+++ edited/r2/src/Particles/tests/interpolate.cpp 2004-08-23 16:55:45 +02:00 >@@ -93,10 +93,10 @@ > MyParticles(const ParticleLayout_t& pl) > : Particles(pl) > { >- addAttribute(pos); >- addAttribute(efield); >- addAttribute(charge); >- addAttribute(cache); >+ this->addAttribute(pos); >+ this->addAttribute(efield); >+ this->addAttribute(charge); >+ this->addAttribute(cache); > } > > // List of attributes; we'll just make them public data members here, >===== r2/src/Particles/tests/particle_tests.h 1.1 vs edited ===== >--- 1.1/r2/src/Particles/tests/particle_tests.h 2002-05-13 17:47:41 +02:00 >+++ edited/r2/src/Particles/tests/particle_tests.h 2004-08-23 16:30:54 +02:00 >@@ -153,10 +153,10 @@ > > void addAllAttributes() > { >- addAttribute(pos); >- addAttribute(mom); >- addAttribute(ad); >- addAttribute(ai); >+ this->addAttribute(pos); >+ this->addAttribute(mom); >+ this->addAttribute(ad); >+ this->addAttribute(ai); > } > }; > > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Aug 23 16:10:43 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 23 Aug 2004 09:10:43 -0700 Subject: [PATCH] Convert ParticlesDoc.txt to html In-Reply-To: References: Message-ID: <412A1703.3030508@codesourcery.com> Richard Guenther wrote: >As subject says. Also adds common header to Layout.html. > >Ok? > >Richard. > > >2004Aug23 Richard Guenther > > * docs/Layout.html: adjust background color, add head image. > docs/index.html: refer to ParticlesDoc.html. > docs/ParticlesDoc.html: new. > docs/ParticlesDoc.txt: remove. > > >------------------------------------------------------------------------ > > >--- /dev/null Tue May 18 17:20:27 2004 >+++ ParticlesDoc.html Mon Aug 23 13:13:27 2004 >@@ -0,0 +1,1520 @@ >+ >+ >+ >+ >+ >+ Layout and related classes > > This title should probably be "POOMA Particles Documentation". Other than that, everything looks great. It is nice to have documentation of the particles. >+ >+ >+ >+
POOMA banner+ALIGN=bottom>
>+ >+ >+

POOMA Particles Documentation

>+ >+ >+

Introduction

>+ >+

>+Particles are primarily used in one of two ways in large scientific >+applications. The first is to track sample particles using Monte >+Carlo techniques, for example, to gather statistics that describe the >+conditions of a complex physical system. Particles of this kind are >+often referred to as "tracers". The second is to perform direct >+numerical simulation of systems that contain discrete point-like >+entities such as ions or molecules. >+ >+

>+In both scenarios, the application contains one or more sets of >+particles. Each set has some data associated with it that describes >+its members' characteristics, such as mass or momentum. Particles >+typically exist in a spatial domain, and they may interact directly >+with one another or with field quantities defined on that domain. >+ >+

>+This document gives an overview of POOMA's support for particles, >+then discusses some implementation details. The classes introduced in >+this tutorial are illustrated by two short programs: one that tracks >+particles under the influence of a simple one-dimensional harmonic >+oscillator potential, and another that models particles bouncing off >+the walls of a closed three-dimensional box. Later on, we will show >+how particles and fields can interact in a simulation code. >+ >+ >+

Overview

>+ >+

>+POOMA's Particles class is a container for a heterogeneous collection >+of particle attributes. The class uses dynamic storage for particle >+data (in the form of a set of POOMA DynamicArray objects), so that >+particles can be added or deleted as necessary. It contains a layout >+object that manages the distribution of particle data across multiple >+patches, and it applies boundary conditions to particles when attribute >+data values exceed a prescribed range. In addition, global functions >+are provided for interpolating data between particle and field element >+positions. >+ >+

>+Each Particles object keeps a list of pointers to its elements' >+attributes. When an application wants to add or delete particles, it >+invokes a method on the Particles object, which delegates the call to >+the layout object for the contained attributes. Particles also >+provides a member function called sync(), which the application >+invokes in order to update the global particle count and numbering, >+update the data distribution across patches, and apply the particle >+boundary conditions. >+ >+

> > -- Jeffrey D. Oldham oldham at codesourcery.com From cummings at linkline.com Mon Aug 23 18:13:03 2004 From: cummings at linkline.com (Julian C. Cummings) Date: Mon, 23 Aug 2004 11:13:03 -0700 Subject: [pooma-dev] Status of Particles and Parallelism In-Reply-To: <411E1CDF.5000405@tat.physik.uni-tuebingen.de> Message-ID: <007f01c4893c$d53722e0$6401a8c0@JULES> Hello, I've been away on vacation for a while, just now getting back to my e-mail. I'm a bit puzzled by this. The whole point of Cheetah was to provide a uniform parallel interface, so that one could use MPI or a SHMEM-like system interchangeably without worrying about the exact implementation details of the parallelism. So I don't understand how the particles code could work with Cheetah but not with MPI. In any case, my recollection is that the PatchSwapLayout code was used to exchange info between processors sharing a virtual node boundary about how many particles would be coming or going. Once this info is exchanged, the actual particle data can be efficiently exchanged using the Cheetah messaging protocols. That's the idea, anyway. I don't know that I would be able to help you diagnose specific bugs/problems in the code at this point, having been away from it for so very long, but feel free to fire questions at me and I will do what I can to help. Regards, Julian C. Dr. Julian C. Cummings Staff Scientist, CACR/Caltech (626) 395-2543 cummings at cacr.caltech.edu > -----Original Message----- > From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] > Sent: Saturday, August 14, 2004 7:09 AM > To: Jeffrey D. Oldham > Cc: pooma-dev at pooma.codesourcery.com; drnuke at lanl.gov > Subject: Re: [pooma-dev] Status of Particles and Parallelism > > > Jeffrey D. Oldham wrote: > > Richard, > > > > Last month you made progress on particles in the Pooma CVS > > repository. Steve Nolen wishes to use particles in the > Pooma repository > > code and MPI for his work. What is the current state of > the particles > > codes with MPI. As an alternate, using Cheetah will be > acceptable. > > Thanks for the information. > > > > Sorry for the late reply, I was on vacation. Parallel > Particles are not > supported with MPI as I was not able to understand what > PatchParticleSwapLayout (or whatever it is called). If > anyone provides > me with some explanation, I'll happily look at what is missing. > > Btw. I have some Cheetah fixes myself, I can collect these > together and > maybe we could provide at least a patch for download along > the cheetah > tarball. > > Richard. > From rguenth at tat.physik.uni-tuebingen.de Mon Aug 23 19:19:16 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Aug 2004 21:19:16 +0200 Subject: Status Message-ID: <412A4334.2010905@tat.physik.uni-tuebingen.de> Apart from the "Fix reductions for MPI operation" patch everything necessary seems to be committed. We're all seeing Field/ExpressionTest failing since a while - I'm currently investigating the reason and my brain is hurting trying to second-guess what is taking what view in which case :) The good news is, I have a workaround for applying to the testcase (apart from disabling the failing part again). It's appended below. The bad news is, I don't know yet what is going wrong. Richard. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Mon Aug 23 21:15:06 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Aug 2004 23:15:06 +0200 Subject: [RFH] ExpressionEngine _not_ zero-based? Message-ID: <412A5E5A.1050705@tat.physik.uni-tuebingen.de> Despite it says so template class Engine > { public: ... /// Expression-engines are zero-based. enum { zeroBased = true }; it is _not_ zero-based - at least not in all cases. Example: Array<1, int> a(8, GuardLayers<1>(1)), b(8, GuardLayers<1>(1)); std::cout << (a+b).domain() << std::endl; prints [-1:8:1] while it should have printed [0:9:1]! or not? The same wrapped into a dummy stencil with zero extent std::cout << Stencil()(a+b).domain() << std::endl; yields the expected. StencilEngines seem to be really zero-based (well, yes - they do it the strange way - not taking a view of the expression, but keeping an offset). While I suspect StencilEngine and ExpressionEngine need to be very similar in principle I don't know how to best fix this deficiency. Any ideas? Richard. From oldham at codesourcery.com Mon Aug 23 23:59:39 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 23 Aug 2004 16:59:39 -0700 Subject: [PATCH] Fix reductions for MPI operation In-Reply-To: <4127ACEC.2020708@tat.physik.uni-tuebingen.de> References: <4127ACEC.2020708@tat.physik.uni-tuebingen.de> Message-ID: <412A84EB.3070902@codesourcery.com> Richard Guenther wrote: > > This patch fixes (works around) a previously discovered problem > (remember the WaitingIterate). I'm sure there is a real problem > to fix (at least for MPI - I'm not sure about Cheetah), and this > is the least intrusive way of fixing it until the right idea for > a cross-context csem like mechanism pops up. > > Without this patch random lockups during reductions may occour. > > Ok? > > Richard. > > > 2004Aug21 Richard Guenther > > * src/Engine/RemoteEngine.h: For MPI avoid doing blocking > operation during reductions while iterates are still pending. Yes, this is fine. >------------------------------------------------------------------------ > >Index: src/Engine/RemoteEngine.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v >retrieving revision 1.42 >diff -u -u -r1.42 RemoteEngine.h >--- src/Engine/RemoteEngine.h 19 Jan 2004 22:04:33 -0000 1.42 >+++ src/Engine/RemoteEngine.h 21 Aug 2004 20:10:06 -0000 >@@ -2065,6 +2065,11 @@ > Pooma::scheduler().endGeneration(); > > csem.wait(); >+#if POOMA_MPI >+ // The above single thread waiting has the same problem as with >+ // the MultiPatch variant. So fix it. >+ Pooma::blockAndEvaluate(); >+#endif > > RemoteProxy globalRet(ret, computationContext); > ret = globalRet; >@@ -2186,6 +2191,27 @@ > > Pooma::scheduler().endGeneration(); > csem.wait(); >+#if POOMA_MPI >+ // We need to wait for Reductions on _all_ contexts to complete >+ // here, as we may else miss to issue a igc update send iterate that a >+ // remote context waits for. Consider the 2-patch setup >+ // a,b | g| | g| >+ // with the expressions >+ // a(I) = b(I+1); >+ // bool res = all(a(I) == 0); >+ // here we issue the following iterates: >+ // 0: guard receive from 1 (write request b) >+ // 1: guard send to 0 (read request b) >+ // 0/1: expression iterate (read request b, write request a) >+ // 0/1: reduction (read request a) >+ // 0/1: blocking MPI_XXX >+ // here the guard send from 1 to 0 can be skipped starting the >+ // blocking MPI operation prematurely while context 0 needs to >+ // wait for this send to complete in order to execute the expression. >+ // >+ // The easiest way (and the only available) is to blockAndEvaluate(). >+ Pooma::blockAndEvaluate(); >+#endif > > if (n > 0) > { > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Tue Aug 24 01:27:29 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 23 Aug 2004 18:27:29 -0700 Subject: Status In-Reply-To: <412A4334.2010905@tat.physik.uni-tuebingen.de> References: <412A4334.2010905@tat.physik.uni-tuebingen.de> Message-ID: <412A9981.2040402@codesourcery.com> Richard Guenther wrote: > Apart from the "Fix reductions for MPI operation" patch everything > necessary seems to be committed. We're all seeing > Field/ExpressionTest failing since a while - I'm currently > investigating the reason and my brain is hurting trying to > second-guess what is taking what view in which case :) The good news > is, I have a workaround for applying to > the testcase (apart from disabling the failing part again). It's > appended below. The bad news is, I don't know yet what is going wrong. > > Richard. The failure first occurred 20Jul. It occurs on an assertion in src/Field/Field.h, line 443. If I remember correctly, it was because of a sizable patch you committed, but I did not remember more. :( It might be this patch: http://www.codesourcery.com/archives/pooma-dev/msg01707.html . I would prefer to fix the problem, not apply a bandage. If you can argue that the code below is wrong and this patch fixes it, we can use this patch. Otherwise, let's get Pooma right first. >------------------------------------------------------------------------ > >Index: ExpressionTest.cpp >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Field/tests/ExpressionTest.cpp,v >retrieving revision 1.3 >diff -u -u -r1.3 ExpressionTest.cpp >--- ExpressionTest.cpp 19 Jul 2004 18:20:41 -0000 1.3 >+++ ExpressionTest.cpp 23 Aug 2004 19:18:50 -0000 >@@ -257,12 +257,12 @@ > Centering inputCentering_m; > }; > >-template >-typename FieldStencilSimple, typename View1, Dom>::Type_t >::Type_t >-twoPt(const Field& expr, const Dom &domain) >+template >+typename FieldStencilSimple, F>::Type_t >+twoPt(const F& expr) > { >- typedef FieldStencilSimple, typename View1, Dom>::Type_t > Ret_t; >- return Ret_t::make(TwoPt(expr), expr(domain)); >+ typedef FieldStencilSimple, F> Ret_t; >+ return Ret_t::make(TwoPt(expr), expr); > } > > template >@@ -290,7 +290,7 @@ > a2(i) = initial(i) + a1(i-1) + a1(i); > } > >- a4(I) = initial(I) + twoPt(a3, I); >+ a4(I) = initial(I) + twoPt(a3)(I); > > Pooma::blockAndEvaluate(); > >@@ -322,7 +322,7 @@ > a2(i) = initial(i) + 1.0 + a1(i-1) + 1.0 + a1(i); > } > >- a4(I) = initial(I) + twoPt(1.0 + a3, I); >+ a4(I) = initial(I) + twoPt(1.0 + a3)(I); > > Pooma::blockAndEvaluate(); > >@@ -464,7 +464,7 @@ > test1(tester, 1, ca1, ca2, ca3, ca4, cinit, cellInterior); > // test2(tester, 2, ca1, ca2, ca3, ca4, cinit, cellInterior); > test3(tester, 3, ca1, ca2, ca3, ca4, cinit, cellInterior); >- //test4(tester, 4, ca1, ca2, ca3, ca4, cinit, cellInterior); >+ // test4(tester, 4, ca1, ca2, ca3, ca4, cinit, cellInterior); > > > int ret = tester.results("ExpressionTest"); > > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 09:59:17 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 24 Aug 2004 11:59:17 +0200 (CEST) Subject: [pooma-dev] Re: Status In-Reply-To: <412A9981.2040402@codesourcery.com> Message-ID: On Mon, 23 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > > Apart from the "Fix reductions for MPI operation" patch everything > > necessary seems to be committed. We're all seeing > > Field/ExpressionTest failing since a while - I'm currently > > investigating the reason and my brain is hurting trying to > > second-guess what is taking what view in which case :) The good news > > is, I have a workaround for applying to > > the testcase (apart from disabling the failing part again). It's > > appended below. The bad news is, I don't know yet what is going wrong. > > > > Richard. > > The failure first occurred 20Jul. It occurs on an assertion in > src/Field/Field.h, line 443. If I remember correctly, it was because of > a sizable patch you committed, but I did not remember more. :( It might > be this patch: > http://www.codesourcery.com/archives/pooma-dev/msg01707.html . I thought so, too, but reverting that patch doesn't fix the problem. The question really is what is the difference between using the stencils (expr, domain) vs. (expr) constructor and taking the view afterwards. There seems to be some inconsistency (which may be caused by the ExpressionEngine not really being zeroBased in all cases - but I'm not sure yet) wrt. Stencils/Expressions and domains. I did invent some artificial testcase to test assumptions we (may) have about how Stencils and Expressions work wrt domains and views and it fails with the same error as we see with the ExpressionTest. What I do not know at the time is wether my assertions on how things work are correct or not. For reference, I attached the testcase that is in parts failing, in other parts triggering bounds checking error. The bounds checking error is always within the stencil functor in checking the expressions component, i.e. for (a+b).read(i) we check one time for the expression domain of (a+b) which succeeds and then proceed to check for the domain of a for which the assertion triggers. In ExpressionTest we fail at 293: a4(I) = initial(I) + twoPt(a3, I); where I is [2:7:1] and the (total) domain of a3 is [-1:9:1]. The question now is, what is the difference of using FieldStencilSimple::make(stencil, expr(domain)) and FieldStencilSimple::make(stencil, expr)(domain) (apart from making it work). The difference may be that we use physicalDomain() on the expr in make() and that doesn't work (or doesn't make a difference) for Field Views. If checking both versions like tester.out() << twoPt(a3, I).domain() << twoPt(a3, I).engine().field().domain() << std::endl; tester.out() << twoPt(a3)(I).domain() << twoPt(a3)(I).engine().field().domain() << std::endl; we end up with Pooma> [0:5:1][0:5:1] Pooma> [0:5:1][0:8:1] which may explain why we do not fail for the second case. But of course we cannot "ban" views from make() as that would ban expressions... All the analysis above is true with and without the patch we thought may be causing the failure. Oh well, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 12:50:55 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 24 Aug 2004 14:50:55 +0200 (CEST) Subject: [PATCH] Fix ExpressionTest Message-ID: This fixes ExpressionTest by deleting all the strange stuff from FieldStencilSimple and replacing it with something that resembles Engine/Stencil.h View1/View2. In turn we use the FieldStencilSimple make() that takes a domain and was modeled after View2 in twoPt() in the test. Tested with the two FieldStencil tests that are available (though that is not much...), ExpressionTest (the twoPt stuff) and StencilTests (tests divVertToCell). More testcases which show what is expected to work appreciated, because I don't really know the desired semantics of FieldStencilSimple (and Stencil) wrt views and domains. Obviously I don't use Stencils myself. Ok? Richard. 2004Aug24 Richard Guenther * src/Engine/Stencil.h: do bounds check only with POOMA_BOUNDS_CHECK. src/Field/DiffOps/FieldStencil.h: rewrite make(stencil, expr), add make(stencil, expr, domain), kill similar broken Accumulate stuff, update documentation. src/Field/tests/ExpressionTest.cpp: use FieldStencilSimple::make with domain argument, don't take view ourselves. -------------- next part -------------- Index: Engine/Stencil.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/Stencil.h,v retrieving revision 1.53 diff -u -c -r1.53 Stencil.h *** Engine/Stencil.h 23 Aug 2004 18:44:17 -0000 1.53 --- Engine/Stencil.h 24 Aug 2004 12:42:03 -0000 *************** *** 432,438 **** --- 432,440 ---- inline int first(int i) const { + #if POOMA_BOUNDS_CHECK PAssert(i >= 0 && i < D); + #endif return 0; } Index: Field/DiffOps/FieldStencil.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/DiffOps/FieldStencil.h,v retrieving revision 1.6 diff -u -c -r1.6 FieldStencil.h *** Field/DiffOps/FieldStencil.h 22 Jul 2004 17:29:58 -0000 1.6 --- Field/DiffOps/FieldStencil.h 24 Aug 2004 12:42:03 -0000 *************** *** 60,126 **** * and stick ONE stencil engine into it. Maybe this class can be generalized * for fields that contain multiple stencil engines. * - * From the old r1 documentation: - * * FieldStencil is used to wrap a user-defined field-based stencil class. * The idea is to encapsulate the majority of the crazy type manipulations ! * required to generate the output ConstField and the calculation of the ! * new number of guard layers. * * To create a stencil, users must create a class similar to the one below, * which computes a central difference divergence of a vertex-centered Field * and maps it to a cell-centered Field: * *

!  * template
!  * class Div { };
!  *  
!  * template
!  * class DivVertToCell, UniformRectilinear >
   * {
   * public:
!  * 
!  *   typedef T2 OutputElement_t;
!  * 
!  *   Centering outputCentering() const
   *   {
!  *     return canonicalCentering(CellType, Continuous);
   *   }
   *
!  *   int lowerExtent(int) const
!  *     {
!  *       return 1;
!  *     }
!  * 
!  *   int upperExtent(int) const
!  *     {
!  *       return 1;
!  *     }
!  *         
!  *   template
!  *   inline OutputElement_t
!  *   operator()(const F &f, int i1) const
!  *     {
!  *       return (f(i1 + 1)(0) - f(i1 - 1)(0)) / 
!  *         f.geometry().mesh().meshSpacing(0);
!  *     }
   * 
!  *   template
!  *   inline OutputElement_t
!  *   operator()(const F &f, int i1, int i2) const
!  *     {
!  *       return (f(i1 + 1, i2)(0) - f(i1 - 1, i2)(0)) / 
!  *         f.geometry().mesh().meshSpacing()(0) +
!  *         (f(i1, i2 + 1)(1) - f(i1, i2 - 1)(1)) / 
!  *         f.geometry().mesh().meshSpacing()(1);
!  *     }
   * };
   * 
* ! * There are 2 required typedefs: OutputCentering_t and OutputElement_t. ! * These export the type of the output centering and the type resulting * from applying the stencil at a point. * * Then, there are two accessors: lowerExtent(int dir) and * upperExtent(int dir). These return the extent of the stencil as a function * of direction. As another example, a forward difference would have a lower --- 60,139 ---- * and stick ONE stencil engine into it. Maybe this class can be generalized * for fields that contain multiple stencil engines. * * FieldStencil is used to wrap a user-defined field-based stencil class. * The idea is to encapsulate the majority of the crazy type manipulations ! * required to generate the output Field. * * To create a stencil, users must create a class similar to the one below, * which computes a central difference divergence of a vertex-centered Field * and maps it to a cell-centered Field: * *
!  * template
!  * class DivVertToCell, UniformRectilinearMesh >
   * {
   * public:
!  *  
!  * typedef T2   OutputElement_t;
!  *     
!  * Centering outputCentering() const 
!  * {
!  *   return canonicalCentering(CellType, Continuous, AllDim);
!  * }
!  *
!  * Centering inputCentering() const 
!  * {
!  *   return canonicalCentering(VertexType, Continuous, AllDim);
!  * }
!  *                           
!  * // Constructors.
!  *
!  * // default version is required by default stencil engine constructor.
!  *
!  * DivVertToCell()
!  * {
!  *   for (int d = 0; d < Dim; ++d)
   *   {
!  *      fact_m(d) = 1.0;
   *   }
+  * }
   *
!  * template
!  * DivVertToCell(const FE &fieldEngine)
!  * {
!  *   for (int d = 0; d < Dim; ++d)
!  *   {
!  *      fact_m(d) = 1 / fieldEngine.mesh().spacings()(d);
!  *   }
!  * }
!  *
!  * // Methods.
!  *
!  * int lowerExtent(int d) const { return 0; }
!  * int upperExtent(int d) const { return 1; }
!  *
!  * template
!  * inline OutputElement_t
!  * operator()(const F &f, int i1) const
!  * {
!  *   return OutputElement_t
!  *     (fact_m(0)*(f.read(i1+1)(0) - f.read(i1)(0)));
!  * }
!  *
!  * // and versions for 2d and 3d
   * 
!  * private:
!  * Vector fact_m;
   * };
   * 
* ! * There is one required typedefs: OutputElement_t. ! * These export the type of the type resulting * from applying the stencil at a point. * + * There are two required methods returning the input and + * output centering. + * * Then, there are two accessors: lowerExtent(int dir) and * upperExtent(int dir). These return the extent of the stencil as a function * of direction. As another example, a forward difference would have a lower *************** *** 128,139 **** * functions, which take a Field of some sort and a set indices, must be * supplied. This is what actually computes the stencil. * ! * A ConstField that contains an ApplyFieldStencil-engine that operates on ! * a Field f, is constructed by using operator()() for FieldStencil: * ! * View1 >, ! * ConstField >::make( ! * Div(), f); */ template --- 141,151 ---- * functions, which take a Field of some sort and a set indices, must be * supplied. This is what actually computes the stencil. * ! * A Field that contains a StencilEngine that operates on ! * a Field f, is constructed by using make() from FieldStencilSimple: * ! * FieldStencilSimple, Field > ! * ::make(DivVertToCell(f.fieldEngine()), f); */ template *************** *** 152,226 **** static inline Type_t make(const Functor &stencil, const Expression &f) { ! Type_t h(stencil.outputCentering(), f.layout(), f.mesh()); ! h.fieldEngine().physicalCellDomain() = f.fieldEngine().physicalCellDomain(); ! // FIXME: need to add comparison for centerings. ! // PAssert(f.centering() == stencil.inputCentering()); ! GuardLayers og(f.fieldEngine().guardLayers()); ! for (int d = 0; d < outputDim; d++) ! { ! og.lower(d) -= stencil.lowerExtent(d); ! og.upper(d) -= stencil.upperExtent(d); ! ! // FIXME: Need to think about adjusting the guards. I don't ! // believe the old version: ! // if (inputCentering[d].first() == 0 && ! // outputCentering[d].first() == 1) ! // og.upper(d)++; ! // if (inputCentering[d].first() == 1 && ! // outputCentering[d].first() == 0) ! // og.upper(d)--; ! } ! h.fieldEngine().guardLayers() = og; ! h.fieldEngine().engine() = SEngine_t(stencil, f, h.physicalDomain()); ! return h; } - template static inline ! Type_t make(const Expression &f, ! const std::vector > &nn, ! const Centering &outputCentering, ! Accumulate accumulate = Accumulate()) { ! PAssert(nn.size() == outputCentering.size()); ! Type_t h(outputCentering, f.layout(), f.mesh()); ! h.fieldEngine().physicalCellDomain() = f.fieldEngine().physicalCellDomain(); ! // FIXME: The guard layers are wrong; we need to find the maximum ! // offsets from all the functors below. (Should the individual ! // sub-fields have their own guard layers???) ! ! h.fieldEngine().guardLayers() = f.fieldEngine().guardLayers(); ! ! if (outputCentering.size() == 1) ! { ! h.fieldEngine().engine() ! = SEngine_t(Functor(nn[0], outputCentering, f.centering(), ! accumulate), ! f, h.physicalDomain()); ! } ! else ! { ! int oc; ! ! for (oc = 0; oc < nn.size(); ++oc) ! { ! h[oc].fieldEngine().guardLayers() = f.fieldEngine().guardLayers(); ! h[oc].fieldEngine().engine() ! = SEngine_t(Functor(nn[oc], outputCentering[oc], f.centering(), ! accumulate), ! f, h[oc].physicalDomain()); ! } ! } ! return h; } }; --- 164,203 ---- static inline Type_t make(const Functor &stencil, const Expression &f) { ! // FIXME: need to add comparison for centerings. ! // PAssert(f.centering() == stencil.inputCentering()); ! // We need to use the centering, layout, mesh constructor. ! // The FieldEngine part initializes physicalCellDomain ! // and guards from the layout. ! Type_t h(stencil.outputCentering(), f.layout(), f.mesh()); + // Initialize engine with appropriate StencilEngine ! Interval domain = insetDomain(stencil, f.physicalDomain()); ! h.fieldEngine().engine() = SEngine_t(stencil, f, domain); ! return h; } static inline ! Type_t make(const Functor &stencil, const Expression &f, const Interval &domain) { ! // FIXME: need to add comparison for centerings. ! // PAssert(f.centering() == stencil.inputCentering()); ! ! // We need to use the centering, layout, mesh constructor. ! // The FieldEngine part initializes physicalCellDomain ! // and guards from the layout. ! ! Type_t h(stencil.outputCentering(), f.layout(), f.mesh()); ! // Initialize engine with appropriate StencilEngine ! h.fieldEngine().engine() = SEngine_t(stencil, f, domain); ! return h; } }; Index: Field/tests/ExpressionTest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/ExpressionTest.cpp,v retrieving revision 1.3 diff -u -c -r1.3 ExpressionTest.cpp *** Field/tests/ExpressionTest.cpp 19 Jul 2004 18:20:41 -0000 1.3 --- Field/tests/ExpressionTest.cpp 24 Aug 2004 12:42:04 -0000 *************** *** 257,268 **** Centering inputCentering_m; }; ! template ! typename FieldStencilSimple, typename View1, Dom>::Type_t >::Type_t ! twoPt(const Field& expr, const Dom &domain) { ! typedef FieldStencilSimple, typename View1, Dom>::Type_t > Ret_t; ! return Ret_t::make(TwoPt(expr), expr(domain)); } template --- 257,268 ---- Centering inputCentering_m; }; ! template ! typename FieldStencilSimple, F>::Type_t ! twoPt(const F& expr, const Dom &domain) { ! typedef FieldStencilSimple, F> Ret_t; ! return Ret_t::make(TwoPt(expr), expr, domain); } template From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 14:37:26 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 24 Aug 2004 16:37:26 +0200 (CEST) Subject: [PATCH] Fix PackUnpack bug Message-ID: PackUnpack.h pack/unpack use a functor with LoopApplyEvaluator that violate the assumption of independent iterations. Thus the Field/LocalPatch testcase fails for OpenMP. We have the required infrastructure for ordered LoopApply in RemoteEngine.h. This patch uses that, but before needs to fix it as it works with zeroBased domain only due to a bug. Tested partly with OpenMP, MPI and serial. Ok? Richard. 2004Aug24 Richard Guenther * src/Functions/PackUnpack.h: use EngineBlockSerialize from RemoteEngine.h. src/Engine/RemoteEngine.h: fix EngineBlockSerialize for non-zerobased domains. -------------- next part -------------- Index: Engine/RemoteEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v retrieving revision 1.42 diff -u -u -r1.42 RemoteEngine.h --- Engine/RemoteEngine.h 19 Jan 2004 22:04:33 -0000 1.42 +++ Engine/RemoteEngine.h 24 Aug 2004 14:23:53 -0000 @@ -1055,8 +1055,8 @@ { CTAssert(Domain::unitStride == 1); int f0 = domain[0].first(); - int e0 = domain[0].length(); - for (int i0 = f0; i0 block) - : field_m(field), block_m(block) + PackLocalPatches(RefCountedBlockPtr block) + : block_m(block) { } - void operator()(int i0) const + inline void operator()(const Element_t &t) { - *block_m = field_m.read(i0); + *block_m = t; ++block_m; } - void operator()(int i0, int i1) const - { - *block_m = field_m.read(i0, i1); - ++block_m; - } - - void operator()(int i0, int i1, int i2) const - { - *block_m = field_m.read(i0, i1, i2); - ++block_m; - } - - void operator()(int i0, int i1, int i2, int i3) const - { - *block_m = field_m.read(i0, i1, i2, i3); - ++block_m; - } - - InputField field_m; - mutable RefCountedBlockPtr block_m; + RefCountedBlockPtr block_m; + int total_m; }; template @@ -149,8 +131,8 @@ { typedef typename Patch::Type_t PatchField_t; PatchField_t patch = field.patchLocal(i); - PackLocalPatches packFunctor(patch, current); - LoopApplyEvaluator::evaluate(packFunctor, patch.domain()); + PackLocalPatches packFunctor(current); + EngineBlockSerialize::apply(packFunctor, patch, patch.domain()); current += patch.domain().size(); } @@ -168,44 +150,27 @@ { typedef typename InputField::Element_t Element_t; - UnPackLocalPatches(const InputField &field, - RefCountedBlockPtr block) - : field_m(field), block_m(block) - { - } - - void operator()(int i0) const + UnPackLocalPatches(RefCountedBlockPtr block) + : block_m(block) { - field_m(i0) = *block_m; - ++block_m; - } - - void operator()(int i0, int i1) const - { - field_m(i0, i1) = *block_m; - ++block_m; } - void operator()(int i0, int i1, int i2) const + inline void operator()(Element_t &t) { - field_m(i0, i1, i2) = *block_m; + t = *block_m; ++block_m; } - void operator()(int i0, int i1, int i2, int i3) const - { - field_m(i0, i1, i2, i3) = *block_m; - ++block_m; - } - - InputField field_m; - mutable RefCountedBlockPtr block_m; + RefCountedBlockPtr block_m; + int total_m; }; template void unpack(const InputField &field, RefCountedBlockPtr block) { + Pooma::blockAndEvaluate(); + int i; RefCountedBlockPtr current = block; @@ -214,8 +179,8 @@ { typedef typename Patch::Type_t PatchField_t; PatchField_t patch = field.patchLocal(i); - UnPackLocalPatches unpackFunctor(patch, current); - LoopApplyEvaluator::evaluate(unpackFunctor, patch.physicalDomain()); + UnPackLocalPatches unpackFunctor(current); + EngineBlockSerialize::apply(unpackFunctor, patch, patch.physicalDomain()); current += patch.physicalDomain().size(); } } From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 14:46:01 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 24 Aug 2004 16:46:01 +0200 (CEST) Subject: OpenMP status Message-ID: Together with the last fixes OpenMP with the Intel Compiler 8.0 on a 4-processor Itanium passes all regression tests in optimized mode apart from: - array_test5: compiler problem, if compiling with -mp it's fine - ScalarCode: compiler problem, sometimes works, sometimes generates unaligned access and abort()s (look for kernel messages) Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 14:48:10 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 24 Aug 2004 16:48:10 +0200 (CEST) Subject: Serial status Message-ID: Serial is fine on ia32 with gcc 3.4 and Intel Compiler 7.1 and on amd64 with gcc 3.4. No regressions apart from gcc failing on MP DynamicArray w/ shared layouts (dynamic_array_test5.cpp) due to the known STL iterator problem. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 22:12:31 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 25 Aug 2004 00:12:31 +0200 Subject: [PATCH] Reorder #includes for CollectFromContextTest Message-ID: <412BBD4F.70207@tat.physik.uni-tuebingen.de> As the subject says. Makes it compile for MPI. Ok? Richard. 2004Aug25 Richard Guenther * src/Tulip/tests/CollectFromContextsTest.cpp: reorder #includes. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Tue Aug 24 22:23:09 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 25 Aug 2004 00:23:09 +0200 Subject: [pooma-dev] [PATCH] Fix ExpressionTest In-Reply-To: References: Message-ID: <412BBFCD.9080700@tat.physik.uni-tuebingen.de> Richard Guenther wrote: > 2004Aug24 Richard Guenther > > * src/Engine/Stencil.h: do bounds check only with > POOMA_BOUNDS_CHECK. > src/Field/DiffOps/FieldStencil.h: rewrite make(stencil, > expr), add make(stencil, expr, domain), kill similar > broken Accumulate stuff, update documentation. ^^^^^^^^^^^^^^^^^^^^^^ actually that part is broken. Leaving it in avoids breaking OffsetReduction test. Ok with that part retained? With this and the last patch MPI is also regression-free (without looking at particle tests). Richard. > src/Field/tests/ExpressionTest.cpp: use FieldStencilSimple::make > with domain argument, don't take view ourselves. From oldham at codesourcery.com Wed Aug 25 00:47:48 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 24 Aug 2004 17:47:48 -0700 Subject: [PATCH] Reorder #includes for CollectFromContextTest In-Reply-To: <412BBD4F.70207@tat.physik.uni-tuebingen.de> References: <412BBD4F.70207@tat.physik.uni-tuebingen.de> Message-ID: <412BE1B4.1050908@codesourcery.com> Richard Guenther wrote: > As the subject says. Makes it compile for MPI. > > Ok? > > Richard. > > > 2004Aug25 Richard Guenther > > * src/Tulip/tests/CollectFromContextsTest.cpp: reorder > #includes. > > > Yes. >------------------------------------------------------------------------ > >Index: CollectFromContextsTest.cpp >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Tulip/tests/CollectFromContextsTest.cpp,v >retrieving revision 1.2 >diff -u -u -r1.2 CollectFromContextsTest.cpp >--- CollectFromContextsTest.cpp 25 Dec 2003 11:26:35 -0000 1.2 >+++ CollectFromContextsTest.cpp 24 Aug 2004 22:10:36 -0000 >@@ -32,9 +32,9 @@ > > // Include files > >+#include "Pooma/Pooma.h" > #include "Tulip/Messaging.h" > #include "Tulip/CollectFromContexts.h" >-#include "Pooma/Pooma.h" > #include "Utilities/Tester.h" > > > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 25 01:05:54 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 24 Aug 2004 18:05:54 -0700 Subject: [pooma-dev] [PATCH] Fix PackUnpack bug In-Reply-To: References: Message-ID: <412BE5F2.5070802@codesourcery.com> Richard Guenther wrote: >PackUnpack.h pack/unpack use a functor with LoopApplyEvaluator >that violate the assumption of independent iterations. Thus >the Field/LocalPatch testcase fails for OpenMP. > >We have the required infrastructure for ordered LoopApply in >RemoteEngine.h. This patch uses that, but before needs to fix >it as it works with zeroBased domain only due to a bug. > > I do not understand this last sentence. I also do not understand the patch, mainly because of my ignorance. Each portion of the patch is internally consistent, but why should a local patch ignore its field? >Tested partly with OpenMP, MPI and serial. > >Ok? > >Richard. > > >2004Aug24 Richard Guenther > > * src/Functions/PackUnpack.h: use EngineBlockSerialize from > RemoteEngine.h. > src/Engine/RemoteEngine.h: fix EngineBlockSerialize for > non-zerobased domains. > > >------------------------------------------------------------------------ > >Index: Engine/RemoteEngine.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v >retrieving revision 1.42 >diff -u -u -r1.42 RemoteEngine.h >--- Engine/RemoteEngine.h 19 Jan 2004 22:04:33 -0000 1.42 >+++ Engine/RemoteEngine.h 24 Aug 2004 14:23:53 -0000 >@@ -1055,8 +1055,8 @@ > { > CTAssert(Domain::unitStride == 1); > int f0 = domain[0].first(); >- int e0 = domain[0].length(); >- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0)); > return op.total_m; > } >@@ -1068,10 +1068,10 @@ > CTAssert(Domain::unitStride == 1); > int f0 = domain[0].first(); > int f1 = domain[1].first(); >- int e0 = domain[0].length(); >- int e1 = domain[1].length(); >- for (int i1 = f1; i1- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ int e1 = domain[1].last(); >+ for (int i1 = f1; i1<=e1; ++i1) >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0,i1)); > return op.total_m; > } >@@ -1084,12 +1084,12 @@ > int f0 = domain[0].first(); > int f1 = domain[1].first(); > int f2 = domain[2].first(); >- int e0 = domain[0].length(); >- int e1 = domain[1].length(); >- int e2 = domain[2].length(); >- for (int i2 = f2; i2- for (int i1 = f1; i1- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ int e1 = domain[1].last(); >+ int e2 = domain[2].last(); >+ for (int i2 = f2; i2<=e2; ++i2) >+ for (int i1 = f1; i1<=e1; ++i1) >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0,i1,i2)); > return op.total_m; > } >@@ -1103,14 +1103,14 @@ > int f1 = domain[1].first(); > int f2 = domain[2].first(); > int f3 = domain[3].first(); >- int e0 = domain[0].length(); >- int e1 = domain[1].length(); >- int e2 = domain[2].length(); >- int e3 = domain[3].length(); >- for (int i3 = f3; i3- for (int i2 = f2; i2- for (int i1 = f1; i1- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ int e1 = domain[1].last(); >+ int e2 = domain[2].last(); >+ int e3 = domain[3].last(); >+ for (int i3 = f3; i3<=e3; ++i3) >+ for (int i2 = f2; i2<=e2; ++i2) >+ for (int i1 = f1; i1<=e1; ++i1) >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0,i1,i2,i3)); > return op.total_m; > } >@@ -1125,16 +1125,16 @@ > int f2 = domain[2].first(); > int f3 = domain[3].first(); > int f4 = domain[4].first(); >- int e0 = domain[0].length(); >- int e1 = domain[1].length(); >- int e2 = domain[2].length(); >- int e3 = domain[3].length(); >- int e4 = domain[4].length(); >- for (int i4 = f4; i4- for (int i3 = f3; i3- for (int i2 = f2; i2- for (int i1 = f1; i1- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ int e1 = domain[1].last(); >+ int e2 = domain[2].last(); >+ int e3 = domain[3].last(); >+ int e4 = domain[4].last(); >+ for (int i4 = f4; i4<=e4; ++i4) >+ for (int i3 = f3; i3<=e3; ++i3) >+ for (int i2 = f2; i2<=e2; ++i2) >+ for (int i1 = f1; i1<=e1; ++i1) >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0,i1,i2,i3,i4)); > return op.total_m; > } >@@ -1150,18 +1150,18 @@ > int f3 = domain[3].first(); > int f4 = domain[4].first(); > int f5 = domain[5].first(); >- int e0 = domain[0].length(); >- int e1 = domain[1].length(); >- int e2 = domain[2].length(); >- int e3 = domain[3].length(); >- int e4 = domain[4].length(); >- int e5 = domain[5].length(); >- for (int i5 = f5; i5- for (int i4 = f4; i4- for (int i3 = f3; i3- for (int i2 = f2; i2- for (int i1 = f1; i1- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ int e1 = domain[1].last(); >+ int e2 = domain[2].last(); >+ int e3 = domain[3].last(); >+ int e4 = domain[4].last(); >+ int e5 = domain[5].last(); >+ for (int i5 = f5; i5<=e5; ++i5) >+ for (int i4 = f4; i4<=e4; ++i4) >+ for (int i3 = f3; i3<=e3; ++i3) >+ for (int i2 = f2; i2<=e2; ++i2) >+ for (int i1 = f1; i1<=e1; ++i1) >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0,i1,i2,i3,i4,i5)); > return op.total_m; > } >@@ -1178,20 +1178,20 @@ > int f4 = domain[4].first(); > int f5 = domain[5].first(); > int f6 = domain[6].first(); >- int e0 = domain[0].length(); >- int e1 = domain[1].length(); >- int e2 = domain[2].length(); >- int e3 = domain[3].length(); >- int e4 = domain[4].length(); >- int e5 = domain[5].length(); >- int e6 = domain[6].length(); >- for (int i6 = f6; i6- for (int i5 = f5; i5- for (int i4 = f4; i4- for (int i3 = f3; i3- for (int i2 = f2; i2- for (int i1 = f1; i1- for (int i0 = f0; i0+ int e0 = domain[0].last(); >+ int e1 = domain[1].last(); >+ int e2 = domain[2].last(); >+ int e3 = domain[3].last(); >+ int e4 = domain[4].last(); >+ int e5 = domain[5].last(); >+ int e6 = domain[6].last(); >+ for (int i6 = f6; i6<=e6; ++i6) >+ for (int i5 = f5; i5<=e5; ++i5) >+ for (int i4 = f4; i4<=e4; ++i4) >+ for (int i3 = f3; i3<=e3; ++i3) >+ for (int i2 = f2; i2<=e2; ++i2) >+ for (int i1 = f1; i1<=e1; ++i1) >+ for (int i0 = f0; i0<=e0; ++i0) > op(engine(i0,i1,i2,i3,i4,i5,i6)); > return op.total_m; > } >Index: Functions/PackUnpack.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Functions/PackUnpack.h,v >retrieving revision 1.5 >diff -u -u -r1.5 PackUnpack.h >--- Functions/PackUnpack.h 25 Oct 2003 12:06:55 -0000 1.5 >+++ Functions/PackUnpack.h 24 Aug 2004 14:23:53 -0000 >@@ -59,6 +59,7 @@ > //----------------------------------------------------------------------------- > > #include "Utilities/RefCountedBlockPtr.h" >+#include "Engine/RemoteEngine.h" > #include "Pooma/Pooma.h" > > //----------------------------------------------------------------------------- >@@ -93,38 +94,19 @@ > { > typedef typename InputField::Element_t Element_t; > >- PackLocalPatches(const InputField &field, >- RefCountedBlockPtr block) >- : field_m(field), block_m(block) >+ PackLocalPatches(RefCountedBlockPtr block) >+ : block_m(block) > { > } > >- void operator()(int i0) const >+ inline void operator()(const Element_t &t) > { >- *block_m = field_m.read(i0); >+ *block_m = t; > ++block_m; > } > >- void operator()(int i0, int i1) const >- { >- *block_m = field_m.read(i0, i1); >- ++block_m; >- } >- >- void operator()(int i0, int i1, int i2) const >- { >- *block_m = field_m.read(i0, i1, i2); >- ++block_m; >- } >- >- void operator()(int i0, int i1, int i2, int i3) const >- { >- *block_m = field_m.read(i0, i1, i2, i3); >- ++block_m; >- } >- >- InputField field_m; >- mutable RefCountedBlockPtr block_m; >+ RefCountedBlockPtr block_m; >+ int total_m; > }; > > template >@@ -149,8 +131,8 @@ > { > typedef typename Patch::Type_t PatchField_t; > PatchField_t patch = field.patchLocal(i); >- PackLocalPatches packFunctor(patch, current); >- LoopApplyEvaluator::evaluate(packFunctor, patch.domain()); >+ PackLocalPatches packFunctor(current); >+ EngineBlockSerialize::apply(packFunctor, patch, patch.domain()); > current += patch.domain().size(); > } > >@@ -168,44 +150,27 @@ > { > typedef typename InputField::Element_t Element_t; > >- UnPackLocalPatches(const InputField &field, >- RefCountedBlockPtr block) >- : field_m(field), block_m(block) >- { >- } >- >- void operator()(int i0) const >+ UnPackLocalPatches(RefCountedBlockPtr block) >+ : block_m(block) > { >- field_m(i0) = *block_m; >- ++block_m; >- } >- >- void operator()(int i0, int i1) const >- { >- field_m(i0, i1) = *block_m; >- ++block_m; > } > >- void operator()(int i0, int i1, int i2) const >+ inline void operator()(Element_t &t) > { >- field_m(i0, i1, i2) = *block_m; >+ t = *block_m; > ++block_m; > } > >- void operator()(int i0, int i1, int i2, int i3) const >- { >- field_m(i0, i1, i2, i3) = *block_m; >- ++block_m; >- } >- >- InputField field_m; >- mutable RefCountedBlockPtr block_m; >+ RefCountedBlockPtr block_m; >+ int total_m; > }; > > template > void > unpack(const InputField &field, RefCountedBlockPtr block) > { >+ Pooma::blockAndEvaluate(); >+ > int i; > > RefCountedBlockPtr current = block; >@@ -214,8 +179,8 @@ > { > typedef typename Patch::Type_t PatchField_t; > PatchField_t patch = field.patchLocal(i); >- UnPackLocalPatches unpackFunctor(patch, current); >- LoopApplyEvaluator::evaluate(unpackFunctor, patch.physicalDomain()); >+ UnPackLocalPatches unpackFunctor(current); >+ EngineBlockSerialize::apply(unpackFunctor, patch, patch.physicalDomain()); > current += patch.physicalDomain().size(); > } > } > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 25 01:50:27 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 24 Aug 2004 18:50:27 -0700 Subject: [pooma-dev] [PATCH] Fix ExpressionTest In-Reply-To: <412BBFCD.9080700@tat.physik.uni-tuebingen.de> References: <412BBFCD.9080700@tat.physik.uni-tuebingen.de> Message-ID: <412BF063.5070903@codesourcery.com> Richard Guenther wrote: > Richard Guenther wrote: > >> 2004Aug24 Richard Guenther >> >> * src/Engine/Stencil.h: do bounds check only with >> POOMA_BOUNDS_CHECK. >> src/Field/DiffOps/FieldStencil.h: rewrite make(stencil, >> expr), add make(stencil, expr, domain), kill similar >> broken Accumulate stuff, update documentation. > > ^^^^^^^^^^^^^^^^^^^^^^ > > actually that part is broken. Leaving it in avoids breaking > OffsetReduction test. > > Ok with that part retained? With this and the last patch MPI is also > regression-free (without looking at particle tests). > > Richard. I guess it's OK to commit. I tried to follow the logic, but you understand this much more than I. Tomorrow morning, we'll see if the regression disappears. > >> src/Field/tests/ExpressionTest.cpp: use FieldStencilSimple::make >> with domain argument, don't take view ourselves. > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 25 02:09:27 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 24 Aug 2004 19:09:27 -0700 Subject: [pooma-dev] OpenMP status In-Reply-To: References: Message-ID: <412BF4D7.8020404@codesourcery.com> Richard Guenther wrote: >Together with the last fixes OpenMP with the Intel Compiler 8.0 >on a 4-processor Itanium passes all regression tests in optimized >mode apart from: > >- array_test5: compiler problem, if compiling with -mp it's fine >- ScalarCode: compiler problem, sometimes works, sometimes generates > unaligned access and abort()s (look for kernel messages) > > If it sometimes works, are we sure it is a compiler problem? Is it instead a race condition? >Richard. > >-- >Richard Guenther >WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ > > > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Aug 25 02:13:26 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 24 Aug 2004 19:13:26 -0700 Subject: POOMA and Cheetah Release Testing Information Message-ID: <412BF5C6.8090508@codesourcery.com> I collected all the testing information into a table. Let's try to keep it up-to-date as we test. I'll try to do some testing tomorrow. Thanks. -- Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 2.4.1-testing URL: From rguenth at tat.physik.uni-tuebingen.de Wed Aug 25 07:43:22 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 25 Aug 2004 09:43:22 +0200 (CEST) Subject: [pooma-dev] OpenMP status In-Reply-To: <412BF4D7.8020404@codesourcery.com> Message-ID: On Tue, 24 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > >Together with the last fixes OpenMP with the Intel Compiler 8.0 > >on a 4-processor Itanium passes all regression tests in optimized > >mode apart from: > > > >- array_test5: compiler problem, if compiling with -mp it's fine > >- ScalarCode: compiler problem, sometimes works, sometimes generates > > unaligned access and abort()s (look for kernel messages) > > > > > If it sometimes works, are we sure it is a compiler problem? Is it > instead a race condition? I'm sure it's not a race condition, but a problem in the generated code as that does unaligned memory access which the Itanium seems to do not like: > dmesg ScalarCode(23238): unaligned access to 0x2000000001200ca5, ip=0x20000000003fe670 ScalarCode(23238): unaligned access to 0x2000000001200cad, ip=0x20000000003fe671 ScalarCode(23238): unaligned access to 0x2000000001200c9d, ip=0x20000000003fe690 ScalarCode(23267): unaligned access to 0x2000000000566b06, ip=0x20000000003fe7a1 It's at the lowest optimization level the compiler does any OpenMP stuff, so I can't really check otherwise. >From gdb I see it's (gdb) run Starting program: /net/alwazn/home/rguenth/src/pooma-bk/r2/src/Field/tests/LINUXICC/ScalarCode [Thread debugging using libthread_db enabled] [New Thread 2305843009213887952 (LWP 23722)] [New Thread 2305843009219836112 (LWP 23723)] [New Thread 2305843009224030416 (LWP 23724)] [New Thread 2305843009228224720 (LWP 23725)] [New Thread 2305843009232419024 (LWP 23726)] ScalarCode(23722): unaligned access to 0x2000000000566b07, ip=0x20000000003fe7a1 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 2305843009213887952 (LWP 23722)] 0x20000000003fd630 in _int_malloc () from /lib/tls/libc.so.6.1 (gdb) bt #0 0x20000000003fd630 in _int_malloc () from /lib/tls/libc.so.6.1 #1 0x20000000003fb760 in malloc () from /lib/tls/libc.so.6.1 #2 0x2000000000268d10 in operator new () from /usr/local/lib/libcxa.so.6 #3 0x4000000000119c90 in _ZN22UniformRectilinearMeshILi2EdEC9ERKS0_RK8IntervalILi2EE () #4 0x40000000001a3640 in _ZN11FieldEngineI22UniformRectilinearMeshILi2EdEd9BrickViewEC9Id10MultiPatchI7GridTag5BrickEEERKS_IS1_T_T0_ERK5INodeILi2EE () #5 0x400000000018aab0 in View1Implementation, double, MultiPatch >, INode<2>, false>::make, CombineDomainOpt, INode<2> >, false> > () #6 0x400000000018adb0 in View1, double, MultiPatch >, INode<2> >::make () #7 0x40000000001cd3a0 in MultiArgEvaluator::evaluate, double, MultiPatch >, Field, double, Brick> >, AllFaceToCellAverage<2>, 2, EvaluateLocLoop, 2> > () #8 0x40000000001374e0 in MultiArgEvaluator::evaluate, double, MultiPatch >, Field, double, Brick> >, AllFaceToCellAverage<2>, 2, EvaluateLocLoop, 2> > () #9 0x40000000000d9bd0 in ScalarCode >::operator(), double, MultiPatch >, Field, double, Brick> > () #10 0x40000000000082d0 in main () not inside any OpenMP parallelized region (only maybe directly preceding). And it may be bad interaction between the installed libc and the Intel Compiler. Who knows. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Wed Aug 25 07:49:12 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 25 Aug 2004 09:49:12 +0200 (CEST) Subject: [pooma-dev] [PATCH] Fix PackUnpack bug In-Reply-To: <412BE5F2.5070802@codesourcery.com> Message-ID: On Tue, 24 Aug 2004, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > >PackUnpack.h pack/unpack use a functor with LoopApplyEvaluator > >that violate the assumption of independent iterations. Thus > >the Field/LocalPatch testcase fails for OpenMP. > > > >We have the required infrastructure for ordered LoopApply in > >RemoteEngine.h. This patch uses that, but before needs to fix > >it as it works with zeroBased domain only due to a bug. > > > > > I do not understand this last sentence. I also do not understand the > patch, mainly because of my ignorance. > > Each portion of the patch is internally consistent, but why should a > local patch ignore its field? I'll try to go through the patch with some explanations > >Index: Engine/RemoteEngine.h > >=================================================================== > >RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v > >retrieving revision 1.42 > >diff -u -u -r1.42 RemoteEngine.h > >--- Engine/RemoteEngine.h 19 Jan 2004 22:04:33 -0000 1.42 > >+++ Engine/RemoteEngine.h 24 Aug 2004 14:23:53 -0000 > >@@ -1055,8 +1055,8 @@ > > { > > CTAssert(Domain::unitStride == 1); > > int f0 = domain[0].first(); > >- int e0 = domain[0].length(); > >- for (int i0 = f0; i0 >+ int e0 = domain[0].last(); > >+ for (int i0 = f0; i0<=e0; ++i0) > > op(engine(i0)); > > return op.total_m; > > } This was clearly wrong. Either it should go from zero to domain[0].length() or from domain[0].first() to domain[0].last(). We didn't catch this yet as the past use was with domain[0].first() == 0 always. > >Index: Functions/PackUnpack.h > >=================================================================== > >RCS file: /home/pooma/Repository/r2/src/Functions/PackUnpack.h,v > >retrieving revision 1.5 > >diff -u -u -r1.5 PackUnpack.h > >--- Functions/PackUnpack.h 25 Oct 2003 12:06:55 -0000 1.5 > >+++ Functions/PackUnpack.h 24 Aug 2004 14:23:53 -0000 > >@@ -59,6 +59,7 @@ > > //----------------------------------------------------------------------------- > > > > #include "Utilities/RefCountedBlockPtr.h" > >+#include "Engine/RemoteEngine.h" > > #include "Pooma/Pooma.h" > > > > //----------------------------------------------------------------------------- > >@@ -93,38 +94,19 @@ > > { > > typedef typename InputField::Element_t Element_t; > > > >- PackLocalPatches(const InputField &field, > >- RefCountedBlockPtr block) > >- : field_m(field), block_m(block) > >+ PackLocalPatches(RefCountedBlockPtr block) > >+ : block_m(block) > > { > > } > > > >- void operator()(int i0) const > >+ inline void operator()(const Element_t &t) > > { > >- *block_m = field_m.read(i0); > >+ *block_m = t; > > ++block_m; > > } > > > >- void operator()(int i0, int i1) const > >- { > >- *block_m = field_m.read(i0, i1); > >- ++block_m; > >- } > >- > >- void operator()(int i0, int i1, int i2) const > >- { > >- *block_m = field_m.read(i0, i1, i2); > >- ++block_m; > >- } > >- > >- void operator()(int i0, int i1, int i2, int i3) const > >- { > >- *block_m = field_m.read(i0, i1, i2, i3); > >- ++block_m; > >- } > >- > >- InputField field_m; > >- mutable RefCountedBlockPtr block_m; > >+ RefCountedBlockPtr block_m; > >+ int total_m; > > }; Rewrite the functor to work with EngineBlockSerialize instead of LoopApplyEvaluator. > > template > >@@ -149,8 +131,8 @@ > > { > > typedef typename Patch::Type_t PatchField_t; > > PatchField_t patch = field.patchLocal(i); > >- PackLocalPatches packFunctor(patch, current); > >- LoopApplyEvaluator::evaluate(packFunctor, patch.domain()); > >+ PackLocalPatches packFunctor(current); > >+ EngineBlockSerialize::apply(packFunctor, patch, patch.domain()); > > current += patch.domain().size(); > > } Use EngineBlockSerialize with the PackLocalPatches functor. EngineBlockSerialize passes the read value to the functor, not the current index as LoopApplyEvaluator does. So we don't need the local patch in the functor, but only in EngineBlockSerialize. And likewise for Pack. The problem with OpenMP and LoopApplyEvaluator was that the loop for that evaluator is OpenMP parallelized - and guess what happens to the block_m member of the functor if it's incremented and written to by multiple threads in parallel... Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Wed Aug 25 13:39:44 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 25 Aug 2004 06:39:44 -0700 Subject: [pooma-dev] OpenMP status In-Reply-To: References: Message-ID: <412C96A0.1000103@codesourcery.com> Richard Guenther wrote: >On Tue, 24 Aug 2004, Jeffrey D. Oldham wrote: > > > >>Richard Guenther wrote: >> >> >> >>>Together with the last fixes OpenMP with the Intel Compiler 8.0 >>>on a 4-processor Itanium passes all regression tests in optimized >>>mode apart from: >>> >>>- array_test5: compiler problem, if compiling with -mp it's fine >>>- ScalarCode: compiler problem, sometimes works, sometimes generates >>> unaligned access and abort()s (look for kernel messages) >>> >>> >>> >>> >>If it sometimes works, are we sure it is a compiler problem? Is it >>instead a race condition? >> >> > >I'm sure it's not a race condition, but a problem in the generated >code as that does unaligned memory access which the Itanium seems to >do not like: > > > >>dmesg >> >> >ScalarCode(23238): unaligned access to 0x2000000001200ca5, >ip=0x20000000003fe670 >ScalarCode(23238): unaligned access to 0x2000000001200cad, >ip=0x20000000003fe671 >ScalarCode(23238): unaligned access to 0x2000000001200c9d, >ip=0x20000000003fe690 >ScalarCode(23267): unaligned access to 0x2000000000566b06, >ip=0x20000000003fe7a1 > >It's at the lowest optimization level the compiler does any >OpenMP stuff, so I can't really check otherwise. > >From gdb I see it's > >(gdb) run >Starting program: >/net/alwazn/home/rguenth/src/pooma-bk/r2/src/Field/tests/LINUXICC/ScalarCode >[Thread debugging using libthread_db enabled] >[New Thread 2305843009213887952 (LWP 23722)] >[New Thread 2305843009219836112 (LWP 23723)] >[New Thread 2305843009224030416 (LWP 23724)] >[New Thread 2305843009228224720 (LWP 23725)] >[New Thread 2305843009232419024 (LWP 23726)] >ScalarCode(23722): unaligned access to 0x2000000000566b07, >ip=0x20000000003fe7a1 > >Program received signal SIGSEGV, Segmentation fault. >[Switching to Thread 2305843009213887952 (LWP 23722)] >0x20000000003fd630 in _int_malloc () from /lib/tls/libc.so.6.1 >(gdb) bt >#0 0x20000000003fd630 in _int_malloc () from /lib/tls/libc.so.6.1 >#1 0x20000000003fb760 in malloc () from /lib/tls/libc.so.6.1 >#2 0x2000000000268d10 in operator new () from /usr/local/lib/libcxa.so.6 >#3 0x4000000000119c90 in >_ZN22UniformRectilinearMeshILi2EdEC9ERKS0_RK8IntervalILi2EE () >#4 0x40000000001a3640 in >_ZN11FieldEngineI22UniformRectilinearMeshILi2EdEd9BrickViewEC9Id10MultiPatchI7GridTag5BrickEEERKS_IS1_T_T0_ERK5INodeILi2EE >() >#5 0x400000000018aab0 in >View1Implementation, double, >MultiPatch >, INode<2>, false>::make, >CombineDomainOpt, INode<2> >, false> > () >#6 0x400000000018adb0 in View1, >double, MultiPatch >, INode<2> >::make () >#7 0x40000000001cd3a0 in >MultiArgEvaluator::evaluatedouble>, double, MultiPatch >, >Field, double, Brick> >, >AllFaceToCellAverage<2>, 2, EvaluateLocLoop, 2> > >() >#8 0x40000000001374e0 in >MultiArgEvaluator::evaluatedouble>, double, MultiPatch >, >Field, double, Brick> >, >AllFaceToCellAverage<2>, 2, EvaluateLocLoop, 2> > >() >#9 0x40000000000d9bd0 in ScalarCode > > >>::operator(), double, >> >> >MultiPatch >, Field, >double, Brick> > () >#10 0x40000000000082d0 in main () > >not inside any OpenMP parallelized region (only maybe directly preceding). >And it may be bad interaction between the installed libc and the Intel >Compiler. Who knows. > > OK. Thanks for the analysis. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Wed Aug 25 20:34:37 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 25 Aug 2004 22:34:37 +0200 Subject: CVS Status Message-ID: <412CF7DD.90305@tat.physik.uni-tuebingen.de> Everything is in CVS now that I can think of. Only maybe adjusting of the compiler flags in the LINUXgcc.conf file could be done. -fno-default-inline hurts performance and -fstrict-aliasing is default with the gcc we're recommending. I'd also change -O3 to -O2 everywhere as -O3 is usually worse. Or not? Richard. From oldham at codesourcery.com Wed Aug 25 20:57:24 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 25 Aug 2004 13:57:24 -0700 Subject: CVS Status In-Reply-To: <412CF7DD.90305@tat.physik.uni-tuebingen.de> References: <412CF7DD.90305@tat.physik.uni-tuebingen.de> Message-ID: <412CFD34.2050507@codesourcery.com> Richard Guenther wrote: > Everything is in CVS now that I can think of. Only maybe adjusting of > the compiler flags in the LINUXgcc.conf file could be done. > -fno-default-inline hurts performance and -fstrict-aliasing is default > with the gcc we're recommending. I'd also change -O3 to -O2 > everywhere as -O3 is usually worse. > > Or not? > > Richard. Yes, that sounds good. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Thu Aug 26 16:47:52 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 26 Aug 2004 09:47:52 -0700 Subject: Cheetah Documentation: HTML Conformance In-Reply-To: References: Message-ID: <412E1438.7060105@codesourcery.com> The attached patch moves the Cheetah documentation up to HTML 4.01 standards. It also updated one dead link www.acl.lanl.gov/cheetah to www.pooma.com and removed one dead link. Is this patch acceptable to commit to the Cheetah CVS repository? -- Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: docs.26Aug.09.7.ChangeLog URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: docs.26Aug.09.7.patch Type: text/x-patch Size: 47162 bytes Desc: not available URL: From oldham at codesourcery.com Thu Aug 26 16:50:36 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 26 Aug 2004 09:50:36 -0700 Subject: Cheetah New Release Number? In-Reply-To: <412E1438.7060105@codesourcery.com> References: <412E1438.7060105@codesourcery.com> Message-ID: <412E14DC.4040706@codesourcery.com> What number do you desire for the next Cheetah release? The two previous release numbers are 1.0 and 1.1.4. Should we use 1.1.5? This release has ?two? error changes and HTML-conformant documentation. -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Thu Aug 26 17:18:34 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 26 Aug 2004 10:18:34 -0700 Subject: Pooma and Cheetah Testing Status Message-ID: <412E1B6A.6070608@codesourcery.com> The testing status document has been changed into HTML and updated to reflect Richard Guenther's testing. -- Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Thu Aug 26 20:48:53 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 26 Aug 2004 22:48:53 +0200 Subject: Cheetah Documentation: HTML Conformance In-Reply-To: <412E1438.7060105@codesourcery.com> References: <412E1438.7060105@codesourcery.com> Message-ID: <412E4CB5.5090206@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > The attached patch moves the Cheetah documentation up to HTML 4.01 > standards. It also updated one dead link www.acl.lanl.gov/cheetah to > www.pooma.com and removed one dead link. > > Is this patch acceptable to commit to the Cheetah CVS repository? Yes, thanks. Richard. > > ------------------------------------------------------------------------ > > 2004-Aug-26 Jeffrey D. Oldham > > * ControllerGuide.html: Move the HTML up to the 4.01 standards. > * ControllerImplReference.html: Likewise. > * ControllerReference.html: Likewise. > * MatchingHandlerGuide.html: Likewise. > * MatchingHandlerReference.html: Likewise. > * Overview.html: Likewise. > Correct links. > * SerializationGuide.html: Move the HTML up to the 4.01 standards. > * SerializationReference.html: Likewise. > * index.html: Likewise. > Correct links. From rguenth at tat.physik.uni-tuebingen.de Thu Aug 26 20:50:42 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 26 Aug 2004 22:50:42 +0200 Subject: [pooma-dev] Cheetah New Release Number? In-Reply-To: <412E14DC.4040706@codesourcery.com> References: <412E1438.7060105@codesourcery.com> <412E14DC.4040706@codesourcery.com> Message-ID: <412E4D22.1020706@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > What number do you desire for the next Cheetah release? The two > previous release numbers are 1.0 and 1.1.4. Should we use 1.1.5? This > release has ?two? error changes and HTML-conformant documentation. I guess 1.1.5 would be ok. Even 1.1.4pl1 or the like would be fine to denote active development has ended (has it?) and only bugfixes are applied. Either of the two. Richard. From rguenth at tat.physik.uni-tuebingen.de Fri Aug 27 15:26:21 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 27 Aug 2004 17:26:21 +0200 (CEST) Subject: [PATCH] fix indirectionlist_test1.cpp for serialAsync scheduler Message-ID: This patch fixes the indirectionlist_test1 Domain test for non-blocking schedulers. Very obvious. Ok? Richard. 2004Aug27 Richard Guenther * src/Domain/tests/indirectionlist_test1.cpp: add Pooma::blockAndEvaluate() where necessary. -------------- next part -------------- Index: indirectionlist_test1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Domain/tests/indirectionlist_test1.cpp,v retrieving revision 1.7 diff -u -u -r1.7 indirectionlist_test1.cpp --- indirectionlist_test1.cpp 21 Dec 2003 12:59:57 -0000 1.7 +++ indirectionlist_test1.cpp 27 Aug 2004 15:24:22 -0000 @@ -57,8 +57,9 @@ Array<1,int,Brick> klist(foo); - klist = 1; + Pooma::blockAndEvaluate(); + for(int i=1;i<7;i++) klist(i) = klist(i-1)+i; klist(2)=3; From rguenth at tat.physik.uni-tuebingen.de Fri Aug 27 15:42:45 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 27 Aug 2004 17:42:45 +0200 (CEST) Subject: [PATCH] Fix some of the MPI particle failures Message-ID: Doh! I missed one #if POOMA_CHEETAH at the POOMA_CHEETAH -> POOMA_MESSAGING conversion (I think even on purpose, but the particle stuff wasn't compiling at that time anyways). Ok? Richard. 2004Aug27 Richard Guenther * src/Engine/DynamicEngine.h: include pack/unpack methods for POOMA_MESSAGING, not just POOMA_CHEETAH. -------------- next part -------------- Index: Engine/DynamicEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/DynamicEngine.h,v retrieving revision 1.19 diff -u -u -r1.19 DynamicEngine.h --- Engine/DynamicEngine.h 22 Oct 2003 19:38:07 -0000 1.19 +++ Engine/DynamicEngine.h 27 Aug 2004 15:40:51 -0000 @@ -309,7 +309,7 @@ void sync(const Domain_t & d); -#if POOMA_CHEETAH +#if POOMA_MESSAGING template inline int packSize(const Dom &) const From rguenth at tat.physik.uni-tuebingen.de Fri Aug 27 15:47:24 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 27 Aug 2004 17:47:24 +0200 (CEST) Subject: [PATCH] don't bench too much for boundschecking Message-ID: This reduces particle benchmarking if POOMA_BOUNDS_CHECK is on to one time with 100 particles (it takes an awful lot of time). Ok? Richard. 2004Aug27 Richard Guenther * src/Particles/tests/particle_tests.h: for POOMA_BOUNDS_CHECK reduce default problem size(s). -------------- next part -------------- Index: Particles/tests/particle_tests.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_tests.h,v retrieving revision 1.22 diff -u -u -r1.22 particle_tests.h --- Particles/tests/particle_tests.h 23 Aug 2004 18:44:17 -0000 1.22 +++ Particles/tests/particle_tests.h 27 Aug 2004 15:45:38 -0000 @@ -400,8 +400,13 @@ // Default parameters for the benchmark. int iters = 1000; +#if POOMA_BOUNDS_CHECK + int startnumparticles = 100; + int endnumparticles = 100; +#else int startnumparticles = 100; int endnumparticles = 10000; +#endif int multnumparticles = 10; double movefrac = 0.1; bool usesync = false; From rguenth at tat.physik.uni-tuebingen.de Fri Aug 27 16:09:38 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 27 Aug 2004 18:09:38 +0200 (CEST) Subject: [PATCH] fix domain error in particle test Message-ID: This patch fixes the last bug in particle tests to let MPI parallelized versions pass on _one_ processor. As update for the truly parallel testruns, the only ones failing are now bctest3, spatial, uniform, destroy, particle_test1-4, particle_bench1-4 and interpolate - all due to the pAbort because "Cross-context particles not supported for MPI". Ok? 2004Aug27 Richard Guenther * src/Particles/tests/interpolate.cpp: initialize physical cell domain, not vertex domain. -------------- next part -------------- Index: interpolate.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/interpolate.cpp,v retrieving revision 1.23 diff -u -u -r1.23 interpolate.cpp --- interpolate.cpp 23 Aug 2004 18:44:17 -0000 1.23 +++ interpolate.cpp 27 Aug 2004 15:59:17 -0000 @@ -282,7 +282,7 @@ // Initialize the field values tester.out() << "Initializing Field values ..." << std::endl; - Interval dom = flayout.domain(); + Interval dom = electric.physicalDomain(); for (int i = dom[0].first(); i <= dom[0].last(); ++i) for (int j = dom[1].first(); j <= dom[1].last(); ++j) electric(i,j) = Particles_t::PointType_t(i+j,i-j);