From gtkacik at Princeton.EDU Mon Feb 3 04:43:18 2003 From: gtkacik at Princeton.EDU (Gasper Tkacik) Date: Sun, 02 Feb 2003 23:43:18 -0500 Subject: [Q]: How to fill an array with random values? Message-ID: <3E3DF366.9080504@princeton.edu> Hello! I am a first year graduate student at Princeton and am considering using Pooma for cosmological simulations. I have just downloaded and compiled it sucessfully and ran some very simple examples. I have the following question: how do I efficiently (without explicit looping) fill the array with random values? I browsed through the manuals and the best I could come up with was to define a function Random, overload its () operator, create a UserFunction rf and apply it to the array A like this A = rf(A); However, I do not know if this is a good solution. Random does not need any parameter, but I was forced to put in something to conform to the () syntax, so I just put in the same A array. Is there a more elegant solution? I hope the compiler does not generate a temporary copy in the above example? BTW: Is this list appropriate for such questions or are there only development issues discussed? If the later, I apologize. Best regards, Gasper. From k-egg at gmx.de Mon Feb 3 09:08:47 2003 From: k-egg at gmx.de (Andreas Vitz) Date: Mon, 3 Feb 2003 10:08:47 +0100 Subject: Compiling problems Message-ID: Dear List, I'm new to pooma, and i got some problems during compilation. i use gcc --version 2.96 during compliation i got following fault: #>make CXXToSuite... See src/Connect/Lux/LINUXgcc-opt/LuxAppPointer.cmpl.o_1.info make: *** [/scratch/pooma/pooma-2.4.0/src/Connect/Lux/LINUXgcc-opt/LuxAppPointer .cmpl.o] Error 1 watching the file: Mon Feb 3 09:50:34 CET 2003 Compiler location: /usr/bin/g++ cd /scratch/pooma/pooma-2.4.0; \ TMPDIR=/tmp/LINUXgcc-opt; \ /usr/bin/time g++ -c /scratch/pooma/pooma-2.4.0/src/Connect/Lux/LuxAppPointer.cmpl.cpp \ -o /scratch/pooma/pooma-2.4.0/src/Connect/Lux/LINUXgcc-opt/LuxAppPointer.cmpl.o \ -DNOCheetahCTAssert -DNOCheetahRTAssert -ftemplate-depth-60 -Drestrict=__restrict__ -DNOPAssert -DNOCTAssert -O2 -fno-default-inline -funroll-loops -fstrict-aliasing \ -I/scratch/pooma/pooma-2.4.0/src \ -I/scratch/pooma/pooma-2.4.0/lib/LINUXgcc-opt \ -I/opt/score/mpi/mpich-1.2.0/i386-redhat7-linux2_4/include/ \ -I/scratch/pooma/cheetah-1.0/build/cheetah-1.0.devel/linux/src \ -I/scratch/pooma/cheetah-1.0/build/cheetah-1.0.devel/linux/lib/g++-ex In file included from /scratch/pooma/pooma-2.4.0/src/Connect/Lux/LuxAppPointer.cmpl.cpp:34: /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: type specifier omitted for parameter /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: parse error before `&' /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: `::ios_base' undeclared (first use here) /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: parse error before `)' /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: `operator<<' declared as function returning a function /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: `operator<< (...)' must have an argument of class or enumerated type /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: `operator<< (...)' must take exactly two arguments /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h: In function `Inform &operator<< (Inform &, const T &)': /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:462: confused by earlier errors, bailing out have i to upgrade my compiler, or where is the mistake ?? Thank you very much for helping Yours Andreas Vitz Germany From rguenth at tat.physik.uni-tuebingen.de Mon Feb 3 09:16:56 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 3 Feb 2003 10:16:56 +0100 (CET) Subject: [pooma-dev] [Q]: How to fill an array with random values? In-Reply-To: <3E3DF366.9080504@princeton.edu> Message-ID: On Sun, 2 Feb 2003, Gasper Tkacik wrote: > I browsed through the manuals and the best I could come up with was to > define a function Random, overload its () operator, create a > UserFunction rf and apply it to the array A like this > > A = rf(A); > > However, I do not know if this is a good solution. Random does not need > any parameter, but I was forced to put in something to conform to the () > syntax, so I just put in the same A array. Is there a more elegant > solution? I hope the compiler does not generate a temporary copy in the > above example? Its ok to do the above, the temporary copy the compiler introduces is only for metadata. If you want to avoid the syntactically confusing passing of A you may want to look at PatchFunction which can do something like Array A; PatchFunction()(A); > BTW: Is this list appropriate for such questions or are there only > development issues discussed? If the later, I apologize. I think the list is to be shared between developers and users. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Mon Feb 3 09:20:01 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 3 Feb 2003 10:20:01 +0100 (CET) Subject: [pooma-dev] Compiling problems In-Reply-To: <200302030901.h1391qJ27511@neptun.tat.physik.uni-tuebingen.de> Message-ID: On Mon, 3 Feb 2003, Andreas Vitz wrote: > I'm new to pooma, and i got some problems during compilation. > > i use gcc --version 2.96 > > during compliation i got following fault: > > > In file included from > /scratch/pooma/pooma-2.4.0/src/Connect/Lux/LuxAppPointer.cmpl.cpp:34: > /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: type specifier omitted > for parameter > /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: parse error before `&' > /scratch/pooma/pooma-2.4.0/src/Utilities/Inform.h:425: `::ios_base' > undeclared (first use here) You need to fix src/Utilities/Inform.h at the specified location suited to your compiler version. > have i to upgrade my compiler, or where is the mistake ?? I think 2.96 is a bad choice and you should try to upgrade to a recent gcc3.2. Also you may want to upgrade your pooma version from the CVS as later versions are capable of autodetecting this problem. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From gtkacik at Princeton.EDU Thu Feb 6 20:34:36 2003 From: gtkacik at Princeton.EDU (Gasper Tkacik) Date: Thu, 06 Feb 2003 15:34:36 -0500 Subject: FFT Message-ID: <3E42C6DC.9020803@princeton.edu> Hello everybody! Is there any support for FFT operations in Pooma? For example, given a Array, is there any preferred way of doing FFT? If not, is there any way of doing it in parallel with the constructs that Pooma provides? Best regards, Gasper. From rguenth at tat.physik.uni-tuebingen.de Thu Feb 6 21:27:36 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 6 Feb 2003 22:27:36 +0100 (CET) Subject: [pooma-dev] FFT In-Reply-To: <3E42C6DC.9020803@princeton.edu> Message-ID: On Thu, 6 Feb 2003, Gasper Tkacik wrote: > Hello everybody! > > Is there any support for FFT operations in Pooma? For example, given a > Array, is there any preferred way of doing FFT? > If not, is there any way of doing it in parallel with the constructs > that Pooma provides? Try looking at src/Transform/WrapFFTW.h which seems to make use of libfftw to do FFT. Richard. From gtkacik at Princeton.EDU Sat Feb 8 15:33:50 2003 From: gtkacik at Princeton.EDU (Gasper Tkacik) Date: Sat, 08 Feb 2003 10:33:50 -0500 Subject: [pooma-dev] FFT In-Reply-To: References: Message-ID: <3E45235E.4090707@princeton.edu> Hi Guenther! Thank you for the advice. I took a look at the code. However, I somehow cannot get the correct results when I use Real-to-Complex transform (rfftw) on 2D arrays. Is it possible that this is so because the algorithm does in-place transform and lays out the values in the half-complex packing? Has this been tested on rfftw or has it only been tested with complex-to-complex stuff? I am afraid to use complex because of the overhead. Best regards, Gasper. Richard Guenther wrote: >On Thu, 6 Feb 2003, Gasper Tkacik wrote: > > > >>Hello everybody! >> >>Is there any support for FFT operations in Pooma? For example, given a >>Array, is there any preferred way of doing FFT? >>If not, is there any way of doing it in parallel with the constructs >>that Pooma provides? >> >> > >Try looking at src/Transform/WrapFFTW.h which seems to make use of libfftw >to do FFT. > From rguenth at tat.physik.uni-tuebingen.de Sat Feb 8 23:19:37 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 9 Feb 2003 00:19:37 +0100 (CET) Subject: [PATCH] Reference documentation structure Message-ID: The following patch will introduce a structure for grouping source files. The result will look like the modules page you can find here: http://www.tat.physik.uni-tuebingen.de/~rguenth/pooma/reference/modules.html I include below the summaries of the groups for which I added one. 2003Feb09 Richard Guenther * docs/reference/array.doxygen: new file. docs/reference/connect.doxygen: new file. docs/reference/databrowser.doxygen: new file. docs/reference/domain.doxygen: new file. docs/reference/engine.doxygen: new file. docs/reference/evaluator.doxygen: new file. docs/reference/field.doxygen: new file. docs/reference/internal.doxygen: new file. docs/reference/io.doxygen: new file. docs/reference/layout.doxygen: new file. docs/reference/objects.doxygen: new file. docs/reference/particles.doxygen: new file. docs/reference/partition.doxygen: new file. docs/reference/pete.doxygen: new file. docs/reference/pooma.doxygen: new file. docs/reference/threads.doxygen: new file. docs/reference/tiny.doxygen: new file. docs/reference/tulip.doxygen: new file. docs/reference/unused.doxygen: new file. docs/reference/utility.doxygen: new file. /** * @defgroup Connect * @ingroup Utilities * Classes/Files for managing connections to external agents such as * visualization tools. * * Support for using the PAWS library is available. PAWS provides a * mechanism for sharing multidimensional array and simple scalar * variables between separate parallel programs. * * Support for using the Lux library for run-time visualization is * available. * */ /** * @defgroup Lux * @ingroup Connect * * Support for using the Lux library for run-time visualization. * */ /** * @defgroup Paws * @ingroup Connect * * Support for using the PAWS library. PAWS provides a * mechanism for sharing multidimensional array and simple scalar * variables between separate parallel programs. * */ /** * @defgroup Domain Domain Objects and Modifiers * @ingroup Structuring * These are classes which handle specifying a domain and modifying it. * * The types of domains can be divided into two classes, namely integer * domains and continuous domains. * * Integer domains include * - int * - Loc * - Interval * - Range * - Grid * * The only continuous domain class is Region. There are a few special * domain classes, namely NullDomain, AllDomain and ErrorDomain. * * All domain classes come with their corresponding traits class * DomainTraits which is used to manage operations on domains * such as * - splitting domains with the split function * - querying if domains overlap using the touches function * - querying the intersection of two domains using the intersect function * - growing/shrinking Interval using the shrinkRight, growRight, * shrinkLeft and growLeft functions * - finding the equivalent subset for a domain from a given transformation * using the equivSubset function * - remove the overlap between two Interval and return a vector of the * resulting domain using the DomainRemoveOverlap function * - querying if one domain contains another using the contains function * * Domains can be modified by arithmetic operations and new domains can * be constructed using the helper classes in NewDomain.h and the LeftDomain * and RightDomain wildcard classes. * * You can iterate through domains by FIXME. * * Some domains can be sliced FIXME. * */ /** * @defgroup Engine Engines * @ingroup Objects * * Engine related classes/files. * * Engines provide the storage for Arrays and Fields and handle * domain decomposition, taking subviews and accessing components * transparently by providing a common interface to their users. * * Engine are usually defined recursively. Engines categorize into * engines allocating storage for data, engines that provide access * to computed data and engines providing modified access to other * engines. * * For the first category, the storage engines, the following are * evailable: * - Engine < Dim, T, Brick > * - Engine < Dim, T, CompressibleBrick > * - Engine < Dim, T, Dynamic > * - Engine < Dim, T, ConstantFunction > * * For the second category, the computation engines, the following * are available: * - Engine < Dim, T, IndexFunction < Functor > > * - Engine < Dim, T, StencilEngine < Function, Expression > > * - Engine < Dim, T, UserFunctionEngine < UserFunction, Expression > > * - Engine < Dim, T, ExpressionTag < Expr > > * * For the second category the most important engine types include: * - Engine < Dim, T, Remote < Tag > > * - Engine < Dim, T, MultiPatch < LayoutTag, PatchTag > > * - Engine < Dim, T, CompFwd < Eng, N > > * - Engine < Dim, T, IndirectionTag < A1, A2 > > * - Engines of the category * Engine < Dim, T, ViewEngineType >, where * ViewEngineType is one of * BrickView, * MultiPatchView, * DynamicView * and ViewEngine * * Views of Engines can be constructed by using the NewEngine<> traits * class which takes the to be viewed engine type and the subsetting * domain type as template parameters. NewEngine<> then defines the type * of the ViewEngine as Type_t typedef member. * * FIXME: Introduce NewEngineEngine<>, NewEngineDomain<> and * newEngineEngine() and newEngineDomain() with their concepts. */ /** * @defgroup Evaluator * @ingroup Objects * * The evaluators present different ways to operate on Arrays and Fields. * * This includes fortran-like manual looping over the patches of the * data using the PatchFunction<> mechanism which supports multiple * input data but only one output. (PatchFunction.h, PatchKernel.h) * * Another way to operate is using the ScalarCode<> facility which * presents something like a n-argument stencil operation with some * of the arguments being the output. (ScalarCode.h, ScalarCodeInfo.h, * MultiArgKernel.h, MultiArgEvaluator.h) * * Evaluating a functor at a whole domain is done using the LoopApplyEvaluator. * (LoopApply.h) * * The most simple way is to use POOMA expressions who are evaluated * via ExpressioKernel objects. (ExpressionKernel.h) * * The internal evaluator objects are templated on the patch type which * gets constructed from the expression node types using the * EvaluatorCombine<> traits class and produce the tags * RemoteMultiPatchEvaluatorTag, MultiPatchEvaluatorTag, RemoteSinglePatchEvaluatorTag * and SinglePatchEvaluatorTag. The MainEvaluatorTag specialized class is * the root of any evaluation. * */ /** * @defgroup Field Fields * @ingroup Objects * * Field related classes/files. Important classes include Field, FieldEngine * Centering and CanonicalCentering. * */ /** * @defgroup Mesh Field Meshes * @ingroup Field * * Meshes provide a way to attach coordinate information to the * vertices of a grid. You can choose between different mesh types * which constrain positions in a different way. All available meshes * are rectilinear, i.e. are constructed as outer product of spacing * vectors. * * To query a mesh for its positions or similar properties use the * functions positions(), outwardNormals(), coordinateNormals(), cellVolumes(), * faceAreas() and edgeLengths(). * * Predefined mesh types are: * - UniformRectilinearMesh which defines a uniformly spaced rectilinear mesh, * - RectilinearMesh which defines a arbitrarily spaced rectilinear mesh, * - NoMesh which defines a mesh without a mesh. * * Meshes are completed by one of Cartesian, Cylindrical or Spherical * coordinate system classes. Complete types for mesh can be constructed * using the MeshTraits traits class and the appropriate tag classes for * the mesh type and the coordinate system type. * */ /** * @defgroup DiffOps * @ingroup Field * */ /** * @defgroup Relations Field Relations * @ingroup Field * * Relations are FIXME. * * Usable predefined relations include boundary conditions of which * the following are available: * - ConstantFaceBC * - PeriodicFaceBC * - PosReflectFaceBC * */ /** * @defgroup Layout Layouts - Laying out Domains * @ingroup Structuring * * A layout combines domain information and guard cell (both internal * and external) information and maps the domain to a specified partition. * So related topics are the \ref Domain and \ref Partition groups. * * All layouts operate with domains based on Interval, i.e. a * continuous integer domain. This domain gets distributed/tiled in a * different way for different layout classes, namely * - DomainLayout does not tile/distribute the domain, hence it is applicable * for local computation only * - GridLayout tiles/distributes the Dim-dimensional domain by constructing * a (possibly) non-uniform rectilinear grid of subdomains (GridTag) * - UniformGridLayout tiles/distributes the domain using a uniform rectilinear * grid of subdomains (UniformTag) * - SparseTileLayout tiles/distributes the domain using non-overlapping * subdomains that need not cover the whole domain (SparseTileTag) * - DynamicLayout tiles/distributes the domain using a grid partition and * handles dynamic domains such as coming from DynamicArray (DynamicTag) * * For actually using any of the above layout on a MultiPatch engine you need * to specify appropriate grid tags which are one of those specified above * alongst the layout classes. * * Specifying the actual tiling is done by several constructors dealing with * the most important tiling types. The generic way to specify tiling is to * use a partitioner, see \ref Partition for reference. * * Mapping the domain to the tiling can be done in two different ways, * namely distributed and replicated which is specified using an instance * of the DistributedTag and ReplicatedTag classes to the layout constructors. * Distributed means the domain is distributed over the tiling, replicated * means the entire domain is replicated over the tiling. Note that remote * engines do not make sense in conjunction with replicated layouts, and this * will trigger a runtime error. * */ /** * @defgroup Objects Data object and container classes * Objects for storing data and for doing computation with. * * Note that for complex objects like Fields and Arrays placed inside * structures the compiler will generate default copy constructors and * assignment operators that usually violate the principle of least * surprise in that they will invoke the objects assignment operator * which will cause a PETE expression to be evalutated and this may * lead to cryptic compiler error messages for for example IndexFunction * engine objects. To work around this you need to explicitly provide * copy constructors and assignment operators that use obj.initialize() * instead of an assignment. This will catch you if you are using * IndexFunction like objects, such as Stencils and FieldStencils. For * writable objects be prepared to get assertion failures because of * unitialized objects created by a default constructors are later * initialized by default copy/assignment. */ /** * @defgroup Partition Partitioning Domains * @ingroup Structuring * * These files deal with domain partitioning and mapping the partition * to computation nodes. The following partitioners are available: * - GridPartition * - UniformGridPartition * - TilePartition * * You usually dont need to interact with the mappers, which are * LocalMapper, ContextMapper, BisectionMapper, ContiguousMapper, * DistributedMapper and UniformMapper. They deal with mapping patches * to nodes, where LocalMapper is used for ReplicatedTag tagged layouts * and the others, derived from the base ContextMapper, for DistributedTag * tagged layouts. * */ From rguenth at tat.physik.uni-tuebingen.de Sun Feb 9 22:49:24 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 9 Feb 2003 23:49:24 +0100 (CET) Subject: [pooma-dev] FFT In-Reply-To: <3E45235E.4090707@princeton.edu> Message-ID: On Sat, 8 Feb 2003, Gasper Tkacik wrote: > Hi Guenther! > > Thank you for the advice. I took a look at the code. However, I somehow > cannot get the correct results when I use Real-to-Complex transform > (rfftw) on 2D arrays. Is it possible that this is so because the > algorithm does in-place transform and lays out the values in the > half-complex packing? Has this been tested on rfftw or has it only been > tested with complex-to-complex stuff? I am afraid to use complex because > of the overhead. I dont know at all - never used the fftw wrapper myself. But reading your description of operation I think your analysis is right that pooma cannot know about n*real to n/2*complex data transform. Maybe it does cope with that, but that need to be explicitly handled in the wrapper. I'd suggest looking at the source and maybe fixing/extending it. Richard. > Best regards, Gasper. > > Richard Guenther wrote: > > >On Thu, 6 Feb 2003, Gasper Tkacik wrote: > > > > > > > >>Hello everybody! > >> > >>Is there any support for FFT operations in Pooma? For example, given a > >>Array, is there any preferred way of doing FFT? > >>If not, is there any way of doing it in parallel with the constructs > >>that Pooma provides? > >> > >> > > > >Try looking at src/Transform/WrapFFTW.h which seems to make use of libfftw > >to do FFT. > > > > From jcrotinger at proximation.com Mon Feb 10 05:48:35 2003 From: jcrotinger at proximation.com (James Crotinger) Date: Sun, 9 Feb 2003 22:48:35 -0700 Subject: [pooma-dev] [PATCH] Reference documentation structure Message-ID: Hi Richard, Is this structure a rearrangement of the source? I do not think this would be a good idea since CVS doesn't support moving files, and losing the history is bad. Jim -----Original Message----- From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] Sent: Saturday, February 08, 2003 4:20 PM To: Jeffrey Oldham Cc: pooma-dev at pooma.codesourcery.com Subject: [pooma-dev] [PATCH] Reference documentation structure The following patch will introduce a structure for grouping source files. The result will look like the modules page you can find here: http://www.tat.physik.uni-tuebingen.de/~rguenth/pooma/reference/modules.html I include below the summaries of the groups for which I added one. 2003Feb09 Richard Guenther * docs/reference/array.doxygen: new file. docs/reference/connect.doxygen: new file. docs/reference/databrowser.doxygen: new file. docs/reference/domain.doxygen: new file. docs/reference/engine.doxygen: new file. docs/reference/evaluator.doxygen: new file. docs/reference/field.doxygen: new file. docs/reference/internal.doxygen: new file. docs/reference/io.doxygen: new file. docs/reference/layout.doxygen: new file. docs/reference/objects.doxygen: new file. docs/reference/particles.doxygen: new file. docs/reference/partition.doxygen: new file. docs/reference/pete.doxygen: new file. docs/reference/pooma.doxygen: new file. docs/reference/threads.doxygen: new file. docs/reference/tiny.doxygen: new file. docs/reference/tulip.doxygen: new file. docs/reference/unused.doxygen: new file. docs/reference/utility.doxygen: new file. /** * @defgroup Connect * @ingroup Utilities * Classes/Files for managing connections to external agents such as * visualization tools. * * Support for using the PAWS library is available. PAWS provides a * mechanism for sharing multidimensional array and simple scalar * variables between separate parallel programs. * * Support for using the Lux library for run-time visualization is * available. * */ /** * @defgroup Lux * @ingroup Connect * * Support for using the Lux library for run-time visualization. * */ /** * @defgroup Paws * @ingroup Connect * * Support for using the PAWS library. PAWS provides a * mechanism for sharing multidimensional array and simple scalar * variables between separate parallel programs. * */ /** * @defgroup Domain Domain Objects and Modifiers * @ingroup Structuring * These are classes which handle specifying a domain and modifying it. * * The types of domains can be divided into two classes, namely integer * domains and continuous domains. * * Integer domains include * - int * - Loc * - Interval * - Range * - Grid * * The only continuous domain class is Region. There are a few special * domain classes, namely NullDomain, AllDomain and ErrorDomain. * * All domain classes come with their corresponding traits class * DomainTraits which is used to manage operations on domains * such as * - splitting domains with the split function * - querying if domains overlap using the touches function * - querying the intersection of two domains using the intersect function * - growing/shrinking Interval using the shrinkRight, growRight, * shrinkLeft and growLeft functions * - finding the equivalent subset for a domain from a given transformation * using the equivSubset function * - remove the overlap between two Interval and return a vector of the * resulting domain using the DomainRemoveOverlap function * - querying if one domain contains another using the contains function * * Domains can be modified by arithmetic operations and new domains can * be constructed using the helper classes in NewDomain.h and the LeftDomain * and RightDomain wildcard classes. * * You can iterate through domains by FIXME. * * Some domains can be sliced FIXME. * */ /** * @defgroup Engine Engines * @ingroup Objects * * Engine related classes/files. * * Engines provide the storage for Arrays and Fields and handle * domain decomposition, taking subviews and accessing components * transparently by providing a common interface to their users. * * Engine are usually defined recursively. Engines categorize into * engines allocating storage for data, engines that provide access * to computed data and engines providing modified access to other * engines. * * For the first category, the storage engines, the following are * evailable: * - Engine < Dim, T, Brick > * - Engine < Dim, T, CompressibleBrick > * - Engine < Dim, T, Dynamic > * - Engine < Dim, T, ConstantFunction > * * For the second category, the computation engines, the following * are available: * - Engine < Dim, T, IndexFunction < Functor > > * - Engine < Dim, T, StencilEngine < Function, Expression > > * - Engine < Dim, T, UserFunctionEngine < UserFunction, Expression > > * - Engine < Dim, T, ExpressionTag < Expr > > * * For the second category the most important engine types include: * - Engine < Dim, T, Remote < Tag > > * - Engine < Dim, T, MultiPatch < LayoutTag, PatchTag > > * - Engine < Dim, T, CompFwd < Eng, N > > * - Engine < Dim, T, IndirectionTag < A1, A2 > > * - Engines of the category * Engine < Dim, T, ViewEngineType >, where * ViewEngineType is one of * BrickView, * MultiPatchView, * DynamicView * and ViewEngine * * Views of Engines can be constructed by using the NewEngine<> traits * class which takes the to be viewed engine type and the subsetting * domain type as template parameters. NewEngine<> then defines the type * of the ViewEngine as Type_t typedef member. * * FIXME: Introduce NewEngineEngine<>, NewEngineDomain<> and * newEngineEngine() and newEngineDomain() with their concepts. */ /** * @defgroup Evaluator * @ingroup Objects * * The evaluators present different ways to operate on Arrays and Fields. * * This includes fortran-like manual looping over the patches of the * data using the PatchFunction<> mechanism which supports multiple * input data but only one output. (PatchFunction.h, PatchKernel.h) * * Another way to operate is using the ScalarCode<> facility which * presents something like a n-argument stencil operation with some * of the arguments being the output. (ScalarCode.h, ScalarCodeInfo.h, * MultiArgKernel.h, MultiArgEvaluator.h) * * Evaluating a functor at a whole domain is done using the LoopApplyEvaluator. * (LoopApply.h) * * The most simple way is to use POOMA expressions who are evaluated * via ExpressioKernel objects. (ExpressionKernel.h) * * The internal evaluator objects are templated on the patch type which * gets constructed from the expression node types using the * EvaluatorCombine<> traits class and produce the tags * RemoteMultiPatchEvaluatorTag, MultiPatchEvaluatorTag, RemoteSinglePatchEvaluatorTag * and SinglePatchEvaluatorTag. The MainEvaluatorTag specialized class is * the root of any evaluation. * */ /** * @defgroup Field Fields * @ingroup Objects * * Field related classes/files. Important classes include Field, FieldEngine * Centering and CanonicalCentering. * */ /** * @defgroup Mesh Field Meshes * @ingroup Field * * Meshes provide a way to attach coordinate information to the * vertices of a grid. You can choose between different mesh types * which constrain positions in a different way. All available meshes * are rectilinear, i.e. are constructed as outer product of spacing * vectors. * * To query a mesh for its positions or similar properties use the * functions positions(), outwardNormals(), coordinateNormals(), cellVolumes(), * faceAreas() and edgeLengths(). * * Predefined mesh types are: * - UniformRectilinearMesh which defines a uniformly spaced rectilinear mesh, * - RectilinearMesh which defines a arbitrarily spaced rectilinear mesh, * - NoMesh which defines a mesh without a mesh. * * Meshes are completed by one of Cartesian, Cylindrical or Spherical * coordinate system classes. Complete types for mesh can be constructed * using the MeshTraits traits class and the appropriate tag classes for * the mesh type and the coordinate system type. * */ /** * @defgroup DiffOps * @ingroup Field * */ /** * @defgroup Relations Field Relations * @ingroup Field * * Relations are FIXME. * * Usable predefined relations include boundary conditions of which * the following are available: * - ConstantFaceBC * - PeriodicFaceBC * - PosReflectFaceBC * */ /** * @defgroup Layout Layouts - Laying out Domains * @ingroup Structuring * * A layout combines domain information and guard cell (both internal * and external) information and maps the domain to a specified partition. * So related topics are the \ref Domain and \ref Partition groups. * * All layouts operate with domains based on Interval, i.e. a * continuous integer domain. This domain gets distributed/tiled in a * different way for different layout classes, namely * - DomainLayout does not tile/distribute the domain, hence it is applicable * for local computation only * - GridLayout tiles/distributes the Dim-dimensional domain by constructing * a (possibly) non-uniform rectilinear grid of subdomains (GridTag) * - UniformGridLayout tiles/distributes the domain using a uniform rectilinear * grid of subdomains (UniformTag) * - SparseTileLayout tiles/distributes the domain using non-overlapping * subdomains that need not cover the whole domain (SparseTileTag) * - DynamicLayout tiles/distributes the domain using a grid partition and * handles dynamic domains such as coming from DynamicArray (DynamicTag) * * For actually using any of the above layout on a MultiPatch engine you need * to specify appropriate grid tags which are one of those specified above * alongst the layout classes. * * Specifying the actual tiling is done by several constructors dealing with * the most important tiling types. The generic way to specify tiling is to * use a partitioner, see \ref Partition for reference. * * Mapping the domain to the tiling can be done in two different ways, * namely distributed and replicated which is specified using an instance * of the DistributedTag and ReplicatedTag classes to the layout constructors. * Distributed means the domain is distributed over the tiling, replicated * means the entire domain is replicated over the tiling. Note that remote * engines do not make sense in conjunction with replicated layouts, and this * will trigger a runtime error. * */ /** * @defgroup Objects Data object and container classes * Objects for storing data and for doing computation with. * * Note that for complex objects like Fields and Arrays placed inside * structures the compiler will generate default copy constructors and * assignment operators that usually violate the principle of least * surprise in that they will invoke the objects assignment operator * which will cause a PETE expression to be evalutated and this may * lead to cryptic compiler error messages for for example IndexFunction * engine objects. To work around this you need to explicitly provide * copy constructors and assignment operators that use obj.initialize() * instead of an assignment. This will catch you if you are using * IndexFunction like objects, such as Stencils and FieldStencils. For * writable objects be prepared to get assertion failures because of * unitialized objects created by a default constructors are later * initialized by default copy/assignment. */ /** * @defgroup Partition Partitioning Domains * @ingroup Structuring * * These files deal with domain partitioning and mapping the partition * to computation nodes. The following partitioners are available: * - GridPartition * - UniformGridPartition * - TilePartition * * You usually dont need to interact with the mappers, which are * LocalMapper, ContextMapper, BisectionMapper, ContiguousMapper, * DistributedMapper and UniformMapper. They deal with mapping patches * to nodes, where LocalMapper is used for ReplicatedTag tagged layouts * and the others, derived from the base ContextMapper, for DistributedTag * tagged layouts. * */ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Mon Feb 10 09:08:02 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 10 Feb 2003 10:08:02 +0100 (CET) Subject: [pooma-dev] [PATCH] Reference documentation structure In-Reply-To: Message-ID: On Sun, 9 Feb 2003, James Crotinger wrote: > Hi Richard, > > Is this structure a rearrangement of the source? I do not think this would > be a good idea since CVS doesn't support moving files, and losing the > history is bad. No, its not a rearrangement of the source. But just structuring the source which in a doxygen way is specifying "groups" in which I can later put files. I originally had this group files in the src/ directory, but as they contain introductional documentation to a (sometimes) large part of the code I thought placing them beyond the docs/reference directory is a better idea. Just to give you an idea the plan is to add per-source-file overview together with placing in a group in every source file, f.i. for FieldCentering.h this would look like /** @file * @ingroup Field * @brief * specifies value locations within a field's cell * * Centering * - specifies value locations within a field's cell * CanonicalCentering * - yields some canonical centerings * canonicalCentering(type, discontinuous, dimension) * - yields the specified canonical centering */ I'm open to suggestions regarding the structure itself, as I'm not really satisfied with it. Also better suggestions for placing group related documentation is welcome (though I'm satisfied with placing this beyond docs/reference - just maybe not in so much files?). Richard. From rguenth at tat.physik.uni-tuebingen.de Thu Feb 13 12:46:06 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 13 Feb 2003 13:46:06 +0100 (CET) Subject: [BUG] ScalarCode with Fields not honouring relations! Message-ID: Hi! ScalarCode in its current form does not honour Fields relations. I.e. it does not dirty them on write. Triggering them on reads seems to work. Testcase below. I suppose we should add something like enum { hasRelations = true }; to Field<> and enum { hasRelations = false }; to Array<>, so we can handle this in ScalarCode and maybe related places. Anyway, the Relation machine seems to be suboptimal for doing boundary conditions (but hey - those tend to be cheap compared to communicating of the internal guards). Any ideas? Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ //----------------------------------------------------------------------------- // evaluatorTest5 - testing ScalarCode and boundary update //----------------------------------------------------------------------------- #include "Pooma/Pooma.h" #include "Pooma/Arrays.h" #include "Pooma/Fields.h" // for PerformUpdateTag() only! #include "Evaluator/ScalarCode.h" #include "Utilities/Tester.h" #include // dummy operation struct DirtyRelations { DirtyRelations() {} template inline void operator()(const A &a, const Loc<1> &i) const { } void scalarCodeInfo(ScalarCodeInfo& i) const { i.arguments(1); i.dimensions(1); i.lowerExtent(0) = 0; i.upperExtent(0) = 0; i.write(0, true); i.useGuards(0, false); } }; struct TriggerRelations { TriggerRelations() {} template inline void operator()(const A &a, const Loc<1> &i) const { } void scalarCodeInfo(ScalarCodeInfo& i) const { i.arguments(1); i.dimensions(1); i.lowerExtent(0) = 1; i.upperExtent(0) = 1; i.write(0, false); i.useGuards(0, true); } }; struct TriggerAndDirtyRelations { TriggerAndDirtyRelations() {} template inline void operator()(const A &a, const Loc<1> &i) const { } void scalarCodeInfo(ScalarCodeInfo& i) const { i.arguments(1); i.dimensions(1); i.lowerExtent(0) = 1; i.upperExtent(0) = 1; i.write(0, true); // umm - _and_ read... i.useGuards(0, true); } }; // boundary condition just incementing a global counter static int bupd = 0; class DummyBC { public: DummyBC() {} DummyBC(const DummyBC &) {} template DummyBC(const DummyBC &, const Target &) {} DummyBC& operator=(const DummyBC&) {} template void operator()(const Target&) const { bupd++; } }; int main(int argc, char *argv[]) { // Initialize POOMA and output stream, using Tester class Pooma::initialize(argc, argv); Pooma::Tester tester(argc, argv); Pooma::blockingExpressions(true); int size = 120; Interval<1> domain(size); DomainLayout<1> layout(domain, GuardLayers<1>(1)); UniformRectilinearMesh<1> mesh(layout); Centering<1> cell = canonicalCentering<1>(CellType, Continuous); Field, double, Brick> a(cell, layout, mesh), b(cell, layout, mesh); tester.out() << "Adding relation\n"; Pooma::newRelation(DummyBC(), a); RelationListItem *rel = a.fieldEngine().data(0, 0).relations()(0); tester.check("a has dirty relation", rel->dirty()); tester.check("a did not have relations applied", bupd == 0); bupd = 0; rel->setDirty(); tester.out() << "Applying DirtyRelations()\n"; ScalarCode()(a); tester.check("a did not have relations applied", bupd == 0); tester.check("a has dirty relation", rel->dirty()); bupd = 0; rel->setDirty(); tester.out() << "Applying TriggerRelations()\n"; ScalarCode()(a); tester.check("a did have relations applied", bupd == 1); tester.check("a has clean relation", !rel->dirty()); bupd = 0; rel->clearDirty(); tester.out() << "Applying TriggerAndDirtyRelations()\n"; ScalarCode()(a); tester.check("a did not have relations applied", bupd == 0); tester.check("a has dirty relation", rel->dirty()); rel->setDirty(); tester.out() << "Reading from a.all()\n"; b.all() = a.all(); tester.check("a did have relations applied", bupd == 1); tester.check("a has clean relation", !rel->dirty()); bupd = 0; int retval = tester.results("evaluatorTest5 (ScalarCode)"); Pooma::finalize(); return retval; } From rguenth at tat.physik.uni-tuebingen.de Fri Feb 14 15:47:43 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 14 Feb 2003 16:47:43 +0100 (CET) Subject: [PATCH] Fix ScalarCode wrt Field Relations Message-ID: Hi! The following patch fixes ScalarCode which is not dirtying Field Relations on arguments written to. Adds a new testcase. Tested on x86-serial-linux with no regressions in Field, Evaluator and Array. Ok? Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ diff -Nru a/r2/src/Array/Array.h b/r2/src/Array/Array.h --- a/r2/src/Array/Array.h Fri Feb 14 16:26:13 2003 +++ b/r2/src/Array/Array.h Fri Feb 14 16:26:13 2003 @@ -1433,6 +1433,9 @@ enum { dimensions = Engine_t::dimensions }; enum { rank = Engine_t::dimensions }; + // Arrays dont support relations attached to them. + + enum { hasRelations = false }; //=========================================================================== // Constructors diff -Nru a/r2/src/Evaluator/MultiArgEvaluator.h b/r2/src/Evaluator/MultiArgEvaluator.h --- a/r2/src/Evaluator/MultiArgEvaluator.h Fri Feb 14 16:26:13 2003 +++ b/r2/src/Evaluator/MultiArgEvaluator.h Fri Feb 14 16:26:13 2003 @@ -106,6 +106,17 @@ } template + inline void dirtyRelations(const A &a, const WrappedInt&) const + { + a.setDirty(); + } + + template + inline void dirtyRelations(const A &, const WrappedInt&) const + { + } + + template void operator()(const A &a, bool f) const { if (f) @@ -117,6 +128,7 @@ // all the engines in a field. notifyEngineWrite(a.engine()); + dirtyRelations(a, WrappedInt()); } } }; diff -Nru a/r2/src/Evaluator/tests/evaluatorTest5.cpp b/r2/src/Evaluator/tests/evaluatorTest5.cpp --- /dev/null Wed Dec 31 16:00:00 1969 +++ b/r2/src/Evaluator/tests/evaluatorTest5.cpp Fri Feb 14 16:26:13 2003 @@ -0,0 +1,185 @@ +// -*- C++ -*- +// ACL:license +// ---------------------------------------------------------------------- +// This software and ancillary information (herein called "SOFTWARE") +// called POOMA (Parallel Object-Oriented Methods and Applications) is +// made available under the terms described here. The SOFTWARE has been +// approved for release with associated LA-CC Number LA-CC-98-65. +// +// Unless otherwise indicated, this SOFTWARE has been authored by an +// employee or employees of the University of California, operator of the +// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with +// the U.S. Department of Energy. The U.S. Government has rights to use, +// reproduce, and distribute this SOFTWARE. The public may copy, distribute, +// prepare derivative works and publicly display this SOFTWARE without +// charge, provided that this Notice and any statement of authorship are +// reproduced on all copies. Neither the Government nor the University +// makes any warranty, express or implied, or assumes any liability or +// responsibility for the use of this SOFTWARE. +// +// If SOFTWARE is modified to produce derivative works, such modified +// SOFTWARE should be clearly marked, so as not to confuse it with the +// version available from LANL. +// +// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, +// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. +// ---------------------------------------------------------------------- +// ACL:license + +//----------------------------------------------------------------------------- +// evaluatorTest5 - testing ScalarCode and boundary update +//----------------------------------------------------------------------------- + +#include "Pooma/Pooma.h" +#include "Pooma/Arrays.h" +#include "Pooma/Fields.h" // for PerformUpdateTag() only! +#include "Evaluator/ScalarCode.h" +#include "Utilities/Tester.h" +#include + + +// dummy operation + +struct DirtyRelations +{ + DirtyRelations() {} + + template + inline void operator()(const A &a, const Loc<1> &i) const + { + } + + void scalarCodeInfo(ScalarCodeInfo& i) const + { + i.arguments(1); + i.dimensions(1); + i.lowerExtent(0) = 0; + i.upperExtent(0) = 0; + i.write(0, true); + i.useGuards(0, false); + } +}; +struct TriggerRelations +{ + TriggerRelations() {} + + template + inline void operator()(const A &a, const Loc<1> &i) const + { + } + + void scalarCodeInfo(ScalarCodeInfo& i) const + { + i.arguments(1); + i.dimensions(1); + i.lowerExtent(0) = 1; + i.upperExtent(0) = 1; + i.write(0, false); + i.useGuards(0, true); + } +}; +struct TriggerAndDirtyRelations +{ + TriggerAndDirtyRelations() {} + + template + inline void operator()(const A &a, const Loc<1> &i) const + { + } + + void scalarCodeInfo(ScalarCodeInfo& i) const + { + i.arguments(1); + i.dimensions(1); + i.lowerExtent(0) = 1; + i.upperExtent(0) = 1; + i.write(0, true); // umm - _and_ read... + i.useGuards(0, true); + } +}; + +// boundary condition just incementing a global counter + +static int bupd = 0; + +class DummyBC +{ +public: + DummyBC() {} + DummyBC(const DummyBC &) {} + template + DummyBC(const DummyBC &, const Target &) {} + DummyBC& operator=(const DummyBC&) {} + template + void operator()(const Target&) const + { + bupd++; + } +}; + + +int main(int argc, char *argv[]) +{ + // Initialize POOMA and output stream, using Tester class + Pooma::initialize(argc, argv); + Pooma::Tester tester(argc, argv); + + Pooma::blockingExpressions(true); + + int size = 120; + + Interval<1> domain(size); + DomainLayout<1> layout(domain, GuardLayers<1>(1)); + UniformRectilinearMesh<1> mesh(layout); + Centering<1> cell = canonicalCentering<1>(CellType, Continuous); + + Field, double, Brick> + a(cell, layout, mesh), b(cell, layout, mesh); + + tester.out() << "Adding relation\n"; + Pooma::newRelation(DummyBC(), a); + RelationListItem *rel = a.fieldEngine().data(0, 0).relations()(0); + + tester.check("a has dirty relation", rel->dirty()); + tester.check("a did not have relations applied", bupd == 0); + + bupd = 0; + rel->setDirty(); + tester.out() << "Applying DirtyRelations()\n"; + ScalarCode()(a); + // not applying relations here is an optimization we're not able to do right now + //tester.check("a did not have relations applied", bupd == 0); + tester.check("a has dirty relation", rel->dirty()); + + bupd = 0; + rel->setDirty(); + tester.out() << "Applying TriggerRelations()\n"; + ScalarCode()(a); + tester.check("a did have relations applied", bupd == 1); + tester.check("a has clean relation", !rel->dirty()); + + bupd = 0; + rel->clearDirty(); + tester.out() << "Applying TriggerAndDirtyRelations()\n"; + ScalarCode()(a); + tester.check("a did not have relations applied", bupd == 0); + tester.check("a has dirty relation", rel->dirty()); + + bupd = 0; + rel->setDirty(); + tester.out() << "Reading from a.all()\n"; + b.all() = a.all(); + tester.check("a did have relations applied", bupd == 1); + tester.check("a has clean relation", !rel->dirty()); + + int retval = tester.results("evaluatorTest5 (ScalarCode)"); + Pooma::finalize(); + return retval; +} + +// ACL:rcsinfo +// ---------------------------------------------------------------------- +// $RCSfile: evaluatorTest2.cpp,v $ $Author: pooma $ +// $Revision: 1.7 $ $Date: 2003/01/29 19:32:07 $ +// ---------------------------------------------------------------------- +// ACL:rcsinfo diff -Nru a/r2/src/Evaluator/tests/makefile b/r2/src/Evaluator/tests/makefile --- a/r2/src/Evaluator/tests/makefile Fri Feb 14 16:26:13 2003 +++ b/r2/src/Evaluator/tests/makefile Fri Feb 14 16:26:13 2003 @@ -36,6 +36,7 @@ TESTS = compressibleTest1 \ evaluatorTest1 evaluatorTest2 evaluatorTest3 evaluatorTest4 \ + evaluatorTest5 \ ReductionTest1 ReductionTest2 ReductionTest3 ReductionTest4 default:: build diff -Nru a/r2/src/Field/Field.h b/r2/src/Field/Field.h --- a/r2/src/Field/Field.h Fri Feb 14 16:26:13 2003 +++ b/r2/src/Field/Field.h Fri Feb 14 16:26:13 2003 @@ -1132,6 +1132,10 @@ typedef Centering Centering_t; + // Fields may have relations attached to them. + + enum { hasRelations = true }; + //--------------------------------------------------------------------------- // User-callable constructors. These ctors are meant to be called by users. From rguenth at tat.physik.uni-tuebingen.de Fri Feb 14 22:07:46 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 14 Feb 2003 23:07:46 +0100 (CET) Subject: [PATCH] Extend ScalarCode to allow writing to guards Message-ID: Hi! The following patch adds write-to-(external)-guards capability to ScalarCode. This allows one to use the same functor for ScalarCode and the external guards filler. Ideally this would allow for n-stage computation without internal guards update for n*m sized internal guards, too, but this needs prevention of internal guards fill (working on it). The problem remains, that ScalarCodeInfo::lower_m/upper_m are not per argument, and such prevent from storing into full 1-guard-layer field with reading from 2-guard-layer field (cannot take the neccesarry view of the 1-guard-layer field). The extension comes with a new test. Ok? Richard. ===== MultiArgEvaluator.h 1.5 vs edited ===== --- 1.5/r2/src/Evaluator/MultiArgEvaluator.h Fri Feb 14 21:39:39 2003 +++ edited/MultiArgEvaluator.h Fri Feb 14 23:05:22 2003 @@ -173,6 +173,8 @@ Pooma::beginExpression(); + // Should use info.readers() here - but readers/writers are mutually + // exclusive so this (or the latter) seems not to be a great idea. applyMultiArg(multiArg, UpdateNotifier()); MultiArgEvaluator::evaluate(multiArg, function, ===== ScalarCodeInfo.h 1.1 vs edited ===== --- 1.1/r2/src/Evaluator/ScalarCodeInfo.h Mon May 13 17:47:34 2002 +++ edited/ScalarCodeInfo.h Fri Feb 14 22:56:48 2003 @@ -105,12 +105,16 @@ dimensions_m = n; lower_m.resize(n); upper_m.resize(n); + wlower_m.resize(n); + wupper_m.resize(n); int i; for (i = 0; i < n; ++i) { lower_m[i] = 0; upper_m[i] = 0; + wlower_m[i] = 0; + wupper_m[i] = 0; } } @@ -123,6 +127,21 @@ { return upper_m[i]; } + + // Extend the evaluation domain into the guard layers by the amount + // specified with g. Needs to be less or equal the guard layer specification + // with lowerExtent() and upperExtent(). + + template + void evaluationExtent(const GuardLayers &g) + { + PAssert(dimensions_m == Dim); + for (int i = 0; i < Dim; ++i) + { + wlower_m[i] = g.lower(i); + wupper_m[i] = g.upper(i); + } + } void write(int i, bool f) { @@ -175,8 +194,8 @@ { ret[d] = Interval<1>( - lower_m[d], - domain[d].last() - domain[d].first() + lower_m[d] + lower_m[d]-wlower_m[d], + domain[d].last() - domain[d].first() + lower_m[d] + wupper_m[d] ); } return ret; @@ -194,6 +213,8 @@ int dimensions_m; Extents_t upper_m; Extents_t lower_m; + Extents_t wupper_m; + Extents_t wlower_m; BoolVector_t useGuards_m; BoolVector_t writers_m; BoolVector_t readers_m; ===== tests/makefile 1.7 vs edited ===== --- 1.7/r2/src/Evaluator/tests/makefile Fri Feb 14 21:39:39 2003 +++ edited/tests/makefile Fri Feb 14 21:57:09 2003 @@ -36,7 +36,7 @@ TESTS = compressibleTest1 \ evaluatorTest1 evaluatorTest2 evaluatorTest3 evaluatorTest4 \ - evaluatorTest5 \ + evaluatorTest5 evaluatorTest6 \ ReductionTest1 ReductionTest2 ReductionTest3 ReductionTest4 default:: build // -*- C++ -*- // ACL:license // ---------------------------------------------------------------------- // This software and ancillary information (herein called "SOFTWARE") // called POOMA (Parallel Object-Oriented Methods and Applications) is // made available under the terms described here. The SOFTWARE has been // approved for release with associated LA-CC Number LA-CC-98-65. // // Unless otherwise indicated, this SOFTWARE has been authored by an // employee or employees of the University of California, operator of the // Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with // the U.S. Department of Energy. The U.S. Government has rights to use, // reproduce, and distribute this SOFTWARE. The public may copy, distribute, // prepare derivative works and publicly display this SOFTWARE without // charge, provided that this Notice and any statement of authorship are // reproduced on all copies. Neither the Government nor the University // makes any warranty, express or implied, or assumes any liability or // responsibility for the use of this SOFTWARE. // // If SOFTWARE is modified to produce derivative works, such modified // SOFTWARE should be clearly marked, so as not to confuse it with the // version available from LANL. // // For more information about POOMA, send e-mail to pooma at acl.lanl.gov, // or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. // ---------------------------------------------------------------------- // ACL:license //----------------------------------------------------------------------------- // evaluatorTest5 - testing ScalarCode and direct external bounds computation //----------------------------------------------------------------------------- #include "Pooma/Pooma.h" #include "Pooma/Arrays.h" #include "Pooma/Fields.h" // for PerformUpdateTag() only! #include "Evaluator/ScalarCode.h" #include "Utilities/Tester.h" #include // dummy operation template struct Store { Store(int val) : val_m(val) {} template inline void operator()(const A &a, const Loc<1> &i) const { a(i) = val_m; } void scalarCodeInfo(ScalarCodeInfo& i) const { i.arguments(1); i.dimensions(1); i.lowerExtent(0) = 2; i.upperExtent(0) = 2; i.evaluationExtent(GuardLayers<1>(extent)); i.write(0, true); i.useGuards(0, true); } const int val_m; }; int main(int argc, char *argv[]) { // Initialize POOMA and output stream, using Tester class Pooma::initialize(argc, argv); Pooma::Tester tester(argc, argv); Pooma::blockingExpressions(true); Interval<1> domain(0, 7); DomainLayout<1> layout(domain, GuardLayers<1>(2)); UniformRectilinearMesh<1> mesh(layout); Centering<1> cell = canonicalCentering<1>(VertexType, Continuous); Field, int, Brick> a(cell, layout, mesh); // init all of a a.all() = 0; tester.out() << a.all() << std::endl; tester.check("guard init", a(-2) == 0 && a(-1) == 0 && a(8) == 0 && a(9) == 0); // zero-extent Store into a (ScalarCode >(1))(a); tester.out() << a.all() << std::endl; tester.check("guard init", a(-2) == 0 && a(-1) == 0 && a(8) == 0 && a(9) == 0); // one-extent Store into a (ScalarCode >(2))(a); tester.out() << a.all() << std::endl; tester.check("guard init", a(-2) == 0 && a(-1) == 2 && a(8) == 2 && a(9) == 0); // two-extent Store into a (ScalarCode >(3))(a); tester.out() << a.all() << std::endl; tester.check("guard init", a(-2) == 3 && a(-1) == 3 && a(8) == 3 && a(9) == 3); int retval = tester.results("evaluatorTest6 (ScalarCode)"); Pooma::finalize(); return retval; } // ACL:rcsinfo // ---------------------------------------------------------------------- // $RCSfile: evaluatorTest2.cpp,v $ $Author: pooma $ // $Revision: 1.7 $ $Date: 2003/01/29 19:32:07 $ // ---------------------------------------------------------------------- // ACL:rcsinfo From gtkacik at Princeton.EDU Mon Feb 17 20:59:44 2003 From: gtkacik at Princeton.EDU (Gasper Tkacik) Date: Mon, 17 Feb 2003 15:59:44 -0500 Subject: Warnings when compiling Pooma Message-ID: <3E514D40.4030703@princeton.edu> Hi all! When I compile Pooma (I currently only use Array<2, double> and no other fancy stuff), I get warnings like: /development/external/r2/src/Domain/Grid.h:360: warning: base class `class Domain<1, DomainTraits > >' should be explicitly initialized in the copy constructor In file included from /development/external/r2/src/Pooma/Arrays.h:46, from /development/utilities/cframework/alg_rfg.h:21, from main.cpp:22: /development/external/r2/src/Engine/RemoteEngine.h: In copy constructor `GatherContexts::GatherContextsData::GatherContextsData(const GatherContexts::GatherContextsData&)': or /development/external/r2/src/Domain/DomainTraits.Interval.h:264: warning: comparison between signed and unsigned integer expressions /development/external/r2/src/Domain/DomainTraits.Interval.h: In static member function `static void DomainTraits >::setDomain(int (&)[2], const T1&, const T2&) [with T1 = int, T2 = long unsigned int]': I.e. mostly complaining about class initialization or comparison between signed / unsigned types. Is this normal in the sense that you folks use Pooma and don't see all these warnings or am I passing somewhere a strange template parameter? I tried to follow the examples... I am running gcc 3.2.20020903 on Linux RH8.0, from KDevelop tool (-Wall). I just checked - I get the warnings even if I only compile the following piece: #ifdef HAVE_CONFIG_H #include #endif #include "Pooma/Pooma.h" #include "Pooma/Arrays.h" #include #include #include int main(int argc, char *argv[]) { Pooma::initialize(argc, argv); Pooma::finalize(); return EXIT_SUCCESS; } Best regards, Gasper. From rguenth at tat.physik.uni-tuebingen.de Mon Feb 17 22:28:07 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 17 Feb 2003 23:28:07 +0100 (CET) Subject: [pooma-dev] Warnings when compiling Pooma In-Reply-To: <3E514D40.4030703@princeton.edu> Message-ID: On Mon, 17 Feb 2003, Gasper Tkacik wrote: > Hi all! > > When I compile Pooma (I currently only use Array<2, double> and no other > fancy stuff), I get warnings like: > > /development/external/r2/src/Domain/Grid.h:360: warning: base class `class > Domain<1, DomainTraits > >' should be explicitly initialized in the > copy constructor This one seems to be a bug - but I'm currently not able to provide a quick fix, as the obvious one is just segfaulting on me. Maybe I'll investigate tomorrow. > or > > /development/external/r2/src/Domain/DomainTraits.Interval.h:264: > warning: comparison > between signed and unsigned integer expressions There's a lot of these around and fixing them is not high priority - just use -Wno-sign-compare. > Is this normal in the sense that you folks use Pooma and don't see all > these warnings or am I passing somewhere a strange template parameter? I > tried to follow the examples... I am running gcc 3.2.20020903 on Linux > RH8.0, from KDevelop tool (-Wall). We're not seeing them, as at least I'm not using -Wall all the time and most of the compilation output is hidden from us due to the makefile structure. Richard. From rguenth at tat.physik.uni-tuebingen.de Tue Feb 18 10:47:09 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 18 Feb 2003 11:47:09 +0100 (CET) Subject: Evaluator/ReductionEvaluator.h question Message-ID: Hi! Why is the result of ReductionEvaluator<>::evaluate() initialized to Expr.read(0) and op never applied to it? This seems to be wrong, f.i. if the operation is void op(double &res, double val) { double tmp = std::sqrt(val); if (tmp > res) res = tmp; } It seems to be more natural to just use the "current" value of ret for initialization of answer, so an appropriate starting value can be provided by the users. Was there any reason in the current implementation? Thanks for clarifying, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Wed Feb 19 10:54:47 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 19 Feb 2003 11:54:47 +0100 (CET) Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: Message-ID: On Tue, 18 Feb 2003, Richard Guenther wrote: > Why is the result of ReductionEvaluator<>::evaluate() initialized > to Expr.read(0) and op never applied to it? This seems to be wrong, > f.i. if the operation is > > void op(double &res, double val) > { > double tmp = std::sqrt(val); > if (tmp > res) > res = tmp; > } I see now that the current implementation does make sense for all reduction operators I can think of, but looking at the evaluation loops they seem to be hard to optimize for the compiler, so may I propose the following patch? Richard. ===== src/Array/Reductions.h 1.1 vs edited ===== --- 1.1/r2/src/Array/Reductions.h Mon May 13 17:47:27 2002 +++ edited/src/Array/Reductions.h Tue Feb 18 12:59:28 2003 @@ -64,7 +64,7 @@ template T sum(const Array &a) { - T ret; + T ret = T(0); Reduction().evaluate(ret, OpAddAssign(), a); return ret; } @@ -74,7 +74,7 @@ template T prod(const Array &a) { - T ret; + T ret = T(1); Reduction().evaluate(ret, OpMultiplyAssign(), a); return ret; } @@ -84,7 +84,7 @@ template T min(const Array &a) { - T ret; + T ret = std::numeric_limits::max(); Reduction().evaluate(ret, FnMinAssign(), a); return ret; } @@ -94,7 +94,7 @@ template T max(const Array &a) { - T ret; + T ret = std::numeric_limits::min(); Reduction().evaluate(ret, FnMaxAssign(), a); return ret; } @@ -104,7 +104,7 @@ template bool all(const Array &a) { - bool ret; + bool ret = true; Reduction().evaluate(ret, FnAndAssign(), a); return ret; } @@ -114,7 +114,7 @@ template bool any(const Array &a) { - bool ret; + bool ret = false; Reduction().evaluate(ret, FnOrAssign(), a); return ret; } @@ -124,7 +124,7 @@ template T bitOr(const Array &a) { - T ret; + T ret = static_cast(0ULL); Reduction().evaluate(ret, OpBitwiseOrAssign(), a); return ret; } @@ -134,7 +134,7 @@ template T bitAnd(const Array &a) { - T ret; + T ret = static_cast(~0ULL); Reduction().evaluate(ret, OpBitwiseAndAssign(), a); return ret; } ===== src/Engine/RemoteEngine.h 1.1 vs edited ===== --- 1.1/r2/src/Engine/RemoteEngine.h Mon May 13 17:47:32 2002 +++ edited/src/Engine/RemoteEngine.h Tue Feb 18 13:06:14 2003 @@ -2090,6 +2090,7 @@ { if (computationalContext[j] == Pooma::context()) { + vals[k] = ret; EngineView view; Reduction(). evaluate(vals[k++], op, ===== src/Evaluator/Reduction.h 1.1 vs edited ===== --- 1.1/r2/src/Evaluator/Reduction.h Mon May 13 17:47:34 2002 +++ edited/src/Evaluator/Reduction.h Tue Feb 18 13:04:26 2003 @@ -226,6 +226,7 @@ int j = 0; while (j < n) { + vals[j] = ret; Reduction(). evaluate(vals[j], op, e(*i), csem); ++i; ++j; ===== src/Evaluator/ReductionEvaluator.h 1.1 vs edited ===== --- 1.1/r2/src/Evaluator/ReductionEvaluator.h Mon May 13 17:47:34 2002 +++ edited/src/Evaluator/ReductionEvaluator.h Tue Feb 18 11:56:09 2003 @@ -127,8 +127,8 @@ Expr localExpr(e); int e0 = domain[0].length(); - T answer(localExpr.read(0)); - for (int i0 = 1; i0 < e0; ++i0) + T answer(ret); + for (int i0 = 0; i0 < e0; ++i0) op(answer, localExpr.read(i0)); ret = answer; @@ -145,22 +145,10 @@ int e0 = domain[0].length(); int e1 = domain[1].length(); - int i00; - bool firstLoop = true; - - T answer(localExpr.read(0, 0)); + T answer(ret); for (int i1 = 0; i1 < e1; ++i1) - { - if (firstLoop) - { - firstLoop = false; - i00 = 1; - } - else - i00 = 0; - for (int i0 = i00; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1)); - } + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1)); ret = answer; } @@ -177,24 +165,12 @@ int e0 = domain[0].length(); int e1 = domain[1].length(); int e2 = domain[2].length(); - - int i00; - bool firstLoop = true; - T answer(localExpr.read(0, 0, 0)); + T answer(ret); for (int i2 = 0; i2 < e2; ++i2) for (int i1 = 0; i1 < e1; ++i1) - { - if (firstLoop) - { - firstLoop = false; - i00 = 1; - } - else - i00 = 0; - for (int i0 = i00; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2)); - } + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2)); ret = answer; } @@ -213,25 +189,13 @@ int e1 = domain[1].length(); int e2 = domain[2].length(); int e3 = domain[3].length(); - - int i00; - bool firstLoop = true; - T answer(localExpr.read(0, 0, 0, 0)); + T answer(ret); for (int i3 = 0; i3 < e3; ++i3) for (int i2 = 0; i2 < e2; ++i2) for (int i1 = 0; i1 < e1; ++i1) - { - if (firstLoop) - { - firstLoop = false; - i00 = 1; - } - else - i00 = 0; - for (int i0 = i00; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3)); - } + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3)); ret = answer; } @@ -252,26 +216,14 @@ int e2 = domain[2].length(); int e3 = domain[3].length(); int e4 = domain[4].length(); - - int i00; - bool firstLoop = true; - T answer(localExpr.read(0, 0, 0, 0, 0)); + T answer(ret); for (int i4 = 0; i4 < e4; ++i4) for (int i3 = 0; i3 < e3; ++i3) for (int i2 = 0; i2 < e2; ++i2) for (int i1 = 0; i1 < e1; ++i1) - { - if (firstLoop) - { - firstLoop = false; - i00 = 1; - } - else - i00 = 0; - for (int i0 = i00; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3, i4)); - } + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3, i4)); ret = answer; } @@ -294,27 +246,15 @@ int e3 = domain[3].length(); int e4 = domain[4].length(); int e5 = domain[5].length(); - - int i00; - bool firstLoop = true; - T answer(localExpr.read(0, 0, 0, 0, 0, 0)); + T answer(ret); for (int i5 = 0; i5 < e5; ++i5) for (int i4 = 0; i4 < e4; ++i4) for (int i3 = 0; i3 < e3; ++i3) for (int i2 = 0; i2 < e2; ++i2) for (int i1 = 0; i1 < e1; ++i1) - { - if (firstLoop) - { - firstLoop = false; - i00 = 1; - } - else - i00 = 0; - for (int i0 = i00; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3, i4, i5)); - } + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3, i4, i5)); ret = answer; } @@ -340,27 +280,15 @@ int e5 = domain[5].length(); int e6 = domain[6].length(); - int i00; - bool firstLoop = true; - - T answer(localExpr.read(0, 0, 0, 0, 0, 0, 0)); + T answer(ret); for (int i6 = 0; i6 < e6; ++i6) for (int i5 = 0; i5 < e5; ++i5) for (int i4 = 0; i4 < e4; ++i4) for (int i3 = 0; i3 < e3; ++i3) for (int i2 = 0; i2 < e2; ++i2) for (int i1 = 0; i1 < e1; ++i1) - { - if (firstLoop) - { - firstLoop = false; - i00 = 1; - } - else - i00 = 0; - for (int i0 = i00; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3, i4, i5, i6)); - } + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3, i4, i5, i6)); ret = answer; } @@ -384,9 +312,10 @@ struct CompressibleReduce { template - inline static void evaluate(T &ret, const Op &, const T1 &val, int) + inline static void evaluate(T &ret, const Op &op, const T1 &val, int) { - ret = static_cast(val); + // Works for op that is constrained to op(op(ret, val), val) == op(ret, val). + op(ret, static_cast(val)); } }; ===== src/Evaluator/tests/ReductionTest1.cpp 1.1 vs edited ===== --- 1.1/r2/src/Evaluator/tests/ReductionTest1.cpp Mon May 13 17:47:34 2002 +++ edited/src/Evaluator/tests/ReductionTest1.cpp Tue Feb 18 12:12:52 2003 @@ -48,22 +48,27 @@ int ret; bool bret; + ret = 0; Reduction().evaluate(ret, OpAddAssign(), a); tester.check("sum", ret, 55); tester.out() << ret << std::endl; + ret = 1; Reduction().evaluate(ret, OpMultiplyAssign(), a(Interval<1>(9))); tester.check("prod", ret, 362880); tester.out() << ret << std::endl; + ret = std::numeric_limits::max(); Reduction().evaluate(ret, FnMinAssign(), a - 2); tester.check("min", ret, -1); tester.out() << ret << std::endl; + bret = true; Reduction().evaluate(bret, FnAndAssign(), a - 1); tester.check("all", bret, false); tester.out() << bret << std::endl; + ret = static_cast(0ULL); Reduction().evaluate(ret, OpBitwiseOrAssign(), a); tester.check("bitOr", ret, 15); tester.out() << ret << std::endl; ===== src/Evaluator/tests/ReductionTest2.cpp 1.1 vs edited ===== --- 1.1/r2/src/Evaluator/tests/ReductionTest2.cpp Mon May 13 17:47:34 2002 +++ edited/src/Evaluator/tests/ReductionTest2.cpp Tue Feb 18 12:13:48 2003 @@ -52,22 +52,27 @@ int ret; bool bret; + ret = 0; Reduction().evaluate(ret, OpAddAssign(), a); tester.check("sum", ret, 55); tester.out() << ret << std::endl; + ret = 1; Reduction().evaluate(ret, OpMultiplyAssign(), a(Interval<1>(9))); tester.check("prod", ret, 362880); tester.out() << ret << std::endl; + ret = std::numeric_limits::max(); Reduction().evaluate(ret, FnMinAssign(), a - 2); tester.check("min", ret, -1); tester.out() << ret << std::endl; + bret = true; Reduction().evaluate(bret, FnAndAssign(), a - 1); tester.check("all", bret, false); tester.out() << bret << std::endl; + ret = static_cast(0ULL); Reduction().evaluate(ret, OpBitwiseOrAssign(), a); tester.check("bitOr", ret, 15); tester.out() << ret << std::endl; ===== src/Evaluator/tests/ReductionTest3.cpp 1.1 vs edited ===== --- 1.1/r2/src/Evaluator/tests/ReductionTest3.cpp Mon May 13 17:47:34 2002 +++ edited/src/Evaluator/tests/ReductionTest3.cpp Tue Feb 18 12:13:39 2003 @@ -47,22 +47,27 @@ int ret; bool bret; + ret = 0; Reduction().evaluate(ret, OpAddAssign(), a); tester.check("sum", ret, int(2 * a.domain().size())); tester.out() << ret << std::endl; + ret = 1; Reduction().evaluate(ret, OpMultiplyAssign(), a); tester.check("prod", ret, 65536); tester.out() << ret << std::endl; + ret = std::numeric_limits::max(); Reduction().evaluate(ret, FnMinAssign(), a); tester.check("min", ret, 2); tester.out() << ret << std::endl; + bret = true; Reduction().evaluate(bret, FnAndAssign(), a); tester.check("all", bret, true); tester.out() << bret << std::endl; + ret = static_cast(0ULL); Reduction().evaluate(ret, OpBitwiseOrAssign(), a); tester.check("bitOr", ret, 2); tester.out() << ret << std::endl; ===== src/Evaluator/tests/ReductionTest4.cpp 1.3 vs edited ===== --- 1.3/r2/src/Evaluator/tests/ReductionTest4.cpp Thu Dec 19 10:38:23 2002 +++ edited/src/Evaluator/tests/ReductionTest4.cpp Tue Feb 18 12:15:06 2003 @@ -65,47 +65,56 @@ // Test various sorts of reductions with a single array. + ret = 0; Reduction().evaluate(ret, OpAddAssign(), a); tester.check("sum", ret, 55); tester.out() << ret << std::endl; + ret = 1; Reduction().evaluate(ret, OpMultiplyAssign(), a(Interval<1>(9))); tester.check("prod", ret, 362880); tester.out() << ret << std::endl; + ret = std::numeric_limits::max(); Reduction().evaluate(ret, FnMinAssign(), a - 2); tester.check("min", ret, -1); tester.out() << ret << std::endl; + bret = true; Reduction().evaluate(bret, FnAndAssign(), a - 1); tester.check("all", bret, false); tester.out() << bret << std::endl; + ret = static_cast(0ULL); Reduction().evaluate(ret, OpBitwiseOrAssign(), a); tester.check("bitOr", ret, 15); tester.out() << ret << std::endl; // Test something with an expression engine (remote2 + remote5). + ret = 0; Reduction().evaluate(ret, OpAddAssign(), a + b); tester.check("sum(a + b)", ret, 55 + 90); tester.out() << ret << std::endl; // Test something with an expression engine (remote5 + remote2). + ret = 0; Reduction().evaluate(ret, OpAddAssign(), b + a); tester.check("sum(b + a)", ret, 90 + 55); tester.out() << ret << std::endl; // Test something with a brick (remote2 + remote5 + brick). + ret = 0; Reduction().evaluate(ret, OpAddAssign(), a + b + c); tester.check("sum(a + b + c)", ret, 90 + 55 + 20); tester.out() << ret << std::endl; // Test something with a brick (brick + remote5 + remote2). + ret = 0; Reduction().evaluate(ret, OpAddAssign(), c + b + a); tester.check("sum(c + b + a)", ret, 20 + 55 + 90); tester.out() << ret << std::endl; ===== src/Field/FieldReductions.h 1.1 vs edited ===== --- 1.1/r2/src/Field/FieldReductions.h Mon May 13 17:47:35 2002 +++ edited/src/Field/FieldReductions.h Tue Feb 18 12:59:01 2003 @@ -73,7 +73,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - T ret; + T ret = T(0); Reduction().evaluate(ret, OpAddAssign(), f); return ret; } @@ -92,7 +92,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - T ret; + T ret = T(1); Reduction().evaluate(ret, OpMultiplyAssign(), f); return ret; } @@ -111,7 +111,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - T ret; + T ret = std::numeric_limits::max(); Reduction().evaluate(ret, FnMinAssign(), f); return ret; } @@ -130,7 +130,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - T ret; + T ret = std::numeric_limits::min(); Reduction().evaluate(ret, FnMaxAssign(), f); return ret; } @@ -149,7 +149,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - bool ret; + bool ret = true; Reduction().evaluate(ret, FnAndAssign(), f); return ret; } @@ -168,7 +168,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - bool ret; + bool ret = false; Reduction().evaluate(ret, FnOrAssign(), f); return ret; } @@ -187,7 +187,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - T ret; + T ret = static_cast(0ULL); Reduction().evaluate(ret, OpBitwiseOrAssign(), f); return ret; } @@ -206,7 +206,7 @@ forEach(f, PerformUpdateTag(), NullCombine()); - T ret; + T ret = static_cast(~0ULL); Reduction().evaluate(ret, OpBitwiseAndAssign(), f); return ret; } From rguenth at tat.physik.uni-tuebingen.de Wed Feb 19 11:07:38 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 19 Feb 2003 12:07:38 +0100 (CET) Subject: r2 Mesh and Pooma::positions() Message-ID: Hi! Is Pooma::positions() supposed to return positions or coordinates for the mesh? They are the same for the only implemented coordinate system (Cartesian), but for other coordinate systems we need to distinguish between them. There is also the cellContaining()/vertexPosition() methods which take or return points which are supposed to be in coordinates or positions? I.e. is m.vertexPosition(cellContaining(positions(m)(Loc()))) == positions(m)(Loc()) supposed to be true for all coordinate systems? Any ideas for good names for methods distinguishing between both worlds? F.i. Mesh::vertexPosition() vs. Mesh::vertexCoordinate() or is the term "coordinate" tied to integral indices? Thanks, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Fri Feb 21 17:36:07 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 21 Feb 2003 18:36:07 +0100 (CET) Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: Message-ID: On Wed, 19 Feb 2003, Richard Guenther wrote: > On Tue, 18 Feb 2003, Richard Guenther wrote: > > > Why is the result of ReductionEvaluator<>::evaluate() initialized > > to Expr.read(0) and op never applied to it? This seems to be wrong, > > f.i. if the operation is > > > > void op(double &res, double val) > > { > > double tmp = std::sqrt(val); > > if (tmp > res) > > res = tmp; > > } > > I see now that the current implementation does make sense for all > reduction operators I can think of, but looking at the evaluation > loops they seem to be hard to optimize for the compiler, so may I > propose the following patch? I expected some critics on the patch, namely the following... (so I delay committing this). > +++ edited/src/Array/Reductions.h Tue Feb 18 12:59:28 2003 > @@ -84,7 +84,7 @@ > template > T min(const Array &a) > { > - T ret; > + T ret = std::numeric_limits::max(); What for types that dont have a std::numeric_limits<> specialization? Are there any that we care? Tiny::Zero<> probably? > @@ -124,7 +124,7 @@ > template > T bitOr(const Array &a) > { > - T ret; > + T ret = static_cast(0ULL); Does this work for all types we care? Do we need to use memset() here? What for FP types - can this initial value be a SNaN or other trapping stuff we will choke on later? I'll go on adding two testcases, one for Arrays, one for Fields and use memset for all bits 1 and all bits 0 initial values. I dont know what to do or wether to care about the numeric_limits<> issues. Any ideas, comments? Richard. From mark at codesourcery.com Fri Feb 21 17:47:02 2003 From: mark at codesourcery.com (Mark Mitchell) Date: Fri, 21 Feb 2003 09:47:02 -0800 Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: Message-ID: <54560000.1045849622@warlock.codesourcery.com> --On Friday, February 21, 2003 06:36:07 PM +0100 Richard Guenther wrote: > I expected some critics on the patch, namely the following... > (so I delay committing this). At one point, I think people wanted POOMA to be able to work on any types that had normal arithmetic properties (roughly speaking, those that form a field, in the algebraic sense.) So, I'm not sure if explicitly initializng things to "1" and such makes sense. But, I don't really know much about this -- it could be I'm totally off. -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Fri Feb 21 18:08:39 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 21 Feb 2003 19:08:39 +0100 (CET) Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: <54560000.1045849622@warlock.codesourcery.com> Message-ID: On Fri, 21 Feb 2003, Mark Mitchell wrote: > > > --On Friday, February 21, 2003 06:36:07 PM +0100 Richard Guenther > wrote: > > > I expected some critics on the patch, namely the following... > > (so I delay committing this). > > At one point, I think people wanted POOMA to be able to work on any types > that had normal arithmetic properties (roughly speaking, those that form a > field, in the algebraic sense.) So, I'm not sure if explicitly initializng > things to "1" and such makes sense. Hmm - at least for such a type constructing from a 1 or 0 element as neutral element of multiplication/addition does make sense. Requiring numerical_limits<> specialization maybe, too. > But, I don't really know much about this -- it could be I'm totally off. Me too - maybe someone can think of a problematic case here. At least the patch brings a nice speedup and code size improvement for gcc3.2. Richard. From rguenth at tat.physik.uni-tuebingen.de Fri Feb 21 18:18:39 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 21 Feb 2003 19:18:39 +0100 (CET) Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: Message-ID: On Fri, 21 Feb 2003, Richard Guenther wrote: > On Fri, 21 Feb 2003, Mark Mitchell wrote: > > > > > > > --On Friday, February 21, 2003 06:36:07 PM +0100 Richard Guenther > > wrote: > > > > > I expected some critics on the patch, namely the following... > > > (so I delay committing this). > > > > At one point, I think people wanted POOMA to be able to work on any types > > that had normal arithmetic properties (roughly speaking, those that form a > > field, in the algebraic sense.) So, I'm not sure if explicitly initializng > > things to "1" and such makes sense. > > Hmm - at least for such a type constructing from a 1 or 0 element as > neutral element of multiplication/addition does make sense. Requiring > numerical_limits<> specialization maybe, too. > > > But, I don't really know much about this -- it could be I'm totally off. > > Me too - maybe someone can think of a problematic case here. At least the > patch brings a nice speedup and code size improvement for gcc3.2. Just stubled over Field/FieldOffset.h which does reductions on FieldOffsetLists the same way the old implementation did. But here all loops are 1D, so no performance/code impact. Richard. From mark at codesourcery.com Fri Feb 21 18:21:39 2003 From: mark at codesourcery.com (Mark Mitchell) Date: Fri, 21 Feb 2003 10:21:39 -0800 Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: Message-ID: <63300000.1045851698@warlock.codesourcery.com> --On Friday, February 21, 2003 07:08:39 PM +0100 Richard Guenther wrote: > Hmm - at least for such a type constructing from a 1 or 0 element as > neutral element of multiplication/addition does make sense. Well, in an algebraic sense there are going to be additive and multiplicative identities -- but they might not be the integers 1 and 0. You might not have an "int" constructor for, say, a matrix class. -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Fri Feb 21 18:40:24 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 21 Feb 2003 19:40:24 +0100 (CET) Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: <63300000.1045851698@warlock.codesourcery.com> Message-ID: On Fri, 21 Feb 2003, Mark Mitchell wrote: > > > --On Friday, February 21, 2003 07:08:39 PM +0100 Richard Guenther > wrote: > > > Hmm - at least for such a type constructing from a 1 or 0 element as > > neutral element of multiplication/addition does make sense. > > Well, in an algebraic sense there are going to be additive and > multiplicative identities -- but they might not be the integers 1 and 0. > You might not have an "int" constructor for, say, a matrix class. Well, constructing from int 1 and int 0 would make sense, but a generic from int constructor of course not. Were the reductions designed to allow for such complex objects? prod() and sum() would have surely worked for almost all of them. Also for these you wont have meaningful comparison operators, so at least min()/max() will not work for them. Richard. From mark at codesourcery.com Fri Feb 21 18:57:36 2003 From: mark at codesourcery.com (Mark Mitchell) Date: Fri, 21 Feb 2003 10:57:36 -0800 Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: Message-ID: <70410000.1045853856@warlock.codesourcery.com> --On Friday, February 21, 2003 07:40:24 PM +0100 Richard Guenther wrote: > Also for these you wont have meaningful comparison operators, so at least > min()/max() will not work for them. Yes. I must admit that I am pretty well making things up. I will let someone who really knows say something more intelligent. :-) -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Fri Feb 21 19:31:47 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 21 Feb 2003 20:31:47 +0100 (CET) Subject: [PATCH] Streamline Engine types Message-ID: Hi! The following patch adds layout() accessors to engines missing one and adds some inline qualifiers to methods that have one in other engines. Ok? Richard. 2003Feb21 Richard Guenther * src/Engine/ViewEngine.h: add layout() accessor, mark small methods inline. src/Engine/Stencil.h: add layout() accessor, make domain() return const reference. src/Engine/DynamicEngine.h: mark small methods inline. src/Engine/ForwardingEngine.h: correct Layout_t, add layout() accessor, mark small methods inline. diff -Nru a/r2/src/Engine/DynamicEngine.h b/r2/src/Engine/DynamicEngine.h --- a/r2/src/Engine/DynamicEngine.h Fri Feb 21 20:26:06 2003 +++ b/r2/src/Engine/DynamicEngine.h Fri Feb 21 20:26:06 2003 @@ -235,11 +235,11 @@ // Create a layout and return a copy. - inline const Layout_t layout() const { return Layout_t(domain_m); } + inline Layout_t layout() const { return Layout_t(domain_m); } // Return whether the block controlled by this engine is shared. - bool isShared() const { return data_m.isValid() && data_m.count() > 1; } + inline bool isShared() const { return data_m.isValid() && data_m.count() > 1; } // Get a private copy of data viewed by this Engine. @@ -247,14 +247,14 @@ // Provide access to the data object. - Pooma::DataObject_t *dataObject() const { return data_m.dataObject(); } + inline Pooma::DataObject_t *dataObject() const { return data_m.dataObject(); } // Return access to our internal data block. This is ref-counted, // so a copy is fine. But you should really know what you're doing // if you call this method. - const DataBlockPtr & dataBlock() const { return data_m; } - DataBlockPtr dataBlock() { return data_m; } + inline const DataBlockPtr & dataBlock() const { return data_m; } + inline DataBlockPtr dataBlock() { return data_m; } //============================================================ // Dynamic interface methods. @@ -308,7 +308,7 @@ // sync() function is a no-op for a single-patch engine. // This version of sync() may be called via the DynamicArray interface. - void sync() { } + inline void sync() { } // Modify the domain (but not the size) of this engine. // This version of sync() may be called by MultiPatchEngine on its patches. @@ -318,19 +318,19 @@ #if POOMA_CHEETAH template - int packSize(const Dom &) const + inline int packSize(const Dom &) const { PInsist(false,"packSize() called on non-remote Dynamic Engine!!"); return 0; } - int pack(const IndirectionList &, char *, bool = true) const + inline int pack(const IndirectionList &, char *, bool = true) const { PInsist(false,"pack() called on non-remote Dynamic Engine!!"); return 0; } - int unpack(const Interval<1> &, char *, bool = true) + inline int unpack(const Interval<1> &, char *, bool = true) { PInsist(false,"unpack() called on non-remote Dynamic Engine!!"); return 0; @@ -509,26 +509,26 @@ // Return the domain: - const Domain_t &domain() const { return domain_m; } + inline const Domain_t &domain() const { return domain_m; } // Return a DomainLayout built from our domain - const Layout_t layout() const { return Layout_t(domain_m); } + inline Layout_t layout() const { return Layout_t(domain_m); } // Return the stride. - int stride() const { return stride_m; } + inline int stride() const { return stride_m; } // Provide access to the data object. - Pooma::DataObject_t *dataObject() const { return data_m.dataObject(); } + inline Pooma::DataObject_t *dataObject() const { return data_m.dataObject(); } // Return access to our internal data block. This is ref-counted, // so a copy is fine. But you should really know what you're doing // if you call this method. - const DataBlockPtr & dataBlock() const { return data_m; } - DataBlockPtr dataBlock() { return data_m; } + inline const DataBlockPtr & dataBlock() const { return data_m; } + inline DataBlockPtr dataBlock() { return data_m; } private: diff -Nru a/r2/src/Engine/Stencil.h b/r2/src/Engine/Stencil.h --- a/r2/src/Engine/Stencil.h Fri Feb 21 20:26:06 2003 +++ b/r2/src/Engine/Stencil.h Fri Feb 21 20:26:06 2003 @@ -400,7 +400,13 @@ // Return the output domain. //============================================================ - inline Domain_t domain() const { return domain_m; } + inline const Domain_t &domain() const { return domain_m; } + + //============================================================ + // Return the output layout. + //============================================================ + + inline Layout_t layout() const { return Layout_t(domain_m); } //============================================================ // Return the first output index value for the specified direction diff -Nru a/r2/src/Engine/ViewEngine.h b/r2/src/Engine/ViewEngine.h --- a/r2/src/Engine/ViewEngine.h Fri Feb 21 20:26:06 2003 +++ b/r2/src/Engine/ViewEngine.h Fri Feb 21 20:26:06 2003 @@ -263,7 +263,12 @@ //--------------------------------------------------------------------------- // Return the domain. - const Domain_t &domain() const { return indexer_m.domain(); } + inline const Domain_t &domain() const { return indexer_m.domain(); } + + //--------------------------------------------------------------------------- + // Return the layout. + + inline Layout_t layout() const { return Layout_t(domain()); } //--------------------------------------------------------------------------- // Return the first value for the specified direction (always zero since this @@ -278,8 +283,8 @@ //--------------------------------------------------------------------------- // Accessors. - const ViewedEngine_t &viewedEngine() const { return eng_m; } - const Indexer_t &indexer() const { return indexer_m; } + inline const ViewedEngine_t &viewedEngine() const { return eng_m; } + inline const Indexer_t &indexer() const { return indexer_m; } //--------------------------------------------------------------------------- // Need to pass lock requests to the contained engine. diff -Nru a/r2/src/Engine/ForwardingEngine.h b/r2/src/Engine/ForwardingEngine.h --- a/r2/src/Engine/ForwardingEngine.h Fri Feb 21 20:28:10 2003 +++ b/r2/src/Engine/ForwardingEngine.h Fri Feb 21 20:28:10 2003 @@ -96,7 +96,7 @@ typedef typename CompAccess_t::ElementRef_t ElementRef_t; typedef typename Eng::Domain_t Domain_t; typedef CompFwd Tag_t; - typedef DomainLayout Layout_t; + typedef typename Eng::Layout_t Layout_t; //--------------------------------------------------------------------------- // required constants @@ -251,9 +251,22 @@ } //--------------------------------------------------------------------------- + // Returns the layout, which is acquired from the contained engine. + + inline const Layout_t& layout() const + { + return elemEngine().layout(); + } + + inline Layout_t& layout() + { + return elemEngine().layout(); + } + + //--------------------------------------------------------------------------- // Returns the domain, which is acquired from the contained engine. - Domain_t domain() const { return elemEngine().domain(); } + inline const Domain_t& domain() const { return elemEngine().domain(); } //--------------------------------------------------------------------------- // Return the first value for the specified direction. From leopardi at bigpond.net.au Fri Feb 21 21:32:26 2003 From: leopardi at bigpond.net.au (Paul C. Leopardi) Date: Sat, 22 Feb 2003 08:32:26 +1100 Subject: [pooma-dev] Evaluator/ReductionEvaluator.h question In-Reply-To: <70410000.1045853856@warlock.codesourcery.com> References: <70410000.1045853856@warlock.codesourcery.com> Message-ID: <200302220832.26893.leopardi@bigpond.net.au> Hi all, I'm not sure whether I'm saying anything intelligent, but the GluCat library - http://glucat.sf.net - is designed to implement a numeric type for use with template libraries such as POOMA. GluCat implements Clifford algebras. Still on the TODO list for GluCat are numeric_limits<> and type_traits<>. So your discussion is very relevant to the future direction of GluCat. Best regards On Sat, 22 Feb 2003 05:57, Mark Mitchell wrote: > --On Friday, February 21, 2003 07:40:24 PM +0100 Richard Guenther > wrote: > > > Also for these you wont have meaningful comparison operators, so at least > > min()/max() will not work for them. > > Yes. I must admit that I am pretty well making things up. > > I will let someone who really knows say something more intelligent. :-) From renard1 at llnl.gov Tue Feb 25 16:10:07 2003 From: renard1 at llnl.gov (Paul A. Renard) Date: Tue, 25 Feb 2003 08:10:07 -0800 Subject: KCC versus icc Message-ID: <5.1.0.14.2.20030225080717.00b10fd0@popout.llnl.gov> An HTML attachment was scrubbed... URL: From oldham at codesourcery.com Wed Feb 26 02:38:28 2003 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Tue, 25 Feb 2003 18:38:28 -0800 Subject: [pooma-dev] KCC versus icc References: <5.1.0.14.2.20030225080717.00b10fd0@popout.llnl.gov> Message-ID: <3E5C28A4.6090702@codesourcery.com> Paul A. Renard wrote: > Hope I'm asking the correct crowd... > > Given the following: > > const int N=128; > Array<2,complex > u(N,N); > Iota<2>::Iota_t ij(iota(u.domain()); > Iota<2>::Index_t I(ij.comp(0)); > Iota<2>::Index_t J(ij.comp(1)); > Array<1,complex > cx(N), cy(N); > > // Values for u, cx, cy are filled elsewhere. > > // Then the following is called: > void compute(){ > u *= cx(I)*cy(J); // runs 4X slower with icc than KCC > } > > When I time this routine, I find that it runs about 4X slower when > compiled with Intel's icc (Version 7, -O3 -DNOPAssert -DNOCTASSERT) than > with KCC (version 4.0f, +K3 -DNOPAssert, -DNOCTAssert). As expected, > the KCC version runs as fast as hand-written loops. > > Do others observe this same sluggish behavior with icc? Am I missing > some obvious compile flag? > > thanks > Paul This is the first such report on icc's bad behavior. Perhaps someone else will have some idea. Arch Robison (robison at kai.com) moved to work on icc. He would probably be interested in the smallest test case you can construct. If you send him information, would you please consider ccing me? Thanks, Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Wed Feb 26 19:27:12 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 26 Feb 2003 20:27:12 +0100 (CET) Subject: [pooma-dev] KCC versus icc In-Reply-To: <5.1.0.14.2.20030225080717.00b10fd0@popout.llnl.gov> Message-ID: Hi! I remember problems with the inliner, i.e. it refused to inline some of the expression template machinery. You might want to search for an option letting you tune the inlining behavior or try profile directed optimizations. With standard -O3 icc is not always faster than gcc3.2.2 with -O3. Richard. On Tue, 25 Feb 2003, Paul A. Renard wrote: > Hope I'm asking the correct crowd... > > Given the following: > > ?const int N=128; > ?Array<2,complex > u(N,N); > ?Iota<2>::Iota_t ij(iota(u.domain()); > ?Iota<2>::Index_t I(ij.comp(0)); > ?Iota<2>::Index_t J(ij.comp(1)); > ?Array<1,complex > cx(N), cy(N); > > ?// Values for u, cx, cy are filled elsewhere. > > ?// Then the following is called: > ?void compute(){ > ? u *= cx(I)*cy(J);??? // runs 4X slower with icc than KCC > ?} > > ?When I time this routine, I find that it runs about 4X slower when compiled with > Intel's icc (Version 7, -O3 -DNOPAssert -DNOCTASSERT) than with KCC (version 4.0f, > +K3 -DNOPAssert, -DNOCTAssert).? As expected, the KCC version runs as fast as > hand-written loops. > > Do others observe this same sluggish behavior with icc?? Am I missing some obvious > compile flag?? > > thanks > Paul >