From oldham at codesourcery.com Mon Dec 1 16:19:35 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 01 Dec 2003 08:19:35 -0800 Subject: [pooma-dev] POOMA Namespace Pollution In-Reply-To: References: Message-ID: <3FCB6A17.1000904@codesourcery.com> James Crotinger wrote: > Hi All, > > I thought that the various global (non-Pooma::) functions all had > Pooma:: objects as arguments, which should usually be enough to avoid > collisions with other people's stuff. What are the problem functions? > > I added namespace support to PETE a long time ago, but I believe it is > an option on the generator program that is used to generate the operator > files. Does CodeSourcery maintain the separate PETE repository? I don't > think this stuff was ever part of the Pooma distribution - we just > generated the operator includes and checked those in. CodeSourcery does not maintain a PETE repository. We never had access to the original CVS tree, and it has not undergone development during the past few years. > At any rate, we > didn't put the Pooma operators in a namespace because, at the time, some > of our compilers (probably most, in fact) didn't do Koenig lookup > correctly. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Dec 2 17:17:20 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 2 Dec 2003 18:17:20 +0100 (CET) Subject: [PATCH] Fix ReduceOverContexts Message-ID: Hi! The following patch fixes ReduceOverContext wrt "invalid" participants of the reduction (I dont think I ever came along these, though...). Seems obviously correct, but it isnt used extensively (only user of the validity argument seems to be Reduction and the case of invalidity is not covered in the testsuite). The problem is that if the owning context has an invalid value, we're calling the reduction operator on an uninitialized value. Ugh. Ok? Richard. ===== ReduceOverContexts.h 1.2 vs edited ===== --- 1.2/r2/src/Tulip/ReduceOverContexts.h Thu Oct 23 14:41:05 2003 +++ edited/ReduceOverContexts.h Tue Dec 2 18:07:24 2003 @@ -34,7 +34,8 @@ /** @file * @ingroup Tulip * @brief - * Undocumented. + * ReduceOverContexts encapsulates functionality like MPI_Reduce + * and MPI_Allreduce by means of the ReduceOverContexts::broadcast() method. */ #ifndef POOMA_CHEETAH_REDUCEOVERCONTEXTS_H @@ -196,11 +197,8 @@ #if POOMA_CHEETAH ReduceOverContexts(const T &val, int toContext = 0, bool valid = true) - : toContext_m(toContext) + : valid_m(false), toContext_m(toContext) { - if (valid) - value_m = val; - int tagBase = tagBase_m; tagBase_m += Pooma::contexts(); @@ -267,9 +265,10 @@ { if (v.valid()) { - if (me->toReceive_m == Pooma::contexts()) + if (!valid_m) { me->value_m = v.value(); + me->valid_m = true; } else { @@ -280,9 +279,13 @@ me->toReceive_m--; } - // The actual value we're reducing. + // The actual value we're reducing. T value_m; + + // If its valid. + + bool valid_m; // The number of messages we're receiving. From oldham at codesourcery.com Tue Dec 2 17:49:37 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 02 Dec 2003 09:49:37 -0800 Subject: [PATCH] Fix ReduceOverContexts In-Reply-To: References: Message-ID: <3FCCD0B1.1010707@codesourcery.com> Richard Guenther wrote: > Hi! > > The following patch fixes ReduceOverContext wrt "invalid" participants of > the reduction (I dont think I ever came along these, though...). > > Seems obviously correct, but it isnt used extensively (only user of the > validity argument seems to be Reduction and > the case of invalidity is not covered in the testsuite). > > The problem is that if the owning context has an invalid value, we're > calling the reduction operator on an uninitialized value. Ugh. > > Ok? Yes. Please correct the spelling noted below. > @@ -280,9 +279,13 @@ > me->toReceive_m--; > } > > - // The actual value we're reducing. > + // The actual value we're reducing. > > T value_m; > + > + // If its valid. Will you please change the spelling to: If it's valid. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Dec 2 19:22:37 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 2 Dec 2003 20:22:37 +0100 (CET) Subject: [pooma-dev] Re: [PATCH] Fix ReduceOverContexts In-Reply-To: <3FCCD0B1.1010707@codesourcery.com> References: <3FCCD0B1.1010707@codesourcery.com> Message-ID: On Tue, 2 Dec 2003, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > Hi! > > > > The following patch fixes ReduceOverContext wrt "invalid" participants of > > the reduction (I dont think I ever came along these, though...). > > > > Seems obviously correct, but it isnt used extensively (only user of the > > validity argument seems to be Reduction and > > the case of invalidity is not covered in the testsuite). > > > > The problem is that if the owning context has an invalid value, we're > > calling the reduction operator on an uninitialized value. Ugh. > > > > Ok? > > Yes. Please correct the spelling noted below. > > T value_m; > > + > > + // If its valid. > > Will you please change the spelling to: > > If it's valid. Ok, for the record, I committed the patch below which has an additional typo fixed (s/value_m/me->value_m/) revealed during a compile check. Richard. 2003Dec02 Richard Guenther * src/Tulip/ReduceOverContexts.h: handle case that reduction context has invalid value. Index: ReduceOverContexts.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/ReduceOverContexts.h,v retrieving revision 1.9 diff -u -u -r1.9 ReduceOverContexts.h --- ReduceOverContexts.h 21 Oct 2003 18:47:59 -0000 1.9 +++ ReduceOverContexts.h 2 Dec 2003 19:14:00 -0000 @@ -34,7 +34,8 @@ /** @file * @ingroup Tulip * @brief - * Undocumented. + * ReduceOverContexts encapsulates functionality like MPI_Reduce + * and MPI_Allreduce by means of the ReduceOverContexts::broadcast() method. */ #ifndef POOMA_CHEETAH_REDUCEOVERCONTEXTS_H @@ -196,11 +197,8 @@ #if POOMA_CHEETAH ReduceOverContexts(const T &val, int toContext = 0, bool valid = true) - : toContext_m(toContext) + : valid_m(false), toContext_m(toContext) { - if (valid) - value_m = val; - int tagBase = tagBase_m; tagBase_m += Pooma::contexts(); @@ -267,9 +265,10 @@ { if (v.valid()) { - if (me->toReceive_m == Pooma::contexts()) + if (!me->valid_m) { me->value_m = v.value(); + me->valid_m = true; } else { @@ -280,9 +279,13 @@ me->toReceive_m--; } - // The actual value we're reducing. + // The actual value we're reducing. T value_m; + + // If it's valid. + + bool valid_m; // The number of messages we're receiving. From rguenth at tat.physik.uni-tuebingen.de Wed Dec 3 20:50:35 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 3 Dec 2003 21:50:35 +0100 (CET) Subject: [PATCH] Add missing methods to DomainLayout Message-ID: Hi! For interoperability, the methods first(int) and blocks() need to be added to DomainLayout. This also (unrelated) moves the touches() method out of line. Tested by being in my tree for a long time. Ok? Richard. 2003Dec03 Richard Guenther * src/Layout/DomainLayout.h: add first(int) and blocks(). Move touches() out of line. Index: DomainLayout.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/DomainLayout.h,v retrieving revision 1.29 diff -u -u -r1.29 DomainLayout.h --- DomainLayout.h 26 Oct 2003 11:28:11 -0000 1.29 +++ DomainLayout.h 3 Dec 2003 20:42:50 -0000 @@ -193,6 +193,10 @@ return domain().initialized(); } + // d'th component of the lower left of the inner domain. + + inline int first(int d) const { return innerDomain()[d].first(); } + // A reference to our node object inline Value_t &node() @@ -205,6 +209,10 @@ return node_m; } + // Number of blocks in each dimension. + + inline Loc blocks() const { return Loc(1); } + // Return the global domain. inline const Domain_t &domain() const @@ -436,37 +444,7 @@ // either pointers or objects. template - int touches(const OtherDomain &d, OutIter o, ConstructTag ctag) const - { - int i, count = 0; - - // type of output elements - - typedef typename IntersectReturnType::Type_t - OutDomain_t; - typedef Node OutNode_t; - - // find the intersection of our domain and the given one - - OutDomain_t outDomain = intersect(d, domain()); - - // add in touching domain if there is anything that intersects - - if (!outDomain.empty()) - { - ++count; - *o = touchesConstruct(outDomain, - node().affinity(), - node().context(), - node().globalID(), - node().localID(), - ctag); - } - - // return the number of non-empty domains we found - - return count; - } + int touches(const OtherDomain &d, OutIter o, ConstructTag ctag) const; // Find local subdomains that touch on a given domain, and insert the // intersection of these subdomains into the given output iterator. Return @@ -535,6 +513,41 @@ Value_t node_m; }; + +template +template +int DomainLayout::touches(const OtherDomain &d, OutIter o, + ConstructTag ctag) const +{ + int i, count = 0; + + // type of output elements + + typedef typename IntersectReturnType::Type_t + OutDomain_t; + typedef Node OutNode_t; + + // find the intersection of our domain and the given one + + OutDomain_t outDomain = intersect(d, domain()); + + // add in touching domain if there is anything that intersects + + if (!outDomain.empty()) + { + ++count; + *o = touchesConstruct(outDomain, + node().affinity(), + node().context(), + node().globalID(), + node().localID(), + ctag); + } + + // return the number of non-empty domains we found + + return count; +} template From rguenth at tat.physik.uni-tuebingen.de Wed Dec 3 20:53:42 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 3 Dec 2003 21:53:42 +0100 (CET) Subject: [PATCH] Move print() methods ool Message-ID: Hi! The following patch moves print() methods out of the class bodies of GridLayout and UniformGridLayout. It also moves out of line code to the appropriate .cpp file instead of cluttering the header. Ok? Richard. Index: GridLayout.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/GridLayout.cpp,v retrieving revision 1.89 diff -u -u -r1.89 GridLayout.cpp --- GridLayout.cpp 11 Mar 2003 21:30:44 -0000 1.89 +++ GridLayout.cpp 3 Dec 2003 20:45:42 -0000 @@ -97,6 +97,13 @@ } template +GridLayoutData::~GridLayoutData() +{ + for (typename List_t::iterator a = this->all_m.begin(); a != this->all_m.end(); ++a) + delete (*a); +} + +template template inline void GridLayoutData::initialize(const Grid &gdom, const Partitioner &gpar, @@ -1127,6 +1134,33 @@ return count; } +template +template +void GridLayoutData::print(Out & ostr) +{ + int i; + ostr << " hasInternalGuards_m, hasExternalGuards_m " + << this->hasInternalGuards_m << ' ' << this->hasExternalGuards_m + << "\n internalGuards_m "; + for (i=0; iinternalGuards_m.upper(i) << '-' + << this->internalGuards_m.lower(i) << ' '; + ostr << "\n externalGuards_m "; + for (i=0; iexternalGuards_m.upper(i) << '-' + << this->externalGuards_m.lower(i) << ' '; + ostr << '\n'; + FillIterator_t gstart = this->gcFillList_m.begin(); + FillIterator_t gend = this->gcFillList_m.end(); + ostr << " this->gcFillList_m\n"; + for(; gstart!=gend; ++gstart) + ostr << " " + << gstart->domain_m << ' ' + << gstart->ownedID_m << ' ' + << gstart->guardID_m << '\n'; + ostr << std::flush; +} + //============================================================ @@ -1732,6 +1766,47 @@ { this->pdata_m->initialize(gdom, gpar, cmap); } + +template +template +void GridLayout::print(Ostream &ostr) const +{ + ostr << "GridLayout " << this->ID() << " on global domain " + << this->domain() << ":" << '\n'; + ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; + ostr << " Local subdomains: " << this->sizeLocal() << '\n'; + ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; + ostr << " Grid blocks: " << this->blocks() << '\n'; + typename GridLayout::const_iterator a; + for (a = this->beginGlobal(); a != this->endGlobal(); ++a) + ostr << " Global subdomain = " << *a << '\n'; + for (a = this->beginLocal(); a != this->endLocal(); ++a) + ostr << " Local subdomain = " << *a << '\n'; + for (a = this->beginRemote(); a != this->endRemote(); ++a) + ostr << " Remote subdomain = " << *a << '\n'; + this->pdata_m->print(ostr); +} + +template +template +void GridLayoutView::print(Ostream &ostr) const +{ + ostr << "GridLayoutView " << this->ID() << " on global domain " + << this->domain() << ':' << '\n'; + ostr << " Base ID: " << this->baseID() << '\n'; + ostr << " Base domain: " << this->baseDomain() << '\n'; + ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; + ostr << " Local subdomains: " << this->sizeLocal() << '\n'; + ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; + const_iterator a; + for (a = this->beginGlobal(); a != this->endGlobal(); ++a) + ostr << " Global subdomain = " << *a << '\n'; + for (a = this->beginLocal(); a != this->endLocal(); ++a) + ostr << " Local subdomain = " << *a << '\n'; + for (a = this->beginRemote(); a != this->endRemote(); ++a) + ostr << " Remote subdomain = " << *a << '\n'; +} + // } // namespace POOMA Index: GridLayout.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/GridLayout.h,v retrieving revision 1.110 diff -u -u -r1.110 GridLayout.h --- GridLayout.h 26 Oct 2003 11:28:11 -0000 1.110 +++ GridLayout.h 3 Dec 2003 20:45:44 -0000 @@ -207,11 +207,7 @@ /// which case we need to delete our nodes. The Observable destructor /// will broadcast messages up to all observers of the Layout. - ~GridLayoutData() - { - for (typename List_t::iterator a = this->all_m.begin(); a != this->all_m.end(); ++a) - delete (*a); - } + ~GridLayoutData(); //============================================================ // Mutators @@ -307,34 +303,13 @@ /// can build either pointers or objects. template - int touchesAlloc(const OtherDomain &fulld, OutIter o, - const ConstructTag &ctag) const; + int touchesAlloc(const OtherDomain &fulld, OutIter o, + const ConstructTag &ctag) const; void sync(); template - void print(Out & ostr) - { - int i; - ostr << " hasInternalGuards_m, hasExternalGuards_m " << - this->hasInternalGuards_m <<" " << this->hasExternalGuards_m <internalGuards_m.upper(i)<<"-"<internalGuards_m.lower(i)<<" "; - ostr <externalGuards_m.upper(i)<<"-"<externalGuards_m.lower(i)<<" "; - ostr <gcFillList_m.begin(); - FillIterator_t gend = this->gcFillList_m.end(); - ostr<< " this->gcFillList_m " <domain_m<<" " - <ownedID_m<<" " - <guardID_m< - void print(Ostream &ostr) const { - ostr << "GridLayout " << this->ID() << " on global domain " - << this->domain() << ":" << '\n'; - ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; - ostr << " Local subdomains: " << this->sizeLocal() << '\n'; - ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; - ostr << " Grid blocks: " << this->blocks() << '\n'; - typename GridLayout::const_iterator a; - for (a = this->beginGlobal(); a != this->endGlobal(); ++a) - ostr << " Global subdomain = " << *a << '\n'; - for (a = this->beginLocal(); a != this->endLocal(); ++a) - ostr << " Local subdomain = " << *a << '\n'; - for (a = this->beginRemote(); a != this->endRemote(); ++a) - ostr << " Remote subdomain = " << *a << '\n'; - this->pdata_m->print(ostr); - } + void print(Ostream &ostr) const; #if !POOMA_NO_TEMPLATE_FRIENDS @@ -1041,23 +1001,7 @@ // Print a GridLayoutView on an output stream template - void print(Ostream &ostr) const - { - ostr << "GridLayoutView " << this->ID() << " on global domain " - << this->domain() << ":" << '\n'; - ostr << " Base ID: " << this->baseID() << '\n'; - ostr << " Base domain: " << this->baseDomain() << '\n'; - ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; - ostr << " Local subdomains: " << this->sizeLocal() << '\n'; - ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; - const_iterator a; - for (a = this->beginGlobal(); a != this->endGlobal(); ++a) - ostr << " Global subdomain = " << *a << '\n'; - for (a = this->beginLocal(); a != this->endLocal(); ++a) - ostr << " Local subdomain = " << *a << '\n'; - for (a = this->beginRemote(); a != this->endRemote(); ++a) - ostr << " Remote subdomain = " << *a << '\n'; - } + void print(Ostream &ostr) const; #if !POOMA_NO_TEMPLATE_FRIENDS Index: UniformGridLayout.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/UniformGridLayout.cpp,v retrieving revision 1.40 diff -u -u -r1.40 UniformGridLayout.cpp --- UniformGridLayout.cpp 11 Mar 2003 21:30:44 -0000 1.40 +++ UniformGridLayout.cpp 3 Dec 2003 20:45:49 -0000 @@ -279,7 +279,7 @@ //----------------------------------------------------------------------------- // // template -// void UniformGridLayout::calcGCFillList() +// void UniformGridLayoutData::calcGCFillList() // // Calculates the cached information needed by MultiPatch Engine to // fill the guard cells. @@ -1182,6 +1182,950 @@ // Return the number of non-empty domains we found. return count; +} + + +//============================================================================= +// UniformGridLayout & UniformGridLayoutData inline method definitions +//============================================================================= + +//----------------------------------------------------------------------------- +// +// Constructors and Initialize methods +// +//----------------------------------------------------------------------------- + +// See comments in class definition above. + +template +inline UniformGridLayout:: +UniformGridLayout() +: LayoutBase > + (new LayoutData_t()), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const DistributedTag& t) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(), + DistributedMapper(UniformGridPartition()))), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const ReplicatedTag & t) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(), + LocalMapper())), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const GuardLayers_t &gcs, + const DistributedTag &) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(gcs), + DistributedMapper(UniformGridPartition(gcs)))), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const GuardLayers_t &gcs, + const ReplicatedTag & ) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(gcs), + LocalMapper())), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Loc &blocks, + const DistributedTag & ) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(blocks), + DistributedMapper( + UniformGridPartition(blocks)))), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Loc &blocks, + const ReplicatedTag & t) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(blocks), + LocalMapper())), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &igcs, + const DistributedTag &) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(blocks,igcs), + DistributedMapper( + UniformGridPartition(blocks,igcs)))), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &igcs, + const ReplicatedTag &) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(blocks,igcs), + LocalMapper())), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &igcs, + const GuardLayers_t &egcs, + const DistributedTag &) + +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(blocks,igcs,egcs), + DistributedMapper( + UniformGridPartition(blocks,igcs,egcs)))), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &igcs, + const GuardLayers_t &egcs, + const ReplicatedTag &t) +: LayoutBase > + (new LayoutData_t(gdom, + UniformGridPartition(blocks,igcs,egcs), + LocalMapper())), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Partitioner &gpar, + const DistributedTag & ) +: LayoutBase > + (new LayoutData_t(gdom,gpar,DistributedMapper(gpar))), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Partitioner &gpar, + const ReplicatedTag &) +: LayoutBase > + (new LayoutData_t(gdom,gpar,LocalMapper())), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +template +inline UniformGridLayout:: +UniformGridLayout(const Domain_t &gdom, + const Partitioner &gpar, + const ContextMapper & cmap) +: LayoutBase > + (new LayoutData_t(gdom,gpar,cmap)), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout:: +UniformGridLayout(const This_t &model) +: LayoutBase >(model.pdata_m), + Observable(*this) +{ + this->pdata_m->attach(*this); +} + +template +inline UniformGridLayout & UniformGridLayout:: +operator=(const This_t &model) +{ + if (this != &model) + { + this->pdata_m->detach(*this); + this->pdata_m = model.pdata_m; + this->pdata_m->attach(*this); + } + return *this; +} + +// Initialize methods... + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const DistributedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + + this->pdata_m->domain_m = gdom; + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->partition(UniformGridPartition(), + DistributedMapper(UniformGridPartition())); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const ReplicatedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + + this->pdata_m->domain_m = gdom; + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->partition(UniformGridPartition(), + LocalMapper()); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const GuardLayers_t &gcs, + const DistributedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(gcs), + DistributedMapper(UniformGridPartition(gcs) )); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const GuardLayers_t &gcs, + const ReplicatedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(gcs), + LocalMapper()); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Loc &blocks, + const DistributedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(blocks), + DistributedMapper(UniformGridPartition(blocks))); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Loc &blocks, + const ReplicatedTag &) +{ + PAssert(!this->initialized()); + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(blocks), + LocalMapper()); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &gcs, + const DistributedTag &) +{ + PAssert(!this->initialized()); + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(blocks, gcs), + DistributedMapper( + UniformGridPartition(blocks, gcs))); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &gcs, + const ReplicatedTag &) +{ + PAssert(!this->initialized()); + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(blocks, gcs), + LocalMapper()); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &igcs, + const GuardLayers_t &egcs, + const DistributedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->partition(UniformGridPartition(blocks, igcs, egcs), + DistributedMapper( + UniformGridPartition(blocks, igcs, egcs))); +} + +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Loc &blocks, + const GuardLayers_t &igcs, + const GuardLayers_t &egcs, + const ReplicatedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->blocks_m = blocks; + this->pdata_m->partition(UniformGridPartition(blocks, igcs, egcs), + LocalMapper()); +} + + +template +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Partitioner &p, + const DistributedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->blocks_m = p.blocks(); + this->pdata_m->partition(p,DistributedMapper(p)); +} + +template +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Partitioner &p, + const ReplicatedTag &) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->blocks_m = p.blocks(); + this->pdata_m->partition(p,LocalMapper()); +} +template +template +inline void +UniformGridLayout:: +initialize(const Domain_t &gdom, + const Partitioner &p, + const ContextMapper &cmap) +{ + PAssert(!this->initialized()); + + // Initialize our global domain, and then do the partitioning. + + this->pdata_m->innerdomain_m = gdom; + this->pdata_m->domain_m = gdom; + this->pdata_m->blocks_m = p.blocks(); + this->pdata_m->partition(p,cmap); +} + +// This initializer is intented to be used by the I/O system + +template +void UniformGridLayout::initialize(const Domain_t& idom, + const List_t& nodes, + const Loc& blocks, + bool hasIG, bool hasEG, + const GuardLayers_t& ig, + const GuardLayers_t& eg) +{ + this->pdata_m->initialize(idom,nodes,blocks,hasIG,hasEG,ig,eg); +} + +// Here are the implementations for globalID: + +template +inline int +UniformGridLayoutData::globalID(const Loc &loc) const +{ + // Make sure the point is in our domain. + PAssert(contains(this->domain_m, loc)); + int currloc; + + if (!this->hasExternalGuards_m) + { + currloc = (loc[0].first() - this->firsti_m[0]) / blocksizes_m[0]; + for (int d = 1; d < Dim; ++d) + currloc += blockstride_m[d] * + ((loc[d].first() - this->firsti_m[d]) / blocksizes_m[d]); + } + else + { + currloc = 0; + for (int d = 0; d < Dim; ++d) + { + int l = loc[d].first(); + + // If l < this->firsti_m[0], currloc is unchanged. + + if (l >= this->firsti_m[d]) + { + if (l <= this->innerdomain_m[d].last()) + { + // The usual expression in this direction. + + currloc += blockstride_m[d] * + ((l - this->firsti_m[d]) / blocksizes_m[d]); + } + else + { + // Must be in the last block in this direction. + + currloc += blockstride_m[d] * allDomain_m[d].last(); + } + } + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0) const +{ + PAssert(Dim == 1); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + + // Compute fortran-order index from position in block grid + // See the Loc version for comments. + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + } + + // Return the globalID for the currloc's node. + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0, int i1) const +{ + PAssert(Dim == 2); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); + + // Compute fortran-order index from position in block grid + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] + + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + if (i1 >= this->firsti_m[1]) { + if (i1 <= this->innerdomain_m[1].last()) + currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + else + currloc += blockstride_m[1] * allDomain_m[1].last(); + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0, int i1, int i2) const +{ + PAssert(Dim == 3); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); + PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); + + // Compute fortran-order index from position in block grid + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] + + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) + + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + if (i1 >= this->firsti_m[1]) { + if (i1 <= this->innerdomain_m[1].last()) + currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + else + currloc += blockstride_m[1] * allDomain_m[1].last(); + } + if (i2 >= this->firsti_m[2]) { + if (i2 <= this->innerdomain_m[2].last()) + currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); + else + currloc += blockstride_m[2] * allDomain_m[2].last(); + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3) const +{ + PAssert(Dim == 4); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); + PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); + PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); + + // Compute fortran-order index from position in block grid + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] + + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) + + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) + + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + if (i1 >= this->firsti_m[1]) { + if (i1 <= this->innerdomain_m[1].last()) + currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + else + currloc += blockstride_m[1] * allDomain_m[1].last(); + } + if (i2 >= this->firsti_m[2]) { + if (i2 <= this->innerdomain_m[2].last()) + currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); + else + currloc += blockstride_m[2] * allDomain_m[2].last(); + } + if (i3 >= this->firsti_m[3]) { + if (i3 <= this->innerdomain_m[3].last()) + currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); + else + currloc += blockstride_m[3] * allDomain_m[3].last(); + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3, + int i4) const +{ + PAssert(Dim == 5); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); + PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); + PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); + PAssert(i4 >= this->domain_m[4].first() && i4 <= this->domain_m[4].last()); + + // Compute fortran-order index from position in block grid + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] + + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) + + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) + + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]) + + blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + if (i1 >= this->firsti_m[1]) { + if (i1 <= this->innerdomain_m[1].last()) + currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + else + currloc += blockstride_m[1] * allDomain_m[1].last(); + } + if (i2 >= this->firsti_m[2]) { + if (i2 <= this->innerdomain_m[2].last()) + currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); + else + currloc += blockstride_m[2] * allDomain_m[2].last(); + } + if (i3 >= this->firsti_m[3]) { + if (i3 <= this->innerdomain_m[3].last()) + currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); + else + currloc += blockstride_m[3] * allDomain_m[3].last(); + } + if (i4 >= this->firsti_m[4]) { + if (i4 <= this->innerdomain_m[4].last()) + currloc += blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); + else + currloc += blockstride_m[4] * allDomain_m[4].last(); + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3, + int i4, int i5) const +{ + PAssert(Dim == 6); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); + PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); + PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); + PAssert(i4 >= this->domain_m[4].first() && i4 <= this->domain_m[4].last()); + PAssert(i5 >= this->domain_m[5].first() && i5 <= this->domain_m[5].last()); + + // Compute fortran-order index from position in block grid + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] + + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) + + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) + + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]) + + blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]) + + blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]); + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + if (i1 >= this->firsti_m[1]) { + if (i1 <= this->innerdomain_m[1].last()) + currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + else + currloc += blockstride_m[1] * allDomain_m[1].last(); + } + if (i2 >= this->firsti_m[2]) { + if (i2 <= this->innerdomain_m[2].last()) + currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); + else + currloc += blockstride_m[2] * allDomain_m[2].last(); + } + if (i3 >= this->firsti_m[3]) { + if (i3 <= this->innerdomain_m[3].last()) + currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); + else + currloc += blockstride_m[3] * allDomain_m[3].last(); + } + if (i4 >= this->firsti_m[4]) { + if (i4 <= this->innerdomain_m[4].last()) + currloc += blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); + else + currloc += blockstride_m[4] * allDomain_m[4].last(); + } + if (i5 >= this->firsti_m[5]) { + if (i5 <= this->innerdomain_m[5].last()) + currloc += blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]); + else + currloc += blockstride_m[5] * allDomain_m[5].last(); + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + +template +inline int +UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3, + int i4, int i5, int i6) const +{ + PAssert(Dim == 7); + PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); + PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); + PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); + PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); + PAssert(i4 >= this->domain_m[4].first() && i4 <= this->domain_m[4].last()); + PAssert(i5 >= this->domain_m[5].first() && i5 <= this->domain_m[5].last()); + PAssert(i6 >= this->domain_m[6].first() && i6 <= this->domain_m[6].last()); + + // Compute fortran-order index from position in block grid + + int currloc; + if (!this->hasExternalGuards_m) + { + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] + + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) + + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) + + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]) + + blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]) + + blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]) + + blockstride_m[6] * ((i6 - this->firsti_m[6]) / blocksizes_m[6]); + } + else + { + currloc = 0; + if (i0 >= this->firsti_m[0]) { + if (i0 <= this->innerdomain_m[0].last()) + currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; + else + currloc = allDomain_m[0].last(); + } + if (i1 >= this->firsti_m[1]) { + if (i1 <= this->innerdomain_m[1].last()) + currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); + else + currloc += blockstride_m[1] * allDomain_m[1].last(); + } + if (i2 >= this->firsti_m[2]) { + if (i2 <= this->innerdomain_m[2].last()) + currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); + else + currloc += blockstride_m[2] * allDomain_m[2].last(); + } + if (i3 >= this->firsti_m[3]) { + if (i3 <= this->innerdomain_m[3].last()) + currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); + else + currloc += blockstride_m[3] * allDomain_m[3].last(); + } + if (i4 >= this->firsti_m[4]) { + if (i4 <= this->innerdomain_m[4].last()) + currloc += blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); + else + currloc += blockstride_m[4] * allDomain_m[4].last(); + } + if (i5 >= this->firsti_m[5]) { + if (i5 <= this->innerdomain_m[5].last()) + currloc += blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]); + else + currloc += blockstride_m[5] * allDomain_m[5].last(); + } + if (i6 >= this->firsti_m[6]) { + if (i6 <= this->innerdomain_m[6].last()) + currloc += blockstride_m[6] * ((i6 - this->firsti_m[6]) / blocksizes_m[6]); + else + currloc += blockstride_m[6] * allDomain_m[6].last(); + } + } + + // Return the globalID for the currloc's node + + PAssert(currloc >= 0 && currloc < this->all_m.size()); + return currloc; +} + + +template +template +void UniformGridLayout::print(Ostream &ostr) const +{ + ostr << "UniformGridLayout " << this->ID() << " on global domain " + << this->domain() << ":" << '\n'; + ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; + ostr << " Local subdomains: " << this->sizeLocal() << '\n'; + ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; + ostr << " Grid blocks: " << this->blocks() << '\n'; + typename UniformGridLayout::const_iterator a; + for (a = this->beginGlobal(); a != this->endGlobal(); ++a) + ostr << " Global subdomain = " << *a << '\n'; + for (a = this->beginLocal(); a != this->endLocal(); ++a) + ostr << " Local subdomain = " << *a << '\n'; + for (a = this->beginRemote(); a != this->endRemote(); ++a) + ostr << " Remote subdomain = " << *a << '\n'; +} + +template +template +void UniformGridLayoutView::print(Ostream &ostr) const +{ + ostr << "UniformGridLayoutView " << this->ID() << " on global domain " + << this->domain() << ":" << '\n'; + ostr << " Base ID: " << this->baseID() << '\n'; + ostr << " Base domain: " << this->baseDomain() << '\n'; + ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; + ostr << " Local subdomains: " << this->sizeLocal() << '\n'; + ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; + const_iterator a; + for (a = this->beginGlobal(); a != this->endGlobal(); ++a) + ostr << " Global subdomain = " << *a << '\n'; + for (a = this->beginLocal(); a != this->endLocal(); ++a) + ostr << " Local subdomain = " << *a << '\n'; + for (a = this->beginRemote(); a != this->endRemote(); ++a) + ostr << " Remote subdomain = " << *a << '\n'; } Index: UniformGridLayout.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/UniformGridLayout.h,v retrieving revision 1.87 diff -u -u -r1.87 UniformGridLayout.h --- UniformGridLayout.h 26 Oct 2003 11:28:11 -0000 1.87 +++ UniformGridLayout.h 3 Dec 2003 20:45:51 -0000 @@ -596,21 +596,7 @@ // Print a UniformGridLayout on an output stream template - void print(Ostream &ostr) const { - ostr << "UniformGridLayout " << this->ID() << " on global domain " - << this->domain() << ":" << '\n'; - ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; - ostr << " Local subdomains: " << this->sizeLocal() << '\n'; - ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; - ostr << " Grid blocks: " << this->blocks() << '\n'; - typename UniformGridLayout::const_iterator a; - for (a = this->beginGlobal(); a != this->endGlobal(); ++a) - ostr << " Global subdomain = " << *a << '\n'; - for (a = this->beginLocal(); a != this->endLocal(); ++a) - ostr << " Local subdomain = " << *a << '\n'; - for (a = this->beginRemote(); a != this->endRemote(); ++a) - ostr << " Remote subdomain = " << *a << '\n'; - } + void print(Ostream &ostr) const; #if !POOMA_NO_TEMPLATE_FRIENDS @@ -881,23 +867,7 @@ // Print a UniformGridLayoutView on an output stream template - void print(Ostream &ostr) const - { - ostr << "UniformGridLayoutView " << this->ID() << " on global domain " - << this->domain() << ":" << '\n'; - ostr << " Base ID: " << this->baseID() << '\n'; - ostr << " Base domain: " << this->baseDomain() << '\n'; - ostr << " Total subdomains: " << this->sizeGlobal() << '\n'; - ostr << " Local subdomains: " << this->sizeLocal() << '\n'; - ostr << " Remote subdomains: " << this->sizeRemote() << '\n'; - const_iterator a; - for (a = this->beginGlobal(); a != this->endGlobal(); ++a) - ostr << " Global subdomain = " << *a << '\n'; - for (a = this->beginLocal(); a != this->endLocal(); ++a) - ostr << " Local subdomain = " << *a << '\n'; - for (a = this->beginRemote(); a != this->endRemote(); ++a) - ostr << " Remote subdomain = " << *a << '\n'; - } + void print(Ostream &ostr) const; #if !POOMA_NO_TEMPLATE_FRIENDS @@ -922,909 +892,6 @@ }; - -//============================================================================= -// UniformGridLayout & UniformGridLayoutData inline method definitions -//============================================================================= - -//----------------------------------------------------------------------------- -// -// Constructors and Initialize methods -// -//----------------------------------------------------------------------------- - -// See comments in class definition above. - -template -inline UniformGridLayout:: -UniformGridLayout() -: LayoutBase > - (new LayoutData_t()), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const DistributedTag& t) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(), - DistributedMapper(UniformGridPartition()))), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const ReplicatedTag & t) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(), - LocalMapper())), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const GuardLayers_t &gcs, - const DistributedTag &) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(gcs), - DistributedMapper(UniformGridPartition(gcs)))), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const GuardLayers_t &gcs, - const ReplicatedTag & ) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(gcs), - LocalMapper())), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Loc &blocks, - const DistributedTag & ) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(blocks), - DistributedMapper( - UniformGridPartition(blocks)))), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Loc &blocks, - const ReplicatedTag & t) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(blocks), - LocalMapper())), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &igcs, - const DistributedTag &) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(blocks,igcs), - DistributedMapper( - UniformGridPartition(blocks,igcs)))), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &igcs, - const ReplicatedTag &) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(blocks,igcs), - LocalMapper())), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &igcs, - const GuardLayers_t &egcs, - const DistributedTag &) - -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(blocks,igcs,egcs), - DistributedMapper( - UniformGridPartition(blocks,igcs,egcs)))), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &igcs, - const GuardLayers_t &egcs, - const ReplicatedTag &t) -: LayoutBase > - (new LayoutData_t(gdom, - UniformGridPartition(blocks,igcs,egcs), - LocalMapper())), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Partitioner &gpar, - const DistributedTag & ) -: LayoutBase > - (new LayoutData_t(gdom,gpar,DistributedMapper(gpar))), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Partitioner &gpar, - const ReplicatedTag &) -: LayoutBase > - (new LayoutData_t(gdom,gpar,LocalMapper())), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -template -inline UniformGridLayout:: -UniformGridLayout(const Domain_t &gdom, - const Partitioner &gpar, - const ContextMapper & cmap) -: LayoutBase > - (new LayoutData_t(gdom,gpar,cmap)), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout:: -UniformGridLayout(const This_t &model) -: LayoutBase >(model.pdata_m), - Observable(*this) -{ - this->pdata_m->attach(*this); -} - -template -inline UniformGridLayout & UniformGridLayout:: -operator=(const This_t &model) -{ - if (this != &model) - { - this->pdata_m->detach(*this); - this->pdata_m = model.pdata_m; - this->pdata_m->attach(*this); - } - return *this; -} - -// Initialize methods... - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const DistributedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - - this->pdata_m->domain_m = gdom; - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->partition(UniformGridPartition(), - DistributedMapper(UniformGridPartition())); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const ReplicatedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - - this->pdata_m->domain_m = gdom; - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->partition(UniformGridPartition(), - LocalMapper()); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const GuardLayers_t &gcs, - const DistributedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(gcs), - DistributedMapper(UniformGridPartition(gcs) )); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const GuardLayers_t &gcs, - const ReplicatedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(gcs), - LocalMapper()); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Loc &blocks, - const DistributedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(blocks), - DistributedMapper(UniformGridPartition(blocks))); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Loc &blocks, - const ReplicatedTag &) -{ - PAssert(!this->initialized()); - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(blocks), - LocalMapper()); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &gcs, - const DistributedTag &) -{ - PAssert(!this->initialized()); - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(blocks, gcs), - DistributedMapper( - UniformGridPartition(blocks, gcs))); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &gcs, - const ReplicatedTag &) -{ - PAssert(!this->initialized()); - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(blocks, gcs), - LocalMapper()); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &igcs, - const GuardLayers_t &egcs, - const DistributedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->partition(UniformGridPartition(blocks, igcs, egcs), - DistributedMapper( - UniformGridPartition(blocks, igcs, egcs))); -} - -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Loc &blocks, - const GuardLayers_t &igcs, - const GuardLayers_t &egcs, - const ReplicatedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->blocks_m = blocks; - this->pdata_m->partition(UniformGridPartition(blocks, igcs, egcs), - LocalMapper()); -} - - -template -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Partitioner &p, - const DistributedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->blocks_m = p.blocks(); - this->pdata_m->partition(p,DistributedMapper(p)); -} - -template -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Partitioner &p, - const ReplicatedTag &) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->blocks_m = p.blocks(); - this->pdata_m->partition(p,LocalMapper()); -} -template -template -inline void -UniformGridLayout:: -initialize(const Domain_t &gdom, - const Partitioner &p, - const ContextMapper &cmap) -{ - PAssert(!this->initialized()); - - // Initialize our global domain, and then do the partitioning. - - this->pdata_m->innerdomain_m = gdom; - this->pdata_m->domain_m = gdom; - this->pdata_m->blocks_m = p.blocks(); - this->pdata_m->partition(p,cmap); -} - -// This initializer is intented to be used by the I/O system - -template -void UniformGridLayout::initialize(const Domain_t& idom, - const List_t& nodes, - const Loc& blocks, - bool hasIG, bool hasEG, - const GuardLayers_t& ig, - const GuardLayers_t& eg) -{ - this->pdata_m->initialize(idom,nodes,blocks,hasIG,hasEG,ig,eg); -} - -// Here are the implementations for globalID: - -template -inline int -UniformGridLayoutData::globalID(const Loc &loc) const -{ - // Make sure the point is in our domain. - PAssert(contains(this->domain_m, loc)); - int currloc; - - if (!this->hasExternalGuards_m) - { - currloc = (loc[0].first() - this->firsti_m[0]) / blocksizes_m[0]; - for (int d = 1; d < Dim; ++d) - currloc += blockstride_m[d] * - ((loc[d].first() - this->firsti_m[d]) / blocksizes_m[d]); - } - else - { - currloc = 0; - for (int d = 0; d < Dim; ++d) - { - int l = loc[d].first(); - - // If l < this->firsti_m[0], currloc is unchanged. - - if (l >= this->firsti_m[d]) - { - if (l <= this->innerdomain_m[d].last()) - { - // The usual expression in this direction. - - currloc += blockstride_m[d] * - ((l - this->firsti_m[d]) / blocksizes_m[d]); - } - else - { - // Must be in the last block in this direction. - - currloc += blockstride_m[d] * allDomain_m[d].last(); - } - } - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0) const -{ - PAssert(Dim == 1); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - - // Compute fortran-order index from position in block grid - // See the Loc version for comments. - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - } - - // Return the globalID for the currloc's node. - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0, int i1) const -{ - PAssert(Dim == 2); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); - - // Compute fortran-order index from position in block grid - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] - + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - if (i1 >= this->firsti_m[1]) { - if (i1 <= this->innerdomain_m[1].last()) - currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - else - currloc += blockstride_m[1] * allDomain_m[1].last(); - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0, int i1, int i2) const -{ - PAssert(Dim == 3); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); - PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); - - // Compute fortran-order index from position in block grid - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] - + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) - + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - if (i1 >= this->firsti_m[1]) { - if (i1 <= this->innerdomain_m[1].last()) - currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - else - currloc += blockstride_m[1] * allDomain_m[1].last(); - } - if (i2 >= this->firsti_m[2]) { - if (i2 <= this->innerdomain_m[2].last()) - currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); - else - currloc += blockstride_m[2] * allDomain_m[2].last(); - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3) const -{ - PAssert(Dim == 4); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); - PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); - PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); - - // Compute fortran-order index from position in block grid - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] - + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) - + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) - + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - if (i1 >= this->firsti_m[1]) { - if (i1 <= this->innerdomain_m[1].last()) - currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - else - currloc += blockstride_m[1] * allDomain_m[1].last(); - } - if (i2 >= this->firsti_m[2]) { - if (i2 <= this->innerdomain_m[2].last()) - currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); - else - currloc += blockstride_m[2] * allDomain_m[2].last(); - } - if (i3 >= this->firsti_m[3]) { - if (i3 <= this->innerdomain_m[3].last()) - currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); - else - currloc += blockstride_m[3] * allDomain_m[3].last(); - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3, - int i4) const -{ - PAssert(Dim == 5); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); - PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); - PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); - PAssert(i4 >= this->domain_m[4].first() && i4 <= this->domain_m[4].last()); - - // Compute fortran-order index from position in block grid - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] - + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) - + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) - + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]) - + blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - if (i1 >= this->firsti_m[1]) { - if (i1 <= this->innerdomain_m[1].last()) - currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - else - currloc += blockstride_m[1] * allDomain_m[1].last(); - } - if (i2 >= this->firsti_m[2]) { - if (i2 <= this->innerdomain_m[2].last()) - currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); - else - currloc += blockstride_m[2] * allDomain_m[2].last(); - } - if (i3 >= this->firsti_m[3]) { - if (i3 <= this->innerdomain_m[3].last()) - currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); - else - currloc += blockstride_m[3] * allDomain_m[3].last(); - } - if (i4 >= this->firsti_m[4]) { - if (i4 <= this->innerdomain_m[4].last()) - currloc += blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); - else - currloc += blockstride_m[4] * allDomain_m[4].last(); - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3, - int i4, int i5) const -{ - PAssert(Dim == 6); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); - PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); - PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); - PAssert(i4 >= this->domain_m[4].first() && i4 <= this->domain_m[4].last()); - PAssert(i5 >= this->domain_m[5].first() && i5 <= this->domain_m[5].last()); - - // Compute fortran-order index from position in block grid - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] - + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) - + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) - + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]) - + blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]) - + blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]); - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - if (i1 >= this->firsti_m[1]) { - if (i1 <= this->innerdomain_m[1].last()) - currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - else - currloc += blockstride_m[1] * allDomain_m[1].last(); - } - if (i2 >= this->firsti_m[2]) { - if (i2 <= this->innerdomain_m[2].last()) - currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); - else - currloc += blockstride_m[2] * allDomain_m[2].last(); - } - if (i3 >= this->firsti_m[3]) { - if (i3 <= this->innerdomain_m[3].last()) - currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); - else - currloc += blockstride_m[3] * allDomain_m[3].last(); - } - if (i4 >= this->firsti_m[4]) { - if (i4 <= this->innerdomain_m[4].last()) - currloc += blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); - else - currloc += blockstride_m[4] * allDomain_m[4].last(); - } - if (i5 >= this->firsti_m[5]) { - if (i5 <= this->innerdomain_m[5].last()) - currloc += blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]); - else - currloc += blockstride_m[5] * allDomain_m[5].last(); - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} - -template -inline int -UniformGridLayoutData::globalID(int i0, int i1, int i2, int i3, - int i4, int i5, int i6) const -{ - PAssert(Dim == 7); - PAssert(i0 >= this->domain_m[0].first() && i0 <= this->domain_m[0].last()); - PAssert(i1 >= this->domain_m[1].first() && i1 <= this->domain_m[1].last()); - PAssert(i2 >= this->domain_m[2].first() && i2 <= this->domain_m[2].last()); - PAssert(i3 >= this->domain_m[3].first() && i3 <= this->domain_m[3].last()); - PAssert(i4 >= this->domain_m[4].first() && i4 <= this->domain_m[4].last()); - PAssert(i5 >= this->domain_m[5].first() && i5 <= this->domain_m[5].last()); - PAssert(i6 >= this->domain_m[6].first() && i6 <= this->domain_m[6].last()); - - // Compute fortran-order index from position in block grid - - int currloc; - if (!this->hasExternalGuards_m) - { - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0] - + blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]) - + blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]) - + blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]) - + blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]) - + blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]) - + blockstride_m[6] * ((i6 - this->firsti_m[6]) / blocksizes_m[6]); - } - else - { - currloc = 0; - if (i0 >= this->firsti_m[0]) { - if (i0 <= this->innerdomain_m[0].last()) - currloc = (i0 - this->firsti_m[0]) / blocksizes_m[0]; - else - currloc = allDomain_m[0].last(); - } - if (i1 >= this->firsti_m[1]) { - if (i1 <= this->innerdomain_m[1].last()) - currloc += blockstride_m[1] * ((i1 - this->firsti_m[1]) / blocksizes_m[1]); - else - currloc += blockstride_m[1] * allDomain_m[1].last(); - } - if (i2 >= this->firsti_m[2]) { - if (i2 <= this->innerdomain_m[2].last()) - currloc += blockstride_m[2] * ((i2 - this->firsti_m[2]) / blocksizes_m[2]); - else - currloc += blockstride_m[2] * allDomain_m[2].last(); - } - if (i3 >= this->firsti_m[3]) { - if (i3 <= this->innerdomain_m[3].last()) - currloc += blockstride_m[3] * ((i3 - this->firsti_m[3]) / blocksizes_m[3]); - else - currloc += blockstride_m[3] * allDomain_m[3].last(); - } - if (i4 >= this->firsti_m[4]) { - if (i4 <= this->innerdomain_m[4].last()) - currloc += blockstride_m[4] * ((i4 - this->firsti_m[4]) / blocksizes_m[4]); - else - currloc += blockstride_m[4] * allDomain_m[4].last(); - } - if (i5 >= this->firsti_m[5]) { - if (i5 <= this->innerdomain_m[5].last()) - currloc += blockstride_m[5] * ((i5 - this->firsti_m[5]) / blocksizes_m[5]); - else - currloc += blockstride_m[5] * allDomain_m[5].last(); - } - if (i6 >= this->firsti_m[6]) { - if (i6 <= this->innerdomain_m[6].last()) - currloc += blockstride_m[6] * ((i6 - this->firsti_m[6]) / blocksizes_m[6]); - else - currloc += blockstride_m[6] * allDomain_m[6].last(); - } - } - - // Return the globalID for the currloc's node - - PAssert(currloc >= 0 && currloc < this->all_m.size()); - return currloc; -} //============================================================ // NewDomain1 traits classes for UniformGridLayout and From rguenth at tat.physik.uni-tuebingen.de Wed Dec 3 20:57:06 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 3 Dec 2003 21:57:06 +0100 (CET) Subject: [PATCH] Obvious stuff Message-ID: Hi! This is the last one... fix a typo in the toplevel Makefile and remove a bogous #include in an example. Ok? Richard. Index: makefile =================================================================== RCS file: /home/pooma/Repository/r2/makefile,v retrieving revision 1.11 diff -u -u -r1.11 makefile --- makefile 28 Dec 2002 17:54:43 -0000 1.11 +++ makefile 3 Dec 2003 20:50:12 -0000 @@ -95,7 +95,7 @@ .PHONY: examples examplesclean $(EXAMPLEDIRS) examplesclean:: - @for i in $(EXAMPLESDIRS); do pushd $$i; make cleansuite; popd; done + @for i in $(EXAMPLEDIRS); do pushd $$i; make cleansuite; popd; done @$(MAKE) dirs examples:: $(EXAMPLEDIRS) Index: examples/Field/Caramana/ComputeVolumes.h =================================================================== RCS file: /home/pooma/Repository/r2/examples/Field/Caramana/ComputeVolumes.h,v retrieving revision 1.1 diff -u -u -r1.1 ComputeVolumes.h --- examples/Field/Caramana/ComputeVolumes.h 30 Aug 2001 01:14:23 -0000 1.1 +++ examples/Field/Caramana/ComputeVolumes.h 3 Dec 2003 20:50:32 -0000 @@ -38,7 +38,6 @@ //----------------------------------------------------------------------------- #include "Field/Field.h" -#include "Field/FieldInitializers.h" #include "Field/DiffOps/FieldStencil.h" #include "Product.h" From oldham at codesourcery.com Wed Dec 3 22:09:21 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 03 Dec 2003 14:09:21 -0800 Subject: [PATCH] Obvious stuff In-Reply-To: References: Message-ID: <3FCE5F11.60601@codesourcery.com> Richard Guenther wrote: > Hi! > > This is the last one... fix a typo in the toplevel Makefile and remove a > bogous #include in an example. > > Ok? Yes. Thank you. > Richard. > > Index: makefile > =================================================================== > RCS file: /home/pooma/Repository/r2/makefile,v > retrieving revision 1.11 > diff -u -u -r1.11 makefile > --- makefile 28 Dec 2002 17:54:43 -0000 1.11 > +++ makefile 3 Dec 2003 20:50:12 -0000 > @@ -95,7 +95,7 @@ > .PHONY: examples examplesclean $(EXAMPLEDIRS) > > examplesclean:: > - @for i in $(EXAMPLESDIRS); do pushd $$i; make cleansuite; popd; done > + @for i in $(EXAMPLEDIRS); do pushd $$i; make cleansuite; popd; done > @$(MAKE) dirs > > examples:: $(EXAMPLEDIRS) > Index: examples/Field/Caramana/ComputeVolumes.h > =================================================================== > RCS file: /home/pooma/Repository/r2/examples/Field/Caramana/ComputeVolumes.h,v > retrieving revision 1.1 > diff -u -u -r1.1 ComputeVolumes.h > --- examples/Field/Caramana/ComputeVolumes.h 30 Aug 2001 01:14:23 -0000 1.1 > +++ examples/Field/Caramana/ComputeVolumes.h 3 Dec 2003 20:50:32 -0000 > @@ -38,7 +38,6 @@ > //----------------------------------------------------------------------------- > > #include "Field/Field.h" > -#include "Field/FieldInitializers.h" > #include "Field/DiffOps/FieldStencil.h" > #include "Product.h" > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Wed Dec 3 23:04:25 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 03 Dec 2003 15:04:25 -0800 Subject: [PATCH] Add missing methods to DomainLayout In-Reply-To: References: Message-ID: <3FCE6BF9.6000603@codesourcery.com> Richard Guenther wrote: > Hi! > > For interoperability, the methods first(int) and blocks() need to be added > to DomainLayout. This also (unrelated) moves the touches() method out of > line. > > Tested by being in my tree for a long time. > > Ok? Yes. > Richard. > > > 2003Dec03 Richard Guenther > > * src/Layout/DomainLayout.h: add first(int) and blocks(). > Move touches() out of line. > > Index: DomainLayout.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Layout/DomainLayout.h,v > retrieving revision 1.29 > diff -u -u -r1.29 DomainLayout.h > --- DomainLayout.h 26 Oct 2003 11:28:11 -0000 1.29 > +++ DomainLayout.h 3 Dec 2003 20:42:50 -0000 > @@ -193,6 +193,10 @@ > return domain().initialized(); > } > > + // d'th component of the lower left of the inner domain. > + > + inline int first(int d) const { return innerDomain()[d].first(); } > + > // A reference to our node object > > inline Value_t &node() > @@ -205,6 +209,10 @@ > return node_m; > } > > + // Number of blocks in each dimension. > + > + inline Loc blocks() const { return Loc(1); } > + > // Return the global domain. > > inline const Domain_t &domain() const > @@ -436,37 +444,7 @@ > // either pointers or objects. > > template > - int touches(const OtherDomain &d, OutIter o, ConstructTag ctag) const > - { > - int i, count = 0; > - > - // type of output elements > - > - typedef typename IntersectReturnType::Type_t > - OutDomain_t; > - typedef Node OutNode_t; > - > - // find the intersection of our domain and the given one > - > - OutDomain_t outDomain = intersect(d, domain()); > - > - // add in touching domain if there is anything that intersects > - > - if (!outDomain.empty()) > - { > - ++count; > - *o = touchesConstruct(outDomain, > - node().affinity(), > - node().context(), > - node().globalID(), > - node().localID(), > - ctag); > - } > - > - // return the number of non-empty domains we found > - > - return count; > - } > + int touches(const OtherDomain &d, OutIter o, ConstructTag ctag) const; > > // Find local subdomains that touch on a given domain, and insert the > // intersection of these subdomains into the given output iterator. Return > @@ -535,6 +513,41 @@ > > Value_t node_m; > }; > + > +template > +template > +int DomainLayout::touches(const OtherDomain &d, OutIter o, > + ConstructTag ctag) const > +{ > + int i, count = 0; > + > + // type of output elements > + > + typedef typename IntersectReturnType::Type_t > + OutDomain_t; > + typedef Node OutNode_t; > + > + // find the intersection of our domain and the given one > + > + OutDomain_t outDomain = intersect(d, domain()); > + > + // add in touching domain if there is anything that intersects > + > + if (!outDomain.empty()) > + { > + ++count; > + *o = touchesConstruct(outDomain, > + node().affinity(), > + node().context(), > + node().globalID(), > + node().localID(), > + ctag); > + } > + > + // return the number of non-empty domains we found > + > + return count; > +} > > > template -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Wed Dec 3 20:47:11 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 3 Dec 2003 21:47:11 +0100 (CET) Subject: [PATCH] Guard Vector constructors Message-ID: Hi! I just had a swep over my local changes and found this (and others, see following mails). It guards the Vector constructors by dimensionality checks. I suppose this bit me at a time. Checked by being in my tree for a long time (don't know if that counts, though ;)). Ok? Richard. Index: Vector.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tiny/Vector.h,v retrieving revision 1.31 diff -u -u -r1.31 Vector.h --- Vector.h 21 Oct 2003 19:50:04 -0000 1.31 +++ Vector.h 3 Dec 2003 20:39:37 -0000 @@ -291,12 +291,14 @@ template inline VectorEngine(const X1& x, const X2& y) { + CTAssert(D == 2); x_m[0] = x; x_m[1] = y; } template inline VectorEngine(const X1& x, const X2& y, const X3& z) { + CTAssert(D == 3); x_m[0] = x; x_m[1] = y; x_m[2] = z; @@ -305,6 +307,7 @@ inline VectorEngine(const X1& x, const X2& y, const X3& z, const X4& a) { + CTAssert(D == 4); x_m[0] = x; x_m[1] = y; x_m[2] = z; From oldham at codesourcery.com Thu Dec 4 02:43:15 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 03 Dec 2003 18:43:15 -0800 Subject: [PATCH] Guard Vector constructors In-Reply-To: References: Message-ID: <3FCE9F43.9080501@codesourcery.com> Richard Guenther wrote: > Hi! > > I just had a swep over my local changes and found this (and others, see > following mails). It guards the Vector constructors by dimensionality > checks. I suppose this bit me at a time. > > Checked by being in my tree for a long time (don't know if that counts, > though ;)). > > Ok? OK. We'll watch for regressions in the nightly POOMA tests. Would you be willing to also send a ChangeLog entry with your patches? It will facilitate reviewing the patches. > Richard. > > Index: Vector.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tiny/Vector.h,v > retrieving revision 1.31 > diff -u -u -r1.31 Vector.h > --- Vector.h 21 Oct 2003 19:50:04 -0000 1.31 > +++ Vector.h 3 Dec 2003 20:39:37 -0000 > @@ -291,12 +291,14 @@ > template > inline VectorEngine(const X1& x, const X2& y) > { > + CTAssert(D == 2); > x_m[0] = x; > x_m[1] = y; > } > template > inline VectorEngine(const X1& x, const X2& y, const X3& z) > { > + CTAssert(D == 3); > x_m[0] = x; > x_m[1] = y; > x_m[2] = z; > @@ -305,6 +307,7 @@ > inline VectorEngine(const X1& x, const X2& y, const X3& z, > const X4& a) > { > + CTAssert(D == 4); > x_m[0] = x; > x_m[1] = y; > x_m[2] = z; -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Mon Dec 8 12:31:12 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 8 Dec 2003 13:31:12 +0100 (CET) Subject: [PATCH] Fix and test (unused) CollectFromContexts Message-ID: Hi! This patch fixes bugs in CollectFromContexts and adds a testcase for it. It's currently unused, but see the next patch. Ok? Richard. 2003Dec08 Richard Guenther * src/Tulip/CollectFromContexts.h: pack/unpack CollectionValue correctly, cleanup object, if valid. * src/Tulip/tests/CollectFromContextsTest.cpp: new. * src/Tulip/tests/makefile: add CollectFromContextsTest. ===== CollectFromContexts.h 1.3 vs edited ===== --- 1.3/r2/src/Tulip/CollectFromContexts.h Wed Dec 3 12:30:45 2003 +++ edited/CollectFromContexts.h Mon Dec 8 13:21:43 2003 @@ -34,7 +34,7 @@ /** @file * @ingroup Tulip * @brief - * Undocumented. + * CollectFromContext encapsulates functionality like MPI_Gather. */ #ifndef POOMA_MESSAGING_COLLECTFROMCONTEXTS_H @@ -136,7 +136,7 @@ static inline int pack(const CollectionValue &v, char *buffer) { int nBytes = Serialize::pack(v.valid(), buffer); - nBytes += Serialize::pack(v.context(), buffer); + nBytes += Serialize::pack(v.context(), buffer + nBytes); if (v.valid()) { @@ -154,7 +154,7 @@ int nBytes = Serialize::unpack(pvalid, buffer); - nBytes += Serialize::unpack(pcon, buffer); + nBytes += Serialize::unpack(pcon, buffer + nBytes); if (*pvalid) { @@ -163,6 +163,9 @@ vp = new CollectionValue(*pvalid, *pval, *pcon); + if (*pvalid) + Serialize::cleanup(pval); + return nBytes; } --- /dev/null Fri Mar 14 14:07:09 2003 +++ tests/CollectFromContextsTest.cpp Mon Dec 8 12:50:01 2003 @@ -0,0 +1,82 @@ +// -*- C++ -*- +// ACL:license +// ---------------------------------------------------------------------- +// This software and ancillary information (herein called "SOFTWARE") +// called POOMA (Parallel Object-Oriented Methods and Applications) is +// made available under the terms described here. The SOFTWARE has been +// approved for release with associated LA-CC Number LA-CC-98-65. +// +// Unless otherwise indicated, this SOFTWARE has been authored by an +// employee or employees of the University of California, operator of the +// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with +// the U.S. Department of Energy. The U.S. Government has rights to use, +// reproduce, and distribute this SOFTWARE. The public may copy, distribute, +// prepare derivative works and publicly display this SOFTWARE without +// charge, provided that this Notice and any statement of authorship are +// reproduced on all copies. Neither the Government nor the University +// makes any warranty, express or implied, or assumes any liability or +// responsibility for the use of this SOFTWARE. +// +// If SOFTWARE is modified to produce derivative works, such modified +// SOFTWARE should be clearly marked, so as not to confuse it with the +// version available from LANL. +// +// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, +// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. +// ---------------------------------------------------------------------- +// ACL:license + +//----------------------------------------------------------------------------- +// Test of PatchSizeSyncer +//----------------------------------------------------------------------------- + +// Include files + +#include "Tulip/Messaging.h" +#include "Tulip/CollectFromContexts.h" +#include "Pooma/Pooma.h" +#include "Utilities/Tester.h" + + +int main(int argc, char *argv[]) +{ + Pooma::initialize(argc, argv); + Pooma::Tester tester(argc, argv); + + const int numContexts = Pooma::contexts(); + const int myContext = Pooma::context(); + + tester.out() << "Running with " << numContexts << " contexts." << std::endl; + + CollectFromContexts ranks(2*(Pooma::context()+1)); + if (Pooma::context() == 0) { + bool check = true; + for (int i=0; i ranks2(Pooma::context()+1, 0, + Pooma::context() > 0 + && Pooma::context() < Pooma::contexts()-1); + if (Pooma::context() == 0) { + bool check = true; + for (int i=1; i Hi! This patch makes use of CollectFromContexts inside PatchSizeSyncer::calcGlobalGrid(), instead of handcrafting a Cheetah based implementation. This reduces explicit Cheetah dependence to fewer places (to aid adding a native MPI implementation). Tested with a native MPI implementation. Ok? Richard. 2003Dec08 Richard Guenther * src/Tulip/PatchSizeSyncer.cmpl.cpp: use CollectFromContexts for gather operation. ===== PatchSizeSyncer.cmpl.cpp 1.1 vs edited ===== --- 1.1/r2/src/Tulip/PatchSizeSyncer.cmpl.cpp Mon May 13 17:47:45 2002 +++ edited/PatchSizeSyncer.cmpl.cpp Fri Dec 5 16:46:01 2003 @@ -34,19 +34,16 @@ // Includes: //----------------------------------------------------------------------------- +#include "Tulip/Messaging.h" #include "Tulip/PatchSizeSyncer.h" #include "Tulip/RemoteProxy.h" +#include "Tulip/CollectFromContexts.h" +#include #include namespace Pooma { -int PatchSizeSyncer::tag_s = 0; - -#if POOMA_CHEETAH -Cheetah::MatchingHandler *PatchSizeSyncer::handler_s = 0; -#endif - //----------------------------------------------------------------------------- // PatchSize constructor & destructor... //----------------------------------------------------------------------------- @@ -68,19 +65,6 @@ } //----------------------------------------------------------------------------- -// PatchSizeSyncer::receiveGrid -// -// This function is passed on to the matching-handler and is invoked when -// a message is received. -//----------------------------------------------------------------------------- - -void PatchSizeSyncer::receiveGrid(std::pair &incoming) -{ - gridList_m.push_back( - std::make_pair(incoming.first,new Grid_t(incoming.second))); -} - -//----------------------------------------------------------------------------- // PatchSizeSyncer::calcGlobalGrid // // Does a reduction of the grids, sending all the local grids to @@ -108,40 +92,12 @@ { #if POOMA_CHEETAH - // Each context will send their local Grid to context 0. - // We'll offset the base tag by the context number - 1 to - // generate the tags for this. - - int tagbase = tag_s; - tag_s += numContexts_m - 1; - Grid<1> result; - if (myContext_m != 0) - { - handler_s->send(0, tagbase + myContext_m - 1, - std::make_pair(localKey_m,localGrid_m)); - } - else + CollectFromContexts > collection + (std::make_pair(localKey_m,localGrid_m)); + if (myContext_m == 0) { - // Push the context 0 grid onto the list: - - gridList_m.push_back(std::make_pair(localKey_m,new Grid_t(localGrid_m))); - - // Request messages from the other contexts, which - // will result in receiveGrid being invoked and - // the remainder of gridList_m being filled. - - for (int i = 1; i < numContexts_m; ++i) - { - handler_s->request(i, tagbase + i - 1, - &PatchSizeSyncer::receiveGrid, - this); - } - - while (gridList_m.size() < numContexts_m) - Pooma::poll(); - // The grid list is full. We sort it and then renormalize the // domains to make them globally consistent. The // renormalization is done by looking through the list and @@ -149,6 +105,10 @@ // have been added on the previous grids. We simultaneously // calculate the total number of points, needed to form the // global result. + + for (int j = 0; j < numContexts_m; ++j) + gridList_m.push_back(Elem_t(collection[j].first, + new Grid_t(collection[j].second))); std::sort(gridList_m.begin(),gridList_m.end(),ElemCompare()); From rguenth at tat.physik.uni-tuebingen.de Mon Dec 8 14:04:13 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 8 Dec 2003 15:04:13 +0100 (CET) Subject: [Q] ReceiveIterate asymmetry bug? Message-ID: Hi! ReceiveIterate looks suspiciously asymmetric wrt POOMA_REORDER_ITERATES, as a write request is always allocated, but for !POOMA_REORDER_ITERATES the request is not released. May I suggest doing the release in the destructor, as it is done for the SendIterate? I suppose this may be because the run() method doesnt block in all cases? In which case there would be a bug, as the ReceiveIterate object would get destroyed, but the static apply method still gets arguments referencing it? How is this supposed to work? Thanks, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Mon Dec 8 14:11:36 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 8 Dec 2003 15:11:36 +0100 (CET) Subject: [pooma-dev] [Q] ReceiveIterate asymmetry bug? In-Reply-To: References: Message-ID: On Mon, 8 Dec 2003, Richard Guenther wrote: > Hi! > > ReceiveIterate looks suspiciously asymmetric wrt POOMA_REORDER_ITERATES, > as a write request is always allocated, but for !POOMA_REORDER_ITERATES > the request is not released. May I suggest doing the release in the > destructor, as it is done for the SendIterate? I suppose this may be > because the run() method doesnt block in all cases? In which case there > would be a bug, as the ReceiveIterate object would get destroyed, but the > static apply method still gets arguments referencing it? > > How is this supposed to work? Just to mention it, I see random repeatable deadlocks and faults with blockingExpressions(false) and the serial async scheduler using cheetah that are cured using blockingExpressions(true). Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Mon Dec 8 16:04:40 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 08 Dec 2003 08:04:40 -0800 Subject: [PATCH] Fix and test (unused) CollectFromContexts In-Reply-To: References: Message-ID: <3FD4A118.90704@codesourcery.com> Richard Guenther wrote: > Hi! > > This patch fixes bugs in CollectFromContexts and adds a testcase for it. > It's currently unused, but see the next patch. > > Ok? Yes. > Richard. > > > 2003Dec08 Richard Guenther > > * src/Tulip/CollectFromContexts.h: pack/unpack CollectionValue > correctly, cleanup object, if valid. > * src/Tulip/tests/CollectFromContextsTest.cpp: new. > * src/Tulip/tests/makefile: add CollectFromContextsTest. > > ===== CollectFromContexts.h 1.3 vs edited ===== > --- 1.3/r2/src/Tulip/CollectFromContexts.h Wed Dec 3 12:30:45 2003 > +++ edited/CollectFromContexts.h Mon Dec 8 13:21:43 2003 > @@ -34,7 +34,7 @@ > /** @file > * @ingroup Tulip > * @brief > - * Undocumented. > + * CollectFromContext encapsulates functionality like MPI_Gather. > */ > > #ifndef POOMA_MESSAGING_COLLECTFROMCONTEXTS_H > @@ -136,7 +136,7 @@ > static inline int pack(const CollectionValue &v, char *buffer) > { > int nBytes = Serialize::pack(v.valid(), buffer); > - nBytes += Serialize::pack(v.context(), buffer); > + nBytes += Serialize::pack(v.context(), buffer + nBytes); > > if (v.valid()) > { > @@ -154,7 +154,7 @@ > > int nBytes = Serialize::unpack(pvalid, buffer); > > - nBytes += Serialize::unpack(pcon, buffer); > + nBytes += Serialize::unpack(pcon, buffer + nBytes); > > if (*pvalid) > { > @@ -163,6 +163,9 @@ > > vp = new CollectionValue(*pvalid, *pval, *pcon); > > + if (*pvalid) > + Serialize::cleanup(pval); > + > return nBytes; > } > > --- /dev/null Fri Mar 14 14:07:09 2003 > +++ tests/CollectFromContextsTest.cpp Mon Dec 8 12:50:01 2003 > @@ -0,0 +1,82 @@ > +// -*- C++ -*- > +// ACL:license > +// ---------------------------------------------------------------------- > +// This software and ancillary information (herein called "SOFTWARE") > +// called POOMA (Parallel Object-Oriented Methods and Applications) is > +// made available under the terms described here. The SOFTWARE has been > +// approved for release with associated LA-CC Number LA-CC-98-65. > +// > +// Unless otherwise indicated, this SOFTWARE has been authored by an > +// employee or employees of the University of California, operator of the > +// Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with > +// the U.S. Department of Energy. The U.S. Government has rights to use, > +// reproduce, and distribute this SOFTWARE. The public may copy, distribute, > +// prepare derivative works and publicly display this SOFTWARE without > +// charge, provided that this Notice and any statement of authorship are > +// reproduced on all copies. Neither the Government nor the University > +// makes any warranty, express or implied, or assumes any liability or > +// responsibility for the use of this SOFTWARE. > +// > +// If SOFTWARE is modified to produce derivative works, such modified > +// SOFTWARE should be clearly marked, so as not to confuse it with the > +// version available from LANL. > +// > +// For more information about POOMA, send e-mail to pooma at acl.lanl.gov, > +// or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. > +// ---------------------------------------------------------------------- > +// ACL:license > + > +//----------------------------------------------------------------------------- > +// Test of PatchSizeSyncer > +//----------------------------------------------------------------------------- > + > +// Include files > + > +#include "Tulip/Messaging.h" > +#include "Tulip/CollectFromContexts.h" > +#include "Pooma/Pooma.h" > +#include "Utilities/Tester.h" > + > + > +int main(int argc, char *argv[]) > +{ > + Pooma::initialize(argc, argv); > + Pooma::Tester tester(argc, argv); > + > + const int numContexts = Pooma::contexts(); > + const int myContext = Pooma::context(); > + > + tester.out() << "Running with " << numContexts << " contexts." << std::endl; > + > + CollectFromContexts ranks(2*(Pooma::context()+1)); > + if (Pooma::context() == 0) { > + bool check = true; > + for (int i=0; i + if (ranks[i] != 2*(i+1)) { > + tester.out() << "[" << i << "] should be " > + << 2*(i+1) << ", but is " << ranks[i] << "\n"; > + check = false; > + } > + tester.check("Collecting ranks", check); > + } > + > + CollectFromContexts ranks2(Pooma::context()+1, 0, > + Pooma::context() > 0 > + && Pooma::context() < Pooma::contexts()-1); > + if (Pooma::context() == 0) { > + bool check = true; > + for (int i=1; i + if (ranks2[i] != i+1) { > + tester.out() << "[" << i << "] should be " > + << (i+1) << ", but is " << ranks[i] << "\n"; > + check = false; > + } > + tester.check("Collecting ranks, but not first and last", check); > + } > + > + int ret = tester.results("CollectFromContextsTest"); > + Pooma::finalize(); > + > + return ret; > +} > + > ===== tests/makefile 1.3 vs edited ===== > --- 1.3/r2/src/Tulip/tests/makefile Wed Jan 8 10:27:36 2003 > +++ edited/tests/makefile Fri Dec 5 16:03:32 2003 > @@ -36,7 +36,7 @@ > > TESTS = ReduceOverContextsTest GridMessageTest \ > GridBroadcastTest PatchSizeSyncerTest \ > - VectorBroadcastTest > + VectorBroadcastTest CollectFromContextsTest > > default:: build > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Dec 8 16:05:03 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 08 Dec 2003 08:05:03 -0800 Subject: [PATCH] Make PatchSizeSyncer::calcGlobalGrid() generic In-Reply-To: References: Message-ID: <3FD4A12F.6060706@codesourcery.com> Richard Guenther wrote: > Hi! > > This patch makes use of CollectFromContexts inside > PatchSizeSyncer::calcGlobalGrid(), instead of handcrafting a Cheetah based > implementation. This reduces explicit Cheetah dependence to fewer places > (to aid adding a native MPI implementation). > > Tested with a native MPI implementation. > > Ok? Yes. Thanks. > Richard. > > > 2003Dec08 Richard Guenther > > * src/Tulip/PatchSizeSyncer.cmpl.cpp: use CollectFromContexts for > gather operation. > > ===== PatchSizeSyncer.cmpl.cpp 1.1 vs edited ===== > --- 1.1/r2/src/Tulip/PatchSizeSyncer.cmpl.cpp Mon May 13 17:47:45 2002 > +++ edited/PatchSizeSyncer.cmpl.cpp Fri Dec 5 16:46:01 2003 > @@ -34,19 +34,16 @@ > // Includes: > //----------------------------------------------------------------------------- > > +#include "Tulip/Messaging.h" > #include "Tulip/PatchSizeSyncer.h" > #include "Tulip/RemoteProxy.h" > +#include "Tulip/CollectFromContexts.h" > > +#include > #include > > namespace Pooma { > > -int PatchSizeSyncer::tag_s = 0; > - > -#if POOMA_CHEETAH > -Cheetah::MatchingHandler *PatchSizeSyncer::handler_s = 0; > -#endif > - > //----------------------------------------------------------------------------- > // PatchSize constructor & destructor... > //----------------------------------------------------------------------------- > @@ -68,19 +65,6 @@ > } > > //----------------------------------------------------------------------------- > -// PatchSizeSyncer::receiveGrid > -// > -// This function is passed on to the matching-handler and is invoked when > -// a message is received. > -//----------------------------------------------------------------------------- > - > -void PatchSizeSyncer::receiveGrid(std::pair &incoming) > -{ > - gridList_m.push_back( > - std::make_pair(incoming.first,new Grid_t(incoming.second))); > -} > - > -//----------------------------------------------------------------------------- > // PatchSizeSyncer::calcGlobalGrid > // > // Does a reduction of the grids, sending all the local grids to > @@ -108,40 +92,12 @@ > { > #if POOMA_CHEETAH > > - // Each context will send their local Grid to context 0. > - // We'll offset the base tag by the context number - 1 to > - // generate the tags for this. > - > - int tagbase = tag_s; > - tag_s += numContexts_m - 1; > - > Grid<1> result; > > - if (myContext_m != 0) > - { > - handler_s->send(0, tagbase + myContext_m - 1, > - std::make_pair(localKey_m,localGrid_m)); > - } > - else > + CollectFromContexts > collection > + (std::make_pair(localKey_m,localGrid_m)); > + if (myContext_m == 0) > { > - // Push the context 0 grid onto the list: > - > - gridList_m.push_back(std::make_pair(localKey_m,new Grid_t(localGrid_m))); > - > - // Request messages from the other contexts, which > - // will result in receiveGrid being invoked and > - // the remainder of gridList_m being filled. > - > - for (int i = 1; i < numContexts_m; ++i) > - { > - handler_s->request(i, tagbase + i - 1, > - &PatchSizeSyncer::receiveGrid, > - this); > - } > - > - while (gridList_m.size() < numContexts_m) > - Pooma::poll(); > - > // The grid list is full. We sort it and then renormalize the > // domains to make them globally consistent. The > // renormalization is done by looking through the list and > @@ -149,6 +105,10 @@ > // have been added on the previous grids. We simultaneously > // calculate the total number of points, needed to form the > // global result. > + > + for (int j = 0; j < numContexts_m; ++j) > + gridList_m.push_back(Elem_t(collection[j].first, > + new Grid_t(collection[j].second))); > > std::sort(gridList_m.begin(),gridList_m.end(),ElemCompare()); > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Dec 9 10:40:52 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 9 Dec 2003 11:40:52 +0100 (CET) Subject: [pooma-dev] [Q] ReceiveIterate asymmetry bug? In-Reply-To: References: Message-ID: It seems nobody is interested in this - but to summarize, with messaging using Cheetah the message requesting machinery in Tulip/SendReceive.h uses possibly stale memory if reordering iterates is allowed (which is the default for --messaging config). A patch like the following is needed (which basically kills out of order processing of these messages), or a more elaborate fix like constructing the needed view in extra memory and freeing that inside the matching handler. Richard. ===== SendReceive.h 1.4 vs edited ===== --- 1.4/r2/src/Tulip/SendReceive.h Tue Dec 2 18:40:12 2003 +++ edited/SendReceive.h Tue Dec 9 11:36:34 2003 @@ -134,7 +134,7 @@ * ReceiveIterate requests a write lock on a piece of data. When that lock * is granted, we register the data with the cheetah matching handler which * will fill the block when a message arrives. The write lock is released - * by the matching handler. + * by the destructor after ensuring we're finished with processing. */ template @@ -166,65 +166,41 @@ engineFunctor(view, writeReq); Pooma::addIncomingMessage(); + ready_m = false; } virtual ~ReceiveIterate() { - } - - // If we're using cheetah, but don't support out-of-order execution, then - // the run method of this iterate must block until the message has been - // received. Unlike typical iterates, the work implied by this iterate - // isn't actually performed in the run method. The run method merely - // registers a method that gets handled by cheetah when the appropriate - // message arrives. - -#if !POOMA_REORDER_ITERATES - - bool ready_m; - - static void handle(This_t *me, IncomingView &viewMessage) - { - apply(me->view_m, viewMessage); - me->ready_m = true; - } - - virtual void run() - { - ready_m = false; - Pooma::remoteEngineHandler()->request(fromContext_m, tag_m, - This_t::handle, this); - + // Be sure we have received the data. while (!ready_m) - { Pooma::poll(); - } + + // Release the received block: + DataObjectRequest writeReq; + engineFunctor(viewLocal, writeReq); + Pooma::gotIncomingMessage(); } -#else + // Unlike typical iterates, the work implied by this iterate + // isn't actually performed in the run method. The run method merely + // registers a method that gets handled by cheetah when the appropriate + // message arrives. So we need to be careful we finished processing + // before we destruct the iterate. virtual void run() { Pooma::remoteEngineHandler()->request(fromContext_m, tag_m, - This_t::apply, view_m); + This_t::apply, this); } -#endif - private: - static void apply(const View &viewLocal, IncomingView &viewMessage) + static void apply(This_t *me, IncomingView &viewMessage) { // For now, we just copy the message into the brick accepting the data. - - KernelEvaluator::evaluate(viewLocal, OpAssign(), + KernelEvaluator::evaluate(me->view_m, OpAssign(), viewMessage); - - // Release the received block: - DataObjectRequest writeReq; - engineFunctor(viewLocal, writeReq); - - Pooma::gotIncomingMessage(); + me->ready_m = true; } // Context we're sending the data to. @@ -239,6 +215,10 @@ // engine).; View view_m; + + // Flag if we have received the data. + + bool ready_m; }; /** From rguenth at tat.physik.uni-tuebingen.de Tue Dec 9 12:46:15 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 9 Dec 2003 13:46:15 +0100 (CET) Subject: [PATCH] Fix Tiny t1 test Message-ID: Hi! This patch properly initializes the t1 test to allow for checking in a parallel environment. Ok? Richard. 2003Dec09 Richard Guenther * src/Tiny/tests/t1.cpp: initialize pooma library. ===== t1.cpp 1.2 vs edited ===== --- 1.2/r2/src/Tiny/tests/t1.cpp Thu Jan 30 22:35:02 2003 +++ edited/t1.cpp Tue Dec 9 13:43:19 2003 @@ -32,6 +32,7 @@ #include #include +#include "Pooma/Pooma.h" #include "Utilities/Tester.h" #include "Tiny/TinyMatrix.h" #include "Tiny/Vector.h" @@ -495,6 +496,7 @@ int main(int argc, char **argv) { + Pooma::initialize(argc, argv); tester = new Pooma::Tester(argc, argv); testTinyMatrixDot(); @@ -519,7 +521,9 @@ testBoundsChecking(); #endif - return tester->results("t1"); + int ret = tester->results("t1"); + Pooma::finalize(); + return ret; } // ACL:rcsinfo From rguenth at tat.physik.uni-tuebingen.de Tue Dec 9 13:03:40 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 9 Dec 2003 14:03:40 +0100 (CET) Subject: [PATCH] Fix CollectFromContexts for Cheetah Message-ID: Hi! This patch fixes CollectFromContexts for Cheetah use, fixes failures of CollectFromContextsTest and PatchSizeSyncerTest. Ok? Richard. 2003Dec09 Richard Guenther * src/Tulip/CollectFromContexts.h: fix typo in assignment, include PAssert.h. ===== CollectFromContexts.h 1.4 vs edited ===== --- 1.4/r2/src/Tulip/CollectFromContexts.h Tue Dec 9 12:16:08 2003 +++ edited/CollectFromContexts.h Tue Dec 9 13:55:08 2003 @@ -45,6 +45,7 @@ //----------------------------------------------------------------------------- #include "Tulip/Messaging.h" +#include "Utilities/PAssert.h" #include @@ -329,7 +330,7 @@ { if (v.valid()) { - me->data_m[v.context()] == v.value(); + me->data_m[v.context()] = v.value(); } me->toReceive_m--; From jcrotinger at proximation.com Tue Dec 9 15:50:46 2003 From: jcrotinger at proximation.com (James Crotinger) Date: Tue, 9 Dec 2003 08:50:46 -0700 Subject: [pooma-dev] [Q] ReceiveIterate asymmetry bug? Message-ID: Hi Richard, I'm interested, but very busy at the moment. This stuff was tested fairly strenuously back in '97, including purified, so if there is a resource bug, it has snuck in since. Unfortunately, the out-of-order execution details involving multiple contexts are more than a little rusty in my brain, and I don't see that I'll have time to review this soon. I'm pretty sure that out-of-order handling of these messages is critical if you want to get any advantage of out-of-order execution. Jim -----Original Message----- From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] Sent: Tuesday, December 09, 2003 3:41 AM To: pooma-dev at pooma.codesourcery.com Subject: Re: [pooma-dev] [Q] ReceiveIterate asymmetry bug? It seems nobody is interested in this - but to summarize, with messaging using Cheetah the message requesting machinery in Tulip/SendReceive.h uses possibly stale memory if reordering iterates is allowed (which is the default for --messaging config). A patch like the following is needed (which basically kills out of order processing of these messages), or a more elaborate fix like constructing the needed view in extra memory and freeing that inside the matching handler. Richard. ===== SendReceive.h 1.4 vs edited ===== --- 1.4/r2/src/Tulip/SendReceive.h Tue Dec 2 18:40:12 2003 +++ edited/SendReceive.h Tue Dec 9 11:36:34 2003 @@ -134,7 +134,7 @@ * ReceiveIterate requests a write lock on a piece of data. When that lock * is granted, we register the data with the cheetah matching handler which * will fill the block when a message arrives. The write lock is released - * by the matching handler. + * by the destructor after ensuring we're finished with processing. */ template @@ -166,65 +166,41 @@ engineFunctor(view, writeReq); Pooma::addIncomingMessage(); + ready_m = false; } virtual ~ReceiveIterate() { - } - - // If we're using cheetah, but don't support out-of-order execution, then - // the run method of this iterate must block until the message has been - // received. Unlike typical iterates, the work implied by this iterate - // isn't actually performed in the run method. The run method merely - // registers a method that gets handled by cheetah when the appropriate - // message arrives. - -#if !POOMA_REORDER_ITERATES - - bool ready_m; - - static void handle(This_t *me, IncomingView &viewMessage) - { - apply(me->view_m, viewMessage); - me->ready_m = true; - } - - virtual void run() - { - ready_m = false; - Pooma::remoteEngineHandler()->request(fromContext_m, tag_m, - This_t::handle, this); - + // Be sure we have received the data. while (!ready_m) - { Pooma::poll(); - } + + // Release the received block: + DataObjectRequest writeReq; + engineFunctor(viewLocal, writeReq); + Pooma::gotIncomingMessage(); } -#else + // Unlike typical iterates, the work implied by this iterate + // isn't actually performed in the run method. The run method merely + // registers a method that gets handled by cheetah when the appropriate + // message arrives. So we need to be careful we finished processing + // before we destruct the iterate. virtual void run() { Pooma::remoteEngineHandler()->request(fromContext_m, tag_m, - This_t::apply, view_m); + This_t::apply, this); } -#endif - private: - static void apply(const View &viewLocal, IncomingView &viewMessage) + static void apply(This_t *me, IncomingView &viewMessage) { // For now, we just copy the message into the brick accepting the data. - - KernelEvaluator::evaluate(viewLocal, OpAssign(), + KernelEvaluator::evaluate(me->view_m, OpAssign(), viewMessage); - - // Release the received block: - DataObjectRequest writeReq; - engineFunctor(viewLocal, writeReq); - - Pooma::gotIncomingMessage(); + me->ready_m = true; } // Context we're sending the data to. @@ -239,6 +215,10 @@ // engine).; View view_m; + + // Flag if we have received the data. + + bool ready_m; }; /** -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Tue Dec 9 16:24:09 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 9 Dec 2003 17:24:09 +0100 (CET) Subject: [pooma-dev] [Q] ReceiveIterate asymmetry bug? In-Reply-To: References: Message-ID: On Tue, 9 Dec 2003, James Crotinger wrote: > Hi Richard, > > I'm interested, but very busy at the moment. This stuff was tested fairly > strenuously back in '97, including purified, so if there is a resource bug, > it has snuck in since. Unfortunately, the out-of-order execution details > involving multiple contexts are more than a little rusty in my brain, and I > don't see that I'll have time to review this soon. I'm pretty sure that > out-of-order handling of these messages is critical if you want to get any > advantage of out-of-order execution. Fair enough. I'm seeing "random" testresults, f.i. for the Particle destroy test, sometimes segfaulting, sometimes passing, sometimes failing, and this _seems_ to be fixed with this patch. But of course this kills performance too much. I just thought I'm missing some critical part of the code where it should magically work ;) Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Tue Dec 9 16:37:13 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 09 Dec 2003 08:37:13 -0800 Subject: [PATCH] Fix Tiny t1 test In-Reply-To: References: Message-ID: <3FD5FA39.7070007@codesourcery.com> Richard Guenther wrote: > Hi! > > This patch properly initializes the t1 test to allow for checking in > a parallel environment. > > Ok? Yes. > Richard. > > > 2003Dec09 Richard Guenther > > * src/Tiny/tests/t1.cpp: initialize pooma library. > > ===== t1.cpp 1.2 vs edited ===== > --- 1.2/r2/src/Tiny/tests/t1.cpp Thu Jan 30 22:35:02 2003 > +++ edited/t1.cpp Tue Dec 9 13:43:19 2003 > @@ -32,6 +32,7 @@ > #include > #include > > +#include "Pooma/Pooma.h" > #include "Utilities/Tester.h" > #include "Tiny/TinyMatrix.h" > #include "Tiny/Vector.h" > @@ -495,6 +496,7 @@ > > int main(int argc, char **argv) > { > + Pooma::initialize(argc, argv); > tester = new Pooma::Tester(argc, argv); > > testTinyMatrixDot(); > @@ -519,7 +521,9 @@ > testBoundsChecking(); > #endif > > - return tester->results("t1"); > + int ret = tester->results("t1"); > + Pooma::finalize(); > + return ret; > } > > // ACL:rcsinfo -- Jeffrey D. Oldham oldham at codesourcery.com From jcrotinger at proximation.com Tue Dec 9 16:45:18 2003 From: jcrotinger at proximation.com (James Crotinger) Date: Tue, 9 Dec 2003 09:45:18 -0700 Subject: [pooma-dev] [Q] ReceiveIterate asymmetry bug? Message-ID: FYI, I don't think cross-context particles were ever working right. That was an item on our "to-redesign" list, probably for 2.5. Jim -----Original Message----- From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] Sent: Tuesday, December 09, 2003 9:24 AM To: James Crotinger Cc: pooma-dev at pooma.codesourcery.com Subject: RE: [pooma-dev] [Q] ReceiveIterate asymmetry bug? On Tue, 9 Dec 2003, James Crotinger wrote: > Hi Richard, > > I'm interested, but very busy at the moment. This stuff was tested fairly > strenuously back in '97, including purified, so if there is a resource bug, > it has snuck in since. Unfortunately, the out-of-order execution details > involving multiple contexts are more than a little rusty in my brain, and I > don't see that I'll have time to review this soon. I'm pretty sure that > out-of-order handling of these messages is critical if you want to get any > advantage of out-of-order execution. Fair enough. I'm seeing "random" testresults, f.i. for the Particle destroy test, sometimes segfaulting, sometimes passing, sometimes failing, and this _seems_ to be fixed with this patch. But of course this kills performance too much. I just thought I'm missing some critical part of the code where it should magically work ;) Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Tue Dec 9 19:39:43 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 9 Dec 2003 20:39:43 +0100 (CET) Subject: [PATCH] Fix ReduceOverContexts wrt WhereProxy Message-ID: Hi! This fixes a missing Unwrap<> for ReduceOverContexts. Caught by regression testing with Cheetah today. Ok? Richard. 2003Dec09 Richard Guenther * src/Tulip/ReduceOverContexts.h: unwrap reduction op. Index: ReduceOverContexts.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/ReduceOverContexts.h,v retrieving revision 1.10 diff -u -u -r1.10 ReduceOverContexts.h --- ReduceOverContexts.h 2 Dec 2003 19:15:04 -0000 1.10 +++ ReduceOverContexts.h 9 Dec 2003 19:32:15 -0000 @@ -48,6 +48,7 @@ #include "Pooma/Pooma.h" #include "Tulip/Messaging.h" #include "Tulip/RemoteProxy.h" +#include "Evaluator/OpMask.h" /** @@ -272,7 +273,7 @@ } else { - ReductionOp()(me->value_m, v.value()); + Unwrap::Op_t()(me->value_m, v.value()); } } From oldham at codesourcery.com Tue Dec 9 21:50:20 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 09 Dec 2003 13:50:20 -0800 Subject: [PATCH] Fix ReduceOverContexts wrt WhereProxy In-Reply-To: References: Message-ID: <3FD6439C.5080304@codesourcery.com> Richard Guenther wrote: > Hi! > > This fixes a missing Unwrap<> for ReduceOverContexts. Caught by regression > testing with Cheetah today. > > Ok? Yes. Thanks for the regression testing. > Richard. > > > 2003Dec09 Richard Guenther > > * src/Tulip/ReduceOverContexts.h: unwrap reduction op. > > Index: ReduceOverContexts.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tulip/ReduceOverContexts.h,v > retrieving revision 1.10 > diff -u -u -r1.10 ReduceOverContexts.h > --- ReduceOverContexts.h 2 Dec 2003 19:15:04 -0000 1.10 > +++ ReduceOverContexts.h 9 Dec 2003 19:32:15 -0000 > @@ -48,6 +48,7 @@ > #include "Pooma/Pooma.h" > #include "Tulip/Messaging.h" > #include "Tulip/RemoteProxy.h" > +#include "Evaluator/OpMask.h" > > > /** > @@ -272,7 +273,7 @@ > } > else > { > - ReductionOp()(me->value_m, v.value()); > + Unwrap::Op_t()(me->value_m, v.value()); > } > } > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Tue Dec 9 21:52:10 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 09 Dec 2003 13:52:10 -0800 Subject: [PATCH] Fix CollectFromContexts for Cheetah In-Reply-To: References: Message-ID: <3FD6440A.3030500@codesourcery.com> Richard Guenther wrote: > Hi! > > This patch fixes CollectFromContexts for Cheetah use, fixes failures > of CollectFromContextsTest and PatchSizeSyncerTest. > > Ok? Yes. > Richard. > > > 2003Dec09 Richard Guenther > > * src/Tulip/CollectFromContexts.h: fix typo in assignment, include > PAssert.h. > > ===== CollectFromContexts.h 1.4 vs edited ===== > --- 1.4/r2/src/Tulip/CollectFromContexts.h Tue Dec 9 12:16:08 2003 > +++ edited/CollectFromContexts.h Tue Dec 9 13:55:08 2003 > @@ -45,6 +45,7 @@ > //----------------------------------------------------------------------------- > > #include "Tulip/Messaging.h" > +#include "Utilities/PAssert.h" > > #include > > @@ -329,7 +330,7 @@ > { > if (v.valid()) > { > - me->data_m[v.context()] == v.value(); > + me->data_m[v.context()] = v.value(); > } > > me->toReceive_m--; -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Thu Dec 11 10:46:35 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 11 Dec 2003 11:46:35 +0100 (CET) Subject: [PATCH] Use Pooma::begin/endExpression() where appropriate Message-ID: Hi! This uses Pooma::begin/endExpression() at places where it makes no difference. Ok? Richard. 2003Dec11 Richard Guenther * src/Evaluator/PatchFunction.h: replace begin/endGeneration() calls with begin/endExpression() calls where appropriate. ===== PatchFunction.h 1.3 vs edited ===== --- 1.3/r2/src/Evaluator/PatchFunction.h Thu Oct 23 14:41:02 2003 +++ edited/PatchFunction.h Thu Dec 11 11:44:25 2003 @@ -138,16 +138,10 @@ { typedef typename EvaluatorTag1::Evaluator_t Evaluator_t; PatchEvaluator evaluator; - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); evaluator.evaluate(a1(), function); notifyEngineWrite(a1.engine()); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } template @@ -155,15 +149,9 @@ { typedef typename EvaluatorTag1::Evaluator_t Evaluator_t; PatchEvaluator evaluator; - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); evaluator.evaluateRead(a1(), function); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } template @@ -172,16 +160,10 @@ { typedef typename EvaluatorTag::Evaluator_t Eval_t; PatchEvaluator evaluator; - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); evaluator.evaluate2(a1(), a2(), function); notifyEngineWrite(a1.engine()); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } template @@ -195,16 +177,10 @@ typedef typename EvaluatorCombine::Evaluator_t Eval_t; PatchEvaluator evaluator; - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); evaluator.evaluate3(a1(), a2(), a3(), function); notifyEngineWrite(a1.engine()); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } private: @@ -372,7 +348,7 @@ const PatchParticle1 &) const { Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); int n = a1.numPatchesLocal(); int i; @@ -389,12 +365,7 @@ notifyEngineWrite(a1.engine(), WrappedInt()); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } template @@ -441,7 +412,7 @@ const PatchParticle2 &) const { Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); int n1 = a1.numPatchesLocal(); int n2 = a2.numPatchesLocal(); @@ -463,12 +434,7 @@ notifyEngineWrite(a1.engine(), WrappedInt()); notifyEngineWrite(a2.engine(), WrappedInt()); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } template @@ -520,7 +486,7 @@ const PatchParticle3 &) const { Pooma::Scheduler_t &scheduler = Pooma::scheduler(); - scheduler.beginGeneration(); + Pooma::beginExpression(); int n1 = a1.numPatchesLocal(); int n2 = a2.numPatchesLocal(); @@ -546,12 +512,7 @@ notifyEngineWrite(a2.engine(), WrappedInt()); notifyEngineWrite(a3.engine(), WrappedInt()); - scheduler.endGeneration(); - - if (Pooma::blockingExpressions()) - { - Pooma::blockAndEvaluate(); - } + Pooma::endExpression(); } template References: Message-ID: On Tue, 9 Dec 2003, Richard Guenther wrote: > On Tue, 9 Dec 2003, James Crotinger wrote: > > > Hi Richard, > > > > I'm interested, but very busy at the moment. This stuff was tested fairly > > strenuously back in '97, including purified, so if there is a resource bug, > > it has snuck in since. Unfortunately, the out-of-order execution details > > involving multiple contexts are more than a little rusty in my brain, and I > > don't see that I'll have time to review this soon. I'm pretty sure that > > out-of-order handling of these messages is critical if you want to get any > > advantage of out-of-order execution. > > Fair enough. I'm seeing "random" testresults, f.i. for the Particle > destroy test, sometimes segfaulting, sometimes passing, sometimes failing, > and this _seems_ to be fixed with this patch. But of course this kills > performance too much. I just thought I'm missing some critical part of the > code where it should magically work ;) Another patch - this time with no predicted change in performance. Just to keep the view life until use. Looks obviously correct to me, but didn't solve my reproducable SCore deadlock's in global reductions with blocking expressions off. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ ===== SendReceive.h 1.3 vs edited ===== --- 1.3/r2/src/Tulip/SendReceive.h Thu Oct 23 14:41:05 2003 +++ edited/SendReceive.h Thu Dec 11 11:58:06 2003 @@ -149,7 +149,7 @@ : Pooma::Iterate_t(Pooma::scheduler()), fromContext_m(fromContext), tag_m(tag), - view_m(view) + view_m(new View(view)) { PAssert(fromContext >= 0); @@ -213,18 +213,19 @@ private: - static void apply(const View &viewLocal, IncomingView &viewMessage) + static void apply(const View *viewLocal, IncomingView &viewMessage) { // For now, we just copy the message into the brick accepting the data. - KernelEvaluator::evaluate(viewLocal, OpAssign(), + KernelEvaluator::evaluate(*viewLocal, OpAssign(), viewMessage); // Release the received block: DataObjectRequest writeReq; - engineFunctor(viewLocal, writeReq); + engineFunctor(*viewLocal, writeReq); Pooma::gotIncomingMessage(); + delete viewLocal; } // Context we're sending the data to. @@ -238,7 +239,7 @@ // The place to put the data we're receiving (typically a view of the // engine).; - View view_m; + View *view_m; }; /** From tveldhui at osl.iu.edu Thu Dec 11 16:41:37 2003 From: tveldhui at osl.iu.edu (Todd Veldhuizen) Date: Thu, 11 Dec 2003 11:41:37 -0500 (EST) Subject: [oon-list] What is the state of play with C++ and number crunching? (fwd) Message-ID: Hi all, this is a recent post on oon-list inquiring about POOMA and its status. You can reply to oon-list at oonumerics.org if you like. And "hi" to anyone here I know... :) Cheers, Todd -- Todd Veldhuizen / tveldhui at acm.org / Indiana University Computer Science ---------- Forwarded message ---------- Date: Thu, 11 Dec 2003 09:33:29 +0100 From: Drew McCormack To: paul.leopardi at unsw.edu.au Cc: oon-list at oonumerics.org Subject: Re: [oon-list] What is the state of play with C++ and number crunching? > POOMA is still under active development. See > http://www.codesourcery.com/pooma/pooma and the pooma-dev mailing list: > http://www.codesourcery.com/pooma/pooma_development This leads to another question: Is there anything wrong with POOMA? Despite the fact that it seems to be a very extensive library, it doesn't get mentioned much. Is it slow? Difficult to use? I see that it even has automatic parallelization. Has anyone tried this? Is performance OK? Drew _______________________________________________ oon-list mailing list oon-list at oonumerics.org http://www.oonumerics.org/mailman/listinfo.cgi/oon-list From rguenth at tat.physik.uni-tuebingen.de Thu Dec 11 17:05:03 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 11 Dec 2003 18:05:03 +0100 (CET) Subject: What is the state of play with C++ and number crunching? (fwd) In-Reply-To: References: Message-ID: On Thu, 11 Dec 2003, Todd Veldhuizen wrote: > Hi all, this is a recent post on oon-list inquiring about POOMA and > its status. You can reply to oon-list at oonumerics.org if you like. Uh, oh, cross-list posting follows :) > ---------- Forwarded message ---------- > Date: Thu, 11 Dec 2003 09:33:29 +0100 > From: Drew McCormack > To: paul.leopardi at unsw.edu.au > Cc: oon-list at oonumerics.org > Subject: Re: [oon-list] What is the state of play with C++ and number > crunching? > > > POOMA is still under active development. See > > http://www.codesourcery.com/pooma/pooma and the pooma-dev mailing list: > > http://www.codesourcery.com/pooma/pooma_development > > This leads to another question: Is there anything wrong with POOMA? > Despite the fact that it seems to be a very extensive library, it > doesn't get mentioned much. Is it slow? Difficult to use? I've been playing with POOMA for about one and a half year now and still am there. It's easy to use if you stick to the basic features like data parallel expressions and arrays, it gets tricky if you want to explore the advanced features like the Field infrastructure as that seems to be in not a very good shape. For performance - if you have a good optimizing C++ compiler (you desperately need very good inlining and loop optimization) performance is not slower than other OO libraries. Just ask, if you want to have more details. It's slow at compiling, it's currently slow (for me) in MPI mode (which uses the Cheetah library), the threads package it could use (Smarts) doesnt compile. But things will improve, I have some native OpenMP and MPI work done. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Mon Dec 15 20:02:42 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 15 Dec 2003 12:02:42 -0800 Subject: [PATCH] Use Pooma::begin/endExpression() where appropriate In-Reply-To: References: Message-ID: <3FDE1362.50609@codesourcery.com> Richard Guenther wrote: > Hi! > > This uses Pooma::begin/endExpression() at places where it makes no > difference. > > Ok? Yes. Good code cleanup. > Richard. > > > 2003Dec11 Richard Guenther > > * src/Evaluator/PatchFunction.h: replace begin/endGeneration() > calls with begin/endExpression() calls where appropriate. > > ===== PatchFunction.h 1.3 vs edited ===== > --- 1.3/r2/src/Evaluator/PatchFunction.h Thu Oct 23 14:41:02 2003 > +++ edited/PatchFunction.h Thu Dec 11 11:44:25 2003 > @@ -138,16 +138,10 @@ > { > typedef typename EvaluatorTag1::Evaluator_t Evaluator_t; > PatchEvaluator evaluator; > - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > evaluator.evaluate(a1(), function); > notifyEngineWrite(a1.engine()); > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > template > @@ -155,15 +149,9 @@ > { > typedef typename EvaluatorTag1::Evaluator_t Evaluator_t; > PatchEvaluator evaluator; > - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > evaluator.evaluateRead(a1(), function); > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > template > @@ -172,16 +160,10 @@ > { > typedef typename EvaluatorTag::Evaluator_t Eval_t; > PatchEvaluator evaluator; > - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > evaluator.evaluate2(a1(), a2(), function); > notifyEngineWrite(a1.engine()); > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > template > @@ -195,16 +177,10 @@ > typedef typename EvaluatorCombine::Evaluator_t Eval_t; > > PatchEvaluator evaluator; > - Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > evaluator.evaluate3(a1(), a2(), a3(), function); > notifyEngineWrite(a1.engine()); > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > private: > @@ -372,7 +348,7 @@ > const PatchParticle1 &) const > { > Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > > int n = a1.numPatchesLocal(); > int i; > @@ -389,12 +365,7 @@ > > notifyEngineWrite(a1.engine(), WrappedInt()); > > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > template > @@ -441,7 +412,7 @@ > const PatchParticle2 &) const > { > Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > > int n1 = a1.numPatchesLocal(); > int n2 = a2.numPatchesLocal(); > @@ -463,12 +434,7 @@ > notifyEngineWrite(a1.engine(), WrappedInt()); > notifyEngineWrite(a2.engine(), WrappedInt()); > > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > template > @@ -520,7 +486,7 @@ > const PatchParticle3 &) const > { > Pooma::Scheduler_t &scheduler = Pooma::scheduler(); > - scheduler.beginGeneration(); > + Pooma::beginExpression(); > > int n1 = a1.numPatchesLocal(); > int n2 = a2.numPatchesLocal(); > @@ -546,12 +512,7 @@ > notifyEngineWrite(a2.engine(), WrappedInt()); > notifyEngineWrite(a3.engine(), WrappedInt()); > > - scheduler.endGeneration(); > - > - if (Pooma::blockingExpressions()) > - { > - Pooma::blockAndEvaluate(); > - } > + Pooma::endExpression(); > } > > template Hi! I'd like to merge in a native MPI parallelization. Unfortunately the patch is huge and touches a lot of files. Basically one part of the patch introduces a new #define POOMA_MESSAGING and replaces occourences of POOMA_CHEETAH with POOMA_MESSAGING where appropriate. A lot of tests are touched by this and also all the remote engine stuff. Second, the common entry header for messaging support is changed to Tulip/Messaging.h which can be unconditionally (or conditionally on POOMA_MESSAGING) included instead of Cheetah/Cheetah.h. After merging the above changes, the remaining part should only touch Tulip/ and the remote engines. Unfortunately, the serial async scheduler is also hacked to support async completion of MPI requests. Would it be ok to merge the first part unconditionally on if the second part will be accepted? If it is, I'll try to split the patch appropriately. Thanks, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From oldham at codesourcery.com Thu Dec 18 17:23:32 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 18 Dec 2003 09:23:32 -0800 Subject: [pooma-dev] Merging native MPI parallelization In-Reply-To: References: Message-ID: <3FE1E294.9030808@codesourcery.com> Richard Guenther wrote: > Hi! > > I'd like to merge in a native MPI parallelization. Unfortunately the > patch is huge and touches a lot of files. Basically one part of the patch > introduces a new #define POOMA_MESSAGING and replaces occourences of > POOMA_CHEETAH with POOMA_MESSAGING where appropriate. A lot of tests are > touched by this and also all the remote engine stuff. Second, the common > entry header for messaging support is changed to Tulip/Messaging.h which > can be unconditionally (or conditionally on POOMA_MESSAGING) included > instead of Cheetah/Cheetah.h. > > After merging the above changes, the remaining part should only touch > Tulip/ and the remote engines. Unfortunately, the serial async scheduler > is also hacked to support async completion of MPI requests. > > Would it be ok to merge the first part unconditionally in if the second > part will be accepted? If it is, I'll try to split the patch > appropriately. Native MPI support will be good because it will lower the barrier to parallel use. At the same time, there are probably still Cheetah users and we should support them. Can we do so using POOMA_CHEETAH and POOMA_MESSAGING preprocessor definitions? Packaging the changes to have reasonable size can be difficult. I think all changes should be presented as patches for community review. ChangeLog entries will help understand the changes. -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Thu Dec 18 18:36:54 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 18 Dec 2003 19:36:54 +0100 (CET) Subject: [pooma-dev] Merging native MPI parallelization In-Reply-To: <3FE1E294.9030808@codesourcery.com> References: <3FE1E294.9030808@codesourcery.com> Message-ID: On Thu, 18 Dec 2003, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > Hi! > > > > I'd like to merge in a native MPI parallelization. Unfortunately the > > patch is huge and touches a lot of files. Basically one part of the patch > > introduces a new #define POOMA_MESSAGING and replaces occourences of > > POOMA_CHEETAH with POOMA_MESSAGING where appropriate. A lot of tests are > > touched by this and also all the remote engine stuff. Second, the common > > entry header for messaging support is changed to Tulip/Messaging.h which > > can be unconditionally (or conditionally on POOMA_MESSAGING) included > > instead of Cheetah/Cheetah.h. > > > > After merging the above changes, the remaining part should only touch > > Tulip/ and the remote engines. Unfortunately, the serial async scheduler > > is also hacked to support async completion of MPI requests. > > > > Would it be ok to merge the first part unconditionally in if the second > > part will be accepted? If it is, I'll try to split the patch > > appropriately. > > Native MPI support will be good because it will lower the barrier to > parallel use. Yes, that was my primary goal. Also enhancing performance turned out to be a _lot_ easier with a native MPI interface than with Cheetah. > At the same time, there are probably still Cheetah users > and we should support them. Can we do so using POOMA_CHEETAH and > POOMA_MESSAGING preprocessor definitions? Yes, at the moment I have POOMA_MESSAGING for general support, POOMA_CHEETAH for Cheetah specific parts and POOMA_MPI for MPI specific parts. So, for both POOMA_CHEETAH and POOMA_MPI, POOMA_MESSAGING is defined, but POOMA_CHEETAH and POOMA_MPI are mutually exclusive. > Packaging the changes to have reasonable size can be difficult. I think > all changes should be presented as patches for community review. > ChangeLog entries will help understand the changes. I'll try to start with the patch to introduce POOMA_MESSAGING. That should be touching a lot of files, but in a very obvious way. Thanks, Richard. From rguenth at tat.physik.uni-tuebingen.de Thu Dec 18 19:15:00 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 18 Dec 2003 20:15:00 +0100 (CET) Subject: [PATCH] Clean up testsuite wrt messaging support Message-ID: Hi! The following fall out during testing the new and old messaging support. They mostly fix testsuite deadlocks due to missing finalization or enable the test for serial runs, too. Ok? Richard. 2003Dec18 Richard Guenther * Array/tests/array_test28.cpp: run always, be verbose about what is failing. Domain/tests/IteratorPairDomainTest1.cpp: properly finalize. Domain/tests/IteratorPairDomainTest2.cpp: likewise. Domain/tests/domaintest.cpp: likewise. Domain/tests/indirectionlist_test1.cpp: likewise. Evaluator/tests/ReductionTest4.cpp: run always, block at the right place. Pooma/tests/pabort.cpp: try to properly finalize. Index: Array/tests/array_test28.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Array/tests/array_test28.cpp,v retrieving revision 1.3 diff -u -u -r1.3 array_test28.cpp --- Array/tests/array_test28.cpp 21 Nov 2003 17:35:16 -0000 1.3 +++ Array/tests/array_test28.cpp 18 Dec 2003 19:03:41 -0000 @@ -49,8 +49,6 @@ Pooma::initialize(argc, argv); Pooma::Tester tester(argc, argv); -#if POOMA_CHEETAH - Interval<3> I3(6,6,6); Array<3> a0(I3), b0(I3); Array<3, double, Remote > a1(I3), b1(I3); @@ -68,34 +66,48 @@ b1 = 1.0; b2 = 2.0; b3 = 3.0; - - a0 = b0; tester.check(all(a0 == 0.0)); - a1 = b1; tester.check(all(a1 == 1.0)); - a2 = b2; tester.check(all(a2 == 2.0)); - a3 = b3; tester.check(all(a3 == 3.0)); - - a0 = b1; tester.check(all(a0 == 1.0)); - a1 = b2; tester.check(all(a1 == 2.0)); - a2 = b3; tester.check(all(a2 == 3.0)); - a3 = b0; tester.check(all(a3 == 0.0)); - - a0 = b2; tester.check(all(a0 == 2.0)); - a1 = b3; tester.check(all(a1 == 3.0)); - a2 = b0; tester.check(all(a2 == 0.0)); - a3 = b1; tester.check(all(a3 == 1.0)); - - a0 = b3; tester.check(all(a0 == 3.0)); - a1 = b0; tester.check(all(a1 == 0.0)); - a2 = b1; tester.check(all(a2 == 1.0)); - a3 = b2; tester.check(all(a3 == 2.0)); + + a0 = b0; tester.check("Brick = Brick\n\t", + all(a0 == 0.0)); + a1 = b1; tester.check("Remote = Remote\n\t", + all(a1 == 1.0)); + a2 = b2; tester.check("MultiPatch> = MultiPatch>\n\t", + all(a2 == 2.0)); + a3 = b3; tester.check("MultiPatch> = MultiPatch>\n\t", + all(a3 == 3.0)); + + a0 = b1; tester.check("Brick = Remote\n\t", + all(a0 == 1.0)); + a1 = b2; tester.check("Remote = MultiPatch>\n\t", + all(a1 == 2.0)); + a2 = b3; tester.check("MultiPatch> = MultiPatch>\n\t", + all(a2 == 3.0)); + a3 = b0; tester.check("MultiPatch> = Brick\n\t", + all(a3 == 0.0)); + + a0 = b2; tester.check("Brick = MultiPatch>\n\t", + all(a0 == 2.0)); + a1 = b3; tester.check("Remote = MultiPatch>\n\t", + all(a1 == 3.0)); + a2 = b0; tester.check("MultiPatch> = Brick\n\t", + all(a2 == 0.0)); + a3 = b1; tester.check("MultiPatch> = Remote\n\t", + all(a3 == 1.0)); + + a0 = b3; tester.check("Brick = MultiPatch>\n\t", + all(a0 == 3.0)); + a1 = b0; tester.check("Remote = Brick\n\t", + all(a1 == 0.0)); + a2 = b1; tester.check("MultiPatch> = Remote\n\t", + all(a2 == 1.0)); + a3 = b2; tester.check("MultiPatch> = MultiPatch>\n\t", + all(a3 == 2.0)); Array<3, Vector<2, double>, Remote > a4(I3); a4 = Vector<2, double>(1.0, 2.0); - tester.check(all(a4.comp(1) == 2.0)); - -#endif // POOMA_CHEETAH + tester.check("a4.comp(1)", all(a4.comp(1) == 2.0)); int ret = tester.results( "array_test28" ); Pooma::finalize(); Index: Domain/tests/IteratorPairDomainTest1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Domain/tests/IteratorPairDomainTest1.cpp,v retrieving revision 1.1 diff -u -u -r1.1 IteratorPairDomainTest1.cpp --- Domain/tests/IteratorPairDomainTest1.cpp 9 Apr 2001 21:33:04 -0000 1.1 +++ Domain/tests/IteratorPairDomainTest1.cpp 18 Dec 2003 19:03:43 -0000 @@ -179,7 +179,8 @@ tester.out() << "Finished IteratorPairDomain test 1.\n" << endl; - int res = tester.results("IteratorPairDomainTest1 " ); + int res = tester.results("IteratorPairDomainTest1"); + Pooma::finalize(); return res; } Index: Domain/tests/IteratorPairDomainTest2.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Domain/tests/IteratorPairDomainTest2.cpp,v retrieving revision 1.1 diff -u -u -r1.1 IteratorPairDomainTest2.cpp --- Domain/tests/IteratorPairDomainTest2.cpp 9 Apr 2001 21:33:04 -0000 1.1 +++ Domain/tests/IteratorPairDomainTest2.cpp 18 Dec 2003 19:03:43 -0000 @@ -89,7 +89,8 @@ tester.out() << "Finished IteratorPairDomain test 2.\n" << endl; - int res = tester.results("IteratorPairDomainTest " ); + int res = tester.results("IteratorPairDomainTest2"); + Pooma::finalize(); return res; } Index: Domain/tests/domaintest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Domain/tests/domaintest.cpp,v retrieving revision 1.17 diff -u -u -r1.17 domaintest.cpp --- Domain/tests/domaintest.cpp 7 Jun 2000 03:21:42 -0000 1.17 +++ Domain/tests/domaintest.cpp 18 Dec 2003 19:03:44 -0000 @@ -553,10 +553,9 @@ tester.out() << " split([3.5,4]) ==> " << a4 << ", " << a5 << std::endl; } - tester.results("domaintest"); + int ret = tester.results("domaintest"); Pooma::finalize(); - - return 0; + return ret; } // ACL:rcsinfo Index: Domain/tests/indirectionlist_test1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Domain/tests/indirectionlist_test1.cpp,v retrieving revision 1.6 diff -u -u -r1.6 indirectionlist_test1.cpp --- Domain/tests/indirectionlist_test1.cpp 22 Jan 2003 23:39:27 -0000 1.6 +++ Domain/tests/indirectionlist_test1.cpp 18 Dec 2003 19:03:45 -0000 @@ -94,7 +94,10 @@ tester.out() << roo << std::endl; tester.out() << "Finished IndirectionList test." << std::endl << std::endl; - return 0; + + int res = tester.results("indirectionlist_test1"); + Pooma::finalize(); + return res; } // ACL:rcsinfo Index: Evaluator/tests/ReductionTest4.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/tests/ReductionTest4.cpp,v retrieving revision 1.3 diff -u -u -r1.3 ReductionTest4.cpp --- Evaluator/tests/ReductionTest4.cpp 17 Dec 2002 18:39:04 -0000 1.3 +++ Evaluator/tests/ReductionTest4.cpp 18 Dec 2003 19:03:46 -0000 @@ -41,8 +41,6 @@ Pooma::initialize(argc,argv); Pooma::Tester tester(argc,argv); -#if POOMA_CHEETAH - Loc<1> blocks2(2), blocks5(5); UniformGridPartition<1> partition2(blocks2), partition5(blocks5); UniformGridLayout<1> layout2(Interval<1>(10), partition2, DistributedTag()), @@ -51,8 +49,6 @@ b(layout5); Array<1, int> c(10); - Pooma::blockAndEvaluate(); - for (int i = 0; i < 10; i++) { a(i) = i + 1; @@ -60,6 +56,8 @@ c(i) = i % 5; } + Pooma::blockAndEvaluate(); + int ret; bool bret; @@ -111,8 +109,6 @@ tester.out() << ret << std::endl; // Finish. - -#endif // POOMA_CHEETAH int return_status = tester.results("ReductionTest4"); Index: Pooma/tests/pabort.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Pooma/tests/pabort.cpp,v retrieving revision 1.1 diff -u -u -r1.1 pabort.cpp --- Pooma/tests/pabort.cpp 30 Jan 2003 20:03:53 -0000 1.1 +++ Pooma/tests/pabort.cpp 18 Dec 2003 19:03:47 -0000 @@ -69,6 +69,7 @@ // This test is *expected* to abort. tester->check(handler_ok); int res = tester->results("pAbort"); + Pooma::finalize(); exit(res); } @@ -95,6 +96,7 @@ // If we get here, the call to Pooma::pAbort did not work. int res = tester->results("pAbort"); + Pooma::finalize(); return res; } From rguenth at tat.physik.uni-tuebingen.de Thu Dec 18 19:50:05 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 18 Dec 2003 20:50:05 +0100 (CET) Subject: [PATCH] Introduce POOMA_MESSAGING Message-ID: Hi! This patch introduces POOMA_MESSAGING which is set for both Cheetah and in future native MPI. It also mechanically changes POOMA_CHEETAH to POOMA_MESSAGING tests, where appropriate. Also including of Cheetah/Cheetah.h is exchanged for including Tulip/Messaging.h (which in turn includes Cheetah/Cheetah.h and will include mpi.h for native MPI). Ok? Richard. 2003Dec18 Richard Guenther * configure: add POOMA_MESSAGING define, if Cheetah is configured. src/Domain/Grid.h: change #if POOMA_CHEETAH to #if POOMA_MESSAGING where appropriate, #include Tulip/Messaging.h rather than Cheetah/Cheetah.h. src/Engine/RemoteDynamicEngine.h: likewise. src/Engine/RemoteEngine.h: likewise. src/Engine/tests/dynamiclayout_test1.cpp: likewise. src/Engine/tests/makeOwnCopy.cpp: likewise. src/Engine/tests/remoteDynamicTest1.cpp: likewise. src/Field/tests/ExpressionTest.cpp: likewise. src/Field/tests/FieldTour1.cpp: likewise. src/Field/tests/Gradient.cpp: likewise. src/Field/tests/LocalPatch.cpp: likewise. src/Field/tests/OffsetReduction.cpp: likewise. src/Field/tests/ScalarCode.cpp: likewise. src/Field/tests/StencilTests.cpp: likewise. src/Field/tests/VectorTest.cpp: likewise. src/Field/tests/WhereTest.cpp: likewise. src/IO/tests/FileSetWriterTest1.cpp: likewise. src/IO/tests/FileSetWriterTest2.cpp: likewise. src/Particles/Attribute.h: likewise. src/Particles/Attribute.h: likewise. src/Particles/AttributeWrapper.h: likewise. src/Particles/PatchSwapLayout.h: likewise. src/Particles/tests/attributelist.cpp: likewise. src/Particles/tests/bclist.cpp: likewise. src/Particles/tests/bctest1.cpp: likewise. src/Particles/tests/bctest2.cpp: likewise. src/Particles/tests/bctest3.cpp: likewise. src/Particles/tests/destroy.cpp: likewise. src/Particles/tests/interpolate.cpp: likewise. src/Particles/tests/particle_bench1.cpp: likewise. src/Particles/tests/particle_bench2.cpp: likewise. src/Particles/tests/particle_bench3.cpp: likewise. src/Particles/tests/particle_bench4.cpp: likewise. src/Particles/tests/spatial.cpp: likewise. src/Particles/tests/uniform.cpp: likewise. src/Pooma/Pooma.cmpl.cpp: likewise. src/Pooma/Pooma.h: likewise. src/Tulip/Messaging.h: likewise. src/Tulip/PatchSizeSyncer.cmpl.cpp: likewise. src/Tulip/tests/GridMessageTest.cpp: likewise. src/Tulip/PatchSizeSyncer.h: likewise, remove unused declarations. src/Tulip/tests/CollectFromContextsTest.cpp: disable parts of the test for serial runs. Index: configure =================================================================== RCS file: /home/pooma/Repository/r2/configure,v retrieving revision 1.111 diff -u -u -r1.111 configure --- configure 5 Aug 2003 17:45:16 -0000 1.111 +++ configure 18 Dec 2003 19:30:43 -0000 @@ -403,7 +403,9 @@ $fftw_able = 0; $fftw_default_dir = ""; -### include cheetah usage? +### include messaging via cheetah/mpi? +$messaging = 0; +$mpi = 0; $cheetah = 0; $cheetah_able = 0; $cheetah_arch = ""; @@ -1266,7 +1268,7 @@ } -### figure out if we should include the CHEETAH package or not +### figure out if we should include the CHEETAH package or MPI or nothing sub setcheetah { # set $cheetah variable properly @@ -1275,9 +1277,11 @@ $cheetah = 1; } print "Set cheetah = $cheetah\n" if $dbgprnt; + $messaging = $cheetah + $mpi; - # add a define indicating whether CHEETAH is available, and configure + # add a define indicating whether CHEETAH/MPI is available, and configure # extra options to include and define lists + my $defmessaging = $messaging; my $defcheetah = 0; if ($cheetah) { @@ -1311,6 +1315,7 @@ $link = $cheetah_link; } } + add_yesno_define("POOMA_MESSAGING", $defmessaging); add_yesno_define("POOMA_CHEETAH", $defcheetah); } Index: src/Domain/Grid.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Domain/Grid.h,v retrieving revision 1.17 diff -u -u -r1.17 Grid.h --- src/Domain/Grid.h 12 Oct 2003 11:14:38 -0000 1.17 +++ src/Domain/Grid.h 18 Dec 2003 19:30:51 -0000 @@ -501,9 +501,9 @@ // ////////////////////////////////////////////////////////////////////// -#if POOMA_CHEETAH +#if POOMA_MESSAGING -#include "Cheetah/Cheetah.h" +#include "Tulip/Messaging.h" namespace Cheetah { @@ -559,7 +559,7 @@ } // namespace Cheetah -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING #endif // POOMA_DOMAIN_GRID_H Index: src/Engine/RemoteDynamicEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/RemoteDynamicEngine.h,v retrieving revision 1.21 diff -u -u -r1.21 RemoteDynamicEngine.h --- src/Engine/RemoteDynamicEngine.h 22 Oct 2003 19:38:07 -0000 1.21 +++ src/Engine/RemoteDynamicEngine.h 18 Dec 2003 19:30:55 -0000 @@ -337,8 +337,7 @@ domain_m = d; } - // -#if POOMA_CHEETAH +#if POOMA_MESSAGING template int packSize(const Dom &packList) const @@ -758,9 +757,9 @@ } }; -#if POOMA_CHEETAH +#if POOMA_MESSAGING -#include "MatchingHandler/Serialize.h" +#include "Tulip/Messaging.h" namespace Cheetah { @@ -835,7 +834,7 @@ } // namespace Cheetah -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING /// checkDynamicID(obj, ID) is a specializable function that is used /// by some classes to check the dynamic ID value stored in the first Index: src/Engine/RemoteEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v retrieving revision 1.38 diff -u -u -r1.38 RemoteEngine.h --- src/Engine/RemoteEngine.h 21 Nov 2003 21:30:38 -0000 1.38 +++ src/Engine/RemoteEngine.h 18 Dec 2003 19:30:59 -0000 @@ -1200,9 +1200,9 @@ }; -#if POOMA_CHEETAH +#if POOMA_MESSAGING -#include "MatchingHandler/Serialize.h" +#include "Tulip/Messaging.h" struct EngineElemSerialize { @@ -1593,7 +1593,7 @@ } // namespace Cheetah -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING //----------------------------------------------------------------------------- Index: src/Engine/tests/dynamiclayout_test1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/tests/dynamiclayout_test1.cpp,v retrieving revision 1.5 diff -u -u -r1.5 dynamiclayout_test1.cpp --- src/Engine/tests/dynamiclayout_test1.cpp 6 Jun 2000 20:46:53 -0000 1.5 +++ src/Engine/tests/dynamiclayout_test1.cpp 18 Dec 2003 19:30:59 -0000 @@ -45,7 +45,7 @@ #include using namespace std; -#ifdef POOMA_CHEETAH +#ifdef POOMA_MESSAGING typedef MultiPatch > DynamicMultiPatch_t; #else typedef MultiPatch DynamicMultiPatch_t; Index: src/Engine/tests/makeOwnCopy.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/tests/makeOwnCopy.cpp,v retrieving revision 1.2 diff -u -u -r1.2 makeOwnCopy.cpp --- src/Engine/tests/makeOwnCopy.cpp 13 May 2003 17:43:12 -0000 1.2 +++ src/Engine/tests/makeOwnCopy.cpp 18 Dec 2003 19:31:00 -0000 @@ -85,7 +85,7 @@ tester.out() << ad << bd << std::endl; -#if POOMA_CHEETAH +#if POOMA_MESSAGING // Create the layouts. @@ -121,7 +121,7 @@ tester.out() << ard << brd << std::endl; -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING int ret = tester.results("makeOwnCopy"); Pooma::finalize(); Index: src/Engine/tests/remoteDynamicTest1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/tests/remoteDynamicTest1.cpp,v retrieving revision 1.8 diff -u -u -r1.8 remoteDynamicTest1.cpp --- src/Engine/tests/remoteDynamicTest1.cpp 16 May 2001 21:21:07 -0000 1.8 +++ src/Engine/tests/remoteDynamicTest1.cpp 18 Dec 2003 19:31:00 -0000 @@ -41,7 +41,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING struct PackObject { Index: src/Field/tests/ExpressionTest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/ExpressionTest.cpp,v retrieving revision 1.1 diff -u -u -r1.1 ExpressionTest.cpp --- src/Field/tests/ExpressionTest.cpp 30 Aug 2001 01:15:18 -0000 1.1 +++ src/Field/tests/ExpressionTest.cpp 18 Dec 2003 19:31:02 -0000 @@ -57,7 +57,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; typedef Remote CompBrickTag_t; Index: src/Field/tests/FieldTour1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/FieldTour1.cpp,v retrieving revision 1.1 diff -u -u -r1.1 FieldTour1.cpp --- src/Field/tests/FieldTour1.cpp 30 Aug 2001 01:15:18 -0000 1.1 +++ src/Field/tests/FieldTour1.cpp 18 Dec 2003 19:31:03 -0000 @@ -31,7 +31,7 @@ #include "Pooma/Fields.h" -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Field/tests/Gradient.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/Gradient.cpp,v retrieving revision 1.2 diff -u -u -r1.2 Gradient.cpp --- src/Field/tests/Gradient.cpp 10 Feb 2003 22:13:15 -0000 1.2 +++ src/Field/tests/Gradient.cpp 18 Dec 2003 19:31:03 -0000 @@ -48,7 +48,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Field/tests/LocalPatch.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/LocalPatch.cpp,v retrieving revision 1.3 diff -u -u -r1.3 LocalPatch.cpp --- src/Field/tests/LocalPatch.cpp 10 Feb 2003 22:13:15 -0000 1.3 +++ src/Field/tests/LocalPatch.cpp 18 Dec 2003 19:31:04 -0000 @@ -32,7 +32,7 @@ #include "Pooma/Fields.h" -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; typedef Remote CompressibleBrickTag_t; Index: src/Field/tests/OffsetReduction.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/OffsetReduction.cpp,v retrieving revision 1.1 diff -u -u -r1.1 OffsetReduction.cpp --- src/Field/tests/OffsetReduction.cpp 30 Aug 2001 01:15:18 -0000 1.1 +++ src/Field/tests/OffsetReduction.cpp 18 Dec 2003 19:31:04 -0000 @@ -50,7 +50,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Field/tests/ScalarCode.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/ScalarCode.cpp,v retrieving revision 1.2 diff -u -u -r1.2 ScalarCode.cpp --- src/Field/tests/ScalarCode.cpp 14 Oct 2003 16:14:53 -0000 1.2 +++ src/Field/tests/ScalarCode.cpp 18 Dec 2003 19:31:05 -0000 @@ -42,7 +42,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Field/tests/StencilTests.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/StencilTests.cpp,v retrieving revision 1.1 diff -u -u -r1.1 StencilTests.cpp --- src/Field/tests/StencilTests.cpp 30 Aug 2001 01:15:18 -0000 1.1 +++ src/Field/tests/StencilTests.cpp 18 Dec 2003 19:31:05 -0000 @@ -54,7 +54,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Field/tests/VectorTest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/VectorTest.cpp,v retrieving revision 1.1 diff -u -u -r1.1 VectorTest.cpp --- src/Field/tests/VectorTest.cpp 30 Aug 2001 01:15:18 -0000 1.1 +++ src/Field/tests/VectorTest.cpp 18 Dec 2003 19:31:05 -0000 @@ -57,7 +57,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Field/tests/WhereTest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Field/tests/WhereTest.cpp,v retrieving revision 1.3 diff -u -u -r1.3 WhereTest.cpp --- src/Field/tests/WhereTest.cpp 21 Nov 2003 21:31:05 -0000 1.3 +++ src/Field/tests/WhereTest.cpp 18 Dec 2003 19:31:06 -0000 @@ -57,7 +57,7 @@ #include #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/IO/tests/FileSetWriterTest1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/IO/tests/FileSetWriterTest1.cpp,v retrieving revision 1.1 diff -u -u -r1.1 FileSetWriterTest1.cpp --- src/IO/tests/FileSetWriterTest1.cpp 3 Oct 2001 03:25:08 -0000 1.1 +++ src/IO/tests/FileSetWriterTest1.cpp 18 Dec 2003 19:31:07 -0000 @@ -45,7 +45,7 @@ const int dim = 3; -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/IO/tests/FileSetWriterTest2.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/IO/tests/FileSetWriterTest2.cpp,v retrieving revision 1.1 diff -u -u -r1.1 FileSetWriterTest2.cpp --- src/IO/tests/FileSetWriterTest2.cpp 3 Oct 2001 03:53:32 -0000 1.1 +++ src/IO/tests/FileSetWriterTest2.cpp 18 Dec 2003 19:31:07 -0000 @@ -46,7 +46,7 @@ const int dim = 3; -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef DistributedTag LayoutTag_t; typedef Remote BrickTag_t; #else Index: src/Particles/Attribute.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/Attribute.h,v retrieving revision 1.12 diff -u -u -r1.12 Attribute.h --- src/Particles/Attribute.h 26 Oct 2003 12:27:36 -0000 1.12 +++ src/Particles/Attribute.h 18 Dec 2003 19:31:07 -0000 @@ -127,7 +127,7 @@ */ -#if POOMA_CHEETAH +#if POOMA_MESSAGING /// packSize, pack and unpack function interface for particle swapping @@ -135,7 +135,7 @@ virtual int pack(int, const IndirectionList &, char *) const = 0; virtual int unpack(int, const Interval<1> &, char *) = 0; -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING }; Index: src/Particles/AttributeWrapper.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/AttributeWrapper.h,v retrieving revision 1.13 diff -u -u -r1.13 AttributeWrapper.h --- src/Particles/AttributeWrapper.h 26 Oct 2003 12:27:36 -0000 1.13 +++ src/Particles/AttributeWrapper.h 18 Dec 2003 19:31:08 -0000 @@ -53,8 +53,8 @@ #include "Utilities/Inform.h" #include "Utilities/PAssert.h" -#if POOMA_CHEETAH -#include "MatchingHandler/Serialize.h" +#if POOMA_MESSAGING +#include "Tulip/Messaging.h" #endif #include @@ -171,7 +171,7 @@ */ -#if POOMA_CHEETAH +#if POOMA_MESSAGING // packSize, pack and unpack functions for particle swapping @@ -193,7 +193,7 @@ return array().engine().localPatch(pid).unpack(dom,buffer); } -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING private: // The object that we're wrapping Index: src/Particles/PatchSwapLayout.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/PatchSwapLayout.h,v retrieving revision 1.19 diff -u -u -r1.19 PatchSwapLayout.h --- src/Particles/PatchSwapLayout.h 26 Oct 2003 12:27:36 -0000 1.19 +++ src/Particles/PatchSwapLayout.h 18 Dec 2003 19:31:10 -0000 @@ -719,9 +719,9 @@ }; -#if POOMA_CHEETAH +#if POOMA_MESSAGING -#include "MatchingHandler/Serialize.h" +#include "Tulip/Messaging.h" //----------------------------------------------------------------------------- // @@ -901,7 +901,7 @@ patchInfo(pack->patchID_m).msgReceived() += 1; } -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING // Include out-of-line definitions Index: src/Particles/tests/attributelist.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/attributelist.cpp,v retrieving revision 1.10 diff -u -u -r1.10 attributelist.cpp --- src/Particles/tests/attributelist.cpp 9 Jun 2000 00:41:53 -0000 1.10 +++ src/Particles/tests/attributelist.cpp 18 Dec 2003 19:31:10 -0000 @@ -61,7 +61,7 @@ int blocks = 4; DynamicLayout layout(D,blocks); tester.out() << "DynamicLayout object:\n" << layout << std::endl; -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > EngineTag_t; #else typedef MultiPatch EngineTag_t; Index: src/Particles/tests/bclist.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/bclist.cpp,v retrieving revision 1.5 diff -u -u -r1.5 bclist.cpp --- src/Particles/tests/bclist.cpp 9 Jun 2000 00:41:53 -0000 1.5 +++ src/Particles/tests/bclist.cpp 18 Dec 2003 19:31:11 -0000 @@ -65,7 +65,7 @@ Interval<1> D(10); int blocks = 4; DynamicLayout layout(D,blocks); -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > EngineTag_t; #else typedef MultiPatch EngineTag_t; Index: src/Particles/tests/bctest1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/bctest1.cpp,v retrieving revision 1.7 diff -u -u -r1.7 bctest1.cpp --- src/Particles/tests/bctest1.cpp 11 Sep 2001 00:27:29 -0000 1.7 +++ src/Particles/tests/bctest1.cpp 18 Dec 2003 19:31:11 -0000 @@ -52,7 +52,7 @@ #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > EngineTag_t; #else typedef MultiPatch EngineTag_t; Index: src/Particles/tests/bctest2.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/bctest2.cpp,v retrieving revision 1.8 diff -u -u -r1.8 bctest2.cpp --- src/Particles/tests/bctest2.cpp 11 Sep 2001 00:27:29 -0000 1.8 +++ src/Particles/tests/bctest2.cpp 18 Dec 2003 19:31:11 -0000 @@ -52,7 +52,7 @@ #include -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > EngineTag_t; #else typedef MultiPatch EngineTag_t; Index: src/Particles/tests/bctest3.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/bctest3.cpp,v retrieving revision 1.14 diff -u -u -r1.14 bctest3.cpp --- src/Particles/tests/bctest3.cpp 23 Jan 2003 21:29:49 -0000 1.14 +++ src/Particles/tests/bctest3.cpp 18 Dec 2003 19:31:12 -0000 @@ -92,7 +92,7 @@ tester.out() << "Creating Particles object with DynamicArray attributes ..." << std::endl; UniformLayout pl(Pooma::contexts()); -#if POOMA_CHEETAH +#if POOMA_MESSAGING MyParticles P(pl); #else MyParticles P(pl); @@ -151,7 +151,7 @@ // Let's also try a KillBC on a free-standing DynamicArray. tester.out() << "Creating a free-standing DynamicArray ..." << std::endl; -#if POOMA_CHEETAH +#if POOMA_MESSAGING DynamicArray< Vector<2,int>, MultiPatch< DynamicTag, Remote > > a3; #else DynamicArray< Vector<2,int>, MultiPatch > a3; Index: src/Particles/tests/destroy.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/destroy.cpp,v retrieving revision 1.20 diff -u -u -r1.20 destroy.cpp --- src/Particles/tests/destroy.cpp 23 Jan 2003 21:29:49 -0000 1.20 +++ src/Particles/tests/destroy.cpp 18 Dec 2003 19:31:13 -0000 @@ -114,7 +114,7 @@ // Engine tag type for attributes -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; #else typedef MultiPatch AttrEngineTag_t; @@ -126,7 +126,7 @@ // Field type -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef Field< Mesh_t, double, MultiPatch< UniformTag, Remote > > Field_t; #else Index: src/Particles/tests/interpolate.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/interpolate.cpp,v retrieving revision 1.20 diff -u -u -r1.20 interpolate.cpp --- src/Particles/tests/interpolate.cpp 13 Jun 2000 00:38:21 -0000 1.20 +++ src/Particles/tests/interpolate.cpp 18 Dec 2003 19:31:14 -0000 @@ -119,7 +119,7 @@ // Engine tag type for attributes -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; #else typedef MultiPatch AttrEngineTag_t; @@ -140,7 +140,7 @@ // Field type -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef Field< Geometry_t, double, MultiPatch< UniformTag, Remote > > DField_t; typedef Field< Geometry_t, Vector, Index: src/Particles/tests/particle_bench1.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench1.cpp,v retrieving revision 1.8 diff -u -u -r1.8 particle_bench1.cpp --- src/Particles/tests/particle_bench1.cpp 14 Jul 2000 22:55:19 -0000 1.8 +++ src/Particles/tests/particle_bench1.cpp 18 Dec 2003 19:31:14 -0000 @@ -45,7 +45,7 @@ // Typedefs for what we are simulating here. -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; typedef MultiPatch< UniformTag, Remote > FieldEngineTag_t; #else @@ -88,7 +88,7 @@ // this example, though, just the layout. Loc<2> blocks(3, 4); -#if POOMA_CHEETAH +#if POOMA_MESSAGING FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); #else FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); Index: src/Particles/tests/particle_bench2.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench2.cpp,v retrieving revision 1.6 diff -u -u -r1.6 particle_bench2.cpp --- src/Particles/tests/particle_bench2.cpp 14 Jul 2000 22:55:19 -0000 1.6 +++ src/Particles/tests/particle_bench2.cpp 18 Dec 2003 19:31:15 -0000 @@ -45,7 +45,7 @@ // Typedefs for what we are simulating here. -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; typedef MultiPatch< UniformTag, Remote > FieldEngineTag_t; #else @@ -88,7 +88,7 @@ // this example, though, just the layout. Loc<2> blocks(3, 4); -#if POOMA_CHEETAH +#if POOMA_MESSAGING FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); #else FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); Index: src/Particles/tests/particle_bench3.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench3.cpp,v retrieving revision 1.5 diff -u -u -r1.5 particle_bench3.cpp --- src/Particles/tests/particle_bench3.cpp 14 Jul 2000 22:55:19 -0000 1.5 +++ src/Particles/tests/particle_bench3.cpp 18 Dec 2003 19:31:15 -0000 @@ -45,7 +45,7 @@ // Typedefs for what we are simulating here. -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; typedef MultiPatch< GridTag, Remote > FieldEngineTag_t; #else @@ -88,7 +88,7 @@ // this example, though, just the layout. Loc<2> blocks(3, 4); -#if POOMA_CHEETAH +#if POOMA_MESSAGING FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); #else FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); Index: src/Particles/tests/particle_bench4.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench4.cpp,v retrieving revision 1.6 diff -u -u -r1.6 particle_bench4.cpp --- src/Particles/tests/particle_bench4.cpp 14 Jul 2000 22:55:19 -0000 1.6 +++ src/Particles/tests/particle_bench4.cpp 18 Dec 2003 19:31:15 -0000 @@ -45,7 +45,7 @@ // Typedefs for what we are simulating here. -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; typedef MultiPatch< GridTag, Remote > FieldEngineTag_t; #else @@ -88,7 +88,7 @@ // this example, though, just the layout. Loc<2> blocks(3, 4); -#if POOMA_CHEETAH +#if POOMA_MESSAGING FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); #else FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); Index: src/Particles/tests/spatial.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/spatial.cpp,v retrieving revision 1.23 diff -u -u -r1.23 spatial.cpp --- src/Particles/tests/spatial.cpp 23 Jan 2003 21:29:49 -0000 1.23 +++ src/Particles/tests/spatial.cpp 18 Dec 2003 19:31:16 -0000 @@ -119,7 +119,7 @@ // Engine tag type for attributes -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; #else typedef MultiPatch AttrEngineTag_t; @@ -131,7 +131,7 @@ // Field type -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef Field< Mesh_t, int, MultiPatch< UniformTag, Remote > > Field_t; #else typedef Field< Mesh_t, int, MultiPatch > Field_t; Index: src/Particles/tests/uniform.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Particles/tests/uniform.cpp,v retrieving revision 1.7 diff -u -u -r1.7 uniform.cpp --- src/Particles/tests/uniform.cpp 23 Jan 2003 21:29:49 -0000 1.7 +++ src/Particles/tests/uniform.cpp 18 Dec 2003 19:31:16 -0000 @@ -103,7 +103,7 @@ // Engine tag type for attributes -#if POOMA_CHEETAH +#if POOMA_MESSAGING typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; #else typedef MultiPatch AttrEngineTag_t; Index: src/Pooma/Pooma.cmpl.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Pooma/Pooma.cmpl.cpp,v retrieving revision 1.38 diff -u -u -r1.38 Pooma.cmpl.cpp --- src/Pooma/Pooma.cmpl.cpp 11 Dec 2001 20:43:30 -0000 1.38 +++ src/Pooma/Pooma.cmpl.cpp 18 Dec 2003 19:31:18 -0000 @@ -45,8 +45,8 @@ #include #include -#if POOMA_CHEETAH -# include "Cheetah/Cheetah.h" +#if POOMA_MESSAGING +# include "Tulip/Messaging.h" #endif //----------------------------------------------------------------------------- Index: src/Pooma/Pooma.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Pooma/Pooma.h,v retrieving revision 1.33 diff -u -u -r1.33 Pooma.h --- src/Pooma/Pooma.h 21 Oct 2003 20:57:27 -0000 1.33 +++ src/Pooma/Pooma.h 18 Dec 2003 19:31:19 -0000 @@ -105,9 +105,10 @@ #include "Utilities/Inform.h" #include "Utilities/Options.h" -#if POOMA_CHEETAH -# include "Cheetah/Cheetah.h" +#if POOMA_MESSAGING +#include "Tulip/Messaging.h" #endif + //----------------------------------------------------------------------------- // Macro definitions Index: src/Tulip/Messaging.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/Messaging.h,v retrieving revision 1.7 diff -u -u -r1.7 Messaging.h --- src/Tulip/Messaging.h 21 Oct 2003 18:47:59 -0000 1.7 +++ src/Tulip/Messaging.h 18 Dec 2003 19:31:20 -0000 @@ -31,8 +31,8 @@ // TagGenerator //----------------------------------------------------------------------------- -#ifndef POOMA_CHEETAH_MESSAGING_H -#define POOMA_CHEETAH_MESSAGING_H +#ifndef POOMA_TULIP_MESSAGING_H +#define POOMA_TULIP_MESSAGING_H /** @file * @ingroup Tulip @@ -118,7 +118,7 @@ }; -#if POOMA_CHEETAH +#if POOMA_MESSAGING namespace Cheetah { @@ -183,7 +183,7 @@ } // namespace Cheetah -#endif // #if POOMA_CHEETAH +#endif // #if POOMA_MESSAGING namespace Pooma { @@ -222,7 +222,8 @@ { return particleSwapHandler_g; } -#endif + +#endif // #if POOMA_CHEETAH void initializeCheetahHelpers(int contexts); void finalizeCheetahHelpers(); @@ -248,7 +249,7 @@ } -#endif // POOMA_CHEETAH_MESSAGING_H +#endif // POOMA_TULIP_MESSAGING_H // ACL:rcsinfo // ---------------------------------------------------------------------- Index: src/Tulip/PatchSizeSyncer.cmpl.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/PatchSizeSyncer.cmpl.cpp,v retrieving revision 1.6 diff -u -u -r1.6 PatchSizeSyncer.cmpl.cpp --- src/Tulip/PatchSizeSyncer.cmpl.cpp 9 Dec 2003 19:30:07 -0000 1.6 +++ src/Tulip/PatchSizeSyncer.cmpl.cpp 18 Dec 2003 19:31:20 -0000 @@ -90,7 +90,7 @@ void PatchSizeSyncer::calcGlobalGrid(Grid_t &globalGrid) { -#if POOMA_CHEETAH +#if POOMA_MESSAGING Grid<1> result; @@ -142,11 +142,11 @@ RemoteProxy > broadcast(result,0); globalGrid = Grid<1>(broadcast.value()); -#else // POOMA_CHEETAH +#else // !POOMA_MESSAGING globalGrid = localGrid_m; -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING } Index: src/Tulip/PatchSizeSyncer.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/PatchSizeSyncer.h,v retrieving revision 1.5 diff -u -u -r1.5 PatchSizeSyncer.h --- src/Tulip/PatchSizeSyncer.h 21 Oct 2003 18:47:59 -0000 1.5 +++ src/Tulip/PatchSizeSyncer.h 18 Dec 2003 19:31:21 -0000 @@ -96,11 +96,6 @@ void calcGlobalGrid(Grid_t &globalGrid); - // This is passed to Cheetah and is called when incoming messages - // are received. - - void receiveGrid(std::pair &incoming); - private: //============================================================ @@ -129,25 +124,12 @@ static int tag_s; - // This is the Cheetah stuff. If we don't have Cheetah, this class should - // work in serial (it's a no-op) without sending any messages. All - // Cheetah stuff should compile away. - -#if POOMA_CHEETAH - - friend void Pooma::initializeCheetahHelpers(int contexts); - friend void Pooma::finalizeCheetahHelpers(); - - static Cheetah::MatchingHandler *handler_s; - -#endif // POOMA_CHEETAH - }; } // namespace Pooma -#if POOMA_CHEETAH +#if POOMA_MESSAGING namespace Cheetah { @@ -205,7 +187,7 @@ } // namespace Cheetah -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING #endif // POOMA_CHEETAH_PATCHSIZESYNCER_H Index: src/Tulip/tests/CollectFromContextsTest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/tests/CollectFromContextsTest.cpp,v retrieving revision 1.1 diff -u -u -r1.1 CollectFromContextsTest.cpp --- src/Tulip/tests/CollectFromContextsTest.cpp 9 Dec 2003 19:27:38 -0000 1.1 +++ src/Tulip/tests/CollectFromContextsTest.cpp 18 Dec 2003 19:31:21 -0000 @@ -60,6 +60,9 @@ tester.check("Collecting ranks", check); } + // We can't do the following test on !MESSAGING, as invalid data on + // context 0 is not supported in this case. +#if POOMA_MESSAGING CollectFromContexts ranks2(Pooma::context()+1, 0, Pooma::context() > 0 && Pooma::context() < Pooma::contexts()-1); @@ -73,6 +76,7 @@ } tester.check("Collecting ranks, but not first and last", check); } +#endif int ret = tester.results("CollectFromContextsTest"); Pooma::finalize(); Index: src/Tulip/tests/GridMessageTest.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/tests/GridMessageTest.cpp,v retrieving revision 1.6 diff -u -u -r1.6 GridMessageTest.cpp --- src/Tulip/tests/GridMessageTest.cpp 21 Sep 2001 19:02:18 -0000 1.6 +++ src/Tulip/tests/GridMessageTest.cpp 18 Dec 2003 19:31:21 -0000 @@ -38,8 +38,8 @@ #include "Domain/Grid.h" #include "Domain/Range.h" -#if POOMA_CHEETAH -#include "Cheetah/Cheetah.h" +#if POOMA_MESSAGING +#include "Tulip/Messaging.h" #endif #define BARRIER From oldham at codesourcery.com Thu Dec 18 21:49:12 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 18 Dec 2003 13:49:12 -0800 Subject: [pooma-dev] [PATCH] Clean up testsuite wrt messaging support In-Reply-To: References: Message-ID: <3FE220D8.7080404@codesourcery.com> Richard Guenther wrote: > Hi! > > The following fall out during testing the new and old messaging support. > They mostly fix testsuite deadlocks due to missing finalization or enable > the test for serial runs, too. > > Ok? > > Richard. > > > 2003Dec18 Richard Guenther > > * Array/tests/array_test28.cpp: run always, be verbose about > what is failing. > Domain/tests/IteratorPairDomainTest1.cpp: properly finalize. > Domain/tests/IteratorPairDomainTest2.cpp: likewise. > Domain/tests/domaintest.cpp: likewise. > Domain/tests/indirectionlist_test1.cpp: likewise. > Evaluator/tests/ReductionTest4.cpp: run always, block at the > right place. > Pooma/tests/pabort.cpp: try to properly finalize. These are good improvements. Please commit them. > Index: Array/tests/array_test28.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Array/tests/array_test28.cpp,v > retrieving revision 1.3 > diff -u -u -r1.3 array_test28.cpp > --- Array/tests/array_test28.cpp 21 Nov 2003 17:35:16 -0000 1.3 > +++ Array/tests/array_test28.cpp 18 Dec 2003 19:03:41 -0000 > @@ -49,8 +49,6 @@ > Pooma::initialize(argc, argv); > Pooma::Tester tester(argc, argv); > > -#if POOMA_CHEETAH > - > Interval<3> I3(6,6,6); > Array<3> a0(I3), b0(I3); > Array<3, double, Remote > a1(I3), b1(I3); > @@ -68,34 +66,48 @@ > b1 = 1.0; > b2 = 2.0; > b3 = 3.0; > - > - a0 = b0; tester.check(all(a0 == 0.0)); > - a1 = b1; tester.check(all(a1 == 1.0)); > - a2 = b2; tester.check(all(a2 == 2.0)); > - a3 = b3; tester.check(all(a3 == 3.0)); > - > - a0 = b1; tester.check(all(a0 == 1.0)); > - a1 = b2; tester.check(all(a1 == 2.0)); > - a2 = b3; tester.check(all(a2 == 3.0)); > - a3 = b0; tester.check(all(a3 == 0.0)); > - > - a0 = b2; tester.check(all(a0 == 2.0)); > - a1 = b3; tester.check(all(a1 == 3.0)); > - a2 = b0; tester.check(all(a2 == 0.0)); > - a3 = b1; tester.check(all(a3 == 1.0)); > - > - a0 = b3; tester.check(all(a0 == 3.0)); > - a1 = b0; tester.check(all(a1 == 0.0)); > - a2 = b1; tester.check(all(a2 == 1.0)); > - a3 = b2; tester.check(all(a3 == 2.0)); > + > + a0 = b0; tester.check("Brick = Brick\n\t", > + all(a0 == 0.0)); > + a1 = b1; tester.check("Remote = Remote\n\t", > + all(a1 == 1.0)); > + a2 = b2; tester.check("MultiPatch> = MultiPatch>\n\t", > + all(a2 == 2.0)); > + a3 = b3; tester.check("MultiPatch> = MultiPatch>\n\t", > + all(a3 == 3.0)); > + > + a0 = b1; tester.check("Brick = Remote\n\t", > + all(a0 == 1.0)); > + a1 = b2; tester.check("Remote = MultiPatch>\n\t", > + all(a1 == 2.0)); > + a2 = b3; tester.check("MultiPatch> = MultiPatch>\n\t", > + all(a2 == 3.0)); > + a3 = b0; tester.check("MultiPatch> = Brick\n\t", > + all(a3 == 0.0)); > + > + a0 = b2; tester.check("Brick = MultiPatch>\n\t", > + all(a0 == 2.0)); > + a1 = b3; tester.check("Remote = MultiPatch>\n\t", > + all(a1 == 3.0)); > + a2 = b0; tester.check("MultiPatch> = Brick\n\t", > + all(a2 == 0.0)); > + a3 = b1; tester.check("MultiPatch> = Remote\n\t", > + all(a3 == 1.0)); > + > + a0 = b3; tester.check("Brick = MultiPatch>\n\t", > + all(a0 == 3.0)); > + a1 = b0; tester.check("Remote = Brick\n\t", > + all(a1 == 0.0)); > + a2 = b1; tester.check("MultiPatch> = Remote\n\t", > + all(a2 == 1.0)); > + a3 = b2; tester.check("MultiPatch> = MultiPatch>\n\t", > + all(a3 == 2.0)); > > Array<3, Vector<2, double>, Remote > a4(I3); > > a4 = Vector<2, double>(1.0, 2.0); > > - tester.check(all(a4.comp(1) == 2.0)); > - > -#endif // POOMA_CHEETAH > + tester.check("a4.comp(1)", all(a4.comp(1) == 2.0)); > > int ret = tester.results( "array_test28" ); > Pooma::finalize(); > Index: Domain/tests/IteratorPairDomainTest1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Domain/tests/IteratorPairDomainTest1.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 IteratorPairDomainTest1.cpp > --- Domain/tests/IteratorPairDomainTest1.cpp 9 Apr 2001 21:33:04 -0000 1.1 > +++ Domain/tests/IteratorPairDomainTest1.cpp 18 Dec 2003 19:03:43 -0000 > @@ -179,7 +179,8 @@ > > tester.out() << "Finished IteratorPairDomain test 1.\n" << endl; > > - int res = tester.results("IteratorPairDomainTest1 " ); > + int res = tester.results("IteratorPairDomainTest1"); > + Pooma::finalize(); > return res; > } > > Index: Domain/tests/IteratorPairDomainTest2.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Domain/tests/IteratorPairDomainTest2.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 IteratorPairDomainTest2.cpp > --- Domain/tests/IteratorPairDomainTest2.cpp 9 Apr 2001 21:33:04 -0000 1.1 > +++ Domain/tests/IteratorPairDomainTest2.cpp 18 Dec 2003 19:03:43 -0000 > @@ -89,7 +89,8 @@ > > tester.out() << "Finished IteratorPairDomain test 2.\n" << endl; > > - int res = tester.results("IteratorPairDomainTest " ); > + int res = tester.results("IteratorPairDomainTest2"); > + Pooma::finalize(); > return res; > } > > Index: Domain/tests/domaintest.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Domain/tests/domaintest.cpp,v > retrieving revision 1.17 > diff -u -u -r1.17 domaintest.cpp > --- Domain/tests/domaintest.cpp 7 Jun 2000 03:21:42 -0000 1.17 > +++ Domain/tests/domaintest.cpp 18 Dec 2003 19:03:44 -0000 > @@ -553,10 +553,9 @@ > tester.out() << " split([3.5,4]) ==> " << a4 << ", " << a5 << std::endl; > } > > - tester.results("domaintest"); > + int ret = tester.results("domaintest"); > Pooma::finalize(); > - > - return 0; > + return ret; > } > > // ACL:rcsinfo > Index: Domain/tests/indirectionlist_test1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Domain/tests/indirectionlist_test1.cpp,v > retrieving revision 1.6 > diff -u -u -r1.6 indirectionlist_test1.cpp > --- Domain/tests/indirectionlist_test1.cpp 22 Jan 2003 23:39:27 -0000 1.6 > +++ Domain/tests/indirectionlist_test1.cpp 18 Dec 2003 19:03:45 -0000 > @@ -94,7 +94,10 @@ > tester.out() << roo << std::endl; > > tester.out() << "Finished IndirectionList test." << std::endl << std::endl; > - return 0; > + > + int res = tester.results("indirectionlist_test1"); > + Pooma::finalize(); > + return res; > } > > // ACL:rcsinfo > Index: Evaluator/tests/ReductionTest4.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Evaluator/tests/ReductionTest4.cpp,v > retrieving revision 1.3 > diff -u -u -r1.3 ReductionTest4.cpp > --- Evaluator/tests/ReductionTest4.cpp 17 Dec 2002 18:39:04 -0000 1.3 > +++ Evaluator/tests/ReductionTest4.cpp 18 Dec 2003 19:03:46 -0000 > @@ -41,8 +41,6 @@ > Pooma::initialize(argc,argv); > Pooma::Tester tester(argc,argv); > > -#if POOMA_CHEETAH > - > Loc<1> blocks2(2), blocks5(5); > UniformGridPartition<1> partition2(blocks2), partition5(blocks5); > UniformGridLayout<1> layout2(Interval<1>(10), partition2, DistributedTag()), > @@ -51,8 +49,6 @@ > b(layout5); > Array<1, int> c(10); > > - Pooma::blockAndEvaluate(); > - > for (int i = 0; i < 10; i++) > { > a(i) = i + 1; > @@ -60,6 +56,8 @@ > c(i) = i % 5; > } > > + Pooma::blockAndEvaluate(); > + > int ret; > bool bret; > > @@ -111,8 +109,6 @@ > tester.out() << ret << std::endl; > > // Finish. > - > -#endif // POOMA_CHEETAH > > int return_status = tester.results("ReductionTest4"); > > Index: Pooma/tests/pabort.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Pooma/tests/pabort.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 pabort.cpp > --- Pooma/tests/pabort.cpp 30 Jan 2003 20:03:53 -0000 1.1 > +++ Pooma/tests/pabort.cpp 18 Dec 2003 19:03:47 -0000 > @@ -69,6 +69,7 @@ > // This test is *expected* to abort. > tester->check(handler_ok); > int res = tester->results("pAbort"); > + Pooma::finalize(); > exit(res); > } > > @@ -95,6 +96,7 @@ > > // If we get here, the call to Pooma::pAbort did not work. > int res = tester->results("pAbort"); > + Pooma::finalize(); > return res; > } > -- Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Thu Dec 18 22:02:09 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Thu, 18 Dec 2003 14:02:09 -0800 Subject: [PATCH] Introduce POOMA_MESSAGING In-Reply-To: References: Message-ID: <3FE223E1.6080102@codesourcery.com> Richard Guenther wrote: > Hi! > > This patch introduces POOMA_MESSAGING which is set for both Cheetah and in > future native MPI. It also mechanically changes POOMA_CHEETAH to > POOMA_MESSAGING tests, where appropriate. Also including of > Cheetah/Cheetah.h is exchanged for including Tulip/Messaging.h (which in > turn includes Cheetah/Cheetah.h and will include mpi.h for native MPI). > > Ok? Yes, but I have some questions ... > Richard. > > > 2003Dec18 Richard Guenther > > * configure: add POOMA_MESSAGING define, if Cheetah is configured. > src/Domain/Grid.h: change #if POOMA_CHEETAH to #if POOMA_MESSAGING > where appropriate, #include Tulip/Messaging.h rather than > Cheetah/Cheetah.h. > src/Tulip/Messaging.h: likewise. There seems to be Cheetah code surrounded by POOMA_MESSAGING. Is this correct? > src/Tulip/PatchSizeSyncer.cmpl.cpp: likewise. This may need changing if the next file is changed. > src/Tulip/PatchSizeSyncer.h: likewise, remove unused declarations. Same question as for src/Tulip/Messaging.h. > > Index: configure > =================================================================== > RCS file: /home/pooma/Repository/r2/configure,v > retrieving revision 1.111 > diff -u -u -r1.111 configure > --- configure 5 Aug 2003 17:45:16 -0000 1.111 > +++ configure 18 Dec 2003 19:30:43 -0000 > @@ -403,7 +403,9 @@ > $fftw_able = 0; > $fftw_default_dir = ""; > > -### include cheetah usage? > +### include messaging via cheetah/mpi? > +$messaging = 0; > +$mpi = 0; > $cheetah = 0; > $cheetah_able = 0; > $cheetah_arch = ""; > @@ -1266,7 +1268,7 @@ > } > > > -### figure out if we should include the CHEETAH package or not > +### figure out if we should include the CHEETAH package or MPI or nothing > sub setcheetah > { > # set $cheetah variable properly > @@ -1275,9 +1277,11 @@ > $cheetah = 1; > } > print "Set cheetah = $cheetah\n" if $dbgprnt; > + $messaging = $cheetah + $mpi; > > - # add a define indicating whether CHEETAH is available, and configure > + # add a define indicating whether CHEETAH/MPI is available, and configure > # extra options to include and define lists > + my $defmessaging = $messaging; > my $defcheetah = 0; > if ($cheetah) > { > @@ -1311,6 +1315,7 @@ > $link = $cheetah_link; > } > } > + add_yesno_define("POOMA_MESSAGING", $defmessaging); > add_yesno_define("POOMA_CHEETAH", $defcheetah); > } > > Index: src/Domain/Grid.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Domain/Grid.h,v > retrieving revision 1.17 > diff -u -u -r1.17 Grid.h > --- src/Domain/Grid.h 12 Oct 2003 11:14:38 -0000 1.17 > +++ src/Domain/Grid.h 18 Dec 2003 19:30:51 -0000 > @@ -501,9 +501,9 @@ > // > ////////////////////////////////////////////////////////////////////// > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > -#include "Cheetah/Cheetah.h" > +#include "Tulip/Messaging.h" > > namespace Cheetah { > > @@ -559,7 +559,7 @@ > > } // namespace Cheetah > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > #endif // POOMA_DOMAIN_GRID_H > > Index: src/Engine/RemoteDynamicEngine.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Engine/RemoteDynamicEngine.h,v > retrieving revision 1.21 > diff -u -u -r1.21 RemoteDynamicEngine.h > --- src/Engine/RemoteDynamicEngine.h 22 Oct 2003 19:38:07 -0000 1.21 > +++ src/Engine/RemoteDynamicEngine.h 18 Dec 2003 19:30:55 -0000 > @@ -337,8 +337,7 @@ > domain_m = d; > } > > - // > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > template > int packSize(const Dom &packList) const > @@ -758,9 +757,9 @@ > } > }; > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > -#include "MatchingHandler/Serialize.h" > +#include "Tulip/Messaging.h" > > namespace Cheetah { > > @@ -835,7 +834,7 @@ > > } // namespace Cheetah > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > /// checkDynamicID(obj, ID) is a specializable function that is used > /// by some classes to check the dynamic ID value stored in the first > Index: src/Engine/RemoteEngine.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v > retrieving revision 1.38 > diff -u -u -r1.38 RemoteEngine.h > --- src/Engine/RemoteEngine.h 21 Nov 2003 21:30:38 -0000 1.38 > +++ src/Engine/RemoteEngine.h 18 Dec 2003 19:30:59 -0000 > @@ -1200,9 +1200,9 @@ > > }; > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > -#include "MatchingHandler/Serialize.h" > +#include "Tulip/Messaging.h" > > struct EngineElemSerialize > { > @@ -1593,7 +1593,7 @@ > > } // namespace Cheetah > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > > //----------------------------------------------------------------------------- > Index: src/Engine/tests/dynamiclayout_test1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Engine/tests/dynamiclayout_test1.cpp,v > retrieving revision 1.5 > diff -u -u -r1.5 dynamiclayout_test1.cpp > --- src/Engine/tests/dynamiclayout_test1.cpp 6 Jun 2000 20:46:53 -0000 1.5 > +++ src/Engine/tests/dynamiclayout_test1.cpp 18 Dec 2003 19:30:59 -0000 > @@ -45,7 +45,7 @@ > #include > using namespace std; > > -#ifdef POOMA_CHEETAH > +#ifdef POOMA_MESSAGING > typedef MultiPatch > DynamicMultiPatch_t; > #else > typedef MultiPatch DynamicMultiPatch_t; > Index: src/Engine/tests/makeOwnCopy.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Engine/tests/makeOwnCopy.cpp,v > retrieving revision 1.2 > diff -u -u -r1.2 makeOwnCopy.cpp > --- src/Engine/tests/makeOwnCopy.cpp 13 May 2003 17:43:12 -0000 1.2 > +++ src/Engine/tests/makeOwnCopy.cpp 18 Dec 2003 19:31:00 -0000 > @@ -85,7 +85,7 @@ > > tester.out() << ad << bd << std::endl; > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > // Create the layouts. > > @@ -121,7 +121,7 @@ > > tester.out() << ard << brd << std::endl; > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > int ret = tester.results("makeOwnCopy"); > Pooma::finalize(); > Index: src/Engine/tests/remoteDynamicTest1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Engine/tests/remoteDynamicTest1.cpp,v > retrieving revision 1.8 > diff -u -u -r1.8 remoteDynamicTest1.cpp > --- src/Engine/tests/remoteDynamicTest1.cpp 16 May 2001 21:21:07 -0000 1.8 > +++ src/Engine/tests/remoteDynamicTest1.cpp 18 Dec 2003 19:31:00 -0000 > @@ -41,7 +41,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > struct PackObject > { > Index: src/Field/tests/ExpressionTest.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/ExpressionTest.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 ExpressionTest.cpp > --- src/Field/tests/ExpressionTest.cpp 30 Aug 2001 01:15:18 -0000 1.1 > +++ src/Field/tests/ExpressionTest.cpp 18 Dec 2003 19:31:02 -0000 > @@ -57,7 +57,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > typedef Remote CompBrickTag_t; > Index: src/Field/tests/FieldTour1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/FieldTour1.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 FieldTour1.cpp > --- src/Field/tests/FieldTour1.cpp 30 Aug 2001 01:15:18 -0000 1.1 > +++ src/Field/tests/FieldTour1.cpp 18 Dec 2003 19:31:03 -0000 > @@ -31,7 +31,7 @@ > > #include "Pooma/Fields.h" > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Field/tests/Gradient.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/Gradient.cpp,v > retrieving revision 1.2 > diff -u -u -r1.2 Gradient.cpp > --- src/Field/tests/Gradient.cpp 10 Feb 2003 22:13:15 -0000 1.2 > +++ src/Field/tests/Gradient.cpp 18 Dec 2003 19:31:03 -0000 > @@ -48,7 +48,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Field/tests/LocalPatch.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/LocalPatch.cpp,v > retrieving revision 1.3 > diff -u -u -r1.3 LocalPatch.cpp > --- src/Field/tests/LocalPatch.cpp 10 Feb 2003 22:13:15 -0000 1.3 > +++ src/Field/tests/LocalPatch.cpp 18 Dec 2003 19:31:04 -0000 > @@ -32,7 +32,7 @@ > > #include "Pooma/Fields.h" > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > typedef Remote CompressibleBrickTag_t; > Index: src/Field/tests/OffsetReduction.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/OffsetReduction.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 OffsetReduction.cpp > --- src/Field/tests/OffsetReduction.cpp 30 Aug 2001 01:15:18 -0000 1.1 > +++ src/Field/tests/OffsetReduction.cpp 18 Dec 2003 19:31:04 -0000 > @@ -50,7 +50,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Field/tests/ScalarCode.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/ScalarCode.cpp,v > retrieving revision 1.2 > diff -u -u -r1.2 ScalarCode.cpp > --- src/Field/tests/ScalarCode.cpp 14 Oct 2003 16:14:53 -0000 1.2 > +++ src/Field/tests/ScalarCode.cpp 18 Dec 2003 19:31:05 -0000 > @@ -42,7 +42,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Field/tests/StencilTests.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/StencilTests.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 StencilTests.cpp > --- src/Field/tests/StencilTests.cpp 30 Aug 2001 01:15:18 -0000 1.1 > +++ src/Field/tests/StencilTests.cpp 18 Dec 2003 19:31:05 -0000 > @@ -54,7 +54,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Field/tests/VectorTest.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/VectorTest.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 VectorTest.cpp > --- src/Field/tests/VectorTest.cpp 30 Aug 2001 01:15:18 -0000 1.1 > +++ src/Field/tests/VectorTest.cpp 18 Dec 2003 19:31:05 -0000 > @@ -57,7 +57,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Field/tests/WhereTest.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Field/tests/WhereTest.cpp,v > retrieving revision 1.3 > diff -u -u -r1.3 WhereTest.cpp > --- src/Field/tests/WhereTest.cpp 21 Nov 2003 21:31:05 -0000 1.3 > +++ src/Field/tests/WhereTest.cpp 18 Dec 2003 19:31:06 -0000 > @@ -57,7 +57,7 @@ > #include > #include > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/IO/tests/FileSetWriterTest1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/IO/tests/FileSetWriterTest1.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 FileSetWriterTest1.cpp > --- src/IO/tests/FileSetWriterTest1.cpp 3 Oct 2001 03:25:08 -0000 1.1 > +++ src/IO/tests/FileSetWriterTest1.cpp 18 Dec 2003 19:31:07 -0000 > @@ -45,7 +45,7 @@ > > const int dim = 3; > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/IO/tests/FileSetWriterTest2.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/IO/tests/FileSetWriterTest2.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 FileSetWriterTest2.cpp > --- src/IO/tests/FileSetWriterTest2.cpp 3 Oct 2001 03:53:32 -0000 1.1 > +++ src/IO/tests/FileSetWriterTest2.cpp 18 Dec 2003 19:31:07 -0000 > @@ -46,7 +46,7 @@ > > const int dim = 3; > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef DistributedTag LayoutTag_t; > typedef Remote BrickTag_t; > #else > Index: src/Particles/Attribute.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/Attribute.h,v > retrieving revision 1.12 > diff -u -u -r1.12 Attribute.h > --- src/Particles/Attribute.h 26 Oct 2003 12:27:36 -0000 1.12 > +++ src/Particles/Attribute.h 18 Dec 2003 19:31:07 -0000 > @@ -127,7 +127,7 @@ > > */ > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > /// packSize, pack and unpack function interface for particle swapping > > @@ -135,7 +135,7 @@ > virtual int pack(int, const IndirectionList &, char *) const = 0; > virtual int unpack(int, const Interval<1> &, char *) = 0; > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > }; > > Index: src/Particles/AttributeWrapper.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/AttributeWrapper.h,v > retrieving revision 1.13 > diff -u -u -r1.13 AttributeWrapper.h > --- src/Particles/AttributeWrapper.h 26 Oct 2003 12:27:36 -0000 1.13 > +++ src/Particles/AttributeWrapper.h 18 Dec 2003 19:31:08 -0000 > @@ -53,8 +53,8 @@ > #include "Utilities/Inform.h" > #include "Utilities/PAssert.h" > > -#if POOMA_CHEETAH > -#include "MatchingHandler/Serialize.h" > +#if POOMA_MESSAGING > +#include "Tulip/Messaging.h" > #endif > > #include > @@ -171,7 +171,7 @@ > > */ > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > // packSize, pack and unpack functions for particle swapping > > @@ -193,7 +193,7 @@ > return array().engine().localPatch(pid).unpack(dom,buffer); > } > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > private: > // The object that we're wrapping > Index: src/Particles/PatchSwapLayout.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/PatchSwapLayout.h,v > retrieving revision 1.19 > diff -u -u -r1.19 PatchSwapLayout.h > --- src/Particles/PatchSwapLayout.h 26 Oct 2003 12:27:36 -0000 1.19 > +++ src/Particles/PatchSwapLayout.h 18 Dec 2003 19:31:10 -0000 > @@ -719,9 +719,9 @@ > }; > > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > -#include "MatchingHandler/Serialize.h" > +#include "Tulip/Messaging.h" > > //----------------------------------------------------------------------------- > // > @@ -901,7 +901,7 @@ > patchInfo(pack->patchID_m).msgReceived() += 1; > } > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > // Include out-of-line definitions > > Index: src/Particles/tests/attributelist.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/attributelist.cpp,v > retrieving revision 1.10 > diff -u -u -r1.10 attributelist.cpp > --- src/Particles/tests/attributelist.cpp 9 Jun 2000 00:41:53 -0000 1.10 > +++ src/Particles/tests/attributelist.cpp 18 Dec 2003 19:31:10 -0000 > @@ -61,7 +61,7 @@ > int blocks = 4; > DynamicLayout layout(D,blocks); > tester.out() << "DynamicLayout object:\n" << layout << std::endl; > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > EngineTag_t; > #else > typedef MultiPatch EngineTag_t; > Index: src/Particles/tests/bclist.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/bclist.cpp,v > retrieving revision 1.5 > diff -u -u -r1.5 bclist.cpp > --- src/Particles/tests/bclist.cpp 9 Jun 2000 00:41:53 -0000 1.5 > +++ src/Particles/tests/bclist.cpp 18 Dec 2003 19:31:11 -0000 > @@ -65,7 +65,7 @@ > Interval<1> D(10); > int blocks = 4; > DynamicLayout layout(D,blocks); > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > EngineTag_t; > #else > typedef MultiPatch EngineTag_t; > Index: src/Particles/tests/bctest1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/bctest1.cpp,v > retrieving revision 1.7 > diff -u -u -r1.7 bctest1.cpp > --- src/Particles/tests/bctest1.cpp 11 Sep 2001 00:27:29 -0000 1.7 > +++ src/Particles/tests/bctest1.cpp 18 Dec 2003 19:31:11 -0000 > @@ -52,7 +52,7 @@ > #include > > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > EngineTag_t; > #else > typedef MultiPatch EngineTag_t; > Index: src/Particles/tests/bctest2.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/bctest2.cpp,v > retrieving revision 1.8 > diff -u -u -r1.8 bctest2.cpp > --- src/Particles/tests/bctest2.cpp 11 Sep 2001 00:27:29 -0000 1.8 > +++ src/Particles/tests/bctest2.cpp 18 Dec 2003 19:31:11 -0000 > @@ -52,7 +52,7 @@ > #include > > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > EngineTag_t; > #else > typedef MultiPatch EngineTag_t; > Index: src/Particles/tests/bctest3.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/bctest3.cpp,v > retrieving revision 1.14 > diff -u -u -r1.14 bctest3.cpp > --- src/Particles/tests/bctest3.cpp 23 Jan 2003 21:29:49 -0000 1.14 > +++ src/Particles/tests/bctest3.cpp 18 Dec 2003 19:31:12 -0000 > @@ -92,7 +92,7 @@ > tester.out() << "Creating Particles object with DynamicArray attributes ..." > << std::endl; > UniformLayout pl(Pooma::contexts()); > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > MyParticles P(pl); > #else > MyParticles P(pl); > @@ -151,7 +151,7 @@ > // Let's also try a KillBC on a free-standing DynamicArray. > > tester.out() << "Creating a free-standing DynamicArray ..." << std::endl; > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > DynamicArray< Vector<2,int>, MultiPatch< DynamicTag, Remote > > a3; > #else > DynamicArray< Vector<2,int>, MultiPatch > a3; > Index: src/Particles/tests/destroy.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/destroy.cpp,v > retrieving revision 1.20 > diff -u -u -r1.20 destroy.cpp > --- src/Particles/tests/destroy.cpp 23 Jan 2003 21:29:49 -0000 1.20 > +++ src/Particles/tests/destroy.cpp 18 Dec 2003 19:31:13 -0000 > @@ -114,7 +114,7 @@ > > // Engine tag type for attributes > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > #else > typedef MultiPatch AttrEngineTag_t; > @@ -126,7 +126,7 @@ > > // Field type > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef Field< Mesh_t, double, MultiPatch< UniformTag, Remote > > > Field_t; > #else > Index: src/Particles/tests/interpolate.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/interpolate.cpp,v > retrieving revision 1.20 > diff -u -u -r1.20 interpolate.cpp > --- src/Particles/tests/interpolate.cpp 13 Jun 2000 00:38:21 -0000 1.20 > +++ src/Particles/tests/interpolate.cpp 18 Dec 2003 19:31:14 -0000 > @@ -119,7 +119,7 @@ > > // Engine tag type for attributes > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > #else > typedef MultiPatch AttrEngineTag_t; > @@ -140,7 +140,7 @@ > > // Field type > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef Field< Geometry_t, double, MultiPatch< UniformTag, Remote > > > DField_t; > typedef Field< Geometry_t, Vector, > Index: src/Particles/tests/particle_bench1.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench1.cpp,v > retrieving revision 1.8 > diff -u -u -r1.8 particle_bench1.cpp > --- src/Particles/tests/particle_bench1.cpp 14 Jul 2000 22:55:19 -0000 1.8 > +++ src/Particles/tests/particle_bench1.cpp 18 Dec 2003 19:31:14 -0000 > @@ -45,7 +45,7 @@ > > // Typedefs for what we are simulating here. > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > typedef MultiPatch< UniformTag, Remote > FieldEngineTag_t; > #else > @@ -88,7 +88,7 @@ > // this example, though, just the layout. > > Loc<2> blocks(3, 4); > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); > #else > FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); > Index: src/Particles/tests/particle_bench2.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench2.cpp,v > retrieving revision 1.6 > diff -u -u -r1.6 particle_bench2.cpp > --- src/Particles/tests/particle_bench2.cpp 14 Jul 2000 22:55:19 -0000 1.6 > +++ src/Particles/tests/particle_bench2.cpp 18 Dec 2003 19:31:15 -0000 > @@ -45,7 +45,7 @@ > > // Typedefs for what we are simulating here. > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > typedef MultiPatch< UniformTag, Remote > FieldEngineTag_t; > #else > @@ -88,7 +88,7 @@ > // this example, though, just the layout. > > Loc<2> blocks(3, 4); > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); > #else > FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); > Index: src/Particles/tests/particle_bench3.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench3.cpp,v > retrieving revision 1.5 > diff -u -u -r1.5 particle_bench3.cpp > --- src/Particles/tests/particle_bench3.cpp 14 Jul 2000 22:55:19 -0000 1.5 > +++ src/Particles/tests/particle_bench3.cpp 18 Dec 2003 19:31:15 -0000 > @@ -45,7 +45,7 @@ > > // Typedefs for what we are simulating here. > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > typedef MultiPatch< GridTag, Remote > FieldEngineTag_t; > #else > @@ -88,7 +88,7 @@ > // this example, though, just the layout. > > Loc<2> blocks(3, 4); > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); > #else > FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); > Index: src/Particles/tests/particle_bench4.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/particle_bench4.cpp,v > retrieving revision 1.6 > diff -u -u -r1.6 particle_bench4.cpp > --- src/Particles/tests/particle_bench4.cpp 14 Jul 2000 22:55:19 -0000 1.6 > +++ src/Particles/tests/particle_bench4.cpp 18 Dec 2003 19:31:15 -0000 > @@ -45,7 +45,7 @@ > > // Typedefs for what we are simulating here. > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > typedef MultiPatch< GridTag, Remote > FieldEngineTag_t; > #else > @@ -88,7 +88,7 @@ > // this example, though, just the layout. > > Loc<2> blocks(3, 4); > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > FieldLayout_t flayout(geometry.physicalDomain(), blocks, DistributedTag()); > #else > FieldLayout_t flayout(geometry.physicalDomain(), blocks, ReplicatedTag()); > Index: src/Particles/tests/spatial.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/spatial.cpp,v > retrieving revision 1.23 > diff -u -u -r1.23 spatial.cpp > --- src/Particles/tests/spatial.cpp 23 Jan 2003 21:29:49 -0000 1.23 > +++ src/Particles/tests/spatial.cpp 18 Dec 2003 19:31:16 -0000 > @@ -119,7 +119,7 @@ > > // Engine tag type for attributes > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > #else > typedef MultiPatch AttrEngineTag_t; > @@ -131,7 +131,7 @@ > > // Field type > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef Field< Mesh_t, int, MultiPatch< UniformTag, Remote > > Field_t; > #else > typedef Field< Mesh_t, int, MultiPatch > Field_t; > Index: src/Particles/tests/uniform.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Particles/tests/uniform.cpp,v > retrieving revision 1.7 > diff -u -u -r1.7 uniform.cpp > --- src/Particles/tests/uniform.cpp 23 Jan 2003 21:29:49 -0000 1.7 > +++ src/Particles/tests/uniform.cpp 18 Dec 2003 19:31:16 -0000 > @@ -103,7 +103,7 @@ > > // Engine tag type for attributes > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > typedef MultiPatch< DynamicTag, Remote > AttrEngineTag_t; > #else > typedef MultiPatch AttrEngineTag_t; > Index: src/Pooma/Pooma.cmpl.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Pooma/Pooma.cmpl.cpp,v > retrieving revision 1.38 > diff -u -u -r1.38 Pooma.cmpl.cpp > --- src/Pooma/Pooma.cmpl.cpp 11 Dec 2001 20:43:30 -0000 1.38 > +++ src/Pooma/Pooma.cmpl.cpp 18 Dec 2003 19:31:18 -0000 > @@ -45,8 +45,8 @@ > #include > #include > > -#if POOMA_CHEETAH > -# include "Cheetah/Cheetah.h" > +#if POOMA_MESSAGING > +# include "Tulip/Messaging.h" > #endif > > //----------------------------------------------------------------------------- > Index: src/Pooma/Pooma.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Pooma/Pooma.h,v > retrieving revision 1.33 > diff -u -u -r1.33 Pooma.h > --- src/Pooma/Pooma.h 21 Oct 2003 20:57:27 -0000 1.33 > +++ src/Pooma/Pooma.h 18 Dec 2003 19:31:19 -0000 > @@ -105,9 +105,10 @@ > #include "Utilities/Inform.h" > #include "Utilities/Options.h" > > -#if POOMA_CHEETAH > -# include "Cheetah/Cheetah.h" > +#if POOMA_MESSAGING > +#include "Tulip/Messaging.h" > #endif > + > > //----------------------------------------------------------------------------- > // Macro definitions > Index: src/Tulip/Messaging.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tulip/Messaging.h,v > retrieving revision 1.7 > diff -u -u -r1.7 Messaging.h > --- src/Tulip/Messaging.h 21 Oct 2003 18:47:59 -0000 1.7 > +++ src/Tulip/Messaging.h 18 Dec 2003 19:31:20 -0000 > @@ -31,8 +31,8 @@ > // TagGenerator > //----------------------------------------------------------------------------- > > -#ifndef POOMA_CHEETAH_MESSAGING_H > -#define POOMA_CHEETAH_MESSAGING_H > +#ifndef POOMA_TULIP_MESSAGING_H > +#define POOMA_TULIP_MESSAGING_H > > /** @file > * @ingroup Tulip > @@ -118,7 +118,7 @@ > }; > > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > namespace Cheetah { > > @@ -183,7 +183,7 @@ > > } // namespace Cheetah > > -#endif // #if POOMA_CHEETAH > +#endif // #if POOMA_MESSAGING > > namespace Pooma { > > @@ -222,7 +222,8 @@ > { > return particleSwapHandler_g; > } > -#endif > + > +#endif // #if POOMA_CHEETAH > > void initializeCheetahHelpers(int contexts); > void finalizeCheetahHelpers(); > @@ -248,7 +249,7 @@ > > } > > -#endif // POOMA_CHEETAH_MESSAGING_H > +#endif // POOMA_TULIP_MESSAGING_H > > // ACL:rcsinfo > // ---------------------------------------------------------------------- > Index: src/Tulip/PatchSizeSyncer.cmpl.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tulip/PatchSizeSyncer.cmpl.cpp,v > retrieving revision 1.6 > diff -u -u -r1.6 PatchSizeSyncer.cmpl.cpp > --- src/Tulip/PatchSizeSyncer.cmpl.cpp 9 Dec 2003 19:30:07 -0000 1.6 > +++ src/Tulip/PatchSizeSyncer.cmpl.cpp 18 Dec 2003 19:31:20 -0000 > @@ -90,7 +90,7 @@ > > void PatchSizeSyncer::calcGlobalGrid(Grid_t &globalGrid) > { > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > Grid<1> result; > > @@ -142,11 +142,11 @@ > RemoteProxy > broadcast(result,0); > globalGrid = Grid<1>(broadcast.value()); > > -#else // POOMA_CHEETAH > +#else // !POOMA_MESSAGING > > globalGrid = localGrid_m; > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > } > > > Index: src/Tulip/PatchSizeSyncer.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tulip/PatchSizeSyncer.h,v > retrieving revision 1.5 > diff -u -u -r1.5 PatchSizeSyncer.h > --- src/Tulip/PatchSizeSyncer.h 21 Oct 2003 18:47:59 -0000 1.5 > +++ src/Tulip/PatchSizeSyncer.h 18 Dec 2003 19:31:21 -0000 > @@ -96,11 +96,6 @@ > > void calcGlobalGrid(Grid_t &globalGrid); > > - // This is passed to Cheetah and is called when incoming messages > - // are received. > - > - void receiveGrid(std::pair &incoming); > - > private: > > //============================================================ > @@ -129,25 +124,12 @@ > > static int tag_s; > > - // This is the Cheetah stuff. If we don't have Cheetah, this class should > - // work in serial (it's a no-op) without sending any messages. All > - // Cheetah stuff should compile away. > - > -#if POOMA_CHEETAH > - > - friend void Pooma::initializeCheetahHelpers(int contexts); > - friend void Pooma::finalizeCheetahHelpers(); > - > - static Cheetah::MatchingHandler *handler_s; > - > -#endif // POOMA_CHEETAH > - > }; > > } // namespace Pooma > > > -#if POOMA_CHEETAH > +#if POOMA_MESSAGING > > namespace Cheetah { > > @@ -205,7 +187,7 @@ > > } // namespace Cheetah > > -#endif // POOMA_CHEETAH > +#endif // POOMA_MESSAGING > > #endif // POOMA_CHEETAH_PATCHSIZESYNCER_H > > Index: src/Tulip/tests/CollectFromContextsTest.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tulip/tests/CollectFromContextsTest.cpp,v > retrieving revision 1.1 > diff -u -u -r1.1 CollectFromContextsTest.cpp > --- src/Tulip/tests/CollectFromContextsTest.cpp 9 Dec 2003 19:27:38 -0000 1.1 > +++ src/Tulip/tests/CollectFromContextsTest.cpp 18 Dec 2003 19:31:21 -0000 > @@ -60,6 +60,9 @@ > tester.check("Collecting ranks", check); > } > > + // We can't do the following test on !MESSAGING, as invalid data on > + // context 0 is not supported in this case. > +#if POOMA_MESSAGING > CollectFromContexts ranks2(Pooma::context()+1, 0, > Pooma::context() > 0 > && Pooma::context() < Pooma::contexts()-1); > @@ -73,6 +76,7 @@ > } > tester.check("Collecting ranks, but not first and last", check); > } > +#endif > > int ret = tester.results("CollectFromContextsTest"); > Pooma::finalize(); > Index: src/Tulip/tests/GridMessageTest.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Tulip/tests/GridMessageTest.cpp,v > retrieving revision 1.6 > diff -u -u -r1.6 GridMessageTest.cpp > --- src/Tulip/tests/GridMessageTest.cpp 21 Sep 2001 19:02:18 -0000 1.6 > +++ src/Tulip/tests/GridMessageTest.cpp 18 Dec 2003 19:31:21 -0000 > @@ -38,8 +38,8 @@ > #include "Domain/Grid.h" > #include "Domain/Range.h" > > -#if POOMA_CHEETAH > -#include "Cheetah/Cheetah.h" > +#if POOMA_MESSAGING > +#include "Tulip/Messaging.h" > #endif > > #define BARRIER -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Fri Dec 19 08:34:49 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 19 Dec 2003 09:34:49 +0100 (CET) Subject: [PATCH] Introduce POOMA_MESSAGING In-Reply-To: <3FE223E1.6080102@codesourcery.com> References: <3FE223E1.6080102@codesourcery.com> Message-ID: On Thu, 18 Dec 2003, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > Hi! > > > > This patch introduces POOMA_MESSAGING which is set for both Cheetah and in > > future native MPI. It also mechanically changes POOMA_CHEETAH to > > POOMA_MESSAGING tests, where appropriate. Also including of > > Cheetah/Cheetah.h is exchanged for including Tulip/Messaging.h (which in > > turn includes Cheetah/Cheetah.h and will include mpi.h for native MPI). > > > > Ok? > > Yes, but I have some questions ... > > > Richard. > > > > > > 2003Dec18 Richard Guenther > > > > * configure: add POOMA_MESSAGING define, if Cheetah is configured. > > src/Domain/Grid.h: change #if POOMA_CHEETAH to #if POOMA_MESSAGING > > where appropriate, #include Tulip/Messaging.h rather than > > Cheetah/Cheetah.h. > > src/Tulip/Messaging.h: likewise. > > There seems to be Cheetah code surrounded by POOMA_MESSAGING. Is this > correct? > > > src/Tulip/PatchSizeSyncer.cmpl.cpp: likewise. > > This may need changing if the next file is changed. > > > src/Tulip/PatchSizeSyncer.h: likewise, remove unused declarations. > > Same question as for src/Tulip/Messaging.h. Ah yes, I should have mentioned that the native MPI implementation shares the Cheetah::Serialize classes with Cheetah. This minimizes necessary changes and was the least hassle. I'm going to add the MatchingHandler/Serialize.h file from the Cheetah distribution as Tulip/CheetahSerialize.h - but maybe we should check the license first. Ok with this clarification? Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Fri Dec 19 08:40:34 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 19 Dec 2003 09:40:34 +0100 (CET) Subject: Cheetah license Message-ID: Hi! Does anybody have an idea what license the Cheetah distribution is under? There is nothing mentioned at all, not even a copyright notice in any of the files in the Cheetah distribution. So I suppose we're not allowed to do anything with it? Any Cheetah-ers around? Thanks, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Sun Dec 21 14:53:15 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 21 Dec 2003 15:53:15 +0100 (CET) Subject: [PATCH] Track up-to-date faces Message-ID: Hi! This patch moves away from a bool tracking dirtyness of the internal guards, but instead track the individual faces. This allows for updating only the needed internal guards and wastly improves performance of (my) CFD codes as you can see from the top parts of a flat profile: before patch (the MultiArgKernels are the actual CFD): Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 26.19 13.59 13.59 select 4.99 16.18 2.59 2653 0.00 0.00 int EngineBlockSerialize::apply, Interval<(int)3> >(EngineElemSerialize&, Engine<(int)3, double, BrickView> const&, Interval<(int)3> const&) 4.82 18.68 2.50 read 3.97 20.74 2.06 2653 0.00 0.00 int EngineBlockSerialize::apply, Interval<(int)3> >(EngineElemDeSerialize&, Engine<(int)3, double, BrickView> const&, Interval<(int)3> const&) 2.58 22.08 1.34 write 2.51 23.38 1.30 memcpy 2.08 24.46 1.08 762 0.00 0.00 int EngineBlockSerialize::apply, BrickView>, Interval<(int)3> >(EngineElemSerialize&, Engine<(int)3, Vector<(int)3, double, Full>, BrickView> const&, Interval<(int)3> const&) 1.75 25.37 0.91 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() 1.73 26.27 0.90 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() 1.73 27.17 0.90 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() after patch: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 15.75 4.77 4.77 select 2.97 5.67 0.90 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() 2.97 6.57 0.90 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() 2.97 7.47 0.90 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() 2.91 8.35 0.88 10 0.09 0.09 MultiArgKernel >, double, CompFwd, BrickView>, Loc<(int)1> > >, Field >, double, BrickView>, BrickView, BrickView>, EvaluateLocLoop, (int)3> >::run() ... ... 1.65 20.66 0.50 453 0.00 0.00 int EngineBlockSerialize::apply, Interval<(int)3> >(EngineElemSerialize&, Engine<(int)3, double, BrickView> const&, Interval<(int)3> const&) 1.16 26.07 0.35 371 0.00 0.00 int EngineBlockSerialize::apply, Interval<(int)3> >(EngineElemDeSerialize&, Engine<(int)3, double, BrickView> const&, Interval<(int)3> const&) 0.46 28.58 0.14 80 0.00 0.00 int EngineBlockSerialize::apply, BrickView>, Interval<(int)3> >(EngineElemSerialize&, Engine<(int)3, Vector<(int)3, double, Full>, BrickView> const&, Interval<(int)3> const&) 0.23 29.20 0.07 70 0.00 0.00 int EngineBlockSerialize::apply, BrickView>, Interval<(int)3> >(EngineElemDeSerialize&, Engine<(int)3, Vector<(int)3, double, Full>, BrickView> const&, Interval<(int)3> const&) 0.50 0.00 453/453 int EngineBlockSerialize::apply, Interval<(int)3> >(EngineElemSerialize&, Engine<(int)3, double, BrickView> const&, Interval<(int)3> const&) [36] where the engine serializers are way down the profile (I grepped for them and appended the first five). Notice the drop in the number of communications from 2653 down to 453! Timewise this is an improvement of more than 50%. It passes without regressions (but these codepaths are only sightly tested in the testsuite) and looks like it produces identical results for my CFD application. But I notice an asymmetry of the Serialization/Deserialization calls after the patch and need to find out where this comes from. But still, is the underlying idea to change bool *pDirty to int *pDirty and using it as bitfield ok? I can even go further and track partial updates, but this will cost memory. Any comments? Further ideas? Richard. Too lazy to do a ChangeLog at the moment. ===== r2/src/Array/tests/makefile 1.4 vs edited ===== --- 1.4/r2/src/Array/tests/makefile Thu Jan 30 22:35:28 2003 +++ edited/r2/src/Array/tests/makefile Sun Dec 21 15:14:56 2003 @@ -39,7 +39,7 @@ array_test12 array_test13 array_test14 array_test15 array_test16 \ array_test17 array_test18 array_test19 array_test20 array_test21 \ array_test22 array_test23 array_test24 array_test25 array_test26 \ - array_test27 array_test28 + array_test27 array_test28 array_test29 default:: build ===== r2/src/Engine/Intersector.h 1.3 vs edited ===== --- 1.3/r2/src/Engine/Intersector.h Thu Oct 23 14:41:01 2003 +++ edited/r2/src/Engine/Intersector.h Sun Dec 21 15:14:56 2003 @@ -145,9 +145,47 @@ // If we've seen this ID before, we're done. if (ids_m[i] == layout.ID()) - { return false; + + // If we've seen the base ID before and the base domain is the same + // we're done. + + if (baseIDs_m[i] == layout.baseID() + && sameBaseDomain(i, layout.baseDomain(), guard)) + { + shared(layout.ID(),ids_m[i]); + + return (!sameBaseDomain(i,layout.baseDomain())); } + } + + // current touches operation works on the owned region, so we don't + // use the guard cells. If we start using touchesAlloc, then you + // need to return true here, and the bypass calculation above + // becomes somewhat more complicated. + + touches(layout); + return false; + } + + template + bool intersect(const Engine &engine, const GuardLayers &guard, GuardLayers &usedGuards) + { + CTAssert(Engine::dimensions == Dim); + + // First, we need to check through our list of layout IDs and see if we've + // either seen this layout or another layout with the same baseID before. + + typedef typename Engine::Layout_t Layout_t; + const Layout_t &layout(engine.layout()); + + int n = ids_m.size(); + for (int i = 0; i < n; ++i) + { + // If we've seen this ID before, we're done. + + if (ids_m[i] == layout.ID()) + return false; // If we've seen the base ID before and the base domain is the same // we're done. @@ -157,10 +195,27 @@ { shared(layout.ID(),ids_m[i]); - // In this case we are using the guard cells unless this domain - // is exactly the same as one we've seen before. + // was: return (!sameBaseDomain(i,layout.baseDomain())); - return (!sameBaseDomain(i,layout.baseDomain())); + // We should be able to find out the actual shape of the + // used internal guards here, rather than just returning bool. + // Something like: + + // But what do, if Dim2 > baseDims_m[i]!? + if (baseDims_m[i] < Dim2) + return true; + + bool used = false; + for (int j = 0; j < Dim2; j++) + { + usedGuards.lower(j) = std::max(0, baseDomains_m[i][j].first() - layout.baseDomain()[j].first()); + if (usedGuards.lower(j) != 0) + used = true; + usedGuards.upper(j) = std::max(0, layout.baseDomain()[j].last() - baseDomains_m[i][j].last()); + if (usedGuards.upper(j) != 0) + used = true; + } + return used; } } @@ -440,6 +495,13 @@ bool intersect(const Engine &l, const GuardLayers &guard) { return (data()->intersect(l,guard)); + } + + template + inline + bool intersect(const Engine &l, const GuardLayers &guard, GuardLayers &usedGuards) + { + return (data()->intersect(l,guard,usedGuards)); } private: ===== r2/src/Engine/MultiPatchEngine.cpp 1.3 vs edited ===== --- 1.3/r2/src/Engine/MultiPatchEngine.cpp Wed May 14 09:48:40 2003 +++ edited/r2/src/Engine/MultiPatchEngine.cpp Sun Dec 21 15:14:56 2003 @@ -36,6 +36,7 @@ #include "Tulip/ReduceOverContexts.h" #include "Threads/PoomaCSem.h" #include "Domain/IteratorPairDomain.h" +#include "Domain/Shrink.h" /////////////////////////////////////////////////////////////////////////////// // @@ -77,10 +78,12 @@ Engine(const Layout_t &layout) : layout_m(layout), data_m(layout.sizeGlobal()), - pDirty_m(new bool(true)) + pDirty_m(new int) { typedef typename Layout_t::Value_t Node_t; + setDirty(); + // check for correct match of PatchTag and the mapper used to make the // layout. // THIS IS A HACK! we test on the context of the first patch, and if it @@ -247,7 +250,7 @@ PAssert(data_m.isValid()); if (data_m.isShared()) { data_m.makeOwnCopy(); - pDirty_m = new bool(*pDirty_m); + pDirty_m = new int(*pDirty_m); } return *this; @@ -288,18 +291,89 @@ int src = p->ownedID_m; int dest = p->guardID_m; - // Create patch arrays that see the entire patch: + // Skip face, if not dirty. + + if (isDirty(p->face_m)) { + + // Create patch arrays that see the entire patch: - Array lhs(data()[dest]), rhs(data()[src]); + Array lhs(data()[dest]), rhs(data()[src]); - // Now do assignment from the subdomains. + // Now do assignment from the subdomains. - lhs(p->domain_m) = rhs(p->domain_m); + lhs(p->domain_m) = rhs(p->domain_m); + } + ++p; } - - *pDirty_m = false; + + clearDirty(); +} + +template +void Engine >:: +fillGuardsHandler(const GuardLayers& g, const WrappedInt &) const +{ + if (!isDirty()) return; + + int updated = 0; + typename Layout_t::FillIterator_t p = layout_m.beginFillList(); + + while (p != layout_m.endFillList()) + { + int src = p->ownedID_m; + int dest = p->guardID_m; + + // Skip face, if not dirty. + + if (isDirty(p->face_m)) { + + // Check, if the p->domain_m is a guard which matches the + // needed guard g. + + int d = p->face_m/2; + int guardSizeNeeded = p->face_m & 1 ? g.upper(d) : g.lower(d); + if (!(p->face_m != -1 + && guardSizeNeeded == 0)) { + + // Create patch arrays that see the entire patch: + + Array lhs(data()[dest]), rhs(data()[src]); + + // Shrink domain, if possible. Maybe not that useful, as + // we can't record this update. + + Interval domain = p->domain_m; +#if POOMA_PARTIAL_GUARDS_UPDATE + int s = domain[d].size(); + if (s > guardSizeNeeded) { + if (p->face_m & 1) + domain[d] = shrinkRight(domain[d], s - guardSizeNeeded); + else + domain[d] = shrinkLeft(domain[d], s - guardSizeNeeded); + } +#endif + + // Now do assignment from the subdomains. + + lhs(domain) = rhs(domain); + + // Mark up-to-date, if updated completely. + +#if POOMA_PARTIAL_GUARDS_UPDATE + if (s == guardSizeNeeded) +#endif + updated |= 1<face_m; + + } + + } + + ++p; + } + + *pDirty_m &= ~updated; } @@ -331,7 +405,7 @@ ++p; } - *pDirty_m = true; + setDirty(); } @@ -366,7 +440,7 @@ ++p; } - *pDirty_m = true; + setDirty(); } ===== r2/src/Engine/MultiPatchEngine.h 1.2 vs edited ===== --- 1.2/r2/src/Engine/MultiPatchEngine.h Thu Oct 23 14:41:01 2003 +++ edited/r2/src/Engine/MultiPatchEngine.h Sun Dec 21 15:14:56 2003 @@ -633,9 +633,17 @@ fillGuardsHandler(WrappedInt()); } + inline void fillGuards(const GuardLayers& g) const + { + fillGuardsHandler(g, WrappedInt()); + } + inline void fillGuardsHandler(const WrappedInt&) const { }; void fillGuardsHandler(const WrappedInt&) const ; + inline void fillGuardsHandler(const GuardLayers&, const WrappedInt&) const { }; + void fillGuardsHandler(const GuardLayers&, const WrappedInt&) const ; + //--------------------------------------------------------------------------- /// Set the internal guard cells to a particular value. @@ -650,14 +658,34 @@ /// Set and get the dirty flag (fillGuards is a no-op unless the /// dirty flag is true). - inline void setDirty() const + inline void setDirty(int face = -1) const { - *pDirty_m = true; + if (face == -1) + *pDirty_m = (1<<(Dim*2))-1; + else { + PAssert(face >= 0 && face <= Dim*2-1); + *pDirty_m |= (1<= 0 && face <= Dim*2-1); + *pDirty_m &= ~(1<= 0 && face <= Dim*2-1); + return *pDirty_m & (1<& g) const + { + baseEngine_m.fillGuards(g); + } + //--------------------------------------------------------------------------- /// Set the internal guard cells to a particular value (default zero) @@ -1213,14 +1246,19 @@ /// Set and get the dirty flag (fillGuard is a no-op unless the /// dirty flag is true). - inline void setDirty() const + inline void setDirty(int face=-1) const + { + baseEngine_m.setDirty(face); + } + + inline void clearDirty(int face=-1) const { - baseEngine_m.setDirty(); + baseEngine_m.clearDirty(face); } - inline bool isDirty() const + inline bool isDirty(int face=-1) const { - return baseEngine_m.isDirty(); + return baseEngine_m.isDirty(face); } //--------------------------------------------------------------------------- @@ -1694,12 +1732,13 @@ apply(const Engine > &engine, const ExpressionApply > &tag) { + GuardLayers usedGuards; bool useGuards = tag.tag().intersector_m.intersect(engine, - engine.layout().internalGuards()); + engine.layout().internalGuards(), usedGuards); if (useGuards) - engine.fillGuards(); + engine.fillGuards(usedGuards); return 0; } @@ -1725,13 +1764,14 @@ const ExpressionApply > &tag, const WrappedInt &) { + GuardLayers usedGuards; bool useGuards = tag.tag().intersector_m. intersect(engine, - engine.layout().baseLayout().internalGuards()); + engine.layout().baseLayout().internalGuards(), usedGuards); if (useGuards) - engine.fillGuards(); + engine.fillGuards(usedGuards); return 0; } ===== r2/src/Engine/Stencil.h 1.5 vs edited ===== --- 1.5/r2/src/Engine/Stencil.h Thu Oct 23 14:41:01 2003 +++ edited/r2/src/Engine/Stencil.h Sun Dec 21 15:14:56 2003 @@ -752,11 +752,14 @@ StencilIntersector(const This_t &model) : domain_m(model.domain_m), + stencilExtent_m(model.stencilExtent_m), intersector_m(model.intersector_m) { } - StencilIntersector(const Interval &domain, const Intersect &intersect) + StencilIntersector(const Interval &domain, const Intersect &intersect, + const GuardLayers &stencilExtent) : domain_m(domain), + stencilExtent_m(stencilExtent), intersector_m(intersect) { } @@ -766,6 +769,7 @@ { intersector_m = model.intersector_m; domain_m = model.domain_m; + stencilExtent_m = model.stencilExtent_m; } return *this; } @@ -813,8 +817,21 @@ return true; } + template + inline + bool intersect(const Engine &engine, const GuardLayers &g, + GuardLayers &usedGuards) + { + intersect(engine); + // FIXME: accumulate used guards from intersect above and + // stencil extent? I.e. allow Stencil<>(a(i-1)+a(i+1))? + usedGuards = stencilExtent_m; + return true; + } + private: Interval domain_m; + GuardLayers stencilExtent_m; Intersect intersector_m; }; @@ -833,8 +850,14 @@ const ExpressionApply > &tag) { typedef StencilIntersector NewIntersector_t; + GuardLayers stencilExtent; + for (int i=0; i(newIntersector)); ===== r2/src/Evaluator/MultiArgEvaluator.h 1.5 vs edited ===== --- 1.5/r2/src/Evaluator/MultiArgEvaluator.h Tue Nov 25 16:39:02 2003 +++ edited/r2/src/Evaluator/MultiArgEvaluator.h Sun Dec 21 15:19:16 2003 @@ -111,19 +111,16 @@ } template - void operator()(const A &a, bool f) const + void operator()(const A &a) const { - if (f) - { - // This isn't quite what we want here, because we may want to - // write to a field containing multiple centering engines. - // Need to rewrite notifyEngineWrite as an ExpressionApply, - // and create a version of ExpressionApply that goes through - // all the engines in a field. + // This isn't quite what we want here, because we may want to + // write to a field containing multiple centering engines. + // Need to rewrite notifyEngineWrite as an ExpressionApply, + // and create a version of ExpressionApply that goes through + // all the engines in a field. - notifyEngineWrite(a.engine()); - dirtyRelations(a, WrappedInt()); - } + notifyEngineWrite(a.engine()); + dirtyRelations(a, WrappedInt()); } }; @@ -172,7 +169,7 @@ MultiArgEvaluator::evaluate(multiArg, function, domain, info, kernel); - applyMultiArg(multiArg, EngineWriteNotifier(), info.writers()); + applyMultiArgIf(multiArg, EngineWriteNotifier(), info.writers()); Pooma::endExpression(); } @@ -265,7 +262,12 @@ const Kernel &kernel) { typedef SimpleIntersector Inter_t; - Inter_t inter(domain); + GuardLayers extent; + for (int i=0; i Inter_t; - Inter_t inter(domain); + GuardLayers extent; + for (int i=0; i &domain) - : seenFirst_m(false), domain_m(domain) + inline SimpleIntersectorData(const Interval &domain, const GuardLayers &extent) + : seenFirst_m(false), domain_m(domain), extent_m(extent) { } @@ -149,6 +149,7 @@ INodeContainer_t inodes_m; GlobalIDDataBase gidStore_m; Interval domain_m; + GuardLayers extent_m; }; /** @@ -179,8 +180,8 @@ enum { dimensions = Dim }; - SimpleIntersector(const Interval &domain) - : pdata_m(new SimpleIntersectorData_t(domain)), useGuards_m(true) + SimpleIntersector(const Interval &domain, const GuardLayers &extent) + : pdata_m(new SimpleIntersectorData_t(domain, extent)), useGuards_m(true) { } SimpleIntersector(const This_t &model) @@ -297,7 +298,7 @@ apply.tag().intersect(engine); if (apply.tag().useGuards()) - engine.fillGuards(); + engine.fillGuards(apply.tag().data()->extent_m); return 0; } @@ -316,7 +317,7 @@ apply.tag().intersect(engine); if (apply.tag().useGuards()) - engine.fillGuards(); + engine.fillGuards(apply.tag().data()->extent_m); return 0; } ===== r2/src/Field/DiffOps/FieldStencil.h 1.3 vs edited ===== --- 1.3/r2/src/Field/DiffOps/FieldStencil.h Sun Oct 26 14:35:20 2003 +++ edited/r2/src/Field/DiffOps/FieldStencil.h Sun Dec 21 15:14:57 2003 @@ -614,11 +614,13 @@ // Constructors FieldStencilIntersector(const This_t &model) - : domain_m(model.domain_m), intersector_m(model.intersector_m) + : domain_m(model.domain_m), stencilExtent_m(model.stencilExtent_m), + intersector_m(model.intersector_m) { } - FieldStencilIntersector(const Domain_t &dom, const Intersect &intersect) - : domain_m(dom), intersector_m(intersect) + FieldStencilIntersector(const Domain_t &dom, const Intersect &intersect, + const GuardLayers &stencilExtent) + : domain_m(dom), stencilExtent_m(stencilExtent), intersector_m(intersect) { } This_t &operator=(const This_t &model) @@ -626,6 +628,7 @@ if (this != &model) { domain_m = model.domain_m; + stencilExtent_m = model.stencilExtent_m; intersector_m = model.intersector_m; } return *this; @@ -668,10 +671,22 @@ return true; } + template + inline bool intersect(const Engine &engine, const GuardLayers &, + GuardLayers &usedGuards) + { + intersect(engine); + // FIXME: accumulate used guards from intersect above and + // stencil extent? I.e. allow Stencil<>(a(i-1)+a(i+1))? + usedGuards = stencilExtent_m; + return true; + } + private: Interval domain_m; + GuardLayers stencilExtent_m; Intersect intersector_m; }; @@ -699,8 +714,14 @@ // cells results in an error in the multipatch inode view.) typedef FieldStencilIntersector NewIntersector_t; + GuardLayers stencilExtent; + for (int i=0; i(newIntersector)); ===== r2/src/Layout/GridLayout.cpp 1.4 vs edited ===== --- 1.4/r2/src/Layout/GridLayout.cpp Wed May 14 09:51:04 2003 +++ edited/r2/src/Layout/GridLayout.cpp Sun Dec 21 15:14:41 2003 @@ -429,7 +429,7 @@ // Now, push IDs and source into cache... - this->gcFillList_m.push_back(GCFillInfo_t(gcdom, sourceID, destID)); + this->gcFillList_m.push_back(GCFillInfo_t(gcdom, sourceID, destID, d*2)); } } } @@ -481,7 +481,7 @@ // Now, push IDs and source into cache... - this->gcFillList_m.push_back(GCFillInfo_t(gcdom, sourceID, destID)); + this->gcFillList_m.push_back(GCFillInfo_t(gcdom, sourceID, destID, d*2+1)); } } } ===== r2/src/Layout/LayoutBase.h 1.3 vs edited ===== --- 1.3/r2/src/Layout/LayoutBase.h Sun Oct 26 14:35:23 2003 +++ edited/r2/src/Layout/LayoutBase.h Sun Dec 21 15:14:41 2003 @@ -119,8 +119,8 @@ struct GCFillInfo { - GCFillInfo(const Domain_t &dom, int ownedID, int guardID) - : domain_m(dom), ownedID_m(ownedID), guardID_m(guardID) { } + GCFillInfo(const Domain_t &dom, int ownedID, int guardID, int face=-1) + : domain_m(dom), ownedID_m(ownedID), guardID_m(guardID), face_m(face) { } // Get a CW warning about this not having a default constructor // when we instantiate the vector below. This never @@ -131,6 +131,7 @@ Domain_t domain_m; // guard layer domain int ownedID_m; // node ID for which domain_m is owned int guardID_m; // node ID for which domain_m is in the guards + int face_m; // destination face of the guard layer (or -1, if unknown) Domain_t & domain() { return domain_m;} int & ownedID() { return ownedID_m;} ===== r2/src/Layout/UniformGridLayout.cpp 1.4 vs edited ===== --- 1.4/r2/src/Layout/UniformGridLayout.cpp Wed May 14 09:51:04 2003 +++ edited/r2/src/Layout/UniformGridLayout.cpp Sun Dec 21 15:14:41 2003 @@ -370,7 +370,7 @@ this->all_m[sourceID]->context() == Pooma::context() || this->all_m[destID]->context() == Pooma::context() ) - this->gcFillList_m.push_back(GCFillInfo_t(gcdom,sourceID,destID)); + this->gcFillList_m.push_back(GCFillInfo_t(gcdom,sourceID,destID,d*2)); } } @@ -417,7 +417,7 @@ this->all_m[sourceID]->context() == Pooma::context() || this->all_m[destID]->context() == Pooma::context() ) - this->gcFillList_m.push_back(GCFillInfo_t(gcdom,sourceID,destID)); + this->gcFillList_m.push_back(GCFillInfo_t(gcdom,sourceID,destID,d*2+1)); } } } From rguenth at tat.physik.uni-tuebingen.de Sun Dec 21 15:38:30 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 21 Dec 2003 16:38:30 +0100 (CET) Subject: [pooma-dev] [PATCH] Track up-to-date faces In-Reply-To: References: Message-ID: On Sun, 21 Dec 2003, Richard Guenther wrote: > It passes without regressions (but these codepaths are only sightly tested > in the testsuite) and looks like it produces identical results for my CFD > application. But I notice an asymmetry of the > Serialization/Deserialization calls after the patch and need to find out > where this comes from. It turned out to be an asymmetry in the numerical scheme, so no worries. Now, ok to apply? ChangeLog below. Richard. 2003Dec21 Richard Guenther * src/Array/tests/makefile: add new test. src/Array/tests/array_test29.cpp: new test. src/Engine/Intersector.h: track used guards. src/Engine/MultiPatchEngine.cpp: new fillGuardsHandler updating only neede guards. Replace dirty handling. src/Engine/MultiPatchEngine.h: new fillGuardsHandler, replace bool pDirty with bitfield. src/Engine/Stencil.h: track used guards. src/Evaluator/MultiArgEvaluator.h: likewise. src/Evaluator/SimpleIntersector.h: likewise. src/Field/DiffOps/FieldStencil.h: likewise. src/Layout/GridLayout.cpp: remember face of guard update. src/Layout/LayoutBase.h: likewise. src/Layout/UniformGridLayout.cpp: likewise. From rguenth at tat.physik.uni-tuebingen.de Tue Dec 23 19:17:37 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 23 Dec 2003 20:17:37 +0100 (CET) Subject: [PATCH] OpenMP loop level parallelism In-Reply-To: References: Message-ID: On Fri, 28 Nov 2003, Richard Guenther wrote: > The attached patch adds loop-level parallelism via OpenMP directives. It > is tested with a full regtest using 2 threads and the Intel compiler 8.0 > on an ia32 machine with no regressions compared to non-OpenMP compilation. > Performance and scaling was _not_ evaluated yet (I will have a 4 processor > Itanium available within the next few weeks). I was just able to check the performance on the 4 processor Itanium and scaling is very good (though 4 processors is not a lot) for my CFD code: 1 CPU: 4.9s/iteration 2 CPU: 2.4s/iteration 3 CPU: 1.7s/iteration 4 CPU: 1.3s/iteration This adds an easy way to explore parallelism in POOMA based applications and is certainly better than non-compiling SMARTS. > So this is a request for comments and comparisons with parallelization > using threads from SMARTS. Anyone interested should report > success/failure. > > Suggested operation is compiling the library in serial mode (with openmp > enabled, edit the config/arch/ file) and best use single patch engines or > at least only a single patch with multi-patch engines. > > Thanks for any comments, And there were appearantly no objections in general for the patch, so, ok to apply? (It's been through regtesting the first time I submitted it with OpenMP enabled and since a lot more often with OpenMP disabled) Richard. 2003Dec23 Richard Guenther * Evaluator/InlineEvaluator.h: parallelize loops with #pragma omp parallel for. Evaluator/LoopApply.h: likewise. Evaluator/ReductionEvaluator.h: likewise, do final reduction manually. Index: Evaluator/InlineEvaluator.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/InlineEvaluator.h,v retrieving revision 1.28 diff -u -u -r1.28 InlineEvaluator.h --- Evaluator/InlineEvaluator.h 22 Oct 2003 20:43:26 -0000 1.28 +++ Evaluator/InlineEvaluator.h 27 Nov 2003 20:57:35 -0000 @@ -149,6 +149,7 @@ LHS localLHS(lhs); RHS localRHS(rhs); int e0 = domain[0].length(); +#pragma omp parallel for if (e0 > 512) for (int i0=0; i0 512) for (int i0 = f0; i0 <= e0; ++i0) op(i0); } @@ -116,6 +117,7 @@ int f1 = domain[1].first(); int e0 = domain[0].last(); int e1 = domain[1].last(); +#pragma omp parallel for for (int i1 = f1; i1 <= e1; ++i1) for (int i0 = f0; i0 <= e0; ++i0) op(i0, i1); @@ -131,6 +133,7 @@ int e0 = domain[0].last(); int e1 = domain[1].last(); int e2 = domain[2].last(); +#pragma omp parallel for for (int i2 = f2; i2 <= e2; ++i2) for (int i1 = f1; i1 <= e1; ++i1) for (int i0 = f0; i0 <= e0; ++i0) @@ -149,6 +152,7 @@ int e1 = domain[1].last(); int e2 = domain[2].last(); int e3 = domain[3].last(); +#pragma omp parallel for for (int i3 = f3; i3 <= e3; ++i3) for (int i2 = f2; i2 <= e2; ++i2) for (int i1 = f1; i1 <= e1; ++i1) @@ -170,6 +174,7 @@ int e2 = domain[2].last(); int e3 = domain[3].last(); int e4 = domain[4].last(); +#pragma omp parallel for for (int i4 = f4; i4 <= e4; ++i4) for (int i3 = f3; i3 <= e3; ++i3) for (int i2 = f2; i2 <= e2; ++i2) @@ -194,6 +199,7 @@ int e3 = domain[3].last(); int e4 = domain[4].last(); int e5 = domain[5].last(); +#pragma omp parallel for for (int i5 = f5; i5 <= e5; ++i5) for (int i4 = f4; i4 <= e4; ++i4) for (int i3 = f3; i3 <= e3; ++i3) @@ -221,6 +227,7 @@ int e4 = domain[4].last(); int e5 = domain[5].last(); int e6 = domain[6].last(); +#pragma omp parallel for for (int i6 = f6; i6 <= e6; ++i6) for (int i5 = f5; i5 <= e5; ++i5) for (int i4 = f4; i4 <= e4; ++i4) Index: Evaluator/ReductionEvaluator.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/ReductionEvaluator.h,v retrieving revision 1.9 diff -u -u -r1.9 ReductionEvaluator.h --- Evaluator/ReductionEvaluator.h 29 Oct 2003 20:13:27 -0000 1.9 +++ Evaluator/ReductionEvaluator.h 27 Nov 2003 20:57:36 -0000 @@ -108,6 +108,56 @@ }; +/** + * Class to hold static array for partial reduction results + * and routine for final reduction. Two versions, one dummy + * for non-OpenMP, one for OpenMP operation. + */ + +#ifndef _OPENMP +template +struct PartialReduction { + static inline void init() {} + inline void storePartialResult(const T& result) + { + answer = result; + } + template + inline void reduce(T& ret, const Op&) + { + ret = answer; + } + T answer; +}; +#else +template +struct PartialReduction { + static inline void init() + { + if (!answer) + answer = new T[omp_get_max_threads()]; + } + inline void storePartialResult(const T& result) + { + int n = omp_get_thread_num(); + answer[n] = result; + if (n == 0) + num_threads = omp_get_num_threads(); + } + template + inline void reduce(T& ret, const Op& op) + { + T res = answer[0]; + for (int i = 1; i +T *PartialReduction::answer = NULL; +#endif //----------------------------------------------------------------------------- @@ -130,6 +180,7 @@ template<> struct ReductionEvaluator { + //--------------------------------------------------------------------------- // Input an expression and cause it to be evaluated. // All this template function does is extract the domain @@ -139,6 +190,7 @@ inline static void evaluate(T &ret, const Op &op, const Expr &e) { typedef typename Expr::Domain_t Domain_t; + PartialReduction::init(); evaluate(ret, op, e, e.domain(), WrappedInt()); } @@ -171,7 +223,7 @@ // // NOTE: These loops assume that the domain passed in is a unit-stride // domain starting at 0. Assertions are made to make sure this is true. - + template inline static void evaluate(T &ret, const Op &op, const Expr &e, const Domain &domain, WrappedInt<1>) @@ -181,9 +233,16 @@ Expr localExpr(e); int e0 = domain[0].length(); - T answer = ReductionTraits::identity(); - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0)); + PartialReduction reduction; +#pragma omp parallel if (e0 > 512) + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); ret = answer; } @@ -199,12 +258,17 @@ int e0 = domain[0].length(); int e1 = domain[1].length(); - T answer = ReductionTraits::identity(); - for (int i1 = 0; i1 < e1; ++i1) - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1)); - - ret = answer; + PartialReduction reduction; +#pragma omp parallel + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i1 = 0; i1 < e1; ++i1) + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); } template @@ -220,13 +284,18 @@ int e1 = domain[1].length(); int e2 = domain[2].length(); - T answer = ReductionTraits::identity(); - for (int i2 = 0; i2 < e2; ++i2) - for (int i1 = 0; i1 < e1; ++i1) - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2)); - - ret = answer; + PartialReduction reduction; +#pragma omp parallel + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i2 = 0; i2 < e2; ++i2) + for (int i1 = 0; i1 < e1; ++i1) + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); } template @@ -244,14 +313,19 @@ int e2 = domain[2].length(); int e3 = domain[3].length(); - T answer = ReductionTraits::identity(); - for (int i3 = 0; i3 < e3; ++i3) - for (int i2 = 0; i2 < e2; ++i2) - for (int i1 = 0; i1 < e1; ++i1) - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3)); - - ret = answer; + PartialReduction reduction; +#pragma omp parallel + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i3 = 0; i3 < e3; ++i3) + for (int i2 = 0; i2 < e2; ++i2) + for (int i1 = 0; i1 < e1; ++i1) + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); } template @@ -271,15 +345,20 @@ int e3 = domain[3].length(); int e4 = domain[4].length(); - T answer = ReductionTraits::identity(); - for (int i4 = 0; i4 < e4; ++i4) - for (int i3 = 0; i3 < e3; ++i3) - for (int i2 = 0; i2 < e2; ++i2) - for (int i1 = 0; i1 < e1; ++i1) - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3, i4)); - - ret = answer; + PartialReduction reduction; +#pragma omp parallel + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i4 = 0; i4 < e4; ++i4) + for (int i3 = 0; i3 < e3; ++i3) + for (int i2 = 0; i2 < e2; ++i2) + for (int i1 = 0; i1 < e1; ++i1) + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3, i4)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); } template @@ -301,16 +380,21 @@ int e4 = domain[4].length(); int e5 = domain[5].length(); - T answer = ReductionTraits::identity(); - for (int i5 = 0; i5 < e5; ++i5) - for (int i4 = 0; i4 < e4; ++i4) - for (int i3 = 0; i3 < e3; ++i3) - for (int i2 = 0; i2 < e2; ++i2) - for (int i1 = 0; i1 < e1; ++i1) - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3, i4, i5)); - - ret = answer; + PartialReduction reduction; +#pragma omp parallel + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i5 = 0; i5 < e5; ++i5) + for (int i4 = 0; i4 < e4; ++i4) + for (int i3 = 0; i3 < e3; ++i3) + for (int i2 = 0; i2 < e2; ++i2) + for (int i1 = 0; i1 < e1; ++i1) + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3, i4, i5)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); } template @@ -334,17 +418,22 @@ int e5 = domain[5].length(); int e6 = domain[6].length(); - T answer = ReductionTraits::identity(); - for (int i6 = 0; i6 < e6; ++i6) - for (int i5 = 0; i5 < e5; ++i5) - for (int i4 = 0; i4 < e4; ++i4) - for (int i3 = 0; i3 < e3; ++i3) - for (int i2 = 0; i2 < e2; ++i2) - for (int i1 = 0; i1 < e1; ++i1) - for (int i0 = 0; i0 < e0; ++i0) - op(answer, localExpr.read(i0, i1, i2, i3, i4, i5, i6)); - - ret = answer; + PartialReduction reduction; +#pragma omp parallel + { + T answer = ReductionTraits::identity(); +#pragma omp for nowait + for (int i6 = 0; i6 < e6; ++i6) + for (int i5 = 0; i5 < e5; ++i5) + for (int i4 = 0; i4 < e4; ++i4) + for (int i3 = 0; i3 < e3; ++i3) + for (int i2 = 0; i2 < e2; ++i2) + for (int i1 = 0; i1 < e1; ++i1) + for (int i0 = 0; i0 < e0; ++i0) + op(answer, localExpr.read(i0, i1, i2, i3, i4, i5, i6)); + reduction.storePartialResult(answer); + } + reduction.reduce(ret, op); } }; From rguenth at tat.physik.uni-tuebingen.de Thu Dec 25 22:13:44 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 25 Dec 2003 23:13:44 +0100 (CET) Subject: [PATCH] Fix where breakage Message-ID: Hi! With my recent where improvements I did some breakage which is obviously fixed by the following patch. I also extended array_test12 to contain some >1 dim tests. Regtested on the few where tests we have using serial ppc-linux. Ok to apply? Thanks, Richard. 2003Dec25 Richard Guenther * Array/tests/array_test12.cpp: check systematically for d-dimensional array/scalar rhs in where. Evaluator/WhereProxy.h: use EvalLeaf of the dimensionality of the first where argument. Index: Array/tests/array_test12.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Array/tests/array_test12.cpp,v retrieving revision 1.14 diff -u -u -r1.14 array_test12.cpp --- Array/tests/array_test12.cpp 21 Nov 2003 21:30:37 -0000 1.14 +++ Array/tests/array_test12.cpp 25 Dec 2003 22:07:47 -0000 @@ -39,6 +39,34 @@ #include +template +void check(Pooma::Tester& tester) +{ + tester.out() << Dim << "-dimensional tests:\n"; + Interval I; + for (int i=0; i(10); + Array a(I), b(I); + a = 1.0; + b = 0.0; + b = where(a == 1.0, a); + tester.check("2-arg where with array rhs", all(b == 1.0)); + b = 0.0; + b = where(a == 1.0, 5.0); + tester.check("2-arg where with scalar rhs", all(b == 5.0)); + b = 0.0; + b = where(a == 1.0, a, a); + tester.check("3-arg where with array/array rhs", all(b == 1.0)); + b = 0.0; + b = where(a == 1.0, a, 3.0); + tester.check("3-arg where with array/scalar rhs", all(b == 1.0)); + b = 0.0; + b = where(a == 1.0, 3.0, a); + tester.check("3-arg where with scalar/array rhs", all(b == 3.0)); + b = 0.0; + b = where(a == 1.0, 1.0, 3.0); + tester.check("3-arg where with scalar/scalar rhs", all(b == 1.0)); +} int main(int argc, char* argv[]) { @@ -114,6 +142,12 @@ tester.check("where reduction", prod(where(d == 0.0, d)) == 0.0); + // generic 2/3-arg where with array/scalar rhs + + check<1>(tester); + check<2>(tester); + check<3>(tester); + int ret = tester.results("array_test12"); Pooma::finalize(); return ret; Index: Evaluator/WhereProxy.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/WhereProxy.h,v retrieving revision 1.6 diff -u -u -r1.6 WhereProxy.h --- Evaluator/WhereProxy.h 21 Nov 2003 21:30:38 -0000 1.6 +++ Evaluator/WhereProxy.h 25 Dec 2003 22:07:47 -0000 @@ -86,7 +86,7 @@ typedef typename ConvertWhereProxy::Make_t MakeFromTree_t; typedef typename MakeFromTree_t::Expression_t WhereMask_t; typedef typename ForEach::Leaf_t, - EvalLeaf<1>, OpCombine>::Type_t Element_t; + EvalLeaf, OpCombine>::Type_t Element_t; inline WhereMask_t whereMask() const From rguenth at tat.physik.uni-tuebingen.de Fri Dec 26 19:01:26 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 26 Dec 2003 20:01:26 +0100 (CET) Subject: Further improving guard update Message-ID: Hi! After ensuring we only fill those internal guards we're actually going to use, the next bottleneck is our lame data flow analysis in the scheduler. It doesn't detect the case where a write doesn't conflict with a read/write as it touches a different domain, which happens f.i. for the guard layer update. A quick hack using the generation count to track dependend iterate shows there is much room for improvement here. But I'm not sure what way we should go. I can think of those options: - pass down the evaluation domain to the data object at request time (this may be hard, as we're handling views here and need to go back to the brick domain) - do the full guard cell update within a special iterate bypassing all the request machinery for the individual updates (sounds like a lot of code duplication here, but maybe the biggest gains for the least headaches in generic code) - ??? -- I'm sure I missed the best one ;) Any ideas? I suspect I'll try to follow the second option, but at least for stencils in expression form ( b(i) = a(i-1) + a(i+1) ) this still won't offer the best solution. Thanks, Richard. From rguenth at tat.physik.uni-tuebingen.de Sat Dec 27 21:27:39 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sat, 27 Dec 2003 22:27:39 +0100 (CET) Subject: [pooma-dev] Further improving guard update In-Reply-To: References: Message-ID: On Fri, 26 Dec 2003, Richard Guenther wrote: > - pass down the evaluation domain to the data object at request time (this > may be hard, as we're handling views here and need to go back to the > brick domain) Ok, I convinced myself that the above should be the way to go. But is it possible at all? At least one should be able to template it on the domain type, so we can use AllDomain here for all requests we cannot (now) update to the new mechanism. How does SMARTS handle all its data analysis? Thanks for any suggestions, Richard. From jcrotinger at proximation.com Sun Dec 28 19:11:27 2003 From: jcrotinger at proximation.com (James Crotinger) Date: Sun, 28 Dec 2003 12:11:27 -0700 Subject: [pooma-dev] Further improving guard update Message-ID: Hi Richard, Wish you had been doing this a couple of years ago, when exponential decay hadn't set in so firmly. :) I had been looking at similar ideas back in '99. I had considered adding a Smarts DataObject for each face in order to allow independent face-to-face dependency tracking (one would need to be careful with the corners here). There were complications to the idea, though I'm afraid I can't recall what they were. I still think this is probably the way to go - SMARTs uses these objects to build a dependency graph and then evaluates that graph in some "smart" order, hoping to reuse cache, etc. The prioritization algorithm was something we had planned to play with some more. (There were also some ideas about ways to produce fewer small iterates as these really kill you, and guard filling makes a lot of these.) If I have time in the next week or so (I'm taking a bit of a break over the holidays), I'll see if I have my old email archive on one of my computers. There may be some ideas in old email. I don't think these ever reached the level of a white paper. There are some published papers on SMARTs. The only one I have on my shelf is the Proceedings from ICS '99, p. 302. I'm sure there were some SuperComputing 9x papers as well. Cheers, Jim -----Original Message----- From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] Sent: Saturday, December 27, 2003 2:28 PM To: Richard Guenther Cc: pooma-dev at pooma.codesourcery.com Subject: Re: [pooma-dev] Further improving guard update On Fri, 26 Dec 2003, Richard Guenther wrote: > - pass down the evaluation domain to the data object at request time (this > may be hard, as we're handling views here and need to go back to the > brick domain) Ok, I convinced myself that the above should be the way to go. But is it possible at all? At least one should be able to template it on the domain type, so we can use AllDomain here for all requests we cannot (now) update to the new mechanism. How does SMARTS handle all its data analysis? Thanks for any suggestions, Richard. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Tue Dec 30 15:05:23 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 30 Dec 2003 16:05:23 +0100 (CET) Subject: [pooma-dev] Further improving guard update In-Reply-To: References: Message-ID: On Sun, 28 Dec 2003, James Crotinger wrote: > Hi Richard, > > Wish you had been doing this a couple of years ago, when exponential decay > hadn't set in so firmly. :) > > I had been looking at similar ideas back in '99. I had considered adding a > Smarts DataObject for each face in order to allow independent face-to-face > dependency tracking (one would need to be careful with the corners here). I don't think having DataObjects for the faces would help (and it would have a cost). You'd either have to introduce DataObjects for the corners, too (that's 72 for the 4-dimensional case, already! don't even think about higher dimensionalities), or have overlapping DataObjects which would surely confuse the scheduler. Oh, but we have overlapping DataObjects anyway with the guards (but I think we're just looking at the owned domain here from the DataObject side of view, no?). What I'll now try to do is introducing one extra flag to the dirty mask to indicate, wether we have the corners updated. For this to work efficiently, we'd need to split the intersector expressionApply into two phases, one collecting the data about the needed guards (including wether we need any corners), and one doing the update. If we don't need the corners, we can do completely independend updates, if we need them, we have to be clever. To teach the scheduler which iterates on the same DataObject are independend, the way to go seems to be tracking of the affected domain for each iterate/DataObject. I haven't gone through the details, but it should be possible to do this. Of course, SMARTS would need to be updated for this, but I'm not interested in SMARTS at all (just concentrating on native OpenMP and MPI, and maybe hybrid operation). But I for sure are happy for any input on this matter. Richard. > There were complications to the idea, though I'm afraid I can't recall what > they were. I still think this is probably the way to go - SMARTs uses these > objects to build a dependency graph and then evaluates that graph in some > "smart" order, hoping to reuse cache, etc. The prioritization algorithm was > something we had planned to play with some more. (There were also some ideas > about ways to produce fewer small iterates as these really kill you, and > guard filling makes a lot of these.) > > If I have time in the next week or so (I'm taking a bit of a break over the > holidays), I'll see if I have my old email archive on one of my computers. > There may be some ideas in old email. I don't think these ever reached the > level of a white paper. > > There are some published papers on SMARTs. The only one I have on my shelf > is the Proceedings from ICS '99, p. 302. I'm sure there were some > SuperComputing 9x papers as well. > > Cheers, > > Jim From oldham at codesourcery.com Tue Dec 30 16:11:47 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 30 Dec 2003 08:11:47 -0800 Subject: [PATCH] Fix where breakage In-Reply-To: References: Message-ID: <3FF1A3C3.9040003@codesourcery.com> Richard Guenther wrote: > Hi! > > With my recent where improvements I did some breakage which is obviously > fixed by the following patch. I also extended array_test12 to contain > some >1 dim tests. > > Regtested on the few where tests we have using serial ppc-linux. > > Ok to apply? > > Thanks, > > Richard. > > > 2003Dec25 Richard Guenther > > * Array/tests/array_test12.cpp: check systematically for > d-dimensional array/scalar rhs in where. > Evaluator/WhereProxy.h: use EvalLeaf of the dimensionality > of the first where argument. Thanks for fixing this. It would be nice if C++ permitted overloading '?:', but it does not so 'where' is important. Yes, please commit this after fixing the ChangeLog typo: "Use EvalLeaf _on_ the dimensionality ..." -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Dec 30 17:48:41 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 30 Dec 2003 18:48:41 +0100 (CET) Subject: [PATCH] Alloc only required # of updates Message-ID: Hi! This patch changes the calcGCFillList routines to reserve only the required amount of entries in the list. Tested with Layout tests and an assert checking the resulting size is not larger than the reserved one. Ok? Richard. 2003Dec30 Richard Guenther * src/Layout/GridLayout.cpp: allocate 2*Dim*local_m.size() fill list nodes only. src/Layout/UniformGridLayout.cpp: likewise. Index: GridLayout.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/GridLayout.cpp,v retrieving revision 1.89 diff -u -u -r1.89 GridLayout.cpp --- GridLayout.cpp 11 Mar 2003 21:30:44 -0000 1.89 +++ GridLayout.cpp 30 Dec 2003 17:42:22 -0000 @@ -366,7 +366,7 @@ // the upward copies first, then the downward copies. int numPatches = this->all_m.size(); - this->gcFillList_m.reserve(2*Dim*numPatches); + this->gcFillList_m.reserve(2*Dim*this->local_m.size()); // Make sure we have the same number of patches as blocks in the grid // (this is just a sanity check). Index: UniformGridLayout.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Layout/UniformGridLayout.cpp,v retrieving revision 1.40 diff -u -u -r1.40 UniformGridLayout.cpp --- UniformGridLayout.cpp 11 Mar 2003 21:30:44 -0000 1.40 +++ UniformGridLayout.cpp 30 Dec 2003 17:42:25 -0000 @@ -299,7 +299,7 @@ int numPatches = this->all_m.size(); - this->gcFillList_m.reserve(2*Dim*numPatches); // a bit extra + this->gcFillList_m.reserve(2*Dim*this->local_m.size()); for (d = 0; d < Dim; ++d) { From oldham at codesourcery.com Tue Dec 30 18:08:50 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 30 Dec 2003 10:08:50 -0800 Subject: [PATCH] Alloc only required # of updates In-Reply-To: References: Message-ID: <3FF1BF32.9070602@codesourcery.com> Richard Guenther wrote: > Hi! > > This patch changes the calcGCFillList routines to reserve only the > required amount of entries in the list. Tested with Layout tests and an > assert checking the resulting size is not larger than the reserved one. > > Ok? Yes. > Richard. > > > 2003Dec30 Richard Guenther > > * src/Layout/GridLayout.cpp: allocate 2*Dim*local_m.size() > fill list nodes only. > src/Layout/UniformGridLayout.cpp: likewise. > > Index: GridLayout.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Layout/GridLayout.cpp,v > retrieving revision 1.89 > diff -u -u -r1.89 GridLayout.cpp > --- GridLayout.cpp 11 Mar 2003 21:30:44 -0000 1.89 > +++ GridLayout.cpp 30 Dec 2003 17:42:22 -0000 > @@ -366,7 +366,7 @@ > // the upward copies first, then the downward copies. > > int numPatches = this->all_m.size(); > - this->gcFillList_m.reserve(2*Dim*numPatches); > + this->gcFillList_m.reserve(2*Dim*this->local_m.size()); > > // Make sure we have the same number of patches as blocks in the grid > // (this is just a sanity check). > Index: UniformGridLayout.cpp > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Layout/UniformGridLayout.cpp,v > retrieving revision 1.40 > diff -u -u -r1.40 UniformGridLayout.cpp > --- UniformGridLayout.cpp 11 Mar 2003 21:30:44 -0000 1.40 > +++ UniformGridLayout.cpp 30 Dec 2003 17:42:25 -0000 > @@ -299,7 +299,7 @@ > > int numPatches = this->all_m.size(); > > - this->gcFillList_m.reserve(2*Dim*numPatches); // a bit extra > + this->gcFillList_m.reserve(2*Dim*this->local_m.size()); > > for (d = 0; d < Dim; ++d) > { -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Dec 30 19:52:15 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 30 Dec 2003 20:52:15 +0100 (CET) Subject: [PATCH] Back to using Cheetah::CHEETAH for serialization Message-ID: Hi! This patch is a partial reversion of a previous patch that made us use Cheetah::DELEGATE serialization for RemoteProxy. It also brings us a Cheetah::CHEETAH serialization for std::string, which was previously missing. One step more for the MPI merge. Tested together with all other MPI changes with serial, Cheetah and MPI. Ok? Richard. 2003Dec30 Richard Guenther * src/Tulip/RemoteProxy.h: use Cheetah::CHEETAH for serialization, add std::string serializer. Index: RemoteProxy.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tulip/RemoteProxy.h,v retrieving revision 1.19 diff -u -u -r1.19 RemoteProxy.h --- RemoteProxy.h 21 Oct 2003 18:47:59 -0000 1.19 +++ RemoteProxy.h 30 Dec 2003 19:45:26 -0000 @@ -70,63 +70,35 @@ #if POOMA_CHEETAH namespace Cheetah { - template - class DelegateType > { - public: - enum { delegate = false }; - }; - - template - class DelegateType > { - public: - enum { delegate = false }; - }; - - template - class DelegateType > { - public: - enum { delegate = false }; - }; - - template - class DelegateType > { - public: - enum { delegate = false }; - }; - /** - * DELEGATE specializations for STL vectors. + * CHEETAH specializations for STL strings */ - template - class Serialize< ::Cheetah::DELEGATE, std::vector > - { + template<> + class Serialize< ::Cheetah::CHEETAH, std::string> + { public: - - static inline int size(const std::vector& v) + static inline int size(const std::string& str) { - return Serialize::size(0, v.size()); + return Serialize::size(0, str.length()); } - - static int pack(const std::vector &v, char* buffer) + + static int pack(const std::string &str, char* buffer) { - CTAssert(!DelegateType::delegate); - return Serialize::pack(&v[0], buffer, v.size()); + return Serialize::pack(str.data(), buffer, str.length()); } - static int unpack(std::vector* &v, char* buffer) + static int unpack(std::string* &str, char* buffer) { - T* ptr; + char* ptr; int size; - int n = Serialize::unpack(ptr, buffer, size); - v = new std::vector(size); - for (int i=0; i::unpack(ptr, buffer, size); + str = new std::string(ptr, size); return n; } - static void cleanup(std::vector* v) { delete v; } + static void cleanup(std::string* str) { delete str; } }; } // namespace Cheetah @@ -190,7 +162,7 @@ { if (toContext != Pooma::context()) { - Pooma::indexHandler()->sendWith(Cheetah::DELEGATE(), toContext, tag, val); + Pooma::indexHandler()->sendWith(Cheetah::CHEETAH(), toContext, tag, val); } } #endif @@ -203,7 +175,7 @@ RemoteProxyBase::ready_m = false; - Pooma::indexHandler()->requestWith(Cheetah::DELEGATE(), owningContext, tag, + Pooma::indexHandler()->requestWith(Cheetah::CHEETAH(), owningContext, tag, This_t::receive, this); while (!RemoteProxyBase::ready_m) From rguenth at tat.physik.uni-tuebingen.de Tue Dec 30 20:17:54 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 30 Dec 2003 21:17:54 +0100 (CET) Subject: [PATCH] Add MPI variants for RemoteProxy, CollectFromContexts and ReduceOverContexts Message-ID: Hi! This patch adds native MPI variants of the above messaging abstractions. These patches were tested together with the remaining changes with serial, Cheetah and MPI. As POOMA_MPI is never defined (for now), this shouldn't introduce regressions there, too. But of course for it alone, this patch is useless. More to follow. Ok? Richard. 2003Dec30 Richard Guenther * src/Tulip/Messaging.cmpl.cpp: initialize static members for POOMA_CHEETAH only. src/Tulip/CollectFromContexts.h: add MPI variant. src/Tulip/ReduceOverContexts.h: likewise. src/Tulip/RemoteProxy.h: likewise. --- Messaging.cmpl.cpp 2003-12-09 20:30:07.000000000 +0100 +++ /tmp/Messaging.cmpl.cpp 2003-12-30 21:11:27.000000000 +0100 @@ -38,12 +38,15 @@ #include "Tulip/ReduceOverContexts.h" #include "Tulip/RemoteProxy.h" #include "Tulip/PatchSizeSyncer.h" +#include "Tulip/SendReceive.h" +#if POOMA_CHEETAH int ReduceOverContextsBase::tagBase_m = 0; int CollectFromContextsBase::tagBase_m = 0; - bool RemoteProxyBase::ready_m; int RemoteProxyBase::tag_m = 0; +#endif + //----------------------------------------------------------------------------- // Tag generator creates a set of tags for global use in r2. There is a --- CollectFromContexts.h 2003-12-09 20:27:38.000000000 +0100 +++ /tmp/CollectFromContexts.h 2003-12-30 21:11:27.000000000 +0100 @@ -44,12 +44,46 @@ // Includes: //----------------------------------------------------------------------------- -#include "Pooma/Pooma.h" #include "Tulip/Messaging.h" +#include "Utilities/PAssert.h" #include +#if !POOMA_MESSAGING + +template +class CollectFromContexts +{ +public: + + CollectFromContexts(const T &val, int context = 0, bool valid = true) + { + PAssert(valid); + PAssert(context == 0); + value_m = val; + } + + T &operator[](int i) + { + PAssert(i == 0); + return value_m; + } + + T operator[](int i) const + { + PAssert(i == 0); + return value_m; + } + +private: + T value_m; +}; + + +#else // POOMA_MESSAGING + + /** * This class associates a value with a flag that indicates whether or not * it is valid. It takes special care to not read the value if it is invalid. @@ -108,11 +142,8 @@ }; -#if POOMA_CHEETAH - namespace Cheetah { - /** * This class is used to serialize CollectionValue objects, taking care * not to send invalid values. @@ -177,9 +208,8 @@ } // namespace Cheetah -#endif - +#if POOMA_CHEETAH /** * This struct holds a few static quantities that are shared by all * instantiations of CollectFromContexts. In particular, we want to @@ -192,14 +222,14 @@ static int tagBase_m; }; - +#endif /** * This class is used to collect all valid values from all contexts. */ template -class CollectFromContexts : public CollectFromContextsBase +class CollectFromContexts { typedef CollectFromContexts This_t; @@ -215,13 +245,50 @@ // to read 'val' unless it is a valid value. Values don't have to // valid because not all contexts necessarily contribute to the collection. -#if POOMA_CHEETAH +#if POOMA_MPI + + CollectFromContexts(const T &val, int toContext = 0, bool valid = true) + : toContext_m(toContext), data_m(Pooma::contexts()) + { + typedef Cheetah::Serialize > Serialize_t; + CollectionValue v(valid, val); + // We need to get at the maximum size we need to transfer per context. + // With the valid/invalid mechanism we can't use size(v) for this, and + // for dynamic types like Grid<> we can't use CV(true, T()) either. + // So for these cases we need to communicate the maximum size needed + // (but we might be able to optimize this with appropriate type tags). + int thislength = Serialize_t::size(v); + thislength = (thislength+7)&~7; // round to qword size + int length; + MPI_Allreduce(&thislength, &length, 1, MPI_INT, MPI_MAX, MPI_COMM_WORLD); + char *buffer = new char[length]; + char *recvbuffer = NULL; + if (Pooma::context() == toContext) + recvbuffer = new char[length*Pooma::contexts()]; + Serialize_t::pack(v, buffer); + MPI_Gather(buffer, length, MPI_CHAR, + recvbuffer, length, MPI_CHAR, + toContext, MPI_COMM_WORLD); + delete[] buffer; + if (Pooma::context() == toContext) { + for (int i=0; i *v2; + Serialize_t::unpack(v2, recvbuffer+i*length); + if (v2->valid()) + data_m[i] = v2->value(); + Serialize_t::cleanup(v2); + } + delete[] recvbuffer; + } + } + +#elif POOMA_CHEETAH CollectFromContexts(const T &val, int toContext = 0, bool valid = true) : toContext_m(toContext), data_m(Pooma::contexts()) { - int tagBase = tagBase_m; - tagBase_m += Pooma::contexts(); + int tagBase = CollectFromContextsBase::tagBase_m; + CollectFromContextsBase::tagBase_m += Pooma::contexts(); if (Pooma::context() == toContext) { @@ -255,20 +322,6 @@ send(toContext, tagBase + Pooma::context(), v); } } - - T &operator[](int i) - { - PAssert(Pooma::context() == toContext_m); - PAssert(i >= 0 and i < Pooma::contexts()); - return data_m[i]; - } - - T operator[](int i) const - { - PAssert(Pooma::context() == toContext_m); - PAssert(i >= 0 and i < Pooma::contexts()); - return data_m[i]; - } private: @@ -283,47 +336,43 @@ me->toReceive_m--; } - - // The actual value we're reducing. - - std::vector data_m; // The number of messages we're receiving. int toReceive_m; - // The context we're reducing on. - - int toContext_m; - -#else +#endif + +public: - CollectFromContexts(const T &val, int = 0, bool valid = true) - { - PAssert(valid); - value_m = val; - } - T &operator[](int i) { - PAssert(i == 0); - return value_m; + PAssert(Pooma::context() == toContext_m); + PAssert(i >= 0 and i < Pooma::contexts()); + return data_m[i]; } T operator[](int i) const { - PAssert(i == 0); - return value_m; + PAssert(Pooma::context() == toContext_m); + PAssert(i >= 0 and i < Pooma::contexts()); + return data_m[i]; } private: - T value_m; + // The actual value we're reducing. + + std::vector data_m; + + // The context we're reducing on. -#endif // POOMA_CHEETAH + int toContext_m; }; +#endif // POOMA_MESSAGING + #endif // POOMA_MESSAGING_COLLECTFROMCONTEXTS_H // ACL:rcsinfo --- ReduceOverContexts.h 2003-12-09 22:49:22.000000000 +0100 +++ /tmp/ReduceOverContexts.h 2003-12-30 21:11:27.000000000 +0100 @@ -45,8 +45,8 @@ // Includes: //----------------------------------------------------------------------------- -#include "Pooma/Pooma.h" #include "Tulip/Messaging.h" +#include "Evaluator/OpMask.h" #include "Tulip/RemoteProxy.h" #include "Evaluator/OpMask.h" @@ -89,6 +89,7 @@ bool valid() const { return valid_m; } const T &value() const { PAssert(valid()); return val_m; } + T &value() { PAssert(valid()); return val_m; } private: @@ -97,7 +98,7 @@ }; -#if POOMA_CHEETAH +#if POOMA_MESSAGING namespace Cheetah { @@ -146,6 +147,9 @@ vp = new ReductionValue(*pvalid, *pval); + if (*pvalid) + Serialize::cleanup(pval); + return nBytes; } @@ -159,6 +163,8 @@ #endif + +#if POOMA_CHEETAH /** * This struct holds a few static quantities that are shared by all * instantiations of ReduceOverContexts. In particular, we want to @@ -171,7 +177,7 @@ static int tagBase_m; }; - +#endif /** * This class is used to implement the final reduction over contexts used @@ -179,7 +185,7 @@ */ template -class ReduceOverContexts : public ReduceOverContextsBase +class ReduceOverContexts { typedef ReduceOverContexts This_t; @@ -195,13 +201,15 @@ // to read 'val' unless it is a valid value. Values don't have to // valid because not all contexts necessarily contribute to the reduction. +#if POOMA_MESSAGING + #if POOMA_CHEETAH ReduceOverContexts(const T &val, int toContext = 0, bool valid = true) : valid_m(false), toContext_m(toContext) { - int tagBase = tagBase_m; - tagBase_m += Pooma::contexts(); + int tagBase = ReduceOverContextsBase::tagBase_m; + ReduceOverContextsBase::tagBase_m += Pooma::contexts(); if (Pooma::context() == toContext) { @@ -235,6 +243,50 @@ } } +#elif POOMA_MPI + + ReduceOverContexts(const T &val, int toContext = 0, bool valid = true) + : toContext_m(toContext) + { + typedef Cheetah::Serialize > Serialize_t; + ReductionValue v(valid, val); + // invalid size is different (doh!), so use some default for size + // strictly speaking this is incorrect, too (see CollectOverContexts), + // but we might not have reduction ops over dynamic sized objects... + int length = Serialize_t::size(ReductionValue(true, T())); + length = (length+7)&~7; // round to qword size + char *buffer = new char[length]; + char *recvbuffer = NULL; + if (Pooma::context() == toContext) + recvbuffer = new char[length*Pooma::contexts()]; + Serialize_t::pack(v, buffer); + MPI_Gather(buffer, length, MPI_CHAR, + recvbuffer, length, MPI_CHAR, + toContext, MPI_COMM_WORLD); + delete[] buffer; + if (Pooma::context() == toContext) { + for (int i=0; i *v2; + Serialize_t::unpack(v2, recvbuffer+i*length); + if (v2->valid()) { + if (!v.valid()) + v = *v2; + else + Unwrap::Op_t()(v.value(), v2->value()); + } + Serialize_t::cleanup(v2); + } + delete[] recvbuffer; + if (v.valid()) + value_m = v.value(); + } + } + +#endif + + // FIXME: with a different API we could use MPI_AllGather here... void broadcast(T &val) { RemoteProxy broadcast(value_m, toContext_m); @@ -254,7 +306,7 @@ val = value_m; } -#endif // POOMA_CHEETAH +#endif // POOMA_MESSAGING inline operator T() const { return value_m; } --- RemoteProxy.h 2003-12-30 20:45:38.000000000 +0100 +++ /tmp/RemoteProxy.h 2003-12-30 21:11:27.000000000 +0100 @@ -34,7 +34,13 @@ /** @file * @ingroup Tulip * @brief - * Undocumented. + * This is like MPI_Bcast. + * + * It moves a value from one context to all others. + * Special about this is that assigns to a RemoteProxy object + * on the owning context is performed to the underlying data, + * while on the remote contexts it is just done to the proxy + * object. */ #ifndef POOMA_CHEETAH_REMOTE_PROXY_H @@ -54,20 +60,15 @@ // Includes: //----------------------------------------------------------------------------- -#include "Pooma/Pooma.h" +#include "Tulip/Messaging.h" #include "Domain/Loc.h" #include "Tiny/Vector.h" -#include "Tulip/Messaging.h" #include "Functions/ComponentAccess.h" -#if POOMA_CHEETAH -# include "Cheetah/Cheetah.h" -#endif - // For Cheetah support we need to mark more types not delegate. -#if POOMA_CHEETAH +#if POOMA_MESSAGING namespace Cheetah { /** @@ -105,6 +106,22 @@ #endif +#if POOMA_CHEETAH +struct RemoteProxyBase +{ + /// If we need a remote value, then this flag lets us know when it's + /// ready. This value is static because it is used to block the parse + /// thread until the data is received. + + static bool ready_m; + + /// We only need one tag for all the remote proxies. Perhaps this could + /// be packaged with the handler for remote proxies. + + static int tag_m; +}; +#endif + /** * This class is the return type of the remote brick engine operator(). * We need an object that lets us assign to data on this context, but that @@ -122,20 +139,6 @@ * value belongs to. */ -struct RemoteProxyBase -{ - /// If we need a remote value, then this flag lets us know when it's - /// ready. This value is static because it is used to block the parse - /// thread until the data is received. - - static bool ready_m; - - /// We only need one tag for all the remote proxies. Perhaps this could - /// be packaged with the handler for remote proxies. - - static int tag_m; -}; - template class RemoteProxy { @@ -147,16 +150,15 @@ /// value and broadcast the value to the other contexts. /// Otherwise we receive the value from the owning context. +#if POOMA_CHEETAH + RemoteProxy(T &val, int owningContext = 0) { -#if POOMA_CHEETAH int tag = RemoteProxyBase::tag_m++; -#endif if (Pooma::context() == owningContext) { value_m = &val; -#if POOMA_CHEETAH int toContext; for (toContext = 0; toContext < Pooma::contexts(); ++toContext) { @@ -165,11 +167,9 @@ Pooma::indexHandler()->sendWith(Cheetah::CHEETAH(), toContext, tag, val); } } -#endif } else { -#if POOMA_CHEETAH storedValue_m = val; value_m = &storedValue_m; @@ -182,10 +182,57 @@ { Pooma::poll(); } -#endif } } +private: + // Handler function for Cheetah. + + static void receive(This_t *me, T &value) + { + me->storedValue_m = value; + RemoteProxyBase::ready_m = true; + } + +public: +#elif POOMA_MPI + + RemoteProxy(T &val, int owningContext = 0) + { + typedef Cheetah::Serialize Serialize_t; + int length = Serialize_t::size(val); + // Only the owningContext can possibly know the actual length for + // types like std::vector<>. Maybe we can conditionalize this extra + // communication on a tag field in the Cheetah::Serialize type. + MPI_Bcast(&length, 1, MPI_INT, owningContext, MPI_COMM_WORLD); + char *buffer = new char[length]; + if (Pooma::context() == owningContext) + Serialize_t::pack(val, buffer); + MPI_Bcast(buffer, length, MPI_CHAR, owningContext, MPI_COMM_WORLD); + if (Pooma::context() == owningContext) { + value_m = &val; + } else { + T *nval; + Serialize_t::unpack(nval, buffer); + storedValue_m = *nval; + value_m = &storedValue_m; + Serialize_t::cleanup(nval); + } + delete[] buffer; + } + +#else + + RemoteProxy(T &val, int owningContext = 0) + { + if (Pooma::context() == owningContext) + { + value_m = &val; + } + } + +#endif + RemoteProxy(const RemoteProxy &s) { if (s.value_m != &s.storedValue_m) @@ -246,14 +293,6 @@ } private: - - // Handler function for Cheetah. - - static void receive(This_t *me, T &value) - { - me->storedValue_m = value; - RemoteProxyBase::ready_m = true; - } // Pointer to the actual value represented by this proxy. From rguenth at tat.physik.uni-tuebingen.de Tue Dec 30 20:41:07 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 30 Dec 2003 21:41:07 +0100 (CET) Subject: [PATCH] MPI SendReceive Message-ID: Hi! This is now the MPI version of SendReceive.h, including changes to RemoteEngine.h which handles (de-)serialization of engines. The latter change allows optimizing away one of the three(!) copies we are doing currently for communicating an engine at receive time: - receive into message buffer - deserialize into temporary brick engine - copy temporary brick engine to target view the message buffer is now directly deserialized into the target view (for non-Cheetah operation, with Cheetah this is not possible). Patch which removes a fourth(!!) copy we're doing at guard update follows. Tested as usual. Ok? Richard. 2003Dec30 Richard Guenther * src/Engine/RemoteEngine.h: add deserializer into existing engine. src/Tulip/SendReceive.h: add MPI variant. ===== RemoteEngine.h 1.9 vs 1.16 ===== --- 1.9/r2/src/Engine/RemoteEngine.h Wed Dec 10 11:19:05 2003 +++ 1.16/r2/src/Engine/RemoteEngine.h Tue Dec 30 21:26:06 2003 @@ -1239,6 +1241,7 @@ t = *a; buffer_m += change; total_m += change; + Cheetah::Serialize::cleanup(a); } char *buffer_m; @@ -1248,6 +1251,9 @@ namespace Cheetah { +// All these serializers/deserializers share a common header, +// namely domain and compressed flag. + template class Serialize > { @@ -1261,6 +1267,8 @@ int nBytes=0; nBytes += Serialize::size(a.domain()); + bool compressed = false; + nBytes += Serialize::size(compressed); nBytes += a.domain().size() * Serialize::size(T()); return nBytes; @@ -1278,6 +1286,11 @@ buffer += change; nBytes += change; + bool compressed = false; + change = Serialize::pack(compressed, buffer); + buffer += change; + nBytes += change; + EngineElemSerialize op(buffer); change = EngineBlockSerialize::apply(op, a, dom); @@ -1287,20 +1300,54 @@ return nBytes; } + // We support a special unpack to avoid an extra copy. + static inline int - unpack(Engine_t* &a, char *buffer) + unpack(Engine_t &a, char *buffer) { - // We'll unpack into a Brick rather than a BrickView, since - // we just copy from it anyway. + Interval *dom; - PAssert(false); - } + int change; + int nBytes=0; - static inline void - cleanup(Engine_t* a) - { - delete a; + change = Serialize::unpack(dom, buffer); + buffer += change; + nBytes += change; + + bool *compressed; + change = Serialize::unpack(compressed, buffer); + buffer += change; + nBytes += change; + + // domains dont match probably, but at least their sizes must + for (int i=0; i::unpack(value, buffer); + + // we can't use usual array assignment here, because this would + // irritate the scheduler and lead to bogous results + Array lhs; + lhs.engine() = a; + Array rhs(*dom); + rhs.engine().setConstant(*value); + KernelEvaluator::evaluate(lhs, OpAssign(), rhs); + } else { + EngineElemDeSerialize op(buffer); + + change = EngineBlockSerialize::apply(op, a, a.domain()); + } + nBytes += change; + + Serialize::cleanup(dom); + Serialize::cleanup(compressed); + + return nBytes; } + }; template @@ -1316,6 +1363,8 @@ int nBytes=0; nBytes += Serialize::size(a.domain()); + bool compressed = false; + nBytes += Serialize::size(compressed); nBytes += a.domain().size() * Serialize::size(T()); return nBytes; @@ -1333,6 +1382,11 @@ buffer += change; nBytes += change; + bool compressed = false; + change = Serialize::pack(compressed, buffer); + buffer += change; + nBytes += change; + EngineElemSerialize op(buffer); change = EngineBlockSerialize::apply(op, a, dom); @@ -1342,6 +1396,8 @@ return nBytes; } + // Old-style unpack with extra copy. + static inline int unpack(Engine_t* &a, char *buffer) { @@ -1354,6 +1410,12 @@ buffer += change; nBytes += change; + bool *compressed; + change = Serialize::unpack(compressed, buffer); + buffer += change; + nBytes += change; + PAssert(!*compressed); + a = new Engine(*dom); EngineElemDeSerialize op(buffer); @@ -1362,6 +1424,9 @@ nBytes += change; + Serialize::cleanup(dom); + Serialize::cleanup(compressed); + return nBytes; } @@ -1370,6 +1435,7 @@ { delete a; } + }; template @@ -1386,7 +1452,10 @@ nBytes += Serialize::size(a.domain()); - bool compressed = a.compressed(); + // we cannot use a.compressed() here, because we need to + // set up a big enough receive buffer and the compressed + // flag is not valid across contexts. + bool compressed = false; nBytes += Serialize::size(compressed); if (compressed) @@ -1433,6 +1502,8 @@ return nBytes; } + // Old-style unpack with extra copy. + static inline int unpack(Engine_t* &a, char *buffer) { @@ -1446,7 +1517,6 @@ nBytes += change; bool *compressed; - change = Serialize::unpack(compressed, buffer); buffer += change; nBytes += change; @@ -1469,6 +1539,9 @@ } nBytes += change; + Serialize::cleanup(dom); + Serialize::cleanup(compressed); + return nBytes; } @@ -1477,6 +1550,7 @@ { delete a; } + }; template @@ -1493,7 +1567,10 @@ nBytes += Serialize::size(a.domain()); - bool compressed = a.compressed(); + // we cannot use a.compressed() here, because we need to + // set up a big enough receive buffer and the compressed + // flag is not valid across contexts. + bool compressed = false; nBytes += Serialize::size(compressed); if (compressed) @@ -1541,8 +1618,10 @@ return nBytes; } + // We support a special unpack to avoid an extra copy. + static inline int - unpack(Engine_t* &a, char *buffer) + unpack(Engine_t &a, char *buffer) { Interval *dom; @@ -1554,40 +1633,36 @@ nBytes += change; bool *compressed; - change = Serialize::unpack(compressed, buffer); buffer += change; nBytes += change; + // domains dont match probably, but at least their sizes must + for (int i=0; i::unpack(value, buffer); - Engine foo(*dom, *value); - - a = new Engine_t(foo, *dom); + // we can't use usual array assignment here, because this would + // irritate the scheduler and lead to bogous results + a.compressedReadWrite() = *value; } else { - Engine foo(*dom); - EngineElemDeSerialize op(buffer); - change = EngineBlockSerialize::apply(op, foo, *dom); - - a = new Engine_t(foo, *dom); + change = EngineBlockSerialize::apply(op, a, *dom); } nBytes += change; - return nBytes; - } + Serialize::cleanup(dom); + Serialize::cleanup(compressed); - static inline void - cleanup(Engine_t* a) - { - delete a; + return nBytes; } }; --- SendReceive.h 2003-10-21 20:47:59.000000000 +0200 +++ /tmp/SendReceive.h 2003-12-30 21:34:17.000000000 +0100 @@ -57,9 +57,11 @@ // Includes: //----------------------------------------------------------------------------- +#include "Tulip/Messaging.h" #include "Pooma/Pooma.h" #include "Evaluator/InlineEvaluator.h" -#include "Tulip/Messaging.h" +#include "Evaluator/RequestLocks.h" +#include "Engine/DataObject.h" #include "Utilities/PAssert.h" //----------------------------------------------------------------------------- @@ -268,14 +270,228 @@ { PAssert(fromContext >= 0); int tag = Pooma::receiveTag(fromContext); - Pooma::scheduler().handOff(new ReceiveIterate(view, - fromContext, tag)); + Pooma::scheduler().handOff(new ReceiveIterate + (view, fromContext, tag)); } }; -#else // not POOMA_CHEETAH +#elif POOMA_MPI + + +/** + * A SendIterate requests a read lock on a piece of data. When that read lock + * is granted, we call a cheetah matching handler to send the data to the + * appropriate context. We construct the SendIterate with a tag that is used + * to match the appropriate ReceiveIterate on the remote context. + */ + +template +class SendIterate + : public Pooma::Iterate_t +{ +public: + SendIterate(const View &view, int toContext, int tag) + : Pooma::Iterate_t(Pooma::scheduler()), + toContext_m(toContext), + tag_m(tag), + view_m(view) + { + PAssert(toContext >= 0); + + hintAffinity(engineFunctor(view_m, + DataObjectRequest())); + +#if POOMA_REORDER_ITERATES + // Priority interface was added to r2 version of serial async so that + // message send iterates would run before any other iterates. + priority(-1); +#endif + + DataObjectRequest writeReq(*this); + DataObjectRequest readReq(writeReq); + engineFunctor(view_m, readReq); + } + + virtual void run() + { + typedef Cheetah::Serialize Serialize_t; + + // serialize and send buffer + int length = Serialize_t::size(view_m); + buffer_m = new char[length]; + Serialize_t::pack(view_m, buffer_m); + MPI_Request *request = Smarts::SystemContext::getMPIRequest(this); + int res = MPI_Isend(buffer_m, length, MPI_CHAR, toContext_m, tag_m, + MPI_COMM_WORLD, request); + PAssert(res == MPI_SUCCESS); + + // release locks + DataObjectRequest writeReq; + DataObjectRequest readReq(writeReq); + engineFunctor(view_m, readReq); + } + + virtual ~SendIterate() + { + // cleanup temporary objects. + delete[] buffer_m; + } + +private: + + // Context we're sending the data to. + + int toContext_m; + + // A tag used to match the sent data with the right receive. + + int tag_m; + + // Communication buffer. + + char *buffer_m; + + // The data we're sending (typically a view of an array). + + View view_m; +}; + + +/** + * ReceiveIterate requests a write lock on a piece of data. When that lock + * is granted, we register the data with the cheetah matching handler which + * will fill the block when a message arrives. The write lock is released + * by the matching handler. + */ + +template +class ReceiveIterate + : public Pooma::Iterate_t +{ +public: + + typedef ReceiveIterate This_t; + + ReceiveIterate(const View &view, int fromContext, int tag) + : Pooma::Iterate_t(Pooma::scheduler()), + fromContext_m(fromContext), + tag_m(tag), buffer_m(NULL), + view_m(view) + { + PAssert(fromContext >= 0); + + hintAffinity(engineFunctor(view, + DataObjectRequest())); + +#if POOMA_REORDER_ITERATES + // Priority interface was added to r2 version of serial async so that + // message receive iterates would run after any other iterates. + priority(-1); +#endif + + DataObjectRequest writeReq(*this); + engineFunctor(view, writeReq); + + Pooma::addIncomingMessage(); + + // pre-allocate incoming buffer and issue async receive + // we may hog on requests here - so maybe we need to conditionalize + // this a bit on request availability? + if (Smarts::SystemContext::haveLotsOfMPIRequests()) { + int length = Cheetah::Serialize::size(view_m); + buffer_m = new char[length]; + MPI_Request *request = Smarts::SystemContext::getMPIRequest(this); + int res = MPI_Irecv(buffer_m, length, MPI_CHAR, fromContext_m, tag_m, + MPI_COMM_WORLD, request); + PAssert(res == MPI_SUCCESS); + } + } + + virtual void run() + { + // nothing - work is done in destructor, if we had enough requests free + if (!buffer_m) { + int length = Cheetah::Serialize::size(view_m); + buffer_m = new char[length]; + MPI_Request *request = Smarts::SystemContext::getMPIRequest(this); + int res = MPI_Irecv(buffer_m, length, MPI_CHAR, fromContext_m, tag_m, + MPI_COMM_WORLD, request); + PAssert(res == MPI_SUCCESS); + } + } + + virtual ~ReceiveIterate() + { + typedef Cheetah::Serialize Serialize_t; + + // de-serialize into target view directly + Serialize_t::unpack(view_m, buffer_m); + + // cleanup temporary objects + delete[] buffer_m; + + // release locks + DataObjectRequest writeReq; + engineFunctor(view_m, writeReq); + + Pooma::gotIncomingMessage(); + } + +private: + + // Context we're sending the data to. + + int fromContext_m; + + // A tag used to match the sent data with the right send. + + int tag_m; + + // Communication buffer. + + char *buffer_m; + + // The place to put the data we're receiving (typically a view of the + // engine).; + + View view_m; +}; + +/** + * SendReceive contains two static functions, send(view, context) and + * receive(view, context). These functions encapsulate generating matching + * tags for the send and receive and launching the iterates to perform the + * send and receive. + */ + +struct SendReceive +{ + template + static + void send(const View &view, int toContext) + { + int tag = Pooma::sendTag(toContext); + Pooma::scheduler().handOff(new SendIterate(view, toContext, tag)); + } +}; + +template +struct Receive +{ + template + static + void receive(const View &view, int fromContext) + { + PAssert(fromContext >= 0); + int tag = Pooma::receiveTag(fromContext); + Pooma::scheduler().handOff(new ReceiveIterate + (view, fromContext, tag)); + } +}; + + +#else // not POOMA_MESSAGING /** @@ -305,7 +521,8 @@ } }; -#endif // not POOMA_CHEETAH + +#endif // not POOMA_MESSAGING ////////////////////////////////////////////////////////////////////// From rguenth at tat.physik.uni-tuebingen.de Tue Dec 30 20:47:27 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 30 Dec 2003 21:47:27 +0100 (CET) Subject: [PATCH] Optimize guard update copy Message-ID: Hi! This patch removes number four of the copies done for guard update. Basically, additionally to the three copies I mentioned in the previous mail, we're doing one extra during the RemoteView expressionApply of the data-parallel assignment we're doing for the guard domains. Ugh. Fixed by manually sending/receiving from/to the views. Doesn't work for Cheetah, so conditionalized on POOMA_MPI. Tested as usual, ok to apply? Richard. 2003Dec30 Richard Guenther * src/Engine/MultiPatchEngine.cpp: optimize remote to local and local to remote copy in guard update. ===== MultiPatchEngine.cpp 1.6 vs 1.7 ===== --- 1.6/r2/src/Engine/MultiPatchEngine.cpp Tue Dec 9 12:16:07 2003 +++ 1.7/r2/src/Engine/MultiPatchEngine.cpp Thu Dec 18 16:41:50 2003 @@ -34,6 +34,7 @@ #include "Engine/CompressedFraction.h" #include "Array/Array.h" #include "Tulip/ReduceOverContexts.h" +#include "Tulip/SendReceive.h" #include "Threads/PoomaCSem.h" #include "Domain/IteratorPairDomain.h" @@ -261,6 +262,40 @@ // //----------------------------------------------------------------------------- +/// Guard layer assign between non-remote engines, just use the +/// ET mechanisms + +template +static inline +void simpleAssign(const Array& lhs, + const Array& rhs, + const Interval& domain) +{ + lhs(domain) = rhs(domain); +} + +/// Guard layer assign between remote engines, use Send/Receive directly +/// to avoid one extra copy of the data. + +template +static inline +void simpleAssign(const Array >& lhs, + const Array >& rhs, + const Interval& domain) +{ + if (lhs.engine().owningContext() == rhs.engine().owningContext()) + lhs(domain) = rhs(domain); + else { + typedef typename NewEngine, Interval >::Type_t ViewEngine_t; + if (lhs.engine().engineIsLocal()) + Receive::receive(ViewEngine_t(lhs.engine().localEngine(), domain), + rhs.engine().owningContext()); + else if (rhs.engine().engineIsLocal()) + SendReceive::send(ViewEngine_t(rhs.engine().localEngine(), domain), + lhs.engine().owningContext()); + } +} + template void Engine >:: fillGuardsHandler(const WrappedInt &) const @@ -293,8 +328,12 @@ Array lhs(data()[dest]), rhs(data()[src]); // Now do assignment from the subdomains. - + // Optimized lhs(p->domain_m) = rhs(p->domain_m); +#if POOMA_MPI + simpleAssign(lhs, rhs, p->domain_m); +#else lhs(p->domain_m) = rhs(p->domain_m); +#endif ++p; } From rguenth at tat.physik.uni-tuebingen.de Wed Dec 31 16:04:45 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 31 Dec 2003 17:04:45 +0100 (CET) Subject: [PATCH] where() strikes again... Message-ID: Hi! The nightly tester catched a regression with the WhereProxy again. This time I broke the previously working where(true, x) while fixing where(a, 1.0) for dimensions of a greater than one... So this time a more elaborated fix and a single point of failure for the still unhandled case of where(const, const). Tested on the existing where tests on serial ia32 linux, ok to apply? Richard. 2003Dec31 Richard Guenther * src/Evaluator/WhereProxy.h: introduce traits class to find dimensionality and type of the where expression. Index: Evaluator/WhereProxy.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Evaluator/WhereProxy.h,v retrieving revision 1.7 diff -u -u -r1.7 WhereProxy.h --- Evaluator/WhereProxy.h 30 Dec 2003 16:24:05 -0000 1.7 +++ Evaluator/WhereProxy.h 31 Dec 2003 15:59:29 -0000 @@ -76,6 +76,29 @@ template struct WhereProxy { + template + struct WhereProxyTraits { + enum { dimensions = F1::dimensions }; + typedef typename ForEach, OpCombine>::Type_t Element_t; + }; + template + struct WhereProxyTraits, F1, B1> { + enum { dimensions = F1::dimensions }; + typedef T Element_t; + }; + template + struct WhereProxyTraits, Val, F1, B1> { + enum { dimensions = B1::dimensions }; + typedef typename ForEach, OpCombine>::Type_t Element_t; + }; + template + struct WhereProxyTraits, Scalar, F1, B1> { + // We open a can of worms, if we try to support this strange case. + // Just use the simpler + // if (cond) + // lhs = val; + }; + WhereProxy(const F& f, const B& b) : f_m(f), b_m(b) { } typedef BinaryNode::Type_t ETrait_t; typedef typename ConvertWhereProxy::Make_t MakeFromTree_t; typedef typename MakeFromTree_t::Expression_t WhereMask_t; - typedef typename ForEach::Leaf_t, - EvalLeaf, OpCombine>::Type_t Element_t; + typedef typename WhereProxyTraits::Leaf_t, + typename CreateLeaf::Leaf_t, F, B>::Element_t Element_t; inline WhereMask_t whereMask() const From oldham at codesourcery.com Wed Dec 31 17:21:36 2003 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Wed, 31 Dec 2003 09:21:36 -0800 Subject: [PATCH] where() strikes again... In-Reply-To: References: Message-ID: <3FF305A0.4000901@codesourcery.com> Richard Guenther wrote: > Hi! > > The nightly tester catched a regression with the WhereProxy again. This > time I broke the previously working where(true, x) while fixing where(a, > 1.0) for dimensions of a greater than one... > > So this time a more elaborated fix and a single point of failure for the > still unhandled case of where(const, const). > > Tested on the existing where tests on serial ia32 linux, ok to apply? Thanks for the quick fix. Yes, please commit it. > Richard. > > > 2003Dec31 Richard Guenther > > * src/Evaluator/WhereProxy.h: introduce traits class to find > dimensionality and type of the where expression. > > Index: Evaluator/WhereProxy.h > =================================================================== > RCS file: /home/pooma/Repository/r2/src/Evaluator/WhereProxy.h,v > retrieving revision 1.7 > diff -u -u -r1.7 WhereProxy.h > --- Evaluator/WhereProxy.h 30 Dec 2003 16:24:05 -0000 1.7 > +++ Evaluator/WhereProxy.h 31 Dec 2003 15:59:29 -0000 > @@ -76,6 +76,29 @@ > template > struct WhereProxy > { > + template > + struct WhereProxyTraits { > + enum { dimensions = F1::dimensions }; > + typedef typename ForEach, OpCombine>::Type_t Element_t; > + }; > + template > + struct WhereProxyTraits, F1, B1> { > + enum { dimensions = F1::dimensions }; > + typedef T Element_t; > + }; > + template > + struct WhereProxyTraits, Val, F1, B1> { > + enum { dimensions = B1::dimensions }; > + typedef typename ForEach, OpCombine>::Type_t Element_t; > + }; > + template > + struct WhereProxyTraits, Scalar, F1, B1> { > + // We open a can of worms, if we try to support this strange case. > + // Just use the simpler > + // if (cond) > + // lhs = val; > + }; > + > WhereProxy(const F& f, const B& b) : f_m(f), b_m(b) { } > > typedef BinaryNode @@ -85,8 +108,8 @@ > typedef typename ExpressionTraits::Type_t ETrait_t; > typedef typename ConvertWhereProxy::Make_t MakeFromTree_t; > typedef typename MakeFromTree_t::Expression_t WhereMask_t; > - typedef typename ForEach::Leaf_t, > - EvalLeaf, OpCombine>::Type_t Element_t; > + typedef typename WhereProxyTraits::Leaf_t, > + typename CreateLeaf::Leaf_t, F, B>::Element_t Element_t; > > inline WhereMask_t > whereMask() const -- Jeffrey D. Oldham oldham at codesourcery.com