From mark at codesourcery.com Mon Jul 2 06:07:09 2001 From: mark at codesourcery.com (Mark Mitchell) Date: Sun, 01 Jul 2001 23:07:09 -0700 Subject: Submit timesheets! Message-ID: <335300000.994054029@warlock.codesourcery.com> There are a bunch of open timesheets for June. Please submit them ASAP. Thank you, -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From mark at codesourcery.com Mon Jul 2 06:18:13 2001 From: mark at codesourcery.com (Mark Mitchell) Date: Sun, 01 Jul 2001 23:18:13 -0700 Subject: [pooma-dev] domain architecture diagram In-Reply-To: Message-ID: <338670000.994054693@warlock.codesourcery.com> I don't know if > this issue of the empty base class is important anymore > with current C++ compilers. This was the only part of this discussion I was smart enough to understand, so I decided to reply to it. :-) In G++ 3.0, there is no empty base class penalty. The same will be true for any IA64 C++ ABI compliant compiler. Modern versions of the EDG front end (used in KCC, the Intel compiler, the SGI compiler, the DEC/Compaq compiler, and elsewhere) are capable of avoiding the penalty. However, I do now know if the vendors were willing to turn on this feature, since it will break compatibility with previous versions of their compilers. And, I do not know whether VC++ is capable of avoiding the penalty. Irrelevant compilers for computers named after fruit are mentioned only because I know that the mere presence of this sentence will get people's adrenalin pumping in New Mexico. :-) Seriously, I know nothing of whether Metrowerks can do this or not. It would be great if someone could try out: struct S { }; struct T : public S { char c; }; int main () { return sizeof (T); } with KCC. If things are good, the program will return 1; if bad, some greater value. -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From scotth at proximation.com Mon Jul 2 14:41:13 2001 From: scotth at proximation.com (Scott Haney) Date: Mon, 2 Jul 2001 08:41:13 -0600 Subject: [pooma-dev] domain architecture diagram In-Reply-To: <338670000.994054693@warlock.codesourcery.com> Message-ID: Documenting the architecture is a great first step. Thanks Allan. The next step is to write down the *essential* requirements for domains. Then, use these requirements to analyze the design and remove every architectural feature that doesn't exist to support an essential requirement. Allan, please proceed with this analysis. As an example, is avoiding the zero base-class penalty essential for domains? If it is essential, we need to provide a workaround for compilers that don't support it. This is not the case if it is simply nice for domains or if it would be really cool to implement a zero base class workaround. I would like to hear people's thoughts on this question. Scott On Monday, July 2, 2001, at 12:18 AM, Mark Mitchell wrote: > I don't know if >> this issue of the empty base class is important anymore >> with current C++ compilers. > > This was the only part of this discussion I was smart enough to > understand, so I decided to reply to it. :-) > > In G++ 3.0, there is no empty base class penalty. The same will > be true for any IA64 C++ ABI compliant compiler. Modern versions > of the EDG front end (used in KCC, the Intel compiler, the SGI > compiler, the DEC/Compaq compiler, and elsewhere) are capable of > avoiding the penalty. However, I do now know if the vendors were > willing to turn on this feature, since it will break compatibility > with previous versions of their compilers. And, I do not know > whether VC++ is capable of avoiding the penalty. > > Irrelevant compilers for computers named after fruit are mentioned > only because I know that the mere presence of this sentence will > get people's adrenalin pumping in New Mexico. :-) Seriously, I > know nothing of whether Metrowerks can do this or not. > > It would be great if someone could try out: > > struct S { }; > struct T : public S { char c; }; > > int main () { return sizeof (T); } > > with KCC. If things are good, the program will return 1; if bad, > some greater value. > > -- Mark Mitchell mark at codesourcery.com > CodeSourcery, LLC http://www.codesourcery.com -- Scott W. Haney Development Manager Proximation LLC 2960 Rodeo Park Drive West Santa Fe, NM 87505 Voice: 505-424-3809 x101 FAX: 505-438-4161 From oldham at codesourcery.com Mon Jul 2 21:08:30 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Mon, 2 Jul 2001 14:08:30 -0700 Subject: [pooma-dev] domain architecture diagram In-Reply-To: <338670000.994054693@warlock.codesourcery.com>; from mark@codesourcery.com on Sun, Jul 01, 2001 at 11:18:13PM -0700 References: <338670000.994054693@warlock.codesourcery.com> Message-ID: <20010702140830.A8444@codesourcery.com> On Sun, Jul 01, 2001 at 11:18:13PM -0700, Mark Mitchell wrote: > > It would be great if someone could try out: > > struct S { }; > struct T : public S { char c; }; > > int main () { return sizeof (T); } > > with KCC. If things are good, the program will return 1; if bad, > some greater value. The answer for KAI C++ 4.0a on Irix 6.5 is "good": KCC -V foo.cc KCC -V foo.cc KAI C++ 4.0a (KCC) -- May 17 2000 -- (C) Copyright 1994-2000 Kuck & Associates, Inc. > ./a.out ./a.out > echo $? echo $? 1 Thanks, Jeffrey D. Oldham oldham at codesourcery.com From oldham at codesourcery.com Mon Jul 2 21:13:09 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Mon, 2 Jul 2001 14:13:09 -0700 Subject: Patch: Improving StencilEngine Comments Message-ID: <20010702141309.B8444@codesourcery.com> 2001-07-02 Jeffrey D. Oldham * Stencil.h: Fix typographical errors. Add stencil concept comments. (insetDomain): Modify initial comments. (Engine): Fix typographical error. (Engine::Engine): Modify initial comments. (Engine::read): Likewise. (Engine::domain): Likewise. (Engine::first): Likewise. (Engine::viewDomain): Likewise. Add other comments. (Engine::intersectDomain): Add initial comment. (View1): Modify initial comments. (View2): Likewise. (Stencil): Fix typographical error in initial comments. (DataObjectRequest): Likewise. Tested on sequential Linux using gcc 3.0 by compiling Pooma library Approved by Scott Haney Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- Index: Stencil.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/Stencil.h,v retrieving revision 1.42 diff -c -p -r1.42 Stencil.h *** Stencil.h 2000/09/20 16:22:07 1.42 --- Stencil.h 2001/06/29 23:03:06 *************** *** 32,38 **** // StencilEngine - An tag for an engine for representing a stencil // ArrayStencil - contains utility functions for building stencils // on arrays ! // View1 - Specialization for Stencil // Engine - Specialization for StencilEngine // NewEngine - Specialization for StencilEngine // EvaluatorEngineTraits - Specialization for StencilEngine --- 32,38 ---- // StencilEngine - An tag for an engine for representing a stencil // ArrayStencil - contains utility functions for building stencils // on arrays ! // View1 - Specialization for Stencil // Engine - Specialization for StencilEngine // NewEngine - Specialization for StencilEngine // EvaluatorEngineTraits - Specialization for StencilEngine *************** *** 93,98 **** --- 93,153 ---- //----------------------------------------------------------------------------- //----------------------------------------------------------------------------- + // + // Stencil Concepts: + // + // A stencil is a pattern repeatedly applied to elements in an input domain to + // yield elements in the output domain. For example, the simplest + // stencil copies each element in the input domain to exactly the same + // element in the output domain. A second-order difference stencil can + // be represented by the formula + // + // out(i) = 2 in(i-1) + in(i) + in(i+1) + // + // where in(i) and out(i) indicate the ith input and output elements, + // respectively. This stencil illustrates that a stencil can use more + // than one input element, but that all input elements must be + // contiguous. + // + // A StencilEngine is an engine applying a stencil to an input array. + // When invoked, the result is an array filled with values from applying + // the stencil to the input array. We explain the engine's data members + // and assumptions. Even though a StencilEngine stores the data for its + // computation, actually performing the computation only when requested, + // we use the slang of its "output" to avoid writing "its output when the + // computation is invoked." Also, in the explanation below, we use + // one-dimensional terminology. The only supported domains and ranges + // are Cartesian products so the one-dimensional terminology is easily + // generalized. + // + // When created, engines frequently are given the desired array output + // range indices, e.g., -3, ..., 5. Any such range can be shifted so the + // leftmost element's index is zero, i.e., zero-based. For example, 0, + // ..., 8 with an offset of -3. To return to the "original," desired + // range, add the offset to each index. The `domain_m' variable records + // the number of output elements. + // + // Assume the engine's stencil uses input array elements with indices + // lowerExtent, lowerExtent+1, ..., 0, ..., upperExtent. Thus, to + // produce out(0) requires knowing in(lowerExtent), ..., in(upperExtent). + // The input domain to consisting of the values used to compute the + // zero-based output range is in(lowerExtent), ..., in(domain_m + + // upperExtent). + // + // The StencilEngine's data members are + // 1) function_m representing the stencil + // 2) expression_m which is approximately the input + // 3) domain_m representing the indices for the output + // 4) offset_m representing the 'shift' to yield zero-based output indices + // Note all members concern output, not input. + // + // When reading the source code below, "domain" is used for both input + // and output indices. The reader must decide the meaning of each + // occurrence. + // + //----------------------------------------------------------------------------- + + //----------------------------------------------------------------------------- // Includes: //----------------------------------------------------------------------------- *************** struct StencilEngine *** 158,163 **** --- 213,221 ---- // b(range) = st(a,range); // // because that version doesn't inset. + // + // In other words, given a stencil and an input domain, return the + // resulting output indices. //--------------------------------------------------------------------------- template *************** Interval insetDomain(const Function & *** 186,192 **** // // Typedefs for the tag, element types, domain and dimensions. // Operator() with integers to evaluate elements quickly. ! // Operator() with a doman to subset. // Accessor for the domain. // //----------------------------------------------------------------------------- --- 244,250 ---- // // Typedefs for the tag, element types, domain and dimensions. // Operator() with integers to evaluate elements quickly. ! // Operator() with a domain to subset. // Accessor for the domain. // //----------------------------------------------------------------------------- *************** public: *** 215,226 **** enum { zeroBased = true }; //============================================================ ! // Construct from a Function object and an expression. //============================================================ Engine(const Function &f, const Expression &e) : function_m(f), expression_m(e), domain_m(Pooma::NoInit()) { Interval inset = insetDomain(f, e.domain()); int d; for (d = 0; d < D; ++d) --- 273,287 ---- enum { zeroBased = true }; //============================================================ ! // Construct from a Function object (effectively a stencil) ! // and an expression (effectively the input array), and ! // sometimes an output (not input) domain. //============================================================ Engine(const Function &f, const Expression &e) : function_m(f), expression_m(e), domain_m(Pooma::NoInit()) { + // inset is the indices for the stencil's output. Interval inset = insetDomain(f, e.domain()); int d; for (d = 0; d < D; ++d) *************** public: *** 241,246 **** --- 302,309 ---- } } + // Construct an engine for composing stencils, e.g., + // stencil1(stencil2(array)). template Engine(const Engine > &e, const INode &node) *************** public: *** 271,281 **** } //============================================================ ! // Element access via ints for speed. //============================================================ inline Element_t read(int i) const { return function()(expression_m, i + offset_m[0]); } --- 334,346 ---- } //============================================================ ! // Element access via ints for speed. The arguments correspond to ! // output elements, not input elements. //============================================================ inline Element_t read(int i) const { + // Input index `i + offset_m[0]' corresponds to output index `i'. return function()(expression_m, i + offset_m[0]); } *************** public: *** 333,345 **** } //============================================================ ! // Return the domain. //============================================================ inline Domain_t domain() const { return domain_m; } //============================================================ ! // Return the first index value for the specified direction // (always zero since this engine is zero-based). //============================================================ --- 398,410 ---- } //============================================================ ! // Return the output domain. //============================================================ inline Domain_t domain() const { return domain_m; } //============================================================ ! // Return the first output index value for the specified direction // (always zero since this engine is zero-based). //============================================================ *************** public: *** 351,357 **** //--------------------------------------------------------------------------- // viewDomain() gives the region of the expression needed to compute a given ! // region of the stencil. //--------------------------------------------------------------------------- inline --- 416,423 ---- //--------------------------------------------------------------------------- // viewDomain() gives the region of the expression needed to compute a given ! // region of the stencil. That is, viewDomain(outputDomain) yields ! // the corresponding input domain. //--------------------------------------------------------------------------- inline *************** public: *** 361,366 **** --- 427,434 ---- int d; for (d = 0; d < D; ++d) { + // The computation subtracts and adds the stencil's extent from + // the "original", unshifted output domain. ret[d] = Interval<1>( domain[d].first() + offset_m[d] *************** public: *** 377,382 **** --- 445,454 ---- return INode(inode, viewDomain(inode.domain())); } + //--------------------------------------------------------------------------- + // intersectDomain() gives the "original", unshifted output domain. + //--------------------------------------------------------------------------- + inline Interval intersectDomain() const { *************** private: *** 411,417 **** //----------------------------------------------------------------------------- // View types for stencil objects. Stencils define operator() to return a ! // stencil engine object. If you wanted to store that object, you could write: // // A a; // Laplace laplace; --- 483,492 ---- //----------------------------------------------------------------------------- // View types for stencil objects. Stencils define operator() to return a ! // stencil engine object, which, when invoked, yields the result of ! // applying the stencil to the given array. ! // ! // If you wanted to store that object, you could write: // // A a; // Laplace laplace; *************** struct View1, Array --- 513,519 ---- //----------------------------------------------------------------------------- // View2 is used to construct the return type for stencils where the ! // output domain is given as well. //----------------------------------------------------------------------------- template *************** struct View2, ArrayIn, *** 517,523 **** // // The return type is whatever the stencil outputs. If this is not // the same type as the elements of 'expr', you must specialize ! // the pooma FunctorResult class (see Pooma/FunctorResult.h). // // To apply a stencil, create an instance of the Stencil<> class. // --- 592,598 ---- // // The return type is whatever the stencil outputs. If this is not // the same type as the elements of 'expr', you must specialize ! // the Pooma FunctorResult class (see Pooma/FunctorResult.h). // // To apply a stencil, create an instance of the Stencil<> class. // *************** struct LeafFunctor What is the Pooma code policy for prefixing "std::" to functions in src/Pooma/PETE/PoomaOps.in? For example, should "std::" be added before "abs"? Thanks for the comments, Jeffrey D. Oldham oldham at codesourcery.com From jhh at caverns.com Thu Jul 5 18:38:24 2001 From: jhh at caverns.com (John Hall) Date: Thu, 5 Jul 2001 12:38:24 -0600 Subject: [pooma-dev] std:: Policy for PoomaOps.in? In-Reply-To: <200107051829.LAA01353@oz.codesourcery.com> References: <200107051829.LAA01353@oz.codesourcery.com> Message-ID: Jeffrey: I recently had occasion to add a field of small structs I called tsEstimate, which required me to override two functions in "std" space to get things to compile. If the PoomaOps code had not specifically requested std:: I could have left the new functions in an out of the way namespace. Instead, I had to add them to std to get the code used. This is probably a rare occurrence, but, one that should be considered. Actually, its pretty close to time for Pooma to enter the namespace world, don't you think? John Hall > What is the Pooma code policy for prefixing "std::" to >functions in src/Pooma/PETE/PoomaOps.in? For example, should "std::" >be added before "abs"? > >Thanks for the comments, >Jeffrey D. Oldham >oldham at codesourcery.com -- From JimC at proximation.com Thu Jul 5 20:18:39 2001 From: JimC at proximation.com (James Crotinger) Date: Thu, 5 Jul 2001 14:18:39 -0600 Subject: [pooma-dev] std:: Policy for PoomaOps.in? Message-ID: > This is probably a rare occurrence, but, one that should be > considered. Actually, its pretty close to time for Pooma to enter the > namespace world, don't you think? > John Hall Not until everyone has Koenig lookup implemented correctly. Actually, some stuff in Pooma (mostly later stuff that doesn't deal with expression templates) is in the Pooma:: namespace. But putting template functions in the Pooma namespace would be a pain in the rear (for users) unless the namespace lookup stuff is done right. I think KCC 4.0 was supposed to do this right, but that's a fairly recent development. I don't know about GCC, MetroWerks, SGI CC, etc. Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From oldham at codesourcery.com Fri Jul 6 03:38:21 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Thu, 5 Jul 2001 20:38:21 -0700 Subject: RFA: Time Benchmarks Message-ID: <20010705203821.A7769@codesourcery.com> OK to commit this patch? When running benchmarks, knowing the running time is frequently useful. Currently, only the number of megaflops is computed. This patch adds a "--report-time" command-line option which substitutes running time in seconds for megaflops. 2001-07-05 Jeffrey D. Oldham * Benchmark.cmpl.cpp (Benchmark::Benchmark): Initialize report_time_m. Process "--report-time". (Benchmark::usage): Add "--report-time" explanation. (Benchmark::runImplementation): Revise storage of time xor Mflops. * Benchmark.h (Benchmark): Add report_time_m. Tested on sequential Linux using gcc3.0 and by building and running three benchmarks Approved by ???you??? Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- ? LINUXgcc ? Benchmark.patch ? Benchmark.ChangeLog ? foo ? tests/LINUXgcc Index: Benchmark.cmpl.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Utilities/Benchmark.cmpl.cpp,v retrieving revision 1.44 diff -c -p -r1.44 Benchmark.cmpl.cpp *** Benchmark.cmpl.cpp 2000/06/30 02:02:50 1.44 --- Benchmark.cmpl.cpp 2001/07/06 02:27:43 *************** Benchmark::Benchmark(int argc, char *arg *** 76,81 **** --- 76,82 ---- print_m = true; diags_m = true; + report_time_m = false; // Default Inform object has null prefix and only prints from context 0: *************** Benchmark::Benchmark(int argc, char *arg *** 225,230 **** --- 226,236 ---- print_m = false; ++i; } + else if (strcmp("--report-time", argv[i]) == 0) + { + report_time_m = true; + ++i; + } else if (strcmp("--num-patches", argv[i]) == 0) { setNumPatches_m = true; *************** void Benchmark::usage(const char *name) *** 278,284 **** << " V1, V2, etc.\n" << "--no-print.........................don't print anything (useful if\n" << " profiling using an external tool).\n" ! << "--n-idiags.........................suppress diagnostic output.\n" << "--iters N..........................run benchmark for N iterations\n" << " (no effect if using SGI timers).\n" << "--samples N........................repeat runs N time.\n" --- 284,291 ---- << " V1, V2, etc.\n" << "--no-print.........................don't print anything (useful if\n" << " profiling using an external tool).\n" ! << "--no-diags.........................suppress diagnostic output.\n" ! << "--report-time......................print time, not Mflops.\n" << "--iters N..........................run benchmark for N iterations\n" << " (no effect if using SGI timers).\n" << "--samples N........................repeat runs N time.\n" *************** void Benchmark::runImplementation(Implem *** 637,645 **** double timeper = total / double(iters); ! // Compute the MOps and store it. ! times[i] = impl->opCount() / timeper / 1.0e6; // If we're testing results and we're printing, do this now. --- 644,655 ---- double timeper = total / double(iters); ! // Either store the running time or the MOps. ! if (report_time_m) ! times[i] = total; ! else ! times[i] = impl->opCount() / timeper / 1.0e6; // If we're testing results and we're printing, do this now. Index: Benchmark.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Utilities/Benchmark.h,v retrieving revision 1.27 diff -c -p -r1.27 Benchmark.h *** Benchmark.h 2000/06/30 02:02:50 1.27 --- Benchmark.h 2001/07/06 02:27:43 *************** private: *** 199,209 **** bool setSamples_m; //--------------------------------------------------------------------------- ! // If true, we are supposed to display results or show diagnostic output. bool print_m; bool diags_m; ! // -------------------------------------------------------------------------- // Points to the Inform object pointer used for printing output: --- 199,211 ---- bool setSamples_m; //--------------------------------------------------------------------------- ! // If true, we are supposed to display results or show diagnostic ! // output or print running time, not Mflops. bool print_m; bool diags_m; ! bool report_time_m; ! // -------------------------------------------------------------------------- // Points to the Inform object pointer used for printing output: From cummings at linkline.com Fri Jul 6 18:24:45 2001 From: cummings at linkline.com (Julian C. Cummings) Date: Fri, 6 Jul 2001 11:24:45 -0700 Subject: Chi machine at LANL Message-ID: Does anyone know who to contact for support issues regarding the new Chi machine at LANL? I was able to log on, but the Compaq cxx compiler is nowhere to be found. There is no info on the LANL web site. I wrote to Andy Martinez, who issued our accounts, but got no response. Thanks, Julian C. Dr. Julian C. Cummings Staff Scientist, CACR/Caltech (626) 395-2543 cummings at cacr.caltech.edu From cummings at linkline.com Fri Jul 6 18:24:45 2001 From: cummings at linkline.com (Julian C. Cummings) Date: Fri, 6 Jul 2001 11:24:45 -0700 Subject: [pooma-dev] domain architecture diagram In-Reply-To: <338670000.994054693@warlock.codesourcery.com> Message-ID: I meant to send a note about this a few days, but anyway, the Intel VTune compiler "passes" the empty base class test. The SGI MIPSpro compiler on Nirvana does not. It gives a size of "2" for the derived class rather than "1". Julian C. > -----Original Message----- > From: Mark Mitchell [mailto:mark at codesourcery.com] > Sent: Sunday, July 01, 2001 11:18 PM > To: cummings at linkline.com; Pooma-Dev > Subject: RE: [pooma-dev] domain architecture diagram > > > I don't know if > > this issue of the empty base class is important anymore > > with current C++ compilers. > > This was the only part of this discussion I was smart enough to > understand, so I decided to reply to it. :-) > > In G++ 3.0, there is no empty base class penalty. The same will > be true for any IA64 C++ ABI compliant compiler. Modern versions > of the EDG front end (used in KCC, the Intel compiler, the SGI > compiler, the DEC/Compaq compiler, and elsewhere) are capable of > avoiding the penalty. However, I do now know if the vendors were > willing to turn on this feature, since it will break compatibility > with previous versions of their compilers. And, I do not know > whether VC++ is capable of avoiding the penalty. > > Irrelevant compilers for computers named after fruit are mentioned > only because I know that the mere presence of this sentence will > get people's adrenalin pumping in New Mexico. :-) Seriously, I > know nothing of whether Metrowerks can do this or not. > > It would be great if someone could try out: > > struct S { }; > struct T : public S { char c; }; > > int main () { return sizeof (T); } > > with KCC. If things are good, the program will return 1; if bad, > some greater value. > > -- > Mark Mitchell mark at codesourcery.com > CodeSourcery, LLC http://www.codesourcery.com > From allan at stokes.ca Fri Jul 6 23:33:18 2001 From: allan at stokes.ca (Allan Stokes) Date: Fri, 6 Jul 2001 16:33:18 -0700 Subject: status report Message-ID: Hello, This week I completed a code review of the Domain constructors and most of NewDomain. This constitutes roughly half of the Domain conceptual documentation. Most of the Domain notation which impacts the Pooma user derives from the constructor policies. This week I lost another day to back spasms after having had two really good weeks. Eighteen hours in bed. Not fun. A month ago when my back was causing me problems I purchased a thick visco-elastic pad to make my existing bed less firm and that seemed to solve the problem. It's a very hot material to sleep on so I've been keeping my eyes open to find something better. Last weekend I purchased a bed which has three layers of the same visco-elastic material built into a thick top pad under a breathable cover. I figured I would get the same benefit with less discomfort. Unfortunately my back strain has slowly returned since I changed beds and none of my adjustments have helped. Despite my decrepitude of late I feel my Pooma work is going well at long last. I'm enjoying finding my way around the Domain code. It seems excessively complicated no matter where you dive in, but I haven't yet found any complexity which could easily be sacrificed once you determine the root force. I think the only way to achieve a significant simplification here is to have less determination to make C++ do exactly what you want it to do. This weekend I'll catch up on the day I lost. I'll post my commentary on the domain constructors Monday and I'll have the first draft of my conceptual documentation for all of Domain finished by the end of next week. Allan From wdn at lanl.gov Sat Jul 7 05:24:12 2001 From: wdn at lanl.gov (Dave Nystrom) Date: Fri, 6 Jul 2001 23:24:12 -0600 Subject: [pooma-dev] Chi machine at LANL In-Reply-To: References: Message-ID: <15174.40188.602125.73427@saltydog.lanl.gov> Perhaps Lee Ankeny will know since he has used the new Compaq machines alot. I think he will be back sometime next week. Dave Nystrom email: wdn at lanl.gov LANL X-3 phone: 505-667-7913 fax: 505-665-3046 Julian C. Cummings writes: > Does anyone know who to contact for support issues > regarding the new Chi machine at LANL? I was able > to log on, but the Compaq cxx compiler is nowhere > to be found. There is no info on the LANL web site. > I wrote to Andy Martinez, who issued our accounts, > but got no response. > > Thanks, Julian C. > > > Dr. Julian C. Cummings > Staff Scientist, CACR/Caltech > (626) 395-2543 > cummings at cacr.caltech.edu > > From scotth at proximation.com Sat Jul 7 15:37:21 2001 From: scotth at proximation.com (Scott Haney) Date: Sat, 7 Jul 2001 09:37:21 -0600 Subject: [pooma-dev] status report In-Reply-To: Message-ID: On Friday, July 6, 2001, at 05:33 PM, Allan Stokes wrote: > I think the only way to achieve a significant simplification here is > to have less determination to make C++ do exactly what you want it to > do. > Allan, I think we need to re-evaluate what we *really* need out of domains and, out of this evaluation, will come the simplification. If we decide that we have a requirement that necessitates the use of fancy C++, so be it. However, I do not believe that all of the domain complexity can be justified on the basis of real requirements. In particular, I know that domains, and a lot of the early implementation of POOMA 2.4 was an exploration of what is required to support extreme generality. The problem is that experience has shown that a lot of this generality is not needed and is accounting for longer compile times, worse performance, and code bloat. Specifically, I think it is fair to say that between POOMA itself and Tecolote, a reasonable number of the usage patterns for domains have been enumerated. It is worth cataloging these since this will, largely, expose the real requirements. Consider the question of working around the non-zero-size base class problem. When do you need to solve this. I believe that this is an issue for small value types like Vector or Tensor. The reason is that you may put a billion of these in an array and you'd just as soon not waste N billion bytes. Do we ever plan to put a billion Loc, Interval, or Range objects in anything? No. Therefore, *independent of whether compilers provide support or not*, it doesn't matter if a base class wastes some space. This is not a requirement and we don't have to pay the price of complexity to supply this feature. Scott -- Scott W. Haney Development Manager Proximation LLC 2960 Rodeo Park Drive West Santa Fe, NM 87505 Voice: 505-424-3809 x101 FAX: 505-438-4161 From scotth at proximation.com Sat Jul 7 17:23:04 2001 From: scotth at proximation.com (Scott Haney) Date: Sat, 7 Jul 2001 11:23:04 -0600 Subject: Status Message-ID: I finally have a design for a POOMA relations package that I like and have implemented it and started to perform some testing. To bring people up to speed, the idea, originally developed by the Blanca team, is as follows. A very common pattern in theoretical physics is illustrated by the simple example. The total energy E is defined E = K + U where K = m v^2 / 2 is the kinetic energy and U = m g h is the potential energy and v = p / m where p is the momentum. E, K, U, and v are dependent variables while p and m are independent variables (g is a constant). Supporting this pattern is a good thing (tm) since it allows our users to more easily model their calculations using POOMA. However, there is an additional benefit in that this pattern gives us (POOMA developers) access to the global structure of the computation. This means that we know the data dependencies: for example, if p changes, we need to re-compute v and then K (but not U) to get a correct E. This, in turn, allows us to ensure correctness. Also, in principle, we could perform optimizations on the tree to improve performance. We could even cache information learned from early timesteps to improve performance as the simulation progresses. POOMA 2 has had a rudimentary implementation of this pattern since the start, primarily to support boundary conditions, which took the form F = f(F) where F is a field and f is some function of the field's values. This expression was encapsulated inside an object called an "Updater". Moreover, each Field contains a list of these updater objects. All updaters have an attribute called a "dirty" flag. The dirty flag for all of the updates associated with F is set whenever F is written to. However, it is important to realize that the expression encapsulated by the updater is not run until someone wants to read from F AND the updater's dirty flag is set. The Blanca team cleverly realized that this concept can be extended to more complicated situations. For example, F = f(F, L1, L2) There is one important difference with the previous case: whenever the fields L1 or L2 become dirty, F must become dirty. Blanca has implemented an external version of this facility, but it has been clear to me for a long time that this is more properly handled inside of POOMA, both for ease of implementation and to gain the full benefits. I have struggled with the architecture for a while, but I now have a unified design that encompasses the old boundary condition support and the newer relation pattern. This new facility will replace the code the current Updater directory. Since it is a broader abstraction, the new facility is called the "Relations" package. This is also the name originally used by Blanca. Relations can be packaged in functor objects or in regular or member functions. From our simple example above, these would look like: // Functor class ComputeKineticEnergy public: ComputeKineticEnergy(const ComputeKineticEnergy &, const Field_t &) { } void operator()(const Field_t &K, const Field_t &m, const Field_t &v) { K = m * v * v / 2; } }; // Function void computePotentialEnergy(const Field_t &U, const Field &m, const Field &h) { U = m * g * h; } // Member function struct ComputeVelocity { void doit(const Field_t &v, const Field_t &p, const Field_t &m) { v = p / m; } }; Functors are primarily useful when the calculation depends on cached data. This is the case with many of the boundary conditions, which pre-compute domain information. The constructor shown is required to allow, in principle, initialization of this auxiliary data. These relations are added to the appropriate fields using the statements Pooma::newRelationFunctor(ComputeKineticEnergy(), K, m, v); Pooma::newRelationFunctionPtr(computePotentialEnergy, U, m, h); Pooma::newRelationMemberPtr(obj, &ComputeVelocity::doit, v, p, m); Then, the entire calculation is triggered by E = K + U; If we then change p and write E.applyRelations(); v, K, and E will be automatically updated. All of this is working in simple cases. However, there are some subtle issues associated with handling arrays, stencils, and sub-fields that need to be worked out as well as the painful, but simple, work of supporting relations with up to 6 things on the RHS. Scott From laa at lanl.gov Mon Jul 9 16:02:37 2001 From: laa at lanl.gov (Lee A Ankeny) Date: Mon, 9 Jul 2001 10:02:37 -0600 Subject: [pooma-dev] Chi machine at LANL In-Reply-To: <15174.40188.602125.73427@saltydog.lanl.gov> References: <15174.40188.602125.73427@saltydog.lanl.gov> Message-ID: Try Ernie Buenafe (eyb) or Joe Kleczka (jhk). I haven't logged into chi for a while, but I suppose that cxx is installed as a module by now. Lee At 11:24 PM -0600 7/6/01, Dave Nystrom wrote: >Perhaps Lee Ankeny will know since he has used the new Compaq machines alot. >I think he will be back sometime next week. > >Dave Nystrom email: wdn at lanl.gov >LANL X-3 phone: 505-667-7913 fax: 505-665-3046 > >Julian C. Cummings writes: > > Does anyone know who to contact for support issues > > regarding the new Chi machine at LANL? I was able > > to log on, but the Compaq cxx compiler is nowhere > > to be found. There is no info on the LANL web site. > > I wrote to Andy Martinez, who issued our accounts, > > but got no response. > > > > Thanks, Julian C. > > > > > > Dr. Julian C. Cummings > > Staff Scientist, CACR/Caltech > > (626) 395-2543 > > cummings at cacr.caltech.edu > > > > -- ------------------------------------------------------------------- Lee Ankeny Section Leader, Group CCN-12, MS B295, Los Alamos, NM 87545 505-665-0195, FAX 505-665-5402, e-mail laa at lanl.gov From mark at codesourcery.com Mon Jul 9 15:46:28 2001 From: mark at codesourcery.com (Mark Mitchell) Date: Mon, 09 Jul 2001 08:46:28 -0700 Subject: [pooma-dev] std:: Policy for PoomaOps.in? In-Reply-To: Message-ID: <32320000.994693588@warlock.codesourcery.com> --On Thursday, July 05, 2001 02:18:39 PM -0600 James Crotinger wrote: > > >> This is probably a rare occurrence, but, one that should be >> considered. Actually, its pretty close to time for Pooma to enter the >> namespace world, don't you think? >> John Hall > > Not until everyone has Koenig lookup implemented correctly. > GCC implements Koenig lookup sufficiently well that I don't recall any bug reports in that area. That does not mean that it is perfect, but it is probably decent. -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From mark at codesourcery.com Mon Jul 9 15:57:36 2001 From: mark at codesourcery.com (Mark Mitchell) Date: Mon, 09 Jul 2001 08:57:36 -0700 Subject: Status Message-ID: <37090000.994694256@warlock.codesourcery.com> --On Saturday, July 07, 2001 11:23:04 AM -0600 Scott Haney wrote: > I finally have a design for a POOMA relations package that I like and > have implemented it and started to perform some testing. Excellent. > Pooma::newRelationFunctor(ComputeKineticEnergy(), K, m, v); > Pooma::newRelationFunctionPtr(computePotentialEnergy, U, m, h); > Pooma::newRelationMemberPtr(obj, &ComputeVelocity::doit, v, p, m); FYI, the STL in most places just uses one function, like this: template void newRelation(T t) { ... t() ... } which works with functors and function pointers automatically, and with the pointer-to-member case via an adaptor; I think there is something like `member_function' that returns a new functor so that you can say: newRelation (member_function(obj, &ComputeVelocity::doit), v, p, m) It's not clear to me that we should try to imitate the STL in any way in POOMA -- but its conventions are now used by lots of people, so it might be good to gradually move the interfaces in that direction. It's also worth noting that compilers are working hard on optimizing those constructs, so if POOMA looks like the STL, that might have positive side-effects as well. Doing things that way also makes for a nice concept-based view of what the arguments are; they are things that are Callable. And, of course, if you go this way you only have to write one function instead of three... -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From gdr at codesourcery.com Mon Jul 9 19:59:44 2001 From: gdr at codesourcery.com (Gabriel Dos Reis) Date: 09 Jul 2001 21:59:44 +0200 Subject: [pooma-dev] std:: Policy for PoomaOps.in? In-Reply-To: Mark Mitchell's message of "Mon, 09 Jul 2001 08:46:28 -0700" References: <32320000.994693588@warlock.codesourcery.com> Message-ID: Mark Mitchell writes: | --On Thursday, July 05, 2001 02:18:39 PM -0600 James Crotinger | wrote: | | > | > | >> This is probably a rare occurrence, but, one that should be | >> considered. Actually, its pretty close to time for Pooma to enter the | >> namespace world, don't you think? | >> John Hall | > | > Not until everyone has Koenig lookup implemented correctly. | > | | GCC implements Koenig lookup sufficiently well that I don't recall | any bug reports in that area. That does not mean that it is perfect, | but it is probably decent. Hi Mark, I guess you meant GCC-3.0. I don't know which compilers are used for POOMA but I used to have reports that VC++-6.0 has serious problems with Koenig lookup. I don't know the case for other compilers. -- Gaby From JimC at proximation.com Mon Jul 9 20:08:50 2001 From: JimC at proximation.com (James Crotinger) Date: Mon, 9 Jul 2001 14:08:50 -0600 Subject: [pooma-dev] std:: Policy for PoomaOps.in? Message-ID: > -----Original Message----- > > Hi Mark, > > I guess you meant GCC-3.0. > > I don't know which compilers are used for POOMA but I used to have > reports that VC++-6.0 has serious problems with Koenig lookup. I > don't know the case for other compilers. VC++6.0 isn't a target platform - it doesn't even support partial specialization. I'm not sure what the situation is with the Intel VTune compiler, though (wrt Koenig lookup). Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From cummings at linkline.com Mon Jul 9 22:19:04 2001 From: cummings at linkline.com (Julian C. Cummings) Date: Mon, 9 Jul 2001 15:19:04 -0700 Subject: problem with patchDomain Message-ID: Hi All, This is probably most relevant for Scott and Stephen. I almost have my new Particles SpatialLayout with NewFields working, but there is a problem that occurs with a cell-centered Field. One of the things I need to do is check whether a particle should be on the current patch by examining its position and comparing it with a bounding box around the domain of this patch. The bounding box that I create stretches one-half cell beyond the first and last field points on this patch. To get the domain of the current patch, I've been saying layout().patchDomain(lid); where layout() returns the FieldLayout and lid is the local id number of the current patch. The problem is that the patch domain I get handed is always in terms of the vertex domain (i.e., the mesh domain) because that is all that the FieldLayout knows about. This is a problem if the actual Field is cell-centered (or something else besides vertex-centered). Would it make sense to put these patchDomain() functions into the Field interface, so that the Field could check its centering and provide a properly adjusted patch domain? I guess an alternative for me is to try taking a view of the current patch of the Field and then ask that patch-view for its domain. Does that work, or do you get a zero-based domain when you view a Field patch? Julian C. Dr. Julian C. Cummings Staff Scientist, CACR/Caltech (626) 395-2543 cummings at cacr.caltech.edu From scotth at proximation.com Tue Jul 10 13:47:37 2001 From: scotth at proximation.com (Scott Haney) Date: Tue, 10 Jul 2001 07:47:37 -0600 Subject: [pooma-dev] Re: Status In-Reply-To: <37090000.994694256@warlock.codesourcery.com> Message-ID: On Monday, July 9, 2001, at 09:57 AM, Mark Mitchell wrote: > FYI, the STL in most places just uses one function, like this: > > template > void newRelation(T t) { ... t() ... } > > which works with functors and function pointers automatically, and > with the pointer-to-member case via an adaptor; I think there > is something like `member_function' that returns a new functor so that > you can say: > > newRelation (member_function(obj, &ComputeVelocity::doit), v, p, m) > > It's not clear to me that we should try to imitate the STL in any way > in POOMA -- but its conventions are now used by lots of people, so it > might be good to gradually move the interfaces in that direction. It's > also worth noting that compilers are working hard on optimizing those > constructs, so if POOMA looks like the STL, that might have positive > side-effects as well. Hi Mark, This is a good suggestion. The Blanca folks would rather have a single function to support their use of round-trip engineering tools. I'm modifying my code to do this. Scott From JimC at proximation.com Tue Jul 10 19:31:08 2001 From: JimC at proximation.com (James Crotinger) Date: Tue, 10 Jul 2001 13:31:08 -0600 Subject: [pooma-dev] std:: Policy for PoomaOps.in? Message-ID: I don't have anything formal. Most people have done Koenig lookup of operators for some time since there is no other way to name the operator (and still use it as an operator). What didn't work correctly with several compilers the last time I tested (back when I was looking at putting namespace support into PETE) was function lookup. For example: #include using std::cout; using std::endl; namespace MySpace { template class A { public: A(const T &t) : t_m(t) { } T t_m; }; template T sum(const T &t1, const T &t2) { return T(t1.t_m + t2.t_m); } int sumInt(int t1, int t2) { return t1 + t2; } } int main() { MySpace::A a(1); MySpace::A b(2); MySpace::A c = sum(a,b); cout << "A(1) + A(2) = " << c.t_m << endl; cout << "1 + 2 = " << MySpace::sumInt(1,2) << endl; return 0; } The line of note is "c = sum(a,b)". If Koenig lookup is implemented, the function "sum" does not need to be qualified with MySpace::, unlike the call to sumInt, which does need to be qualified. This appears to work with GCC 2.95, but I seem to recall that there are problems with its implementation. Marks says 3.00 works. It does not work with VC++6.0 or, surprisingly, with Intel VTune 5.0. The latter, along with SGI's CC, may be the main showstoppers. Jim -----Original Message----- From: Julian C. Cummings [mailto:cummings at linkline.com] Sent: Monday, July 09, 2001 5:17 PM To: James Crotinger Subject: RE: [pooma-dev] std:: Policy for PoomaOps.in? Jim, Do you have any little example codes that test the Koenig lookup functionality? I'd be happy to run them through Intel VTune and see what happens. I should look at what Blitz does regarding this issue. I know that Blitz Arrays and other objects are in a blitz namespace, but of course expression templates are handled differently than in Pooma. Julian C. -----Original Message----- From: James Crotinger [mailto:JimC at proximation.com] Sent: Monday, July 09, 2001 1:09 PM To: 'Gabriel Dos Reis'; Mark Mitchell Cc: James Crotinger; 'John Hall'; pooma-dev at pooma.codesourcery.com Subject: RE: [pooma-dev] std:: Policy for PoomaOps.in? > -----Original Message----- > > Hi Mark, > > I guess you meant GCC-3.0. > > I don't know which compilers are used for POOMA but I used to have > reports that VC++-6.0 has serious problems with Koenig lookup. I > don't know the case for other compilers. VC++6.0 isn't a target platform - it doesn't even support partial specialization. I'm not sure what the situation is with the Intel VTune compiler, though (wrt Koenig lookup). Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdr at codesourcery.com Tue Jul 10 20:12:17 2001 From: gdr at codesourcery.com (Gabriel Dos Reis) Date: 10 Jul 2001 22:12:17 +0200 Subject: [pooma-dev] std:: Policy for PoomaOps.in? In-Reply-To: James Crotinger's message of "Tue, 10 Jul 2001 13:31:08 -0600" References: Message-ID: James Crotinger writes: | int main() | { | MySpace::A a(1); | MySpace::A b(2); | | MySpace::A c = sum(a,b); | | cout << "A(1) + A(2) = " << c.t_m << endl; | | cout << "1 + 2 = " << MySpace::sumInt(1,2) << endl; | | return 0; | } | | The line of note is "c = sum(a,b)". If Koenig lookup is implemented, the | function "sum" does not need to be qualified with MySpace::, unlike the call | to sumInt, which does need to be qualified. | | This appears to work with GCC 2.95, but I seem to recall that there are | problems with its implementation. Marks says 3.00 works. Well, I guess I created some confusion with my nuance on Mark's statement. Koenig lookup was implemented in GCC-2.95, but there were some problems (which are now corrected in GCC-3.0). To have Koenig lookup work well in generic codes, one often needs to bring in scope some symbols which could be not be found otherwise. The natural mechanism for that is a using-declaration. But the problem with GCC-2.95 was that using-declarations in function templates were simply ignored. That is no longer the case with GCC-3.0. Hope I cleared the confusion I created, -- Gaby From mark at codesourcery.com Tue Jul 24 05:35:52 2001 From: mark at codesourcery.com (Mark Mitchell) Date: Tue, 24 Jul 2001 05:35:52 -0000 Subject: Revised NewField Abstractions Document In-Reply-To: <20010723131119.A15511@codesourcery.com> Message-ID: <3815506829.995952952@[192.168.0.164]> --On Monday, July 23, 2001 1:11 PM -0700 Jeffrey Oldham wrote: > Attached is a revised document covering what we discussed while in > Santa Fe. Please resend this to pooma-dev, both so that everyone on that list can participate, and so that this goes in our permanent archives. You all obviously did some very good stuff in Santa Fe, and this kind of document will be very useful for guiding our progress. -- Mark From oldham at codesourcery.com Mon Jul 23 22:51:07 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Mon, 23 Jul 2001 15:51:07 -0700 Subject: Revised NewField Abstraction Document Message-ID: <20010723155107.B16761@codesourcery.com> I partially revised the NewField abstraction document distributed at last Thursday's meeting between Proximation and Blanca. I revised the beginning of section~4, adding information about `FieldOffsetList's and nearest neighbors. Note this was simplified since the Thursday meeting (and will probably change again before implementation). The document is still in progress and subject to change at any time. When we figure out where to store it, it will be added to the Pooma CVS tree. (If you have difficulty reading the PDF file, please let me know.) Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- A non-text attachment was scrubbed... Name: centerings.pdf Type: application/pdf Size: 140999 bytes Desc: not available URL: From sunsetmesa at earthlink.net Tue Jul 24 19:24:10 2001 From: sunsetmesa at earthlink.net (William Nystrom) Date: Tue, 24 Jul 2001 13:24:10 -0600 Subject: FW: Questions about data in Fields Message-ID: <4120017224192410839@earthlink.net> Second try to send this email. ----- Original Message ----- From: William Nystrom To: pooma-dev at codesourcery.com Cc: jcm at lanl.gov ; jxyh at lanl.gov ; sunsetmesa Sent: 7/24/2001 1:20:29 PM Subject: Questions about data in Fields Hi Guys, I talked to Jim about interfacing to some fortran linear solver code awhile back and then before I left for vacation, John and I did some work to try and write the interface for our application using Pooma 2 so we could use this fortran linear solver package. One of the things I am trying to do is to query a Pooma 2 Field and find out the size of the data that is local to a processor. I've done this query for the domain object for a Field and for a cell centered field, it reports sizes in each dimension or coordinate that are one more than they should be. John told me that you guys had decided to allocate enough space for a vertex centered field even if the field was cell centered - as an optimization of some sort. I am worried that the data for a cell centered field may not be contiguous because of the extra padding that occurs for cell centered fields and because the domain object thinks it's size in each dimension is one larger. Can you tell me if the data for a cell centered field that is local to a processor with one patch per processor is actually contiguous in memory? I can test this experimentally but I have not had a chance to do this yet. Also, can you tell me the recommended way to get the correct size of my data on a local processor for a cell centered Pooma 2 Field? Please send replies to sunsetmesa at earthlink.net as I am not able right now to read my lanl email and I am not subscribing to pooma-dev from my ISP account. Thanks, --- William Nystrom --- sunsetmesa at earthlink.net --- EarthLink: It's your Internet. --- William Nystrom --- sunsetmesa at earthlink.net --- EarthLink: It's your Internet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oldham at codesourcery.com Tue Jul 24 19:58:53 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Tue, 24 Jul 2001 12:58:53 -0700 Subject: newfield_revision Patch: FieldOffset Code Message-ID: <20010724125853.C29041@codesourcery.com> The following code implements the FieldOffset portion of the NewField abstraction revisions proposed by Jeffrey D. Oldham and Stephen Smith. A FieldOffset is a pair containing a cell offset and a number indicating a centering value within the cell. Combining with a field and a Loc indicating a specific field cell, this yields a field value. This patch is applied to the newfield_revision development branch, which was created so Stephen and I can share code during our experimentation and development work. There is no guarantee the code on this branch will compile, much less work. I post this message to let you (and my boss) know that we are working toward a new and improved Pooma. 2001-07-24 Jeffrey D. Oldham * Field.h: Added inclusion of FieldOffset.h Added notation that support for FieldOffset with no Loc is needed. View2, Loc>: New specialization. Fix grammatical error in comment. * FieldOffset.h: New file defining FieldOffest. * tests/FieldOffset.cpp: New file testing FieldOffset. * tests/FieldTour3.cpp: Add explanatory comments. * tests/makefile: Add support for FieldTour3 and FieldOffset. Applied to newfield_revision branch Tested on sequential Linux using gcc 3.0 by building library and NewField tests Approved by Stephen Smith Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- Index: Field.h =================================================================== RCS file: /home/pooma/Repository/r2/src/NewField/Field.h,v retrieving revision 1.15.2.2 diff -c -p -r1.15.2.2 Field.h *** Field.h 2001/07/17 23:22:39 1.15.2.2 --- Field.h 2001/07/24 19:48:11 *************** *** 67,72 **** --- 67,73 ---- #include "NewField/VectorFieldOperators.h" #include "NewField/FieldCreateLeaf.h" #include "NewField/FieldCentering.h" + #include "NewField/FieldOffset.h" #include "NewField/PrintField.h" *************** struct View1 and + // FIXME: FieldOffset. + //----------------------------------------------------------------------------- // AltView1 avoids an instantiation problem that arises when two // classes use each other. This class's definition should be exactly *************** struct View2, Loc > specialization for + // indexing a field with a FieldOffset and a Loc. + //----------------------------------------------------------------------------- + + template + struct View2, + FieldOffset, + Loc > + { + // Convenience typedef for the thing we're taking a view of. + + typedef Field Subject_t; + + // The field's dimension (i.e., the number of indices required to select a point). + + enum { dimensions = Subject_t::dimensions }; + + // The return types. + + typedef typename Subject_t::Element_t ReadType_t; + typedef typename Subject_t::ElementRef_t Type_t; + + // The functions that do the indexing. + + inline static + Type_t make(const Subject_t &f, + const FieldOffset &fo, + const Loc &loc) + { + CTAssert(dimensions == Dim); + PAssert(f.numSubFields() > 0); + + #if POOMA_BOUNDS_CHECK + PInsist(contains(f.totalDomain(), loc + fo.cellOffset()), + "Field view bounds error."); + #endif + return f[fo.subFieldNumber()].engine()(loc + fo.cellOffset()); + } + + inline static + ReadType_t makeRead(const Subject_t &f, + const FieldOffset &fo, + const Loc &loc) + { + PAssert(f.numSubFields() > 0); + + #if POOMA_BOUNDS_CHECK + PInsist(contains(f.totalDomain(), loc + fo.cellOffset()), + "Field view bounds error."); + #endif + return f[fo.subFieldNumber()].engine().read(loc + fo.cellOffset()); + } + }; + + + //----------------------------------------------------------------------------- + // View2, Loc > specialization for + // indexing a field with a FieldOffset and a Loc. + //----------------------------------------------------------------------------- + + template + struct View2, + FieldOffset, + Loc > + { + // Convenience typedef for the thing we're taking a view of. + + typedef Field Subject_t; + + // The field's dimension (i.e., the number of indices required to select a point). + + enum { dimensions = Subject_t::dimensions }; + + // The return types. + + typedef typename Subject_t::Element_t ReadType_t; + typedef typename Subject_t::ElementRef_t Type_t; + + // The functions that do the indexing. + + inline static + Type_t make(const Subject_t &f, + const FieldOffset &fo, + const Loc &loc) + { + CTAssert(dimensions == Dim); + + #if POOMA_BOUNDS_CHECK + PInsist(contains(f.totalDomain(), loc + fo.cellOffset()), + "Field view bounds error."); + #endif + return f.engine()(loc + fo.cellOffset()); + } + + inline static + ReadType_t makeRead(const Subject_t &f, + const FieldOffset &fo, + const Loc &loc) + { + #if POOMA_BOUNDS_CHECK + PInsist(contains(f.totalDomain(), loc + fo.cellOffset()), + "Field view bounds error."); + #endif + return f.engine().read(loc + fo.cellOffset()); + } + }; + + + //----------------------------------------------------------------------------- // View3 specialization for indexing a field with three // domains. //----------------------------------------------------------------------------- *************** public: *** 1249,1256 **** //--------------------------------------------------------------------------- ! // Component-forwarding functions. These work quite similar to the ones from ! // Array except we produce a Field with the same GeometryTag. inline typename ComponentView, This_t>::Type_t comp(const int &i1) const --- 1362,1369 ---- //--------------------------------------------------------------------------- ! // Component-forwarding functions. These work quite similarly to the ! // ones from Array except we produce a Field with the same GeometryTag. inline typename ComponentView, This_t>::Type_t comp(const int &i1) const Index: FieldOffset.h =================================================================== RCS file: FieldOffset.h diff -N FieldOffset.h *** /dev/null Tue May 5 14:32:27 1998 --- FieldOffset.h Tue Jul 24 13:48:11 2001 *************** *** 0 **** --- 1,153 ---- + // -*- C++ -*- + // ACL:license + // ---------------------------------------------------------------------- + // This software and ancillary information (herein called "SOFTWARE") + // called POOMA (Parallel Object-Oriented Methods and Applications) is + // made available under the terms described here. The SOFTWARE has been + // approved for release with associated LA-CC Number LA-CC-98-65. + // + // Unless otherwise indicated, this SOFTWARE has been authored by an + // employee or employees of the University of California, operator of the + // Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with + // the U.S. Department of Energy. The U.S. Government has rights to use, + // reproduce, and distribute this SOFTWARE. The public may copy, distribute, + // prepare derivative works and publicly display this SOFTWARE without + // charge, provided that this Notice and any statement of authorship are + // reproduced on all copies. Neither the Government nor the University + // makes any warranty, express or implied, or assumes any liability or + // responsibility for the use of this SOFTWARE. + // + // If SOFTWARE is modified to produce derivative works, such modified + // SOFTWARE should be clearly marked, so as not to confuse it with the + // version available from LANL. + // + // For more information about POOMA, send e-mail to pooma at acl.lanl.gov, + // or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. + // ---------------------------------------------------------------------- + // ACL:license + + //----------------------------------------------------------------------------- + // Classes: + // FieldOffset + //----------------------------------------------------------------------------- + + #ifndef POOMA_NEWFIELD_OFFSET_H + #define POOMA_NEWFIELD_OFFSET_H + + //----------------------------------------------------------------------------- + // Overview: + // + // FieldOffset + // - specifies a relative cell offset and subfield number + //----------------------------------------------------------------------------- + + //----------------------------------------------------------------------------- + // Includes: + //----------------------------------------------------------------------------- + + #include "Domain/Loc.h" + + //----------------------------------------------------------------------------- + // Forward declarations: + //----------------------------------------------------------------------------- + + template + class FieldOffset; + + //----------------------------------------------------------------------------- + // Full Description of FieldOffset: + // + // Given a field f, a Loc loc, and a field offset (offset,num), a + // field value can be obtained. Since each value specified by the + // field's centering is stored in a separate subfield, the notation + // f[num](loc + offset) yields the value. + // + // Accessing values for fields with exactly one value per cell differs + // from accessing fields with multiple subfields. If a field has + // exactly one value per cell, use FieldOffset, which does not + // store a subfield number. If a field has multiple subfields, use + // FieldOffset, which stores a subfield number. + // + //----------------------------------------------------------------------------- + + + //----------------------------------------------------------------------------- + // FieldOffset. + //----------------------------------------------------------------------------- + + template + class FieldOffset { + public: + + //--------------------------------------------------------------------------- + // User-callable constructors. These ctors are meant to be called by users. + + FieldOffset(const Loc &loc, const int subFieldNumber = 0) + : cell_offset_m(loc), subfield_number_m (subFieldNumber) + { + #if POOMA_BOUNDS_CHECK + PInsist(subfield_number_m >= 0, "Erroneous FieldOffset subfield number."); + #endif + return; + } + + //--------------------------------------------------------------------------- + // Accessors. + + inline const Loc &cellOffset() const + { + return cell_offset_m; + } + + inline int subFieldNumber() const + { + return subfield_number_m; + } + + private: + + // The cell offset. + Loc cell_offset_m; + + // The subfield number, if appropriate. + int subfield_number_m; + }; + + + template + class FieldOffset { + public: + + //--------------------------------------------------------------------------- + // User-callable constructors. These ctors are meant to be called by users. + + FieldOffset(const Loc &loc, const int subFieldNumber = 0) + : cell_offset_m(loc), subfield_number_m (-1) + { } + + //--------------------------------------------------------------------------- + // Accessors. + + inline const Loc &cellOffset() const + { + return cell_offset_m; + } + + private: + + // The cell offset. + Loc cell_offset_m; + + // The subfield number, if appropriate. + int subfield_number_m; + }; + + + #endif // POOMA_NEWFIELD_OFFSET_H + + // ACL:rcsinfo + // ---------------------------------------------------------------------- + // $RCSfile: FieldCentering.h,v $ $Author: oldham $ + // $Revision: 1.1.2.1 $ $Date: 2001/07/16 20:44:59 $ + // ---------------------------------------------------------------------- + // ACL:rcsinfo Index: tests/FieldOffset.cpp =================================================================== RCS file: FieldOffset.cpp diff -N FieldOffset.cpp *** /dev/null Tue May 5 14:32:27 1998 --- FieldOffset.cpp Tue Jul 24 13:48:11 2001 *************** *** 0 **** --- 1,125 ---- + // -*- C++ -*- + // ACL:license + // ---------------------------------------------------------------------- + // This software and ancillary information (herein called "SOFTWARE") + // called POOMA (Parallel Object-Oriented Methods and Applications) is + // made available under the terms described here. The SOFTWARE has been + // approved for release with associated LA-CC Number LA-CC-98-65. + // + // Unless otherwise indicated, this SOFTWARE has been authored by an + // employee or employees of the University of California, operator of the + // Los Alamos National Laboratory under Contract No. W-7405-ENG-36 with + // the U.S. Department of Energy. The U.S. Government has rights to use, + // reproduce, and distribute this SOFTWARE. The public may copy, distribute, + // prepare derivative works and publicly display this SOFTWARE without + // charge, provided that this Notice and any statement of authorship are + // reproduced on all copies. Neither the Government nor the University + // makes any warranty, express or implied, or assumes any liability or + // responsibility for the use of this SOFTWARE. + // + // If SOFTWARE is modified to produce derivative works, such modified + // SOFTWARE should be clearly marked, so as not to confuse it with the + // version available from LANL. + // + // For more information about POOMA, send e-mail to pooma at acl.lanl.gov, + // or visit the POOMA web page at http://www.acl.lanl.gov/pooma/. + // ---------------------------------------------------------------------- + // ACL:license + //----------------------------------------------------------------------------- + // Test of the new Centerings class. + //----------------------------------------------------------------------------- + + #include "Pooma/NewFields.h" + #include "Utilities/Tester.h" + + int main(int argc, char *argv[]) + { + Pooma::initialize(argc, argv); + Pooma::Tester tester(argc, argv); + + const int Dim = 2; + + Centering edges + = canonicalCentering(EdgeType, Continuous, XDim | YDim); + + Interval physicalVertexDomain(4, 4); + DomainLayout layout(physicalVertexDomain, GuardLayers(1)); + typedef Field, double, Brick> Field_t; + Field_t f(edges, layout, Vector(0.0), Vector(1.0, 2.0)); + Field_t g(3, edges, layout, Vector(0.0), Vector(1.0, 2.0)); + + // Set some data in the field. + + f[0].all() = 2.0; f[0] = -1.0; + f[1].all() = 3.0; f[1] = -2.0; + + // Test a field with subfields. + + tester.check("f[0](0,0)", + f(FieldOffset(Loc(0), 0), Loc(0)), + -1.0, 1.0e-8); + tester.check("f[0](0,0)", + f(FieldOffset(Loc(2,1), 0), Loc(-2,-1)), + -1.0, 1.0e-8); + tester.check("f[0](2,1)", + f(FieldOffset(Loc(2,1), 0), Loc(0)), + -1.0, 1.0e-8); + tester.check("f[1](0,0)", + f(FieldOffset(Loc(0), 1), Loc(0)), + -2.0, 1.0e-8); + tester.check("f[1](1,2)", + f(FieldOffset(Loc(1,2), 1), Loc(0)), + -2.0, 1.0e-8); + f(FieldOffset(Loc(3,2), 0), Loc(-1,-1)) = 1.3; + f(FieldOffset(Loc(3,2), 1), Loc(-1,-1)) = 10.3; + tester.check("f[0](2,1)", + f(FieldOffset(Loc(2,1), 0), Loc(0)), + 1.3, 1.0e-08); + tester.check("f[1](2,1)", + f(FieldOffset(Loc(2,1), 1), Loc(0)), + 10.3, 1.0e-08); + tester.check("f[0].read(2,1)", + f.read(FieldOffset(Loc(2,1), 0), Loc(0)), + 1.3, 1.0e-08); + tester.check("f[1].read(2,1)", + f.read(FieldOffset(Loc(2,1), 1), Loc(0)), + 10.3, 1.0e-08); + + // Test a field with no subfields. + + Field_t h(canonicalCentering(CellType, Continuous, AllDim), + layout, Vector(0.0), Vector(1.0, 2.0)); + h(FieldOffset(Loc(0,0)), Loc(0,0)) = 1.3; + h(FieldOffset(Loc(0,0)), Loc(0,1)) = 2.3; + h(FieldOffset(Loc(0,0)), Loc(1,0)) = 2.8; + h(FieldOffset(Loc(1,0)), Loc(0,1)) = 3.3; + tester.check("h(0,0)", + h(FieldOffset(Loc(-1,-1)), Loc(1,1)), + 1.3, 1.0e-08); + tester.check("h(0,1)", + h(FieldOffset(Loc(0,1)), Loc(0,0)), + 2.3, 1.0e-08); + tester.check("h(1,0)", + h(FieldOffset(Loc(0,1)), Loc(1,-1)), + 2.8, 1.0e-08); + tester.check("h(1,1)", + h(FieldOffset(Loc(0,0)), Loc(1,1)), + 3.3, 1.0e-08); + tester.check("h.read(1,0)", + h.read(FieldOffset(Loc(0,1)), Loc(1,-1)), + 2.8, 1.0e-08); + tester.check("h.read(1,1)", + h.read(FieldOffset(Loc(0,0)), Loc(1,1)), + 3.3, 1.0e-08); + + int ret = tester.results("FieldOffset"); + Pooma::finalize(); + return ret; + } + + // ACL:rcsinfo + // ---------------------------------------------------------------------- + // $RCSfile: FieldTour3.cpp,v $ $Author: sasmith $ + // $Revision: 1.1.2.1 $ $Date: 2001/07/17 23:22:39 $ + // ---------------------------------------------------------------------- + // ACL:rcsinfo Index: tests/FieldTour3.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/NewField/tests/Attic/FieldTour3.cpp,v retrieving revision 1.1.2.1 diff -c -p -r1.1.2.1 FieldTour3.cpp *** tests/FieldTour3.cpp 2001/07/17 23:22:39 1.1.2.1 --- tests/FieldTour3.cpp 2001/07/24 19:48:11 *************** int main(int argc, char *argv[]) *** 44,50 **** --- 44,55 ---- Interval<2> physicalVertexDomain(4, 4); DomainLayout<2> layout(physicalVertexDomain, GuardLayers<2>(1)); typedef Field, double, Brick> Field_t; + + // Create a field with edge-centered values for the x- and y-directions. Field_t f(edges, layout, Vector<2>(0.0), Vector<2>(1.0, 2.0)); + + // Create a 3-material field with edge-centered values for the x- + // and y-directions. Field_t g(3, edges, layout, Vector<2>(0.0), Vector<2>(1.0, 2.0)); // Set some data in the field. Index: tests/makefile =================================================================== RCS file: /home/pooma/Repository/r2/src/NewField/tests/makefile,v retrieving revision 1.11.2.2 diff -c -p -r1.11.2.2 makefile *** tests/makefile 2001/07/17 23:22:39 1.11.2.2 --- tests/makefile 2001/07/24 19:48:12 *************** run_tests: tests *** 57,65 **** field_tests:: $(ODIR)/BasicTest1 $(ODIR)/BasicTest2 \ $(ODIR)/FieldTour1 $(ODIR)/FieldTour2 \ $(ODIR)/WhereTest $(ODIR)/VectorTest \ $(ODIR)/ScalarCode $(ODIR)/StencilTests \ ! $(ODIR)/ExpressionTest ########################### --- 57,67 ---- field_tests:: $(ODIR)/BasicTest1 $(ODIR)/BasicTest2 \ $(ODIR)/FieldTour1 $(ODIR)/FieldTour2 \ + $(ODIR)/FieldTour3 \ $(ODIR)/WhereTest $(ODIR)/VectorTest \ $(ODIR)/ScalarCode $(ODIR)/StencilTests \ ! $(ODIR)/ExpressionTest $(ODIR)/Centerings \ ! $(ODIR)/FieldOffset ########################### *************** $(ODIR)/StencilTests: $(ODIR)/StencilTes *** 149,154 **** --- 151,163 ---- Centerings: $(ODIR)/Centerings $(ODIR)/Centerings: $(ODIR)/Centerings.o + $(LinkToSuite) + + .PHONY: FieldOffset + + FieldOffset: $(ODIR)/FieldOffset + + $(ODIR)/FieldOffset: $(ODIR)/FieldOffset.o $(LinkToSuite) .PHONY: FieldTour3 From oldham at codesourcery.com Tue Jul 24 20:13:31 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Tue, 24 Jul 2001 13:13:31 -0700 Subject: RFA: Time Benchmarks Message-ID: <20010724131331.A29454@codesourcery.com> This patch has been awaiting response since 05Jul. It is OK to commit this patch? When running benchmarks, knowing the running time is frequently useful. Currently, only the number of megaflops is computed. This patch adds a "--report-time" command-line option which substitutes running time in seconds for megaflops. 2001-07-05 Jeffrey D. Oldham * Benchmark.cmpl.cpp (Benchmark::Benchmark): Initialize report_time_m. Process "--report-time". (Benchmark::usage): Add "--report-time" explanation. (Benchmark::runImplementation): Revise storage of time xor Mflops. * Benchmark.h (Benchmark): Add report_time_m. Tested on sequential Linux using gcc3.0 and by building and running three benchmarks Approved by ???you??? Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- Index: Benchmark.cmpl.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Utilities/Benchmark.cmpl.cpp,v retrieving revision 1.44 diff -c -p -r1.44 Benchmark.cmpl.cpp *** Benchmark.cmpl.cpp 2000/06/30 02:02:50 1.44 --- Benchmark.cmpl.cpp 2001/07/06 02:27:43 *************** Benchmark::Benchmark(int argc, char *arg *** 76,81 **** --- 76,82 ---- print_m = true; diags_m = true; + report_time_m = false; // Default Inform object has null prefix and only prints from context 0: *************** Benchmark::Benchmark(int argc, char *arg *** 225,230 **** --- 226,236 ---- print_m = false; ++i; } + else if (strcmp("--report-time", argv[i]) == 0) + { + report_time_m = true; + ++i; + } else if (strcmp("--num-patches", argv[i]) == 0) { setNumPatches_m = true; *************** void Benchmark::usage(const char *name) *** 278,284 **** << " V1, V2, etc.\n" << "--no-print.........................don't print anything (useful if\n" << " profiling using an external tool).\n" ! << "--n-idiags.........................suppress diagnostic output.\n" << "--iters N..........................run benchmark for N iterations\n" << " (no effect if using SGI timers).\n" << "--samples N........................repeat runs N time.\n" --- 284,291 ---- << " V1, V2, etc.\n" << "--no-print.........................don't print anything (useful if\n" << " profiling using an external tool).\n" ! << "--no-diags.........................suppress diagnostic output.\n" ! << "--report-time......................print time, not Mflops.\n" << "--iters N..........................run benchmark for N iterations\n" << " (no effect if using SGI timers).\n" << "--samples N........................repeat runs N time.\n" *************** void Benchmark::runImplementation(Implem *** 637,645 **** double timeper = total / double(iters); ! // Compute the MOps and store it. ! times[i] = impl->opCount() / timeper / 1.0e6; // If we're testing results and we're printing, do this now. --- 644,655 ---- double timeper = total / double(iters); ! // Either store the running time or the MOps. ! if (report_time_m) ! times[i] = total; ! else ! times[i] = impl->opCount() / timeper / 1.0e6; // If we're testing results and we're printing, do this now. Index: Benchmark.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Utilities/Benchmark.h,v retrieving revision 1.27 diff -c -p -r1.27 Benchmark.h *** Benchmark.h 2000/06/30 02:02:50 1.27 --- Benchmark.h 2001/07/06 02:27:43 *************** private: *** 199,209 **** bool setSamples_m; //--------------------------------------------------------------------------- ! // If true, we are supposed to display results or show diagnostic output. bool print_m; bool diags_m; ! // -------------------------------------------------------------------------- // Points to the Inform object pointer used for printing output: --- 199,211 ---- bool setSamples_m; //--------------------------------------------------------------------------- ! // If true, we are supposed to display results or show diagnostic ! // output or print running time, not Mflops. bool print_m; bool diags_m; ! bool report_time_m; ! // -------------------------------------------------------------------------- // Points to the Inform object pointer used for printing output: From scotth at proximation.com Wed Jul 25 13:40:05 2001 From: scotth at proximation.com (Scott Haney) Date: Wed, 25 Jul 2001 07:40:05 -0600 Subject: [pooma-dev] RFA: Time Benchmarks In-Reply-To: <20010724131331.A29454@codesourcery.com> Message-ID: Hi Jeffrey, Sorry about the delay. This is fine. Scott On Tuesday, July 24, 2001, at 02:13 PM, Jeffrey Oldham wrote: > This patch has been awaiting response since 05Jul. It is OK to commit > this patch? > > When running benchmarks, knowing the running time is frequently > useful. Currently, only the number of megaflops is computed. This > patch adds a "--report-time" command-line option which substitutes > running time in seconds for megaflops. > > 2001-07-05 Jeffrey D. Oldham > > * Benchmark.cmpl.cpp (Benchmark::Benchmark): Initialize > report_time_m. Process "--report-time". > (Benchmark::usage): Add "--report-time" explanation. > (Benchmark::runImplementation): Revise storage of time xor > Mflops. > * Benchmark.h (Benchmark): Add report_time_m. > > Tested on sequential Linux using gcc3.0 and by building and > running three benchmarks > Approved by ???you??? > > Thanks, > Jeffrey D. Oldham > oldham at codesourcery.com From scotth at proximation.com Wed Jul 25 13:50:53 2001 From: scotth at proximation.com (Scott Haney) Date: Wed, 25 Jul 2001 07:50:53 -0600 Subject: [pooma-dev] FW: Questions about data in Fields In-Reply-To: <4120017224192410839@earthlink.net> Message-ID: Hi Dave, The data in fields is not necessarily continuous. We allocate based on a vertex-centered domain for all subfields. This is done so that when the field is partitioned, all of the subfields will be aligned. This fixes the nasty problems we had in R1 with slightly different sized fields giving dramatically different partitionings. This improves robustness and should improve performance. Note as well that if you are looking at a patch and there are internal guard layers, the data won't be continuous either. Your best bet is to assign the patch data to a brick-array of the correct size. This will make the data continuous. However, it would be helpful to see a snippet showing exactly what you're trying to do. This would also help with respect to the size question - I am surprised by the behavior you're reporting. Scott On Tuesday, July 24, 2001, at 01:24 PM, William Nystrom wrote: > Hi Guys, > ? > I talked to Jim about interfacing to some fortran linear solver code > awhile back and then before I left for > vacation, John and I did some work to try and write the interface for > our application using Pooma 2 so > we could use this fortran linear solver package.? One of the things I > am trying to do is to query a Pooma 2 > Field and find out the size of the data that is local to a processor.? > I've done this query for the domain object > for a Field and for a cell centered field, it reports sizes in each > dimension or coordinate that are one more > than they should be.? John told me that you guys had decided to > allocate enough space for a vertex centered > field even if the field was cell centered - as an optimization of some > sort.? I am worried that the data for a > cell centered field may not be contiguous because of the extra padding > that occurs for cell centered fields > and because the domain object thinks it's size in each dimension is one > larger.? Can you tell me if the data > for a cell centered field that is local to a processor with one patch > per processor is actually contiguous in > memory?? I can test this experimentally but I have not had a chance to > do this yet.? Also, can you tell me the > recommended way to get the correct size of my data on a local processor > for a cell centered Pooma 2 > Field? > ? > Please send replies to sunsetmesa at earthlink.net as I am not able right > now to read my lanl email and I > am not subscribing to pooma-dev from my ISP account. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 2528 bytes Desc: not available URL: From sunsetmesa at earthlink.net Wed Jul 25 16:00:43 2001 From: sunsetmesa at earthlink.net (William Nystrom) Date: Wed, 25 Jul 2001 10:0:43 -0600 Subject: [pooma-dev] FW: Questions about data in Fields Message-ID: <412001732516043568@earthlink.net> Hi Scott, Thanks for your reply. I am trying to interface to a linear solver package that is written in f77. This is a package that I have used alot in the past and am very familiar with. I talked with you 2-3 years ago about the interface to this package with the idea of trying to interface to Pooma 1. But I never really had time to pursue that work. Now I am trying to interface to this package using Pooma 2. I talked with Jim about this when John and I came down to Proximation a couple of months ago. I don't know if I can describe the interface well enough for you to follow it or for that matter, even if it is worthwhile at this stage. My ultimate goal is to be able to call this package with the minimum amount of data copying but right now, John and I have compromised on doing some extra copying. Fundamentally, what I need to pass to this fortran linear solver package is the address of the beginning of the chunk of data on a processor. The data must be contiguous. So, this means that the data cannot have any guard cells. Also, I would presume that it means that I can only have 1 vnode or patch per processor. We were wanting to use Pooma 2 Fields for all this work. All of our current physics code uses Pooma 2 Fields. So there was a desire to be able to use Pooma 2 Fields for this work as well rather than having to use arrays. I don't know if this is really answering any of your questions but I have to go for now. I'll give this some more thought and perhaps respond some more to your email. Thanks, Dave ----- Original Message ----- From: Scott Haney To: sunsetmesa at earthlink.net Cc: pooma-dev at pooma.codesourcery.com Sent: 7/25/2001 7:51:15 AM Subject: Re: [pooma-dev] FW: Questions about data in Fields Hi Dave, The data in fields is not necessarily continuous. We allocate based on a vertex-centered domain for all subfields. This is done so that when the field is partitioned, all of the subfields will be aligned. This fixes the nasty problems we had in R1 with slightly different sized fields giving dramatically different partitionings. This improves robustness and should improve performance. Note as well that if you are looking at a patch and there are internal guard layers, the data won't be continuous either. Your best bet is to assign the patch data to a brick-array of the correct size. This will make the data continuous. However, it would be helpful to see a snippet showing exactly what you're trying to do. This would also help with respect to the size question - I am surprised by the behavior you're reporting. Scott On Tuesday, July 24, 2001, at 01:24 PM, William Nystrom wrote: Hi Guys, I talked to Jim about interfacing to some fortran linear solver code awhile back and then before I left for vacation, John and I did some work to try and write the interface for our application using Pooma 2 so we could use this fortran linear solver package. One of the things I am trying to do is to query a Pooma 2 Field and find out the size of the data that is local to a processor. I've done this query for the domain object for a Field and for a cell centered field, it reports sizes in each dimension or coordinate that are one more than they should be. John told me that you guys had decided to allocate enough space for a vertex centered field even if the field was cell centered - as an optimization of some sort. I am worried that the data for a cell centered field may not be contiguous because of the extra padding that occurs for cell centered fields and because the domain object thinks it's size in each dimension is one larger. Can you tell me if the data for a cell centered field that is local to a processor with one patch per processor is actually contiguous in memory? I can test this experimentally but I have not had a chance to do this yet. Also, can you tell me the recommended way to get the correct size of my data on a local processor for a cell centered Pooma 2 Field? Please send replies to sunsetmesa at earthlink.net as I am not able right now to read my lanl email and I am not subscribing to pooma-dev from my ISP account. --- William Nystrom --- sunsetmesa at earthlink.net --- EarthLink: It's your Internet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oldham at codesourcery.com Wed Jul 25 16:03:48 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Wed, 25 Jul 2001 09:03:48 -0700 Subject: Patch: Benchmark Time Message-ID: <20010725090348.A3207@codesourcery.com> When running benchmarks, knowing the running time is frequently useful. Before this patch, only the number of megaflops is computed. This patch adds a "--report-time" command-line option which substitutes running time in seconds for megaflops. 2001-07-25 Jeffrey D. Oldham * Benchmark.cmpl.cpp (Benchmark::Benchmark): Initialize report_time_m. Process "--report-time". (Benchmark::usage): Add "--report-time" explanation. (Benchmark::runImplementation): Revise storage of time xor Mflops. * Benchmark.h (Benchmark): Add report_time_m. Tested on sequential Linux using gcc3.0 and by building and running three benchmarks Approved by Scott Haney Applied to mainline branch (and newfield_revision by accident) Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- ? LINUXgcc ? Benchmark.patch ? tests/LINUXgcc Index: Benchmark.cmpl.cpp =================================================================== RCS file: /home/pooma/Repository/r2/src/Utilities/Benchmark.cmpl.cpp,v retrieving revision 1.44 diff -c -p -r1.44 Benchmark.cmpl.cpp *** Benchmark.cmpl.cpp 2000/06/30 02:02:50 1.44 --- Benchmark.cmpl.cpp 2001/07/05 20:13:26 *************** Benchmark::Benchmark(int argc, char *arg *** 76,81 **** --- 76,82 ---- print_m = true; diags_m = true; + report_time_m = false; // Default Inform object has null prefix and only prints from context 0: *************** Benchmark::Benchmark(int argc, char *arg *** 225,230 **** --- 226,236 ---- print_m = false; ++i; } + else if (strcmp("--report-time", argv[i]) == 0) + { + report_time_m = true; + ++i; + } else if (strcmp("--num-patches", argv[i]) == 0) { setNumPatches_m = true; *************** void Benchmark::usage(const char *name) *** 278,284 **** << " V1, V2, etc.\n" << "--no-print.........................don't print anything (useful if\n" << " profiling using an external tool).\n" ! << "--n-idiags.........................suppress diagnostic output.\n" << "--iters N..........................run benchmark for N iterations\n" << " (no effect if using SGI timers).\n" << "--samples N........................repeat runs N time.\n" --- 284,291 ---- << " V1, V2, etc.\n" << "--no-print.........................don't print anything (useful if\n" << " profiling using an external tool).\n" ! << "--no-diags.........................suppress diagnostic output.\n" ! << "--report-time......................print time, not Mflops.\n" << "--iters N..........................run benchmark for N iterations\n" << " (no effect if using SGI timers).\n" << "--samples N........................repeat runs N time.\n" *************** void Benchmark::runImplementation(Implem *** 637,645 **** double timeper = total / double(iters); ! // Compute the MOps and store it. ! times[i] = impl->opCount() / timeper / 1.0e6; // If we're testing results and we're printing, do this now. --- 644,655 ---- double timeper = total / double(iters); ! // Either store the running time or the MOps. ! if (report_time_m) ! times[i] = timeper; ! else ! times[i] = impl->opCount() / timeper / 1.0e6; // If we're testing results and we're printing, do this now. Index: Benchmark.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Utilities/Benchmark.h,v retrieving revision 1.27 diff -c -p -r1.27 Benchmark.h *** Benchmark.h 2000/06/30 02:02:50 1.27 --- Benchmark.h 2001/07/05 20:13:26 *************** private: *** 199,209 **** bool setSamples_m; //--------------------------------------------------------------------------- ! // If true, we are supposed to display results or show diagnostic output. bool print_m; bool diags_m; ! // -------------------------------------------------------------------------- // Points to the Inform object pointer used for printing output: --- 199,211 ---- bool setSamples_m; //--------------------------------------------------------------------------- ! // If true, we are supposed to display results or show diagnostic ! // output or print running time, not Mflops. bool print_m; bool diags_m; ! bool report_time_m; ! // -------------------------------------------------------------------------- // Points to the Inform object pointer used for printing output: From oldham at codesourcery.com Wed Jul 25 21:12:49 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Wed, 25 Jul 2001 14:12:49 -0700 Subject: Chevron Code Using New Field Abstractions Message-ID: <20010725141249.A3617@codesourcery.com> Attached is a very preliminary version of the Chevron code written using C++ pseudocode closely related to the proposed NewField revisions. It does not compile since the underlying NewField and mesh routines have not yet been implemented. The next steps are: 1. To ensure that the algorithm is correct. 2. To add more comments describing my assumptions about functions and classes. 3. To discuss whether the syntax is acceptable. 4. To make the code available in some portion of the Pooma CVS tree. Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- // Oldham, Jeffrey D. // 2001Jul25 // Pooma // Chevron Kernel Written Using POOMA's Proposed Field Abstraction #include #include #include "Pooma/NewFields.h" // This program implements "Implementation of a Flux-Continuous Fnite // Difference Method for Stratigraphic, Hexahedron Grids," by // S. H. Lee, H. Tchelepi, and L. J. DeChant, \emph{1999 SPE Reservoir // Simulation Symposium}, SPE (Society of Petroleum Engineers) 51901. // Preprocessor symbols: // PSEUDOCODE: Do not define this symbol. Surrounds desired code to // deal with different granularity fields. // DEBUG: If defined, print some information about internal program // values. template inline typename Field::T_t faceWeightedSum(const Field& inputField, const FieldOffsetList &lst, const Field& outputField) { typedef typename Field::T_t T_t; typedef typename FieldOffsetList::size_type size_type; CTAssert((Field::dimensions == Dim)); // HERE const size_type lstLength = lst.size(); PInsist(lstLength > 0, "faceWeightedSum() must be given a nonempty list."); T_t init = inputField(lst[0], loc); // FIXME inputField.mesh().face(arg).normal() returns a normal // vector with length equal to the face's area and direction // perpendicular to the face. for (size_type i = 1; i < lstLength ; ++i) init += outputField.mesh().face(WHICH).normal() * inputField(lst[i], loc); return init; } /** THE PROGRAM **/ int main(int argc, char *argv[]) { // Set up the Pooma library. Pooma::initialize(argc,argv); #ifdef DEBUG std::cout << "Start program." << std::endl; #endif // DEBUG /* DECLARATIONS */ // Create a simple layout. const unsigned Dim = 2; // Work in a 2D world. const unsigned nXs = 5; // number of horizontal vertices const unsigned nYs = 4; // number of vertical vertices Interval meshDomain; meshDomain[0] = Interval<1>(nXs); meshDomain[1] = Interval<1>(nYs); DomainLayout meshLayout(meshDomain, GuardLayers(1)); // Preparation for Field creation. Vector origin(0.0); Vector spacings(1.0,1.0); typedef UniformRectilinear > Geometry_t; typedef Field Fields_t; typedef Field ConstFields_t; // TODO: Change to ConstField when ConstField is available. typedef Tensor Tensor_t; typedef Field FieldT_t; typedef Field ConstFieldT_t; // TODO: Change to ConstField when ConstField is available. typedef Field, Brick> Fieldv_t; typedef Field, Brick> ConstFieldv_t; // TODO: Change to ConstField when ConstField is available. // Cell-centered Fields. Centering cell = canonicalCentering(CellType, Continuous); ConstFieldT_t permeability (cell, meshLayout, origin, spacings); ConstFields_t pressure (cell, meshLayout, origin, spacings); Fields_t totalFlux (cell, meshLayout, origin, spacings); // Subcell-centered Field. typedef Centering::Orientation Orientation; typedef Centering::Position Position; Position position; Centering subcell(CellType, Continuous); position(0) = 0.25; position(1) = 0.25; subcell.addValue(Orientation(1), position); position(1) = 0.75; subcell.addValue(Orientation(1), position); position(0) = 0.75; subcell.addValue(Orientation(1), position); position(1) = 0.25; subcell.addValue(Orientation(1), position); Fields_t pressureGradient (subcell, meshLayout, origin, spacings); // Spoke-centered Field. Centering spoke(FaceType, Discontinuous); // QUESTION: These are supposed to be Discontinuous, right? Orientation orientation; // NOTE: This code is not dimension-independent. for (int zeroFace = 0; zeroFace < 2; ++zeroFace) { orientation = 1; orientation[zeroFace] = 0; position(zeroFace) = 0.0; position(1-zeroFace) = 0.25; spoke.addValue(orientation, position); position(1-zeroFace) = 0.75; spoke.addValue(orientation, position); position(zeroFace) = 1.0; position(1-zeroFace) = 0.25; spoke.addValue(orientation, position); position(1-zeroFace) = 0.75; spoke.addValue(orientation, position); } Fields_t spokeFlux (spoke, meshLayout, origin, spacings); // Face-centered. Centering disFace = canonicalCentering(FaceType, Discontinuous); /* INITIALIZATION */ #ifdef PSEUDOCODE // Initialize tensors. // Initialize the pressures. // Initialize coordinates. #endif // PSEUDOCODE /* COMPUTATION */ #ifdef PSEUDOCODE // Compute pressureGradients by simultaneously solving several // linear equations. The operands have different centerings. // FIXME pressureGradients = linearAlgebra<2>(pressure /* cell-centered */, /* Interpolate from vertex-centered to cell-centered: */ interpolate(coordinates), permeability /* cell-centered */, normals /* face-centered */); #endif // PSEUDOCODE // Compute the spoke fluxes. // We must multiply three quantities, each with a different // centering, to yield values at a fourth-centering. permeability // is cell-centered. pressureGradient is subcell-centered. The // normals are face-centered. The product is spoke-centered. spokeFlux = dot(replicate(dot(replicate(permeability, nearestNeighbors(cell, subcell)), pressureGradient), nearestNeighbors(subcell, spoke)), replicate(meshLayout.unitCoordinateNormals(), nearestNeighbors(disFace, spoke))); // Sum the spoke fluxes into a cell flux. // Q = \sum_{faces f of cell} sign_f area_f \sum_{subfaces sf of f} q_{sf} // We compute this in three steps: // 1. Add together the flux values on each face to form a // face-centered field. // 2. Multiply each value by the magnitude of the face's normal. // 3. Add together each face's value. totalFlux = sum(spokeFlux.mesh().normals().signedMagnitude() * sum(spokeFlux, nearestNeighbors(spoke, disFace)), findFieldOffsetList(disFace, cell)); /* TERMINATION */ std::cout << "total flux:\n" << totalFlux << std::endl; #ifdef DEBUG std::cout << "End program." << std::endl; #endif // DEBUG Pooma::finalize(); return EXIT_SUCCESS; } From oldham at codesourcery.com Wed Jul 25 21:28:37 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Wed, 25 Jul 2001 14:28:37 -0700 Subject: [pooma-dev] Chevron Code Using New Field Abstractions In-Reply-To: <20010725141249.A3617@codesourcery.com>; from oldham@codesourcery.com on Wed, Jul 25, 2001 at 02:12:49PM -0700 References: <20010725141249.A3617@codesourcery.com> Message-ID: <20010725142837.A3654@codesourcery.com> I cleaned out the old code near the top of the file. Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- // Oldham, Jeffrey D. // 2001Jul25 // Pooma // Chevron Kernel Written Using POOMA's Proposed Field Abstraction #include #include #include "Pooma/NewFields.h" // This program implements "Implementation of a Flux-Continuous Fnite // Difference Method for Stratigraphic, Hexahedron Grids," by // S. H. Lee, H. Tchelepi, and L. J. DeChant, \emph{1999 SPE Reservoir // Simulation Symposium}, SPE (Society of Petroleum Engineers) 51901. // Preprocessor symbols: // PSEUDOCODE: Do not define this symbol. Surrounds desired code to // deal with different granularity fields. // DEBUG: If defined, print some information about internal program // values. /** THE PROGRAM **/ int main(int argc, char *argv[]) { // Set up the Pooma library. Pooma::initialize(argc,argv); #ifdef DEBUG std::cout << "Start program." << std::endl; #endif // DEBUG /* DECLARATIONS */ // Create a simple layout. const unsigned Dim = 2; // Work in a 2D world. const unsigned nXs = 5; // number of horizontal vertices const unsigned nYs = 4; // number of vertical vertices Interval meshDomain; meshDomain[0] = Interval<1>(nXs); meshDomain[1] = Interval<1>(nYs); DomainLayout meshLayout(meshDomain, GuardLayers(1)); // Preparation for Field creation. Vector origin(0.0); Vector spacings(1.0,1.0); typedef UniformRectilinear > Geometry_t; typedef Field Fields_t; typedef Field ConstFields_t; // TODO: Change to ConstField when ConstField is available. typedef Tensor Tensor_t; typedef Field FieldT_t; typedef Field ConstFieldT_t; // TODO: Change to ConstField when ConstField is available. typedef Field, Brick> Fieldv_t; typedef Field, Brick> ConstFieldv_t; // TODO: Change to ConstField when ConstField is available. // Cell-centered Fields. Centering cell = canonicalCentering(CellType, Continuous); ConstFieldT_t permeability (cell, meshLayout, origin, spacings); ConstFields_t pressure (cell, meshLayout, origin, spacings); Fields_t totalFlux (cell, meshLayout, origin, spacings); // Subcell-centered Field. typedef Centering::Orientation Orientation; typedef Centering::Position Position; Position position; Centering subcell(CellType, Continuous); position(0) = 0.25; position(1) = 0.25; subcell.addValue(Orientation(1), position); position(1) = 0.75; subcell.addValue(Orientation(1), position); position(0) = 0.75; subcell.addValue(Orientation(1), position); position(1) = 0.25; subcell.addValue(Orientation(1), position); Fields_t pressureGradient (subcell, meshLayout, origin, spacings); // Spoke-centered Field. Centering spoke(FaceType, Discontinuous); // QUESTION: These are supposed to be Discontinuous, right? Orientation orientation; // NOTE: This code is not dimension-independent. for (int zeroFace = 0; zeroFace < 2; ++zeroFace) { orientation = 1; orientation[zeroFace] = 0; position(zeroFace) = 0.0; position(1-zeroFace) = 0.25; spoke.addValue(orientation, position); position(1-zeroFace) = 0.75; spoke.addValue(orientation, position); position(zeroFace) = 1.0; position(1-zeroFace) = 0.25; spoke.addValue(orientation, position); position(1-zeroFace) = 0.75; spoke.addValue(orientation, position); } Fields_t spokeFlux (spoke, meshLayout, origin, spacings); // Face-centered. Centering disFace = canonicalCentering(FaceType, Discontinuous); /* INITIALIZATION */ #ifdef PSEUDOCODE // Initialize tensors. // Initialize the pressures. // Initialize coordinates. #endif // PSEUDOCODE /* COMPUTATION */ #ifdef PSEUDOCODE // Compute pressureGradients by simultaneously solving several // linear equations. The operands have different centerings. // FIXME pressureGradients = linearAlgebra<2>(pressure /* cell-centered */, /* Interpolate from vertex-centered to cell-centered: */ interpolate(coordinates), permeability /* cell-centered */, normals /* face-centered */); #endif // PSEUDOCODE // Compute the spoke fluxes. // We must multiply three quantities, each with a different // centering, to yield values at a fourth-centering. permeability // is cell-centered. pressureGradient is subcell-centered. The // normals are face-centered. The product is spoke-centered. spokeFlux = dot(replicate(dot(replicate(permeability, nearestNeighbors(cell, subcell)), pressureGradient), nearestNeighbors(subcell, spoke)), replicate(meshLayout.unitCoordinateNormals(), nearestNeighbors(disFace, spoke))); // Sum the spoke fluxes into a cell flux. // Q = \sum_{faces f of cell} sign_f area_f \sum_{subfaces sf of f} q_{sf} // We compute this in three steps: // 1. Add together the flux values on each face to form a // face-centered field. // 2. Multiply each value by the magnitude of the face's normal. // 3. Add together each face's value. totalFlux = sum(spokeFlux.mesh().normals().signedMagnitude() * sum(spokeFlux, nearestNeighbors(spoke, disFace)), findFieldOffsetList(disFace, cell)); /* TERMINATION */ std::cout << "total flux:\n" << totalFlux << std::endl; #ifdef DEBUG std::cout << "End program." << std::endl; #endif // DEBUG Pooma::finalize(); return EXIT_SUCCESS; } From mark at codesourcery.com Wed Jul 25 21:38:50 2001 From: mark at codesourcery.com (Mark Mitchell) Date: Wed, 25 Jul 2001 14:38:50 -0700 Subject: [pooma-dev] Chevron Code Using New Field Abstractions In-Reply-To: <20010725142837.A3654@codesourcery.com> Message-ID: <119110000.996097130@warlock.codesourcery.com> --On Wednesday, July 25, 2001 02:28:37 PM -0700 Jeffrey Oldham wrote: > I cleaned out the old code near the top of the file. I looked at the code, and it looks plausible to me. The good news is that I haven't read the paper and I'm not a physicist and I can still almost understand what is going on, so that is good. -- Mark Mitchell mark at codesourcery.com CodeSourcery, LLC http://www.codesourcery.com From oldham at codesourcery.com Wed Jul 25 22:20:45 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Wed, 25 Jul 2001 15:20:45 -0700 Subject: [pooma-dev] Chevron Code Using New Field Abstractions In-Reply-To: <20010725141249.A3617@codesourcery.com>; from oldham@codesourcery.com on Wed, Jul 25, 2001 at 02:12:49PM -0700 References: <20010725141249.A3617@codesourcery.com> Message-ID: <20010725152045.A3785@codesourcery.com> On Wed, Jul 25, 2001 at 02:12:49PM -0700, Jeffrey Oldham wrote: > Attached is a very preliminary version of the Chevron code written > using C++ pseudocode closely related to the proposed NewField > revisions. It does not compile since the underlying NewField and mesh > routines have not yet been implemented. > > The next steps are: > > 1. To ensure that the algorithm is correct. > 2. To add more comments describing my assumptions about functions and > classes. I have added comments near the beginning of the file. > 3. To discuss whether the syntax is acceptable. > 4. To make the code available in some portion of the Pooma CVS tree. Thanks, Jeffrey D. Oldham oldham at codesourcery.com -------------- next part -------------- // Oldham, Jeffrey D. // 2001Jul25 // Pooma // Chevron Kernel Written Using POOMA's Proposed Field Abstraction #include #include #include "Pooma/NewFields.h" // This program implements "Implementation of a Flux-Continuous Fnite // Difference Method for Stratigraphic, Hexahedron Grids," by // S. H. Lee, H. Tchelepi, and L. J. DeChant, \emph{1999 SPE Reservoir // Simulation Symposium}, SPE (Society of Petroleum Engineers) 51901. // Preprocessor symbols: // PSEUDOCODE: Do not define this symbol. Surrounds desired code to // deal with different granularity fields. // DEBUG: If defined, print some information about internal program // values. /** QUESTIONS **/ // o. If several different fields are created using the same mesh // object, is the mesh object shared? // o. Can meshes be queried without going through an associated field? // o. According to my understanding, the Chevron algorithm should be // imbedded inside a loop of some type that repeatedly updates the // coordinates. // o. I omitted a separate coordinates field, presumably updated each // iteration, in favor of using the mesh. Since I do not know how // the coordinates are updated, I omitted updating the mesh. // o. Is it important to flesh out the linear algebra solution? We // might learn something about field syntax, but it will also take // time for me to determine the correct operands. // o. The eight spoke-centered flux values are discontinuous, right? // o. Creating non-canonical edge and face centerings requires // dimension-dependent code. Is this acceptable? /** UNFINISHED WORK **/ // o ConstField = a Field with values that do not change // o nearestNeighbors(inputCentering, outputCentering) // o replicate(field, std::vector) // o meshLayout.unitCoordinateNormals() // o field.mesh() // o field.mesh().normals() // o field.mesh().normals().signedMagnitude() // o sum(field, FieldOffsetList) /** EXPLANATIONS **/ // o Centering canonicalCentering(CellType, Continuous): // returns a centering object for a cell-centered field with one // value at the cell's center (in logical coordinate space) // o subcell: This centering contains four cell-centered values at // positions (0.25, 0.25), (0.25, 0.75), (0.75, 0.75), (0.75, 0.25). // Since this centering is not a canonical centering, it must be // constructed. To do so, we start with a cell-centered centering // without any values and repeatedly add values. The orientation, // ignored for cell-centered values, indicates which coordinate values // are fixed and which are not. Using a (1,...,1) indicates that // all coordinate values may be changed. // o spoke: This face-centering has two values on each face. It, too, // has to be constructed since it is not a normal centering. // o The Chevron algorithm first solves a linear program. I have // omitted since computation since it does not illustrate field // computations. // o replicate(field, std::vector): This function, // syntactic sugar for a nearest neighbors computation, copies the // field values to the positions indicated by the // std::vector. Each field value is copied to one // or more values. replicate() could be replaced by sum(), but the // latter function has an unnecessary loop since each output value // equals one input value. // o nearestNeighbors(inputCentering, outputCentering): This function // returns a std::vector of FieldOffsetList's, one for each output // value specified by the given output centering. For each output // value, the closest input values, wrt Manhattan distance, are // returned. Eventually, these may be pre-computed or cached to // reduce running time. // o meshLayout.unitCoordinateNormals(): This returns a discontinuous // face-centered field with unit-length normals all pointing in // positive directions. // o field.mesh(): Returns the mesh object associated with the field. // o spokeFlux.mesh().normals(): Returns a face-centered field of // normal vectors perpendicular to each face. The magnitude of each // normal equals the face's area/volume. // o spokeFlux.mesh().normals().signedMagnitude(): Returns a // face-centered field of scalars, each having absolute value // equalling the face's area/volume and sign equalling whether the // face's normal is in a positive direction, e.g., the positive // x-direction vs. the negative x-direction. // o sum(field, FieldOffsetList): this parallel-data statement adds // the values indicated in the FieldOffsetList to form each output value /** THE PROGRAM **/ int main(int argc, char *argv[]) { // Set up the Pooma library. Pooma::initialize(argc,argv); #ifdef DEBUG std::cout << "Start program." << std::endl; #endif // DEBUG /* DECLARATIONS */ // Create a simple layout. const unsigned Dim = 2; // Work in a 2D world. const unsigned nXs = 5; // number of horizontal vertices const unsigned nYs = 4; // number of vertical vertices Interval meshDomain; meshDomain[0] = Interval<1>(nXs); meshDomain[1] = Interval<1>(nYs); DomainLayout meshLayout(meshDomain, GuardLayers(1)); // Preparation for Field creation. Vector origin(0.0); Vector spacings(1.0,1.0); typedef UniformRectilinear > Geometry_t; typedef Field Fields_t; typedef Field ConstFields_t; // TODO: Change to ConstField when ConstField is available. typedef Tensor Tensor_t; typedef Field FieldT_t; typedef Field ConstFieldT_t; // TODO: Change to ConstField when ConstField is available. typedef Field, Brick> Fieldv_t; typedef Field, Brick> ConstFieldv_t; // TODO: Change to ConstField when ConstField is available. // Cell-centered Fields. Centering cell = canonicalCentering(CellType, Continuous); ConstFieldT_t permeability (cell, meshLayout, origin, spacings); ConstFields_t pressure (cell, meshLayout, origin, spacings); Fields_t totalFlux (cell, meshLayout, origin, spacings); // Subcell-centered Field. typedef Centering::Orientation Orientation; typedef Centering::Position Position; Position position; Centering subcell(CellType, Continuous); position(0) = 0.25; position(1) = 0.25; subcell.addValue(Orientation(1), position); position(1) = 0.75; subcell.addValue(Orientation(1), position); position(0) = 0.75; subcell.addValue(Orientation(1), position); position(1) = 0.25; subcell.addValue(Orientation(1), position); Fields_t pressureGradient (subcell, meshLayout, origin, spacings); // Spoke-centered Field. Centering spoke(FaceType, Discontinuous); // QUESTION: These are supposed to be Discontinuous, right? Orientation orientation; // NOTE: This code is not dimension-independent. for (int zeroFace = 0; zeroFace < 2; ++zeroFace) { orientation = 1; orientation[zeroFace] = 0; position(zeroFace) = 0.0; position(1-zeroFace) = 0.25; spoke.addValue(orientation, position); position(1-zeroFace) = 0.75; spoke.addValue(orientation, position); position(zeroFace) = 1.0; position(1-zeroFace) = 0.25; spoke.addValue(orientation, position); position(1-zeroFace) = 0.75; spoke.addValue(orientation, position); } Fields_t spokeFlux (spoke, meshLayout, origin, spacings); // Face-centered. Centering disFace = canonicalCentering(FaceType, Discontinuous); /* INITIALIZATION */ #ifdef PSEUDOCODE // Initialize tensors. // Initialize the pressures. // Initialize coordinates. #endif // PSEUDOCODE /* COMPUTATION */ #ifdef PSEUDOCODE // Compute pressureGradients by simultaneously solving several // linear equations. The operands have different centerings. // FIXME pressureGradients = linearAlgebra<2>(pressure /* cell-centered */, /* Interpolate from vertex-centered to cell-centered: */ interpolate(coordinates), permeability /* cell-centered */, normals /* face-centered */); #endif // PSEUDOCODE // Compute the spoke fluxes. // We must multiply three quantities, each with a different // centering, to yield values at a fourth-centering. permeability // is cell-centered. pressureGradient is subcell-centered. The // normals are face-centered. The product is spoke-centered. spokeFlux = dot(replicate(dot(replicate(permeability, nearestNeighbors(cell, subcell)), pressureGradient), nearestNeighbors(subcell, spoke)), replicate(meshLayout.unitCoordinateNormals(), nearestNeighbors(disFace, spoke))); // Sum the spoke fluxes into a cell flux. // Q = \sum_{faces f of cell} sign_f area_f \sum_{subfaces sf of f} q_{sf} // We compute this in three steps: // 1. Add together the flux values on each face to form a // face-centered field. // 2. Multiply each value by the magnitude of the face's normal. // 3. Add together each face's value. totalFlux = sum(spokeFlux.mesh().normals().signedMagnitude() * sum(spokeFlux, nearestNeighbors(spoke, disFace)), nearestNeighbors(disFace, cell)); /* TERMINATION */ std::cout << "total flux:\n" << totalFlux << std::endl; #ifdef DEBUG std::cout << "End program." << std::endl; #endif // DEBUG Pooma::finalize(); return EXIT_SUCCESS; } From oldham at codesourcery.com Fri Jul 27 00:02:53 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Thu, 26 Jul 2001 17:02:53 -0700 Subject: KCC on Irix vs. Linux Message-ID: <20010726170253.A2563@codesourcery.com> We compare the performance of the Pooma target KCC compiler on Linux and Irix computers. Both Stephen Smith and Gabriel Dos Reis supplied data. I conclude it is likely that optimizing Pooma using KCC on Linux is likely to lead to similar speed-ups for the target configuration of KCC on Irix. Running Times for Linux are Significantly Larger than for Irix Linux (Stephen) Linux (Gaby) Irix Acoustic2d N C CppTran PoomaII N C CppTran PoomaII N C CppTran PoomaII 10 0.02 0.03 0.06 10 0.01 0.05 0.10 10 0.00 0.00 0.00 31 0.17 0.33 0.45 31 0.22 0.53 0.69 31 0.00 0.00 0.01 100 6.87 8.45 7.96 100 7.98 12.00 12.22 100 0.04 0.05 0.05 Doof2d N C CppTran PoomaII N C CppTran PoomaII N C CppTran PoomaII 10 0.00 0.00 0.00 10 0.00 0.00 0.00 10 0.00 0.00 0.00 31 0.00 0.00 0.00 31 0.00 0.00 0.00 31 0.00 0.00 0.00 100 0.00 0.01 0.00 100 0.01 0.00 0.01 100 0.00 0.00 0.00 316 0.06 0.09 0.07 316 0.09 0.12 0.11 316 0.01 0.01 0.01 1000 0.67 0.85 0.78 1000 0.94 1.17 1.05 1000 0.11 0.10 0.13 Solvers/Krylov/CGA N C CppTran PoomaII N C CppTran PoomaII N C CppTran PoomaII 1 0.00 0.00 5.07 1 0.00 0.00 8.04 1 0.00 0.00 0.37 3 0.00 0.00 0.00 3 0.00 0.00 0.01 3 0.00 0.00 0.00 10 0.01 0.01 0.02 10 0.01 0.02 0.03 10 0.00 0.00 0.00 31 0.19 0.42 0.44 31 0.24 0.70 0.83 31 0.00 0.01 0.01 100 20.34 31.12 28.81 100 23.16 37.75 44.54 100 0.16 0.19 0.23 Abstraction Ratios for Linux and Irix are Comparable The abstraction ratio is the ratio of an implementation's running time with the C implementation. If the C running time is 0.0, the ratio is omitted. The data is too sparse to make a conclusive conclusion. Linux (Stephen) Linux (Gaby) Irix Acoustic2d N C CppTran PoomaII N C CppTran PoomaII N C CppTran PoomaII 10 1.00 1.50 3.00 10 1.00 5.00 10.00 31 1.00 1.94 2.65 31 1.00 2.41 3.14 100 1.00 1.23 1.16 100 1.00 1.50 1.53 100 1.00 1.25 1.25 Doof2d N C CppTran PoomaII N C CppTran PoomaII N C CppTran PoomaII 100 1.00 0.00 1.00 316 1.00 1.50 1.17 316 1.00 1.33 1.22 316 1.00 1.00 1.00 1000 1.00 1.27 1.16 1000 1.00 1.24 1.12 1000 1.00 0.91 1.18 Solvers/Krylov/CGA N C CppTran PoomaII N C CppTran PoomaII N C CppTran PoomaII 10 1.00 1.00 2.00 10 1.00 2.00 3.00 31 1.00 2.21 2.32 31 1.00 2.92 3.46 100 1.00 1.53 1.42 100 1.00 1.63 1.92 100 1.00 1.19 1.44 Thanks, Jeffrey D. Oldham oldham at codesourcery.com From scotth at proximation.com Fri Jul 27 14:57:06 2001 From: scotth at proximation.com (Scott Haney) Date: Fri, 27 Jul 2001 08:57:06 -0600 Subject: Some answers to Chevron.cc questions Message-ID: Here are some answers to Jeffrey's questions from Chevron.cc. /** QUESTIONS **/ // o. If several different fields are created using the same mesh // object, is the mesh object shared? We used to have a mesh abstraction, but I removed this while writing fieldEngine for reasons that escape me right now. :-) This should probably be restored, which would (1) enable the sensible sharing of meshes that you describe and (2) allow for a generic implementation of fieldEngine. // o. Can meshes be queried without going through an associated field? Once the mesh abstraction is restored, yes. // o. According to my understanding, the Chevron algorithm should be // imbedded inside a loop of some type that repeatedly updates the // coordinates. I don't think this is right. I thought this was an eulerian calculation. It is, after all, just a liquid flowing through dirt. What part of the paper // o. I omitted a separate coordinates field, presumably updated each // iteration, in favor of using the mesh. Since I do not know how // the coordinates are updated, I omitted updating the mesh. OK. // o. Is it important to flesh out the linear algebra solution? We // might learn something about field syntax, but it will also take // time for me to determine the correct operands. Yes, I think we should do this. This will be an interesting use of neighbor operations in scalar code. Recall that we're simply solving for the pressure gradient values that give continuity of the pressure at the face centers and continuity of the fluxes at the spokes. // o. The eight spoke-centered flux values are discontinuous, right? Not really. Once we correctly solve for the pressure gradients, the flux should be continuous. // o. Creating non-canonical edge and face centerings requires // dimension-dependent code. Is this acceptable? No, it isn't acceptable. The good news is that I don't believe it requires dimension dependent code. Figuring out how to do this will teach us something about facilities we need to provide. /** UNFINISHED WORK **/ // o ConstField = a Field with values that do not change We used to have a ConstField, but we don't any more. We used to have ConstArray as well. These were removed to simplify expressions. Putting these back isn't, I think, something we want to do since they caused more problems than what they solved. // o nearestNeighbors(inputCentering, outputCentering) // o replicate(field, std::vector) Nice! But is "replicate" the right word? // o meshLayout.unitCoordinateNormals() This isn't something a layout should do. Is this just the N coordinate normals, e.g., {(0,1),(1,0)}? If so, why a field? If not, what is this exactly? // o field.mesh() We should be able to do this. // o field.mesh().normals() OK. I'm not sure we will do this as a member function, as above, or as an external function normals(field.mesh()) // o field.mesh().normals().signedMagnitude() Is this just mesh.faceAreas() * dot(mesh.normals(), mesh.positiveNormals()) ? Anyway, this isn't a question for the normals, but we can certainly do this. // o sum(field, FieldOffsetList) Sweeetttt. :-) From stephens at proximation.com Fri Jul 27 16:55:42 2001 From: stephens at proximation.com (Stephen Smith) Date: Fri, 27 Jul 2001 10:55:42 -0600 Subject: patchLocal patch Message-ID: This patch fixes the behaviour of f.patchLocal() for new field. The existing version was providing the wrong domain because of the way domains are computed with different centerings. The fixed version gives a field with the physical domain corresponding to the cells owned by the patch and a total domain that includes the guard layers. You can use patchLocal to write into the internal guards if necessary. This fix was required both to wrap up work on particle interaction with new field, and get scalar code sections of Blanca's code running in parallel. Reviewed by Scott Haney. Tested with KCC and --messaging on multi-processor Linux. Stephen <<27.Jun.patchLocal.patch>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 27.Jun.patchLocal.patch Type: application/octet-stream Size: 29306 bytes Desc: not available URL: From oldham at codesourcery.com Tue Jul 31 22:28:16 2001 From: oldham at codesourcery.com (Jeffrey Oldham) Date: Tue, 31 Jul 2001 15:28:16 -0700 Subject: [pooma-dev] Some answers to Chevron.cc questions In-Reply-To: <200107271457.HAA14824@oz.codesourcery.com>; from scotth@proximation.com on Fri, Jul 27, 2001 at 08:57:06AM -0600 References: <200107271457.HAA14824@oz.codesourcery.com> Message-ID: <20010731152816.A15083@codesourcery.com> On Fri, Jul 27, 2001 at 08:57:06AM -0600, Scott Haney wrote: > > // o. According to my understanding, the Chevron algorithm should be > // imbedded inside a loop of some type that repeatedly updates the > // coordinates. > > I don't think this is right. I thought this was an eulerian calculation. > It is, after all, just a liquid flowing through dirt. What part of the > paper Is not a loop necessary for a finite difference method? > // o. I omitted a separate coordinates field, presumably updated each > // iteration, in favor of using the mesh. Since I do not know how > // the coordinates are updated, I omitted updating the mesh. > > OK. > > // o. Is it important to flesh out the linear algebra solution? We > // might learn something about field syntax, but it will also take > // time for me to determine the correct operands. > > Yes, I think we should do this. This will be an interesting use of > neighbor operations in scalar code. Recall that we're simply solving for > the pressure gradient values that give continuity of the pressure at the > face centers and continuity of the fluxes at the spokes. OK. I will look into this. > // o. The eight spoke-centered flux values are discontinuous, right? > > Not really. Once we correctly solve for the pressure gradients, the flux > should be continuous. I do not understand this. By "flux value" I mean q in your algorithm explanation. These values are set using only values within a cell so how can they be shared by adjacent cells? > // o. Creating non-canonical edge and face centerings requires > // dimension-dependent code. Is this acceptable? > > No, it isn't acceptable. The good news is that I don't believe it > requires dimension dependent code. Figuring out how to do this will > teach us something about facilities we need to provide. I'm deferring this until later. > /** UNFINISHED WORK **/ > > // o replicate(field, std::vector) > > Nice! But is "replicate" the right word? This is an unnecessary function. If each output field value equals exactly one input field value, this function makes the copy without the overhead of a loop. Is the name OK? > // o meshLayout.unitCoordinateNormals() > > This isn't something a layout should do. Is this just the N coordinate > normals, e.g., {(0,1),(1,0)}? If so, why a field? If not, what is this > exactly? No, these are unit-length normals perpendicular to the mesh faces but pointing in positive directions. > // o field.mesh().normals().signedMagnitude() > > Is this just > > mesh.faceAreas() * dot(mesh.normals(), mesh.positiveNormals()) ? Yes. Thanks, Jeffrey D. Oldham oldham at codesourcery.com