From radek.pecher at eng.ox.ac.uk Fri May 21 08:07:39 2004 From: radek.pecher at eng.ox.ac.uk (Radek Pecher) Date: Fri, 21 May 2004 09:07:39 +0100 Subject: Temporary copies do appear...?? Message-ID: <200405210907.39846.radek.pecher@eng.ox.ac.uk> Dear POOMA developers, My name is Radek and I am a researcher at the Oxford University, UK. I have started implementing POOMA into our numerical model of liquid crystals in 3D. I feel that the it is a suitable tool for this challenging problem where we have 10 unknowns at each node of the finite-element mesh. If things go right, we will be happy to express our thanks to POOMA in all our publications that will follow. The reason why I am contacting you today is to inform you about a possible POOMA problem that I have encountered why testing my POOMA/ PETE based implementation of an automatic-differentiation class which otherwise works perfectly (I can share the code with you if you are interested, by the way). Please note that I already found a couple of minor POOMA bugs, such as: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - although Tensor.h:338 claims: // The format is: ((t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... )) the truth is in fact: // The format is: ((t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... )) - this is contrary to TinyMatrix because of the i,j-swapping (compare: Tensor.h:361 and TinyMatrix.h:236) ==================================================================== - line /src/Tiny/VectorOperators.h:189 inline typename BinaryReturn< Vector, T2, TAG >::Type_t should correctly be: inline typename BinaryReturn< T1, Vector, TAG >::Type_t - this error may cause problems if T1 and T2 are different types and when stricter type-conversions are imposed ==================================================================== - line /src/DynamicArray/DynamicArray.h:373 : Array(s1, model) should correctly be: : Array<1, T, EngineTag>(s1, model) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< but the problem that I would like to describe in the rest of this email seems more serious than that. Basically, simple algebraic expressions based on the tiny Vector class do create temporary Full-engine copies of individual subexpressions, as opposed to what POOMA claims to prevent. The following short main code: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> #include "Pooma/Arrays.h" int main(int argc, char* argv[]) { ? Pooma::initialize(argc, argv); ? Vector<2> v1(1, 2), v2; ? v2 = v1*v1 + v1*v1; ? Pooma::finalize(); ? return 0; } <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< was tested by modifying the file /src/Tiny/Vector.h by adding the following line: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PrintTypeName(this); PrintTypeName(x); std::cout << std::endl; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< to the Vector(const X& x)-constructor on line 117 and the VectorEngine(const X& x)-constructor on line 290. The diagnostic function PrintTypeName is defined as: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template inline void PrintTypeName(const T& t) { ? std::ostringstream out; ? out << "c++filt " << typeid(t).name(); ? system(out.str().c_str()); } <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< where the GNU tool c++filt is used to demangle the type names. The following optimising g++ (v. 3.3.1) command has been used to build the executable under SuSE Linux 9.0 (i586): >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> g++ -ftemplate-depth-60 -Drestrict=__restrict__ -fno-exceptions -DNOPAssert -DNOCTAssert -O2 -fno-default-inline -funroll-loops -fstrict-aliasing -o Main Main.cpp -I$HOME/lib/Optim/POOMA/linux/lib/ PoomaConfiguration-gcc -I$HOME/lib/Optim/POOMA/linux/src -I$HOME/lib/ Optim/POOMA/linux/lib -fno-exceptions -L$HOME/lib/Optim/POOMA/linux/ lib -lpooma-gcc -lm <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< The code execution output is listed in the following box: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VectorEngine<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpMultiply> > Vector<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpMultiply> > VectorEngine<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpMultiply> > Vector<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpMultiply> > VectorEngine<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpAdd> > Vector<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpAdd> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Clearly, every operation in the expression v1*v1 + v1*v1 invokes a Full-engine copy of the BinaryVectorOp-engine subexpression result. Is this behaviour correct or am I doing something wrong, please? Do I need any better-optimising compiler (I already ordered the latest Intel's ICC, the successor of KAI) or any other command-line flags? If there is any way how to prevent this waste of resources, I would very much appreciate your kind help. Sincerely, Radek __________________________________ Dr. Radek Pecher Research Assistant Department of Engineering Science University of Oxford Parks Road, Oxford, OX1 3PJ, UK Tel: ?+44 (0)1865 273044 Fax: +44 (0)1865 273905 radek.pecher at eng.ox.ac.uk From rguenth at tat.physik.uni-tuebingen.de Fri May 21 08:31:40 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 21 May 2004 10:31:40 +0200 Subject: [pooma-dev] Temporary copies do appear...?? In-Reply-To: <200405210907.39846.radek.pecher@eng.ox.ac.uk> References: <200405210907.39846.radek.pecher@eng.ox.ac.uk> Message-ID: <40ADBE6C.4010301@tat.physik.uni-tuebingen.de> Radek Pecher wrote: > Basically, simple algebraic expressions based on the tiny Vector class > do create temporary Full-engine copies of individual subexpressions, > as opposed to what POOMA claims to prevent. The following short main > code: > > > > #include "Pooma/Arrays.h" > > int main(int argc, char* argv[]) > { > Pooma::initialize(argc, argv); > > Vector<2> v1(1, 2), v2; > v2 = v1*v1 + v1*v1; > > Pooma::finalize(); > return 0; > } You are right that gcc 3.3 does not optimize the copy calls. But compiling the above with g++-3.4 -O2 -fpeel-loops results in straight line code. Using Intel 8.0 compiler the asm code is a bit obfuscated and there are calls to destructors left (not inlining these seems to be a common problem of the Intel compiler). I don't know wether one can structurally avoid the extra constructor calls inside the Vector code, but maybe you can have a look at it? This is certainly a point where optimization will be useful (if not for compilation speed). > g++ -ftemplate-depth-60 -Drestrict=__restrict__ -fno-exceptions > -DNOPAssert -DNOCTAssert -O2 -fno-default-inline -funroll-loops > -fstrict-aliasing -o Main Main.cpp -I$HOME/lib/Optim/POOMA/linux/lib/ > PoomaConfiguration-gcc -I$HOME/lib/Optim/POOMA/linux/src -I$HOME/lib/ > Optim/POOMA/linux/lib -fno-exceptions -L$HOME/lib/Optim/POOMA/linux/ > lib -lpooma-gcc -lm > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Also, if you are using gcc, you may consider applying the leafify patch to your gcc distribution available at http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/ and making the POOMA evaluators use it (I can provide a patch to you). That's worth about 50% performance increase. Hope that helps, Richard. From rguenth at tat.physik.uni-tuebingen.de Mon May 24 09:12:33 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 24 May 2004 11:12:33 +0200 (CEST) Subject: [PATCH] Re: [pooma-dev] Temporary copies do appear...?? In-Reply-To: <200405210907.39846.radek.pecher@eng.ox.ac.uk> References: <200405210907.39846.radek.pecher@eng.ox.ac.uk> Message-ID: On Fri, 21 May 2004, Radek Pecher wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > - although Tensor.h:338 claims: > // The format is: ((t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... )) > the truth is in fact: > // The format is: ((t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... )) > - this is contrary to TinyMatrix because of the i,j-swapping > (compare: Tensor.h:361 and TinyMatrix.h:236) That is indeed inconsistent(?). I don't know what to do on this one, but it seems purely cosmetic. I'd suggest fixing the comments and not swap the indices in TinyMatrix.h:239. Jeffrey? > ==================================================================== > - line /src/Tiny/VectorOperators.h:189 > inline typename BinaryReturn< Vector, T2, TAG >::Type_t > should correctly be: > inline typename BinaryReturn< T1, Vector, TAG >::Type_t > - this error may cause problems if T1 and T2 are different types and > when stricter type-conversions are imposed Indeed. And the same error in TensorOperators.h and TinyMatrixOperators.h. > ==================================================================== > - line /src/DynamicArray/DynamicArray.h:373 > : Array(s1, model) > should correctly be: > : Array<1, T, EngineTag>(s1, model) > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< This is fixed in CVS already. Jeffrey, ok to apply the following patch? Compiled and tested Tiny/ on ia32-linux with gcc 3.4. Richard. 2004May24 Richard Guenther * src/Tiny/VectorOperators.h: use correct return type. src/Tiny/TensorOperators.h: likewise. src/Tiny/TinyMatrixOperators.h: likewise. Index: Tiny/TensorOperators.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tiny/TensorOperators.h,v retrieving revision 1.20 diff -u -u -r1.20 TensorOperators.h --- Tiny/TensorOperators.h 7 Mar 2000 13:18:13 -0000 1.20 +++ Tiny/TensorOperators.h 24 May 2004 08:44:03 -0000 @@ -210,7 +210,7 @@ } \ \ template \ -inline typename BinaryReturn< Tensor, T2, TAG >::Type_t \ +inline typename BinaryReturn< T1, Tensor, TAG >::Type_t \ FUNC( const T1& x, const Tensor& v2 ) \ { \ typedef Tensor V2; \ Index: Tiny/TinyMatrixOperators.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tiny/TinyMatrixOperators.h,v retrieving revision 1.3 diff -u -u -r1.3 TinyMatrixOperators.h --- Tiny/TinyMatrixOperators.h 7 Mar 2000 13:18:14 -0000 1.3 +++ Tiny/TinyMatrixOperators.h 24 May 2004 08:44:03 -0000 @@ -196,7 +196,7 @@ } \ \ template \ -inline typename BinaryReturn< TinyMatrix, T2, TAG >::Type_t \ +inline typename BinaryReturn< T1, TinyMatrix, TAG >::Type_t \ FUNC( const T1& x, const TinyMatrix& v2 ) \ { \ typedef TinyMatrix V2; \ Index: Tiny/VectorOperators.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Tiny/VectorOperators.h,v retrieving revision 1.17 diff -u -u -r1.17 VectorOperators.h --- Tiny/VectorOperators.h 5 Mar 2002 16:14:38 -0000 1.17 +++ Tiny/VectorOperators.h 24 May 2004 08:44:03 -0000 @@ -186,7 +186,7 @@ } \ \ template \ -inline typename BinaryReturn< Vector, T2, TAG >::Type_t \ +inline typename BinaryReturn< T1, Vector, TAG >::Type_t \ FUNC( const T1& x, const Vector& v2 ) \ { \ typedef Vector V2; \ From oldham at codesourcery.com Mon May 24 14:58:43 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Mon, 24 May 2004 07:58:43 -0700 Subject: [PATCH] Re: [pooma-dev] Temporary copies do appear...?? In-Reply-To: References: <200405210907.39846.radek.pecher@eng.ox.ac.uk> Message-ID: <40B20DA3.90303@codesourcery.com> Richard Guenther wrote: >On Fri, 21 May 2004, Radek Pecher wrote: > > > >>- although Tensor.h:338 claims: >>// The format is: ((t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... )) >> the truth is in fact: >>// The format is: ((t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... )) >>- this is contrary to TinyMatrix because of the i,j-swapping >> (compare: Tensor.h:361 and TinyMatrix.h:236) >> >> > >That is indeed inconsistent(?). I don't know what to do on this one, but >it seems purely cosmetic. I'd suggest fixing the comments and not swap >the indices in TinyMatrix.h:239. Jeffrey? > > I agree that consistency is important. Which particular ordering to choose is not important to me. Arrays print out in the actual order (0,0), (0,1), etc. so I think tensors should also. The commented tensor ordering probably follows from the comments concerning the three types of tensors and wishing to print those values. >>==================================================================== >>- line /src/Tiny/VectorOperators.h:189 >>inline typename BinaryReturn< Vector, T2, TAG >::Type_t >> should correctly be: >>inline typename BinaryReturn< T1, Vector, TAG >::Type_t >>- this error may cause problems if T1 and T2 are different types and >> when stricter type-conversions are imposed >> >> > >Indeed. And the same error in TensorOperators.h and >TinyMatrixOperators.h. > > > >>==================================================================== >>- line /src/DynamicArray/DynamicArray.h:373 >>: Array(s1, model) >> should correctly be: >>: Array<1, T, EngineTag>(s1, model) >><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< >> >> > >This is fixed in CVS already. > > >Jeffrey, ok to apply the following patch? Compiled and tested Tiny/ on >ia32-linux with gcc 3.4. > > Yes, these are good error fixes. Please commit them. >Richard. > > >2004May24 Richard Guenther > > * src/Tiny/VectorOperators.h: use correct return type. > src/Tiny/TensorOperators.h: likewise. > src/Tiny/TinyMatrixOperators.h: likewise. > > >Index: Tiny/TensorOperators.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Tiny/TensorOperators.h,v >retrieving revision 1.20 >diff -u -u -r1.20 TensorOperators.h >--- Tiny/TensorOperators.h 7 Mar 2000 13:18:13 -0000 1.20 >+++ Tiny/TensorOperators.h 24 May 2004 08:44:03 -0000 >@@ -210,7 +210,7 @@ > } \ > \ > template \ >-inline typename BinaryReturn< Tensor, T2, TAG >::Type_t \ >+inline typename BinaryReturn< T1, Tensor, TAG >::Type_t \ > FUNC( const T1& x, const Tensor& v2 ) \ > { \ > typedef Tensor V2; \ >Index: Tiny/TinyMatrixOperators.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Tiny/TinyMatrixOperators.h,v >retrieving revision 1.3 >diff -u -u -r1.3 TinyMatrixOperators.h >--- Tiny/TinyMatrixOperators.h 7 Mar 2000 13:18:14 -0000 1.3 >+++ Tiny/TinyMatrixOperators.h 24 May 2004 08:44:03 -0000 >@@ -196,7 +196,7 @@ > } \ > \ > template \ >-inline typename BinaryReturn< TinyMatrix, T2, TAG >::Type_t \ >+inline typename BinaryReturn< T1, TinyMatrix, TAG >::Type_t \ > FUNC( const T1& x, const TinyMatrix& v2 ) \ > { \ > typedef TinyMatrix V2; \ >Index: Tiny/VectorOperators.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Tiny/VectorOperators.h,v >retrieving revision 1.17 >diff -u -u -r1.17 VectorOperators.h >--- Tiny/VectorOperators.h 5 Mar 2002 16:14:38 -0000 1.17 >+++ Tiny/VectorOperators.h 24 May 2004 08:44:03 -0000 >@@ -186,7 +186,7 @@ > } \ > \ > template \ >-inline typename BinaryReturn< Vector, T2, TAG >::Type_t \ >+inline typename BinaryReturn< T1, Vector, TAG >::Type_t \ > FUNC( const T1& x, const Vector& v2 ) \ > { \ > typedef Vector V2; \ > > -- Jeffrey D. Oldham oldham at codesourcery.com From ron_hylton at hotmail.com Mon May 24 14:59:26 2004 From: ron_hylton at hotmail.com (ron hylton) Date: Mon, 24 May 2004 10:59:26 -0400 Subject: [pooma-dev] small bugs Message-ID: Here are some small bugs that perhaps should be fixed in CVS. in Array.h Array::physicalDomain() should return an object rather than a reference because view engines return a temporary layout and the current behavior is sometimes returning a reference to the interior of a temporary. Fix: inline const Domain_t physicalDomain() const in IndexFunctionEngine.h Engine::setDomain() should set firsts_m to be consistent with the constructors. Fix: void setDomain(const Domain_t &dom) { domain_m = dom; for (int d = 0; d < Dim; ++d) firsts_m[d] = domain_m[d].first(); } I think IndexFunctionEngine also should have an Engine::layout() member to be consistent with other Engines. The simplest possibility is: inline Layout_t layout() const { return Layout_t(domain_m); } in ForwardingEngine.h in struct NewEngine typedef Engine > Type_t; should be typedef Engine > Type_t; Ron Hylton _________________________________________________________________ Stop worrying about overloading your inbox - get MSN Hotmail Extra Storage! http://join.msn.click-url.com/go/onm00200362ave/direct/01/ From bmclean at finlaylabs.net Tue May 25 08:54:52 2004 From: bmclean at finlaylabs.net (Ben McLean (finlaylabs)) Date: Tue, 25 May 2004 16:54:52 +0800 Subject: POOMA for windows on Dev-C++, or a binary??? Message-ID: <007301c44235$f637f9e0$d9fd96d3@ravenouswolf> Hi all, has anyone had success compiling POOMA on windows using Dev-C++? I like open-source and being a student have not purchased MS Visual Studio, for which I believe the POOMA compile goes a bit smoother. Alternatively is there a windows pre-compiled binary available, which I can test out the package with? That would probably be easiest. It is for geophysical research, I want to test a few numerical packages especially POOMA before settling on one, but I would rather not spend forever trying to determine how to mod POOMA to compile for windows on Dev-C++ (even if I had the skills to do so... which I dont). Thanks, Ben McLean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Tue May 25 08:58:50 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 25 May 2004 10:58:50 +0200 (CEST) Subject: [pooma-dev] POOMA for windows on Dev-C++, or a binary??? In-Reply-To: <007301c44235$f637f9e0$d9fd96d3@ravenouswolf> References: <007301c44235$f637f9e0$d9fd96d3@ravenouswolf> Message-ID: On Tue, 25 May 2004, Ben McLean (finlaylabs) wrote: > Hi all, > > has anyone had success compiling POOMA on windows using Dev-C++? What is Dev-C++? What are the errors you get? Probably either the compiler doesn't follow the ISO C++ standard, or it does too much and you need to use POOMA CVS. > I like open-source and being a student have not purchased MS Visual Studio, for which I believe the POOMA compile goes a bit smoother. Alternatively is there a windows pre-compiled binary available, which I can test out the package with? That would probably be easiest. > > It is for geophysical research, I want to test a few numerical packages especially POOMA before settling on one, but I would rather not spend forever trying to determine how to mod POOMA to compile for windows on Dev-C++ (even if I had the skills to do so... which I dont). A precompiled binary won't help you very much for a template library. But you can try using a recent Cygwin or Mingw to compile POOMA. Not that I have used Windows for any numerical stuff yet... Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Tue May 25 09:40:26 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 25 May 2004 11:40:26 +0200 (CEST) Subject: [pooma-dev] small bugs In-Reply-To: References: Message-ID: On Mon, 24 May 2004, ron hylton wrote: > Here are some small bugs that perhaps should be fixed in CVS. > > in Array.h > > Array::physicalDomain() should return an object rather than a reference > because view engines return a temporary layout and the current behavior is > sometimes returning a reference to the interior of a temporary. Fix: > > inline const Domain_t physicalDomain() const Indeed. > in IndexFunctionEngine.h > > Engine::setDomain() should set firsts_m to be consistent with the > constructors. Fix: > > void setDomain(const Domain_t &dom) > { > domain_m = dom; > for (int d = 0; d < Dim; ++d) > firsts_m[d] = domain_m[d].first(); > } Hm, I think we should rather drop the firsts_m member and change first(int i) to return domain_m[i].first() instead. firsts_m isn't used otherwise. > I think IndexFunctionEngine also should have an Engine::layout() member to > be consistent with other Engines. The simplest possibility is: > > inline Layout_t layout() const { return Layout_t(domain_m); } Yes. > in ForwardingEngine.h > > in struct NewEngine > typedef Engine > Type_t; > should be > typedef Engine Components> > Type_t; Yes. Compiled ok, not tested (but all those look obvious). Ok? Richard. 2004May25 Richard Guenther From Ron Hylton * src/Array/Array.h: don't possibly return reference to temporary in physicalDomain(). src/Engine/IndexFunctionEngine.h: remove firsts_m member, add layout() accessor. src/Engine/ForwardingEngine.h: use NewEngine_t::dimensions for Type_t in NewEngine traits. Index: Array/Array.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Array/Array.h,v retrieving revision 1.150 diff -u -u -r1.150 Array.h --- Array/Array.h 2 Mar 2004 18:18:45 -0000 1.150 +++ Array/Array.h 25 May 2004 09:35:50 -0000 @@ -1796,7 +1796,7 @@ /// Returns the physical domain, i.e. the domain without external guards. - inline const Domain_t& physicalDomain() const + inline Domain_t physicalDomain() const { return engine_m.layout().innerDomain(); } Index: Engine/IndexFunctionEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/IndexFunctionEngine.h,v retrieving revision 1.26 diff -u -u -r1.26 IndexFunctionEngine.h --- Engine/IndexFunctionEngine.h 22 Oct 2003 19:38:07 -0000 1.26 +++ Engine/IndexFunctionEngine.h 25 May 2004 09:35:51 -0000 @@ -124,16 +124,12 @@ explicit Engine(const Domain_t &domain, const Functor &f = Functor()) : funct_m(f), domain_m(domain) { - for (int d = 0; d < Dim; ++d) - firsts_m[d] = domain[d].first(); } template explicit Engine(const Layout &layout, const Functor &f = Functor()) : funct_m(f), domain_m(layout.domain()) { - for (int d = 0; d < Dim; ++d) - firsts_m[d] = domain_m[d].first(); } //--------------------------------------------------------------------------- @@ -142,8 +138,6 @@ Engine(const This_t &model) : funct_m(model.functor()), domain_m(model.domain()) { - for (int d = 0; d < Dim; ++d) - firsts_m[d] = model.firsts_m[d]; } //--------------------------------------------------------------------------- @@ -153,8 +147,6 @@ { domain_m = rhs.domain(); funct_m = rhs.functor(); - for (int d = 0; d < Dim; ++d) - firsts_m[d] = rhs.firsts_m[d]; return *this; } @@ -240,7 +232,15 @@ inline int first(int i) const { PAssert(i >= 0 && i < Dim); - return firsts_m[i]; + return domain_m[i].first(); + } + + //--------------------------------------------------------------------------- + /// Returns the layout, which is constructed as a DomainLayout. + + Layout_t layout() const + { + return Layout_t(domain_m); } //--------------------------------------------------------------------------- @@ -253,7 +253,6 @@ Functor funct_m; Domain_t domain_m; - int firsts_m[Dim]; }; Index: Engine/ForwardingEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/ForwardingEngine.h,v retrieving revision 1.48 diff -u -u -r1.48 ForwardingEngine.h --- Engine/ForwardingEngine.h 22 Oct 2003 19:38:07 -0000 1.48 +++ Engine/ForwardingEngine.h 25 May 2004 09:35:51 -0000 @@ -317,7 +317,7 @@ struct NewEngine >, Domain> { typedef typename NewEngine::Type_t NewEngine_t; - typedef Engine > Type_t; + typedef Engine > Type_t; }; /** From bmclean at finlaylabs.net Tue May 25 09:45:50 2004 From: bmclean at finlaylabs.net (Ben McLean (finlaylabs)) Date: Tue, 25 May 2004 17:45:50 +0800 Subject: [pooma-dev] POOMA for windows on Dev-C++, or a binary??? References: <007301c44235$f637f9e0$d9fd96d3@ravenouswolf> Message-ID: <008b01c4423d$162fb4c0$d9fd96d3@ravenouswolf> Dev-C++ is a popular GPL'd C++ IDE, http://www.bloodshed.net/devcpp.html , and I have found it very functional. It would be very useful to have a setup for POOMA such as is available for Visual C++, so that an example compiled by Dev-C++ triggered (error-free) compilation of POOMA. For an example of the type of errors I get, when compiling the CosTimes.cpp example I get many undefined reference errors, starting with: [Linker error] undefined reference to `Pooma::initialize(int&, char**&, bool, bool, bool)' [Linker error] undefined reference to `Pooma::BrickBase<2>::BrickBase(Interval<2> const&, bool)' [Linker error] undefined reference to `Pooma::blockAndEvaluate()' etc etc... I am guessing this is a makefile problem, presently Dev-C++ is auto-generating its own..., maybe it is a simple matter to set up POOMA for compilation by Dev-C++, and open it to an even wider audience. Thanks, Ben. > What is Dev-C++? What are the errors you get? Probably either the > compiler doesn't follow the ISO C++ standard, or it does too much and you > need to use POOMA CVS. > > > A precompiled binary won't help you very much for a template library. But > you can try using a recent Cygwin or Mingw to compile POOMA. Not that I > have used Windows for any numerical stuff yet... > > Richard. > > -- > Richard Guenther > WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ > From h.belitz at fz-juelich.de Tue May 25 10:00:03 2004 From: h.belitz at fz-juelich.de (Hendrik Belitz) Date: Tue, 25 May 2004 12:00:03 +0200 Subject: [pooma-dev] POOMA for windows on Dev-C++, or a binary??? In-Reply-To: <008b01c4423d$162fb4c0$d9fd96d3@ravenouswolf> References: <007301c44235$f637f9e0$d9fd96d3@ravenouswolf> <008b01c4423d$162fb4c0$d9fd96d3@ravenouswolf> Message-ID: <200405251200.03044.h.belitz@fz-juelich.de> Since Dev-C++ uses GCC, this should not be a problem. Why don't you just compile POOMA with the command line compiler youbreceived together with Dev-C++? Compiling complete packages with an IDE like Dev-C++ is generally not a good idea. Using the usual configure/make/make install makes life a lot easier... From oldham at codesourcery.com Tue May 25 15:03:18 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 25 May 2004 08:03:18 -0700 Subject: [pooma-dev] small bugs In-Reply-To: References: Message-ID: <40B36036.4040707@codesourcery.com> Richard Guenther wrote: >On Mon, 24 May 2004, ron hylton wrote: > > > >>Here are some small bugs that perhaps should be fixed in CVS. >> >>in Array.h >> >>Array::physicalDomain() should return an object rather than a reference >>because view engines return a temporary layout and the current behavior is >>sometimes returning a reference to the interior of a temporary. Fix: >> >> inline const Domain_t physicalDomain() const >> >> > >Indeed. > > > >>in IndexFunctionEngine.h >> >>Engine::setDomain() should set firsts_m to be consistent with the >>constructors. Fix: >> >> void setDomain(const Domain_t &dom) >> { >> domain_m = dom; >> for (int d = 0; d < Dim; ++d) >> firsts_m[d] = domain_m[d].first(); >> } >> >> > >Hm, I think we should rather drop the firsts_m member and change >first(int i) to return domain_m[i].first() instead. firsts_m isn't used >otherwise. > > > >>I think IndexFunctionEngine also should have an Engine::layout() member to >>be consistent with other Engines. The simplest possibility is: >> >> inline Layout_t layout() const { return Layout_t(domain_m); } >> >> > >Yes. > > > >>in ForwardingEngine.h >> >>in struct NewEngine >> typedef Engine > Type_t; >>should be >> typedef Engine>Components> > Type_t; >> >> > >Yes. > >Compiled ok, not tested (but all those look obvious). Ok? > > It's great to have these improvements, but let's compile the test cases before committing. Doing so does not take too long and will help isolate the source of any such problems. After that, please commit these changes. >Richard. > > >2004May25 Richard Guenther > > From Ron Hylton > > * src/Array/Array.h: don't possibly return reference to > temporary in physicalDomain(). > src/Engine/IndexFunctionEngine.h: remove firsts_m member, > add layout() accessor. > src/Engine/ForwardingEngine.h: use NewEngine_t::dimensions > for Type_t in NewEngine traits. > > >Index: Array/Array.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Array/Array.h,v >retrieving revision 1.150 >diff -u -u -r1.150 Array.h >--- Array/Array.h 2 Mar 2004 18:18:45 -0000 1.150 >+++ Array/Array.h 25 May 2004 09:35:50 -0000 >@@ -1796,7 +1796,7 @@ > > /// Returns the physical domain, i.e. the domain without external guards. > >- inline const Domain_t& physicalDomain() const >+ inline Domain_t physicalDomain() const > { > return engine_m.layout().innerDomain(); > } >Index: Engine/IndexFunctionEngine.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Engine/IndexFunctionEngine.h,v >retrieving revision 1.26 >diff -u -u -r1.26 IndexFunctionEngine.h >--- Engine/IndexFunctionEngine.h 22 Oct 2003 19:38:07 -0000 1.26 >+++ Engine/IndexFunctionEngine.h 25 May 2004 09:35:51 -0000 >@@ -124,16 +124,12 @@ > explicit Engine(const Domain_t &domain, const Functor &f = Functor()) > : funct_m(f), domain_m(domain) > { >- for (int d = 0; d < Dim; ++d) >- firsts_m[d] = domain[d].first(); > } > > template > explicit Engine(const Layout &layout, const Functor &f = Functor()) > : funct_m(f), domain_m(layout.domain()) > { >- for (int d = 0; d < Dim; ++d) >- firsts_m[d] = domain_m[d].first(); > } > > //--------------------------------------------------------------------------- >@@ -142,8 +138,6 @@ > Engine(const This_t &model) > : funct_m(model.functor()), domain_m(model.domain()) > { >- for (int d = 0; d < Dim; ++d) >- firsts_m[d] = model.firsts_m[d]; > } > > //--------------------------------------------------------------------------- >@@ -153,8 +147,6 @@ > { > domain_m = rhs.domain(); > funct_m = rhs.functor(); >- for (int d = 0; d < Dim; ++d) >- firsts_m[d] = rhs.firsts_m[d]; > > return *this; > } >@@ -240,7 +232,15 @@ > inline int first(int i) const > { > PAssert(i >= 0 && i < Dim); >- return firsts_m[i]; >+ return domain_m[i].first(); >+ } >+ >+ //--------------------------------------------------------------------------- >+ /// Returns the layout, which is constructed as a DomainLayout. >+ >+ Layout_t layout() const >+ { >+ return Layout_t(domain_m); > } > > //--------------------------------------------------------------------------- >@@ -253,7 +253,6 @@ > > Functor funct_m; > Domain_t domain_m; >- int firsts_m[Dim]; > }; > > >Index: Engine/ForwardingEngine.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Engine/ForwardingEngine.h,v >retrieving revision 1.48 >diff -u -u -r1.48 ForwardingEngine.h >--- Engine/ForwardingEngine.h 22 Oct 2003 19:38:07 -0000 1.48 >+++ Engine/ForwardingEngine.h 25 May 2004 09:35:51 -0000 >@@ -317,7 +317,7 @@ > struct NewEngine >, Domain> > { > typedef typename NewEngine::Type_t NewEngine_t; >- typedef Engine > Type_t; >+ typedef Engine > Type_t; > }; > > /** > > -- Jeffrey D. Oldham oldham at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue May 25 18:14:19 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 25 May 2004 20:14:19 +0200 Subject: [PATCH] Fix TinyMatrix print inconsistency In-Reply-To: <40B20DA3.90303@codesourcery.com> References: <200405210907.39846.radek.pecher@eng.ox.ac.uk> <40B20DA3.90303@codesourcery.com> Message-ID: <40B38CFB.5060301@tat.physik.uni-tuebingen.de> Jeffrey D. Oldham wrote: > Richard Guenther wrote: > >> On Fri, 21 May 2004, Radek Pecher wrote: >> >> >> >>> - although Tensor.h:338 claims: >>> // The format is: ((t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... )) >>> the truth is in fact: >>> // The format is: ((t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... )) >>> - this is contrary to TinyMatrix because of the i,j-swapping >>> (compare: Tensor.h:361 and TinyMatrix.h:236) >>> >> >> >> That is indeed inconsistent(?). I don't know what to do on this one, but >> it seems purely cosmetic. I'd suggest fixing the comments and not swap >> the indices in TinyMatrix.h:239. Jeffrey? >> >> > > I agree that consistency is important. Which particular ordering to > choose is not important to me. Arrays print out in the actual order > (0,0), (0,1), etc. so I think tensors should also. The commented tensor > ordering probably follows from the comments concerning the three types > of tensors and wishing to print those values. That would be the following patch. Tested with Tiny tests, ok? Richard. 2004May25 Richard Guenther * src/Tiny/Tensor.h: fix comments describing output format. src/Tiny/TinyMatrix.h: fix comments describing output format, fix output format to match the one of tensors. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From oldham at codesourcery.com Tue May 25 18:21:47 2004 From: oldham at codesourcery.com (Jeffrey D. Oldham) Date: Tue, 25 May 2004 11:21:47 -0700 Subject: [PATCH] Fix TinyMatrix print inconsistency In-Reply-To: <40B38CFB.5060301@tat.physik.uni-tuebingen.de> References: <200405210907.39846.radek.pecher@eng.ox.ac.uk> <40B20DA3.90303@codesourcery.com> <40B38CFB.5060301@tat.physik.uni-tuebingen.de> Message-ID: <40B38EBB.3020702@codesourcery.com> Richard Guenther wrote: > Jeffrey D. Oldham wrote: > >> Richard Guenther wrote: >> >>> On Fri, 21 May 2004, Radek Pecher wrote: >>> >>> >>> >>>> - although Tensor.h:338 claims: >>>> // The format is: ((t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... )) >>>> the truth is in fact: >>>> // The format is: ((t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... )) >>>> - this is contrary to TinyMatrix because of the i,j-swapping >>>> (compare: Tensor.h:361 and TinyMatrix.h:236) >>>> >>> >>> >>> >>> That is indeed inconsistent(?). I don't know what to do on this >>> one, but >>> it seems purely cosmetic. I'd suggest fixing the comments and not swap >>> the indices in TinyMatrix.h:239. Jeffrey? >>> >>> >> >> I agree that consistency is important. Which particular ordering to >> choose is not important to me. Arrays print out in the actual order >> (0,0), (0,1), etc. so I think tensors should also. The commented >> tensor ordering probably follows from the comments concerning the >> three types of tensors and wishing to print those values. > > > That would be the following patch. Tested with Tiny tests, ok? > > Richard. > > 2004May25 Richard Guenther > > * src/Tiny/Tensor.h: fix comments describing output format. > src/Tiny/TinyMatrix.h: fix comments describing output format, > fix output format to match the one of tensors. > This is good. It must be correct since the indices are now "i,j", not "j,i". ;) Thanks for the work. Yes, let's commit this improvement. >------------------------------------------------------------------------ > >Index: Tiny/Tensor.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Tiny/Tensor.h,v >retrieving revision 1.46 >diff -u -u -r1.46 Tensor.h >--- Tiny/Tensor.h 21 Oct 2003 19:50:04 -0000 1.46 >+++ Tiny/Tensor.h 25 May 2004 18:08:09 -0000 >@@ -337,7 +337,7 @@ > > > // Output to a stream. >- // The format is: ((t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... )) >+ // The format is: ((t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... )) > > template > void print(Out &out) const >@@ -379,7 +379,7 @@ > > > /// Output to a stream. >-/// The format is: ( ( t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... ) >+/// The format is: ( ( t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... ) > > template > std::ostream &operator<<(std::ostream &out, const Tensor &t) >Index: Tiny/TinyMatrix.h >=================================================================== >RCS file: /home/pooma/Repository/r2/src/Tiny/TinyMatrix.h,v >retrieving revision 1.16 >diff -u -u -r1.16 TinyMatrix.h >--- Tiny/TinyMatrix.h 21 Oct 2003 19:50:04 -0000 1.16 >+++ Tiny/TinyMatrix.h 25 May 2004 18:08:09 -0000 >@@ -213,7 +213,7 @@ > > > // Output to a stream. >- // The format is: ((t(0,0) t(1,0),... ) (t(0,1) t(1,1) ... ) ... )) >+ // The format is: ((t(0,0) t(0,1),... ) (t(1,0) t(1,1) ... ) ... )) > > template > void print(Out &out) const >@@ -225,18 +225,18 @@ > long precision = out.precision(); > out.width(0); > out << "("; >- for (int i = 0; i < D2; i++) { >+ for (int i = 0; i < D1; i++) { > out << "("; > out.flags(incomingFormatFlags); > out.width(width); > out.precision(precision); >- out << (*this)(0, i); >- for (int j = 1; j < D1; j++) { >+ out << (*this)(i, 0); >+ for (int j = 1; j < D2; j++) { > out << " "; > out.flags(incomingFormatFlags); > out.width(width); > out.precision(precision); >- out << (*this)(j, i); >+ out << (*this)(i, j); > } > out << ")"; > } >@@ -255,7 +255,7 @@ > > > /// Output to a stream. >-/// The format is: ( ( t(0,0) t(1,0),... ) ( t(0,1) t(1,1) ... ) ... ) >+/// The format is: ( ( t(0,0) t(0,1),... ) ( t(1,0) t(1,1) ... ) ... ) > > template > std::ostream &operator<<(std::ostream &out, const TinyMatrix &t) > > -- Jeffrey D. Oldham oldham at codesourcery.com From kai at chaos.gwdg.de Thu May 27 22:50:26 2004 From: kai at chaos.gwdg.de (=?ISO-8859-1?Q?Kai_Br=F6king?=) Date: Fri, 28 May 2004 00:50:26 +0200 Subject: Fwd: Pooma 2.4.0 Problem on Tru64 Unix Message-ID: <3ED9E5DF-B030-11D8-B7FF-000A95B950D8@chaos.gwdg.de> Hello everybody, I was asked to forward my problem to this list. I would be glad, if somebody could help. I have broken it down to the following points (to my opinion, anyway). The original problem is attached below. It has to do with ar stating that that the argument list is too long while executing make: 1) it seems to be a problem of the shell (in this case bash), which complains about the argument list generated by make being too long to work with. The argument list is generated automatically as constructed in $(pooma_source)/config/Shared/tail.mk , specifically while generating the AR_CMDLINE -variable as $(AR_CMDLINE)=$(AR) $(RULE_AR_OPTS) $@ $(filter %.o,$+) $(INSTANTIATION_DIR)/*.o I have not gotten round to break that down to its parts, as RULE_AR_OPTS and INSTANTIATION_DIR are rather nastily composed of more variables that seem to be distributed all over the place. 2) Fixing the problem seems to incorporate: a) finding out which files exactly make up the library that is to be composed by ar and b) getting ar to parse one by one the files it shall ad to the archive and adding them to the library that is to be composed, plus c) finding out if what else is to be done for getting a working library. I will have a look at all three, but for now: here's the original problem, and I would appreciate any hints for other ways to find a workaround. Maybe any of you have stumbled across this mess, too. Thank you for your time and effort going through my mail. Best Regards, Kai. ************************************************************************ ***** Kai Broeking e-mail: kai at chaos.gwdg.de Max-Planck-Institut fuer Stroemungsforschung Tel: (49) 551 5176 444 Goettingen FAX: (49) 551 5176 439 ************************************************************************ ***** Now here's what caused all the trouble: Begin forwarded message: > I have tried to compile pooma 2.4.0 on (several) alphas running Tru 64 > Unix (arch DECCXX) > The options I gave configure were: > --prefix ~myhome/lib// --arch DECCXX > configure ran fine and completed within a second or so. > I made a setenv POOMASUITE DECCXX. > > During the execution of make I encountered the following problem: > For some reason there is a time in front of the > ar rc foo1.cmpl.o foo2.cmpl.o foo3.cmpl.o ... > > make always teminates with error 1, and the precise error message I > find in > lib/DECCXX/libpooma.a_1.info is: > > /bin/sh: /usr/bin/time: arg list too long > > even if I unset the TIME variable in config/Shared/variables.mk, the > problem continues, and ar returns the above error message. I have also > tried moving the pooma-2.4.0-drectory up a bit, but this does not > solve the problem, either. > My question therefore would be whether there is a workaround for this > problem. > Thank you in anticipation of your help, > > Best Regards, > Kai Broeking. > -- > *********************************************************************** > ****** > Kai Broeking > e-mail: kai at chaos.gwdg.de > Max-Planck-Institut fuer Stroemungsforschung Tel: (49) 551 5176 444 > Goettingen > FAX: (49) 551 5176 439 > *********************************************************************** > ****** From rguenth at tat.physik.uni-tuebingen.de Fri May 28 09:20:58 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 28 May 2004 11:20:58 +0200 (CEST) Subject: [pooma-dev] Fwd: Pooma 2.4.0 Problem on Tru64 Unix In-Reply-To: <3ED9E5DF-B030-11D8-B7FF-000A95B950D8@chaos.gwdg.de> References: <3ED9E5DF-B030-11D8-B7FF-000A95B950D8@chaos.gwdg.de> Message-ID: On Fri, 28 May 2004, Kai Br?king wrote: > Hello everybody, > > I was asked to forward my problem to this list. I would be glad, if > somebody could help. I have broken it down to the following points (to > my opinion, anyway). The original problem is attached below. It has to > do with ar stating that that the argument list is too long while > executing make: As a simple workaround (doesn't solve the problem, but may lessen its impact), try (whoops, patch is reversed, apply with -R) --- pooma-mpi3/r2/config/Shared/include1.mk 2004-05-28 11:10:26.000000000 +0200 +++ pooma-bk/r2/config/Shared/include1.mk 2002-07-01 16:54:07.000000000 +0200 @@ -33,8 +33,7 @@ ifndef NEXTDIR # THISDIR :=$(subst /tmp_mnt,,$(shell pwd)) -# THISDIR :=$(shell pwd) - THISDIR := . + THISDIR :=$(shell pwd) DIR_LIST :=$(THISDIR) else DIR_LIST :=$(THISDIR)/$(NEXTDIR) $(DIR_LIST) --- pooma-mpi3/r2/config/Shared/tail.mk 2004-05-28 11:16:42.000000000 +0200 +++ pooma-bk/r2/config/Shared/tail.mk 2002-07-01 16:54:07.000000000 +0200 @@ -57,8 +57,7 @@ INFO_FILE = $@_$(PASS).info # This is prepended to compile, link, archive, preprocess, etc rules. -#PRE_CMDLINE = cd $(PROJECT_ROOT); TMPDIR=$(TMPDIR)/$(SUITE); $(TIME) -PRE_CMDLINE = TMPDIR=$(TMPDIR)/$(SUITE); $(TIME) +PRE_CMDLINE = cd $(PROJECT_ROOT); TMPDIR=$(TMPDIR)/$(SUITE); $(TIME) # This is prepended to compile, link, archive, preprocess, etc rules. PDB_PRE_CMDLINE = cd $(@D); TMPDIR=$(TMPDIR)/$(SUITE); $(TIME) From radek.pecher at eng.ox.ac.uk Fri May 28 10:34:52 2004 From: radek.pecher at eng.ox.ac.uk (Radek Pecher) Date: Fri, 28 May 2004 11:34:52 +0100 Subject: Yes, Vector temporaries do appear in every operation...!! Message-ID: <200405281134.52635.radek.pecher@eng.ox.ac.uk> I installed the latest GCC (v 3.4) and ran again the test described in my previous email titled "Temporary copies do appear...??" (using the optimisation flags from /config/arch/LINUXgcc.conf). To my disappointment, the temporary Full-engine copies of the three subexpressions in the Vector expression v1*v1 + v1*v1 do get created. Here is an excerpt of the main code and the output from Vector.h which was modified by replacing {} of all the Vector-constructors by {PrintTypeName(this);} : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Vector<2> v1(1, 2), v2; v2 = v1*v1 + v1*v1; ==================================================================== Vector<2, double, Full>* Vector<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpMultiply> >* Vector<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpMultiply> >* Vector<2, double, Full>* Vector<2, double, BinaryVectorOp, Vector<2, double, Full>, OpAdd> >* Vector<2, double, Full>* <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< I also tried the same test for the class Array; the corresponding code and output follow: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Array<1> a1(2, ModelElement(10)), a2(2); a2 = a1*a1 + a1*a1; ==================================================================== Array<1, double, Brick>* Array<1, double, Brick>* Array<1, double, ExpressionTag >, Reference > > > >* Array<1, double, ExpressionTag >, Reference > > > >* Array<1, double, ExpressionTag >, Reference > >, BinaryNode >, Reference > > > > >* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, ExpressionTag, Array<1, double, BrickView> >, BinaryNode, Array<1, double, BrickView> > > > >* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, BrickView>* Array<1, double, ExpressionTag, Array<1, double, BrickView> >, BinaryNode, Array<1, double, BrickView> > > > >* <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Clearly, for the Array-case, there are no instantiations in between the ExpressionTag-based Array-constructors (although there are a number of BrickView-based calls later, these represent references, not copies; the only full-memory arrays are those with engine Brick). Unlike the Array-case, however, the Vector-case does exhibit Full-engine instantiations between the BinaryVectorOp-based calls. That may indicate a serious flaw in the design of the tiny classes... As to myself, I am going to write my own PETE-based tiny classes (the POOMA versions seem to me unnecessarily too complicated for their actual purpose). Nevertheless, if someone knows how to prevent such odd behaviour (which clearly defies one of the main POOMA goals, i.e. to get rid of unnecessary memory copying), it would be appreciated. From rguenth at tat.physik.uni-tuebingen.de Fri May 28 10:53:57 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 28 May 2004 12:53:57 +0200 (CEST) Subject: [pooma-dev] Yes, Vector temporaries do appear in every operation...!! In-Reply-To: <200405281134.52635.radek.pecher@eng.ox.ac.uk> References: <200405281134.52635.radek.pecher@eng.ox.ac.uk> Message-ID: On Fri, 28 May 2004, Radek Pecher wrote: > > I installed the latest GCC (v 3.4) and ran again the test described in > my previous email titled "Temporary copies do appear...??" > (using the optimisation flags from /config/arch/LINUXgcc.conf). > > To my disappointment, the temporary Full-engine copies of the three > subexpressions in the Vector expression v1*v1 + v1*v1 do get created. Note that without your debugging stuff in the constructors, these get inlined and optimized away by the optimizer. Of course one could argue creating the copies should be avoided in the first place, but I cannot see how this can be done, as, f.i. for BinaryOp::operator() we clearly need to return a _new_ Vector as result. To avoid this one would have to expression-template the vector itself, so only primitive variable types are ever copied. But I don't think this will work or pay off. Or do yoy have different ideas? Richard. > Here is an excerpt of the main code and the output from Vector.h which > was modified by replacing {} of all the Vector-constructors by > {PrintTypeName(this);} : > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > Vector<2> v1(1, 2), v2; > v2 = v1*v1 + v1*v1; > ==================================================================== > Vector<2, double, Full>* > Vector<2, double, Full>* > Vector<2, double, BinaryVectorOp, > Vector<2, double, Full>, OpMultiply> >* > Vector<2, double, Full>* > Vector<2, double, BinaryVectorOp, > Vector<2, double, Full>, OpMultiply> >* > Vector<2, double, Full>* > Vector<2, double, BinaryVectorOp, > Vector<2, double, Full>, OpAdd> >* > Vector<2, double, Full>* > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< > > > I also tried the same test for the class Array; the corresponding code > and output follow: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > Array<1> a1(2, ModelElement(10)), a2(2); > a2 = a1*a1 + a1*a1; > ==================================================================== > Array<1, double, Brick>* > Array<1, double, Brick>* > Array<1, double, ExpressionTag Reference >, Reference double, Brick> > > > >* > Array<1, double, ExpressionTag Reference >, Reference double, Brick> > > > >* > Array<1, double, ExpressionTag BinaryNode >, > Reference > >, BinaryNode Reference >, Reference double, Brick> > > > > >* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, ExpressionTag BinaryNode, > Array<1, double, BrickView> >, BinaryNode Array<1, double, BrickView>, Array<1, double, > BrickView> > > > >* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, BrickView>* > Array<1, double, ExpressionTag BinaryNode, > Array<1, double, BrickView> >, BinaryNode Array<1, double, BrickView>, Array<1, double, > BrickView> > > > >* > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< > > > Clearly, for the Array-case, there are no instantiations in between > the ExpressionTag-based Array-constructors (although there are a > number of BrickView-based calls later, these represent references, > not copies; the only full-memory arrays are those with engine Brick). > > Unlike the Array-case, however, the Vector-case does exhibit > Full-engine instantiations between the BinaryVectorOp-based calls. > That may indicate a serious flaw in the design of the tiny classes... > > As to myself, I am going to write my own PETE-based tiny classes (the > POOMA versions seem to me unnecessarily too complicated for their > actual purpose). Nevertheless, if someone knows how to prevent such > odd behaviour (which clearly defies one of the main POOMA goals, i.e. > to get rid of unnecessary memory copying), it would be appreciated. > -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From radek.pecher at eng.ox.ac.uk Fri May 28 11:20:18 2004 From: radek.pecher at eng.ox.ac.uk (Radek Pecher) Date: Fri, 28 May 2004 12:20:18 +0100 Subject: [pooma-dev] Yes, Vector temporaries do appear in every operation...!! In-Reply-To: References: <200405281134.52635.radek.pecher@eng.ox.ac.uk> Message-ID: <200405281220.19007.radek.pecher@eng.ox.ac.uk> | Note that without your debugging stuff in the constructors, these | get inlined and optimized away by the optimizer. Of course one | could argue creating the copies should be avoided in the first | place, but I cannot see how this can be done, as, f.i. for | BinaryOp::operator() we clearly need | to return a _new_ Vector as result. To avoid this one would have | to expression-template the vector itself, so only primitive | variable types are ever copied. But I don't think this will work | or pay off. I actually compiled the code with the original (unmodified) version of Vector.h first and used GDB to run it and disassemble it. Without much analysing, I noticed several looping jumps at the place of the algebraic expression which only confirms that the optimising compiler did not produce the required code: v2(0) = v1(0)*v1(0) + v1(0)*v1(0); v2(1) = v1(1)*v1(1) + v1(1)*v1(1); as was supposed to. (And I also tried several other optimisation configurations, of course.) As to the need for the return of a Vector, I suppose that Vector<2, double, BinaryVectorOp<...> > is all is needed (with the references to its two operands). There is no need at all to take this object and make its Full-engine copy for any subsequent operations. From rguenth at tat.physik.uni-tuebingen.de Fri May 28 11:35:15 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 28 May 2004 13:35:15 +0200 (CEST) Subject: [pooma-dev] Yes, Vector temporaries do appear in every operation...!! In-Reply-To: <200405281220.19007.radek.pecher@eng.ox.ac.uk> References: <200405281134.52635.radek.pecher@eng.ox.ac.uk> <200405281220.19007.radek.pecher@eng.ox.ac.uk> Message-ID: On Fri, 28 May 2004, Radek Pecher wrote: > | Note that without your debugging stuff in the constructors, these > | get inlined and optimized away by the optimizer. Of course one > | could argue creating the copies should be avoided in the first > | place, but I cannot see how this can be done, as, f.i. for > | BinaryOp::operator() we clearly need > | to return a _new_ Vector as result. To avoid this one would have > | to expression-template the vector itself, so only primitive > | variable types are ever copied. But I don't think this will work > | or pay off. > > I actually compiled the code with the original (unmodified) version of > Vector.h first and used GDB to run it and disassemble it. Without > much analysing, I noticed several looping jumps at the place of the > algebraic expression which only confirms that the optimising compiler > did not produce the required code: > v2(0) = v1(0)*v1(0) + v1(0)*v1(0); > v2(1) = v1(1)*v1(1) + v1(1)*v1(1); > as was supposed to. (And I also tried several other optimisation > configurations, of course.) I don't have these temporaries. Compiling with gcc 3.4, using options -O2 -funroll-loops -DNOPAssert -S I get: .L171: fldl -24(%ebp) leal -24(%ebp), %eax leal -72(%ebp), %ecx fldl -16(%ebp) fxch %st(1) movl %eax, -88(%ebp) fmul %st(0), %st fxch %st(1) movl %eax, -84(%ebp) leal -104(%ebp), %edx fmul %st(0), %st fxch %st(1) movl %eax, -120(%ebp) movl %eax, -116(%ebp) leal -56(%ebp), %eax cmpl %eax, %ebx fstl -72(%ebp) fxch %st(1) fstl -64(%ebp) fxch %st(1) fstl -104(%ebp) fadd %st(0), %st fxch %st(1) fstl -96(%ebp) fadd %st(0), %st fxch %st(1) movl %ecx, -136(%ebp) movl %edx, -132(%ebp) fstl -56(%ebp) fxch %st(1) fstl -48(%ebp) je .L282 fxch %st(1) fstpl -40(%ebp) fstpl -32(%ebp) jmp .L179 .p2align 4,,7 .L282: fstp %st(0) fstp %st(0) .p2align 4,,15 .L179: which I haven't analyzed for optimal-ness in detail, but certainly there is no loop left and no calls to constructors/destructors. There are unnecessary stores to not-removed temporaries though. > As to the need for the return of a Vector, I suppose that > Vector<2, double, BinaryVectorOp<...> > is all is needed (with the > references to its two operands). There is no need at all to take this > object and make its Full-engine copy for any subsequent operations. Well, yes, this would be a step to expression-template the vector classes. You then need assignment operators / constructors that know how to transfer this into a regular Vector - which would be the expression template expanders. Maybe it's really simple - you might want to try ;) Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Fri May 28 11:56:28 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Fri, 28 May 2004 13:56:28 +0200 (CEST) Subject: [pooma-dev] Yes, Vector temporaries do appear in every operation...!! In-Reply-To: <200405281220.19007.radek.pecher@eng.ox.ac.uk> References: <200405281134.52635.radek.pecher@eng.ox.ac.uk> <200405281220.19007.radek.pecher@eng.ox.ac.uk> Message-ID: On Fri, 28 May 2004, Radek Pecher wrote: > | Note that without your debugging stuff in the constructors, these > | get inlined and optimized away by the optimizer. Of course one > | could argue creating the copies should be avoided in the first > | place, but I cannot see how this can be done, as, f.i. for > | BinaryOp::operator() we clearly need > | to return a _new_ Vector as result. To avoid this one would have > | to expression-template the vector itself, so only primitive > | variable types are ever copied. But I don't think this will work > | or pay off. > > > I actually compiled the code with the original (unmodified) version of > Vector.h first and used GDB to run it and disassemble it. Without > much analysing, I noticed several looping jumps at the place of the > algebraic expression which only confirms that the optimising compiler > did not produce the required code: > v2(0) = v1(0)*v1(0) + v1(0)*v1(0); > v2(1) = v1(1)*v1(1) + v1(1)*v1(1); > as was supposed to. (And I also tried several other optimisation > configurations, of course.) Oh, and you usually get a better idea of what the compiler is able to do if not using main(), but just an externally visible function like #include "Pooma/Arrays.h" Vector<2> test(const Vector<2>& v1) { Vector<2> v2; v2 = v1*v1 + v1*v1; return v2; } and indeed, comparing with Vector<2> test2(const Vector<2>& v1) { Vector<2> v2; v2(0) = v1(0)*v1(0) + v1(0)*v1(0); v2(1) = v1(1)*v1(1) + v1(1)*v1(1); return v2; } shows inefficient code. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Mon May 31 15:01:32 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 31 May 2004 17:01:32 +0200 Subject: Cheetah / PETE license Message-ID: <40BB48CC.9050101@tat.physik.uni-tuebingen.de> Hi! Is there any progress regarding the missing Cheetah / PETE licenses? Would there be objections against removing Cheetah support from POOMA? Thanks, Richard. From rguenth at tat.physik.uni-tuebingen.de Mon May 31 15:16:36 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 31 May 2004 17:16:36 +0200 Subject: [PATCH] Shorten filenames during build/link Message-ID: <40BB4C54.6020808@tat.physik.uni-tuebingen.de> This patch shortens filenames by omitting full path to avoid overly long command lines. Compiled and tested building some examples, benchmarks and tests. Ok? Richard. 2004May31 Richard Guenther * config/Shared/include1.mk: set THISDIR to . config/Shared/tail.mk: don't change into PROJECT_ROOT for PRE_CMDLINE. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p URL: From rguenth at tat.physik.uni-tuebingen.de Mon May 31 15:20:16 2004 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 31 May 2004 17:20:16 +0200 Subject: [PATCH] Minor cleanups Message-ID: <40BB4D30.3020109@tat.physik.uni-tuebingen.de> Hi! This patch changes the comp(int) methods of Array/Field to not pass by const reference, it also removes an unused specialization of ComponentView. Compiled and tested on ia32-linux, ok? Richard. 2004May31 Richard Guenther * src/Array/Array.h: remove ComponentView specialization, do not pass i1 by const reference for comp() method. src/Field/Field.h: do not pass i1 by const reference for comp() method. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: p2 URL: