From yourfriend_t at yahoo.co.jp Sun Jun 1 05:39:08 2003 From: yourfriend_t at yahoo.co.jp (T) Date: Sun, 1 Jun 2003 14:39:08 +0900 (JST) Subject: Where can I get latest cheetah? Message-ID: <20030601053908.2519.qmail@web506.mail.yahoo.co.jp> Hello, I want to know where can I get latest cheetah. We can get latest poooma using CVS from anoncvs at pooma.codesourcery.com but I couldn't find where cheetah is. Where can I get? Thank you, Tomo __________________________________________________ Do You Yahoo!? Yahoo! BB is Broadband by Yahoo! http://bb.yahoo.co.jp/ From rguenth at tat.physik.uni-tuebingen.de Sun Jun 1 11:43:51 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 1 Jun 2003 13:43:51 +0200 (CEST) Subject: [pooma-dev] Where can I get latest cheetah? In-Reply-To: <20030601053908.2519.qmail@web506.mail.yahoo.co.jp> Message-ID: On Sun, 1 Jun 2003, T wrote: > Hello, > > I want to know where can I get latest cheetah. > We can get latest poooma using CVS from anoncvs at pooma.codesourcery.com > but I couldn't find where cheetah is. > Where can I get? http://www.acl.lanl.gov/cheetah/ Richard. From renard1 at llnl.gov Tue Jun 3 19:11:06 2003 From: renard1 at llnl.gov (Paul A. Renard) Date: Tue, 03 Jun 2003 12:11:06 -0700 Subject: Good News. Intel's ICC 8.0 Beta looks promising, now. Message-ID: <5.1.0.14.2.20030603114653.0423ab78@popout.llnl.gov> An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Tue Jun 3 19:36:33 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 3 Jun 2003 21:36:33 +0200 (CEST) Subject: [pooma-dev] Good News. Intel's ICC 8.0 Beta looks promising, now. In-Reply-To: <5.1.0.14.2.20030603114653.0423ab78@popout.llnl.gov> Message-ID: On Tue, 3 Jun 2003, Paul A. Renard wrote: > Back in February, 2003, I reported that Intel's icc 7.0 compiler was producing code > using Pooma constructs that was 2X-4X slower than KCC.? Since then, the folks at > Intel have worked hard, and for my little test (reproduced at the end of this > message), the icc 8.0 Beta compiler (l_cc_b_8.0.023) is now producing? code slightly > faster (maybe 5-10%) than KCC, and certainly comparable to hand-written loops. > > The only optimization items for compiling were: > ??????????? -O3 -DNOPAssert -DNOCTAssert -tpp7 -xW > but the last two are particular to Pentium 4 vectorization, which plays a very small > part in the tests I did, and which probably caused the "slightly faster", rather > than "just about the same speed". > > So, icc 8.0 seems to be a useful choice in compilers (for Linux and Windows). Unfortunately my tests show its better, but still worse than with gcc. Your test is 1d, try 3d and it starts to suck. Inlining is still the culprit, as is CSE with f.i. Loc (where n>1) objects. With the following gcc3.3 patch applied http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/leafify-3.3-2 or leafify-3.4-2 for mainline, I get very good results with gcc3.3. The only parts to change inside POOMA are the expression Kernels in src/Evaluator, where you put __attribute__((leafify)) on the kernel functions (can extract a patch, if you like). Richard. From renard1 at llnl.gov Tue Jun 3 19:57:34 2003 From: renard1 at llnl.gov (Paul A. Renard) Date: Tue, 03 Jun 2003 12:57:34 -0700 Subject: [pooma-dev] Good News. Intel's ICC 8.0 Beta looks promising, now. In-Reply-To: References: <5.1.0.14.2.20030603114653.0423ab78@popout.llnl.gov> Message-ID: <5.1.0.14.2.20030603125001.042b4c38@popout.llnl.gov> An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Tue Jun 3 20:41:26 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 3 Jun 2003 22:41:26 +0200 (CEST) Subject: [pooma-dev] Good News. Intel's ICC 8.0 Beta looks promising, now. In-Reply-To: <5.1.0.14.2.20030603125001.042b4c38@popout.llnl.gov> Message-ID: On Tue, 3 Jun 2003, Paul A. Renard wrote: > Richard: > > From your message: > Unfortunately my tests show its better, but still worse than with gcc. > Your test is 1d, try 3d and it starts to suck. Inlining is still the > culprit, as is CSE with f.i. Loc (where n>1) objects. > > > Actually, my test is 2D.? Do you have a 3D test you can send?? Were you comparing > icc 8.0?? I'd like to try your test on my machine with KCC and icc 8.0 Yes, attached. I tested icc8.0 and gcc3.3 (patched). I'd be interested in KCC results, too. With gcc I get Benchmark size 262144: ET: 5.55688e-08 Stencil: 6.05278e-08 ScalarCode (int): 7.5695e-08 ScalarCode (Loc): 1.15906e-07 Benchmark size 2097152: ET: 5.74374e-08 Stencil: 6.38685e-08 ScalarCode (int): 7.94697e-08 ScalarCode (Loc): 1.19308e-07 Benchmark size 262144: ET: 7.75644e-08 Stencil: 7.78923e-08 ScalarCode (int): 6.76191e-08 ScalarCode (Loc): 1.55674e-07 Benchmark size 2097152: ET: 6.99201e-08 Stencil: 7.7395e-08 ScalarCode (int): 6.24175e-08 ScalarCode (Loc): 1.54993e-07 Total (sum) s/iteration 1.37126e-06 with icc Benchmark size 262144: ET: 7.37382e-08 Stencil: 7.42148e-08 ScalarCode (int): 8.37249e-08 ScalarCode (Loc): 9.26857e-08 Benchmark size 2097152: ET: 8.0122e-08 Stencil: 7.84069e-08 ScalarCode (int): 8.49171e-08 ScalarCode (Loc): 9.70053e-08 Benchmark size 262144: ET: 1.14643e-07 Stencil: 9.76029e-08 ScalarCode (int): 6.61776e-08 ScalarCode (Loc): 1.42822e-07 Benchmark size 2097152: ET: 1.13272e-07 Stencil: 9.84888e-08 ScalarCode (int): 5.80321e-08 ScalarCode (Loc): 1.41148e-07 Total (sum) s/iteration 1.497e-06 While the 1d Loc using ScalarCode are better with icc, the 3d expression template versions are awfully slow (filed a PR already). Richard. -------------- next part -------------- #include "Pooma/Pooma.h" #include "Pooma/Arrays.h" #include "Utilities/Clock.h" template void benchET(const A1& a, const A2& b) { asm("benchET_begin:"); Interval<1> I = a.physicalDomain(); Loc<1> dX = Loc<1>(1); b(I) = 0.5 * (a.read(I-dX) + a.read(I+dX)); asm("benchET_end:"); } struct MyStencil { MyStencil() {}; template inline typename A1::Element_t operator()(const A1& a, int i) const { return 0.5 * (a.read(i-1) + a.read(i+1)); } inline int lowerExtent(int) const { return 1; } inline int upperExtent(int) const { return 1; } }; template void benchStencil(const A1& a, const A2& b) { asm("benchStencil_begin:"); Interval<1> I = a.physicalDomain(); b(I) = Stencil()(a)(I); asm("benchStencil_end:"); } struct MyScalarCodeLoc { MyScalarCodeLoc() {}; void scalarCodeInfo(ScalarCodeInfo<1, 2>& i) const { i.extent(GuardLayers<1>(1)); i.write(0, true); i.write(1, false); i.useGuards(0, false); i.useGuards(1, true); } static const Loc<1> dX; template inline void operator()(const A1& a, const A2& b, const Loc<1>& I) const { b(I) = 0.5 * (a.read(I-dX) + a.read(I+dX)); } }; const Loc<1> MyScalarCodeLoc::dX = Loc<1>(1); struct MyScalarCodeInt { MyScalarCodeInt() {}; void scalarCodeInfo(ScalarCodeInfo<1, 2>& i) const { i.extent(GuardLayers<1>(1)); i.write(0, true); i.write(1, false); i.useGuards(0, false); i.useGuards(1, true); } template inline void operator()(const A1& a, const A2& b, const Loc<1>& I) const { int i = I.first(); b(i) = 0.5 * (a.read(i-1) + a.read(i+1)); } }; template void benchScalarCodeLoc(const A1& a, const A2& b) { asm("benchScalarCodeLoc_begin:"); Interval<1> I = a.physicalDomain(); ScalarCode()(a, b); asm("benchScalarCodeLoc_end:"); } template void benchScalarCodeInt(const A1& a, const A2& b) { asm("benchScalarCodeInt_begin:"); Interval<1> I = a.physicalDomain(); ScalarCode()(a, b); asm("benchScalarCodeInt_end:"); } void bench(int size) { Interval<1> domain = Interval<1>(size); GridLayout<1> layout = GridLayout<1>(domain, Loc<1>(8), GuardLayers<1>(1), ReplicatedTag()); Array<1, double, MultiPatch > A(layout), B(layout); A(A.domain()) = 1.0; B(domain) = 1.0; if (!all(B(domain) == 1.0)) exit(1); double startET = Pooma::Clock::value(); benchET(A, B); double endET = Pooma::Clock::value(); if (!all(B(domain) == 1.0)) exit(1); double startStencil = Pooma::Clock::value(); benchStencil(A, B); double endStencil = Pooma::Clock::value(); if (!all(B(domain) == 1.0)) exit(1); double startScalarCodeInt = Pooma::Clock::value(); benchScalarCodeInt(A, B); double endScalarCodeInt = Pooma::Clock::value(); if (!all(B(domain) == 1.0)) exit(1); double startScalarCodeLoc = Pooma::Clock::value(); benchScalarCodeLoc(A, B); double endScalarCodeLoc = Pooma::Clock::value(); if (!all(B(domain) == 1.0)) exit(1); Inform out; out << "Benchmark size " << size << ":" << std::endl; out << " ET: " << (endET - startET)/size << std::endl; out << " Stencil: " << (endStencil - startStencil)/size << std::endl; out << " ScalarCode (int): " << (endScalarCodeInt - startScalarCodeInt)/size << std::endl; out << " ScalarCode (Loc): " << (endScalarCodeLoc - startScalarCodeLoc)/size << std::endl; } int main(int argc, char **argv) { Pooma::initialize(argc, argv); Pooma::blockingExpressions(true); bench(32*32*32); bench(32*32*32*10); bench(32*32*32*100); Pooma::finalize(); return 0; } From rguenth at tat.physik.uni-tuebingen.de Tue Jun 3 21:12:06 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 3 Jun 2003 23:12:06 +0200 (CEST) Subject: [pooma-dev] Good News. Intel's ICC 8.0 Beta looks promising, now. In-Reply-To: <3EDCFA73.6000405@codesourcery.com> Message-ID: On Tue, 3 Jun 2003, Jeffrey D. Oldham wrote: > Richard Guenther wrote: > > With the following gcc3.3 patch applied > > http://www.tat.physik.uni-tuebingen.de/~rguenth/gcc/leafify-3.3-2 > > or leafify-3.4-2 for mainline, I get very good results with gcc3.3. > > Have you asked the gcc community to accept this leafify inlining idea? > There is usually a lot of resistance to adding extentions to gcc since > they tend to cause complications from non-orthogonality with other > language features. I just bugged the gcc list about this issue, maybe experienced users of scientific C++ codes want to comment. Also maybe someone knows other precedence than the NEC C++ compiler for a similar language extension. The post is at http://gcc.gnu.org/ml/gcc/2003-06/msg00308.html Thanks, Richard. From jcrotinger at proximation.com Wed Jun 4 21:32:49 2003 From: jcrotinger at proximation.com (James Crotinger) Date: Wed, 4 Jun 2003 15:32:49 -0600 Subject: [pooma-dev] Brick engine and pointer aliasing Message-ID: I don't know where the code base ended up, but it used to be that we did a dependency analysis and if we detected the same block in use on the RHS that would be assigned to on the LHS, we allocated a new array, assigned results into it, and then copied the results back into the original target array. A lot of that sort of analysis got removed in the push to get 2.3 out before we left, though, so it may be gone, in which case certain statements that used to be well-defined would now be undefined. With respect to restrict, we tried this at various times and found it not to help. With KCC, it wasn't properly propagated to the generated C code - they were not able to carry out the analysis carefully enough to label the ultimate temporary pointer that was used in the inner loop as a restricted pointer. Also, it's nonstandard, so if it is put in, make it so that it can be configured away. Jim -----Original Message----- From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] Sent: Wednesday, May 28, 2003 7:59 AM To: pooma-dev at pooma.codesourcery.com Subject: [pooma-dev] Brick engine and pointer aliasing Hi! Currently the data members of the Brick and BrickView engines are _not_ marked restrict, i.e. they're T *data_m. While strictly speaking this is correct it harms performance on vector computers quite a lot. For dataparallel statements in POOMA the result is undefined, if iterations depend on each other, which is equivalent to that the compiler may assume restrictness of all data_m pointers, here? [note the question mark] For non-dataparallel statements, the situation is more complicated. While under the restrict assumption a loop like for (i=0; i<4; ++i) A(i) = A(i-1); is the same as non-restricted(?), if we have two views to the same Array, things get messed up, as in for (i=0; i<4; ++i) A(Interval1)(i) = A(Interval2)(i-1); as now the iterations can be executed in parallel if BrickViews have restricted data members. The question now is, do we actually "support" such non-dataparallel statements involving different views of the same Brick engine? Can we specify such uses as undefined behavior? Can we mark Brick and BrickView engine data_m members restrict? Any thoughts on these issues? Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rguenth at tat.physik.uni-tuebingen.de Wed Jun 4 21:39:40 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Wed, 4 Jun 2003 23:39:40 +0200 (CEST) Subject: [pooma-dev] Brick engine and pointer aliasing In-Reply-To: Message-ID: On Wed, 4 Jun 2003, James Crotinger wrote: > I don't know where the code base ended up, but it used to be that we did a > dependency analysis and if we detected the same block in use on the RHS that > would be assigned to on the LHS, we allocated a new array, assigned results > into it, and then copied the results back into the original target array. A > lot of that sort of analysis got removed in the push to get 2.3 out before > we left, though, so it may be gone, in which case certain statements that > used to be well-defined would now be undefined. Yes, I remember reading something in some manual about this fact ;) > With respect to restrict, we tried this at various times and found it not to > help. With KCC, it wasn't properly propagated to the generated C code - they > were not able to carry out the analysis carefully enough to label the > ultimate temporary pointer that was used in the inner loop as a restricted > pointer. Also, it's nonstandard, so if it is put in, make it so that it can > be configured away. "restrict" is properly #defined based on a configure check. Or do you mean specifically disable restrict for Brick and BrickView? Anyway, putting restrict in helps the NEC C++ compiler vectorizing some loops (not all - its bad at inlining as so many other compilers - and its #pragma inline complete refuses to work for some unknown reason), so you get 2GFlops instead of 50MFlops there ;) Richard. > Jim > > > -----Original Message----- > From: Richard Guenther [mailto:rguenth at tat.physik.uni-tuebingen.de] > Sent: Wednesday, May 28, 2003 7:59 AM > To: pooma-dev at pooma.codesourcery.com > Subject: [pooma-dev] Brick engine and pointer aliasing > > Hi! > > Currently the data members of the Brick and BrickView engines are > _not_ marked restrict, i.e. they're T *data_m. While strictly > speaking this is correct it harms performance on vector computers > quite a lot. > > For dataparallel statements in POOMA the result is undefined, if > iterations depend on each other, which is equivalent to that the > compiler may assume restrictness of all data_m pointers, here? > [note the question mark] > > For non-dataparallel statements, the situation is more complicated. > While under the restrict assumption a loop like > > for (i=0; i<4; ++i) > A(i) = A(i-1); > > is the same as non-restricted(?), if we have two views to the same > Array, things get messed up, as in > > for (i=0; i<4; ++i) > A(Interval1)(i) = A(Interval2)(i-1); > > as now the iterations can be executed in parallel if BrickViews > have restricted data members. > > The question now is, do we actually "support" such non-dataparallel > statements involving different views of the same Brick engine? Can > we specify such uses as undefined behavior? Can we mark Brick and > BrickView engine data_m members restrict? > > Any thoughts on these issues? > > Richard. > > -- > Richard Guenther > WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ > From yye00 at aub.edu.lb Mon Jun 9 12:27:32 2003 From: yye00 at aub.edu.lb (Yaqoub El Khamra) Date: Mon, 9 Jun 2003 12:27:32 GMT Subject: MPI Problems Message-ID: On Redhat 9, using LAM MPI and using the following configuration: ./configure --arch LINUXGCC --noex --static --opt --mpi --noshmem Everything seems to work fine, I cd and make. Everything is again working properly. To be certain, I "make tests" and this is what I get: cdup.o(.text+0x71): undefined reference to `_kio' and the same for ah_init, ah_next, ah_insert, lam_kexit, etc... Did you encounter this error by any chance? Did any of the other users have a similar problem?? Thank you Yaqoub From rguenth at tat.physik.uni-tuebingen.de Tue Jun 10 14:54:55 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 10 Jun 2003 16:54:55 +0200 (CEST) Subject: [pooma-dev] MPI Problems In-Reply-To: Message-ID: On Mon, 9 Jun 2003, Yaqoub El Khamra wrote: > > On Redhat 9, using LAM MPI and using the following configuration: > ./configure --arch LINUXGCC --noex --static --opt --mpi --noshmem > > Everything seems to work fine, I cd and make. Everything is again working > properly. To be certain, I "make tests" and this is what I get: > > cdup.o(.text+0x71): undefined reference to `_kio' > > and the same for ah_init, ah_next, ah_insert, lam_kexit, etc... > > Did you encounter this error by any chance? Did any of the other users have a > similar problem?? Check your linking command - you probably need another LAM MPI library linked. The configure is not exactly clever in trying to find out which. Look in configure, line 780 and below and either fix the generic libs for LAM, or create a different mpi species switch for it and submit a patch to the cheetah maintainers. Or use MPICH ;) Hope this helps, Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Sun Jun 15 12:38:03 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Sun, 15 Jun 2003 14:38:03 +0200 (CEST) Subject: [PATCH] Fix Engine/tests/dynamiclayout_test1 failure Message-ID: Hi! The following patch fixes the failure. The problem is, the default constructor of Engine<1, T, Remote > doesnt create a sane state and such makeOwnCopy() called from ElementProperties::construct will fail with an assertion. Tested Layout, Engine and DynamicArray with no new failures on ppc-linux. Ok? Richard. 2003Jun15 Richard Guenther * src/Engine/RemoteDynamicEngine.h: (makeOwnCopy) verify engine before copying. Index: RemoteDynamicEngine.h =================================================================== RCS file: /home/pooma/Repository/r2/src/Engine/RemoteDynamicEngine.h,v retrieving revision 1.19 diff -u -u -r1.19 RemoteDynamicEngine.h --- RemoteDynamicEngine.h 18 Dec 2002 21:38:19 -0000 1.19 +++ RemoteDynamicEngine.h 15 Jun 2003 12:09:25 -0000 @@ -239,7 +239,7 @@ inline Engine_t &makeOwnCopy() { - if (engineIsLocal()) + if (engineIsLocal() && localEnginePtr_m != NULL) { // Ideally this would be localEnginePtr_m.makeOwnCopy(); // but Shared<> doesn't implement ElementProperties correctly. @@ -516,7 +516,7 @@ template Engine<1, T, Remote >::Engine() - : owningContext_m(0) + : owningContext_m(0), localEnginePtr_m(NULL) { PAssert(owningContext_m < Pooma::contexts()); From rguenth at tat.physik.uni-tuebingen.de Mon Jun 16 09:23:38 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 16 Jun 2003 11:23:38 +0200 (CEST) Subject: [pooma-dev] Good News. Intel's ICC 8.0 Beta looks promising, now. In-Reply-To: <5.1.0.14.2.20030610091409.00aa4120@popout.llnl.gov> Message-ID: Hi Paul / others! As you have access to KAI CC, I'd be curious to know what numbers you get for the BlitzLoops / ABCTest benchmarks. Do we really expect the POOMA-II numbers match the C / C restrict numbers? I know neither ICC, nor gcc is really there at the moment. For the BlitzLoops with my patched gcc3.3 I get (-O2 -march=athlon -fomit-frame-pointer -funroll-loops) rguenth at phoenix15 BlitzLoops > ./LINUXgcc33/Loop18 --sim-params 1000 3 1 --no-diags --samples 10 C N restrict C CppTran PoomaII 1000 637.74 599.74 441.03 416.92 10000 446.39 412.08 364.48 345.99 100000 94.34 83.25 82.50 81.25 1000000 91.77 74.49 74.39 76.55 which is good once we are memory bandwidth limited. With icc 8.0 (-O3 -xK -tpp6 -ip -restrict) rguenth at phoenix15 BlitzLoops > ./LINUXICC/Loop18 --sim-params 1000 3 1 --no-diags --samples 10 C N restrict C CppTran PoomaII 1000 857.11 909.79 248.03 187.58 10000 449.74 434.15 230.60 181.60 100000 82.61 83.24 82.04 88.50 1000000 79.18 90.93 80.07 88.65 Again, this looks good for large datasets. Just for the curious, if I mark both brick engine and brickview engines data_m member restrict we get the same numbers for Intel icpc, gcc also doesnt improve with this benchmark. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Mon Jun 23 12:54:31 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Jun 2003 14:54:31 +0200 (CEST) Subject: [RFC] PCH support for pooma / gcc3.4 Message-ID: Hi! I finally got to add preliminary support for PCH to pooma (see attached patch). Unfortunately the build system of pooma is a complete mess, so the pch files cannot be used for compiling the testsuite. Also adding support for Intel icc 8.0 pch will be difficult as it handles things totally different. Also we may want to restructure the Pooma/ includes quite a bit. Comments? Who doesnt like the build system, too? Who would be happy with a (stripped down in features) autoconf transition that would affect library users? Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ # This is a BitKeeper generated patch for the following project: # Project Name: pooma/cheetah repository tracking CVS/tarball # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.60 -> 1.61 # r2/configure 1.11 -> 1.12 # r2/config/Shared/compilerules.mk 1.1 -> 1.2 # r2/src/Pooma/objfile.mk 1.1 -> 1.2 # r2/bin/makeinstall 1.1 -> 1.2 # r2/config/Shared/rules.mk 1.2 -> 1.3 # r2/config/Shared/tail.mk 1.1 -> 1.2 # r2/config/arch/LINUXgcc.conf 1.2 -> 1.3 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/23 rguenth at bellatrix.tat.physik.uni-tuebingen.de 1.61 # initial PCH support and other (unfortunate) random stuff # -------------------------------------------- # diff -Nru a/r2/bin/makeinstall b/r2/bin/makeinstall --- a/r2/bin/makeinstall Mon Jun 23 14:50:59 2003 +++ b/r2/bin/makeinstall Mon Jun 23 14:50:59 2003 @@ -31,7 +31,7 @@ # # Usage: # -# makeinstall [] +# makeinstall [] # # where # @@ -56,20 +56,21 @@ ### Make sure we have the right arguments -if [ "$#" != "4" -a "$#" != "5" ]; then - echo "Usage: $0 []" +if [ "$#" != "5" -a "$#" != "6" ]; then + echo "Usage: $0 []" exit 1 fi suite=$1 libext=$4 +pchextension=$5 libbase=pooma libname=lib$libbase libfull=$libname.$libext extensions="" -if [ "$#" != "4" ]; then - extensions=$5 +if [ "$#" != "5" ]; then + extensions=$6 fi libfilename=$libname$extensions.$libext @@ -127,6 +128,10 @@ srclista=`find src/arch -type f -print | grep -v CVS` tar cf - $srclisth $srclistc $srclista | (cd $installdir ; tar xvf - ) +if test "$pchextension" != ""; then + echo "Copying pch files to $installdir ..." + for i in src/Pooma/$suite/*.$pchextension; do cp $i $installdir/src/Pooma; done +fi ### Copy HTML files to html directory diff -Nru a/r2/config/Shared/compilerules.mk b/r2/config/Shared/compilerules.mk --- a/r2/config/Shared/compilerules.mk Mon Jun 23 14:50:59 2003 +++ b/r2/config/Shared/compilerules.mk Mon Jun 23 14:50:59 2003 @@ -65,6 +65,7 @@ #$(PF_OUT)/%.f: $(THISDIR)/%.g ajax; $(AjaxToPF_out) #%.o: %.f; $(FCToSuite)\; @$(ProblemEcho) +$(ODIR)/%.$(PCH_EXTENSION): $(THISDIR)/%.h; $(PCHToSuite) # ACL:rcsinfo # ---------------------------------------------------------------------- diff -Nru a/r2/config/Shared/rules.mk b/r2/config/Shared/rules.mk --- a/r2/config/Shared/rules.mk Mon Jun 23 14:50:59 2003 +++ b/r2/config/Shared/rules.mk Mon Jun 23 14:50:59 2003 @@ -54,7 +54,7 @@ ifeq ("$(INSTREPO)", "1") clean:: @echo Removing all .o files from suite $(SUITE) beneath `pwd` - rm -f $(INSTANTIATION_DIR)/*.o; \ + @rm -f $(INSTANTIATION_DIR)/*.o; @$(foreach dir,$(shell $(FIND) . -name $(SUITE)),rm -f $(dir)/*.o;) else clean:: @@ -107,7 +107,7 @@ install:: @echo Installing files to directory $(INSTALL_DIR) ... ; \ cd $(PROJECT_ROOT) ; \ - bin/makeinstall $(SUITE) $(INSTALL_DIR) $(INSTALL_ARCH) $(INSTALL_LIBEXT) $(INSTALL_EXT) + bin/makeinstall $(SUITE) $(INSTALL_DIR) $(INSTALL_ARCH) $(INSTALL_LIBEXT) "$(PCH_EXTENSION)" $(INSTALL_EXT) # Create a distribution file for this project, and place it in # the top level of the build tree. @@ -227,7 +227,7 @@ $(ToSuiteSetup) @echo Linker location: `which $(LD)` >> $(INFO_FILE); @echo LinkToSuite... See $(subst $(THISDIR)/,,$(INFO_FILE));\ - echo "$(PRE_CMDLINE) $(LD_LINK_CMDLINE)" | $(PERL) $(SHARED_ROOT)/pretty.pl ld >> $(INFO_FILE); + echo "$(PRE_CMDLINE) $(LD_LINK_CMDLINE)" >> $(INFO_FILE); @$(PRE_CMDLINE) $(LD_LINK_CMDLINE) $(SUITE_REDIRECT) @-ln -f $@ $@_$(PASS) $(infotimestamp) @@ -265,7 +265,7 @@ $(ToSuiteSetup) @echo Compiler location: `which $(CXX)` >> $(INFO_FILE) @echo CXXToSuite... See $(subst $(THISDIR)/,,$(INFO_FILE)) - @echo "$(PRE_CMDLINE) $(CXX_COMPILE_CMDLINE)" | $(PERL) $(SHARED_ROOT)/pretty.pl cc >> $(INFO_FILE) + @echo "$(PRE_CMDLINE) $(CXX_COMPILE_CMDLINE)" >> $(INFO_FILE) @$(PRE_CMDLINE) $(CXX_COMPILE_CMDLINE) $(SUITE_REDIRECT) $(infotimestamp) endef @@ -331,6 +331,15 @@ $(infotimestamp) endef +define PCHToSuite + $(ToSuiteSetup) + @echo Compiler location: `which $(CXX)` >> $(INFO_FILE) + @echo PCHToSuite... See $(subst $(THISDIR)/,,$(INFO_FILE)) + @echo "$(PRE_CMDLINE) $(CXX_PCH_CMDLINE)" >> $(INFO_FILE) + @$(PRE_CMDLINE) $(CXX_PCH_CMDLINE) $(SUITE_REDIRECT) + $(infotimestamp) +endef + define maketargetdir if [ ! -d "$(dir $@)" ]; then mkdir $(dir $@); fi;\ if [ ! -h "$(dir $@)Makefile" ]; then ln -s $(SHARED_ROOT)/Makefile $(dir $@)Makefile; fi diff -Nru a/r2/config/Shared/tail.mk b/r2/config/Shared/tail.mk --- a/r2/config/Shared/tail.mk Mon Jun 23 14:50:59 2003 +++ b/r2/config/Shared/tail.mk Mon Jun 23 14:50:59 2003 @@ -89,6 +89,9 @@ # Build command line for prelink step PRELINK_CMDLINE = $(PRELINK) $(RULE_PRELINK_OPTS) $(filter %.o,$+) +# PCH command line for C++ compiler +CXX_PCH_CMDLINE = $(CXX) -c $< -o $@ $(SUITE_DEFINES) $(RULE_CXXOPTS) $(RULE_INCLUDES) + # Build command line for archiver ifeq ("$(INSTREPO)", "1") AR_CMDLINE = $(AR) $(RULE_AR_OPTS) $@ $(filter %.o,$+) $(INSTANTIATION_DIR)/*.o diff -Nru a/r2/config/arch/LINUXgcc.conf b/r2/config/arch/LINUXgcc.conf --- a/r2/config/arch/LINUXgcc.conf Mon Jun 23 14:50:59 2003 +++ b/r2/config/arch/LINUXgcc.conf Mon Jun 23 14:50:59 2003 @@ -166,12 +166,14 @@ $cppverbose = "-v"; # flag for verbose compiler output $cpponeper = ""; # flag to turn on one-instantance-per-obj $cppstrict = " -ansi"; # flag for ANSI conformance checking +$cpppchextension = "h.gch"; # extension for pch files ### debug or optimized build settings for C++ applications $cppdbg_app = "-g"; -$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -fno-default-inline -funroll-loops -fstrict-aliasing"; +$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -march=athlon -fomit-frame-pointer -funroll-loops -ftime-report"; +#$cppopt_app = "-DNOPAssert -DNOCTAssert -O2 -fomit-frame-pointer -funroll-loops --param max-inline-slope=1000000"; ### debug or optimized build settings for C++ libraries diff -Nru a/r2/configure b/r2/configure --- a/r2/configure Mon Jun 23 14:50:59 2003 +++ b/r2/configure Mon Jun 23 14:50:59 2003 @@ -210,6 +210,7 @@ $cheetahnm = "--messaging"; $strictnm = "--strict"; $archfnsnm = "--arch-specific-functions"; +$pchnm = "--pch"; ### configure options $dbgprntnm = "-v"; # turn on verbose output from configure @@ -273,6 +274,7 @@ [$arargnm, "", "include in the archiver args."], [$linknm, "", "use for the linker application."], [$linkargnm, "", "include in the linker args."], + [$pchnm, "", "generate precompiled headers."], [$dbgprntnm, "", "turn on verbose output from configure."], [$nooverwritenm, "", "force check of whether to overwrite files."], [$overwritenm, "", "force overwrite of all files."], @@ -485,6 +487,7 @@ $cppopt_app = ""; $cppdbg_lib = ""; $cppdbg_app = ""; +$cpppchextension = ""; ### the name and arguments for the C compiler $c = ""; @@ -2021,6 +2024,12 @@ print FSUITE "INSTREPO = 1\n"; print FSUITE "INSTANTIATION_DIR = $instantiationdir\n"; print FSUITE "\n"; + } + + if (scalar @{$arghash{$pchnm}} > 1) + { + print FSUITE "GENERATE_PCH = 1\n"; + print FSUITE "PCH_EXTENSION = $cpppchextension\n"; } print FSUITE "### installation and distribution information\n"; diff -Nru a/r2/src/Pooma/objfile.mk b/r2/src/Pooma/objfile.mk --- a/r2/src/Pooma/objfile.mk Mon Jun 23 14:50:59 2003 +++ b/r2/src/Pooma/objfile.mk Mon Jun 23 14:50:59 2003 @@ -36,6 +36,13 @@ $(UNIQUE)_OBJS := \ $(ODIR)/Pooma.cmpl.o +ifneq ($(GENERATE_PCH),) +$(UNIQUE)_OBJS := $($(UNIQUE)_OBJS) \ + $(ODIR)/Arrays.$(PCH_EXTENSION) \ + $(ODIR)/Fields.$(PCH_EXTENSION) \ + $(ODIR)/Pooma.$(PCH_EXTENSION) +endif + LOCAL_OBJS += $($(UNIQUE)_OBJS) # Set rules for the ODIR directory From mark at codesourcery.com Mon Jun 23 14:46:12 2003 From: mark at codesourcery.com (Mark Mitchell) Date: 23 Jun 2003 07:46:12 -0700 Subject: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: <1056373064.10530.ezmlm@pooma.codesourcery.com> References: <1056373064.10530.ezmlm@pooma.codesourcery.com> Message-ID: <1056379573.9786.8.camel@doubledemon.codesourcery.com> Richard -- I think it would be great to overhaul the POOMA build system to use autoconf and more traditional makefiles. There is no real harm if we break things on a few platforms; they can get put back together. I think it's even OK to assume GNU Make if that makes things simpler. Yours, -- Mark Mitchell CodeSourcery, LLC mark at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Mon Jun 23 14:55:09 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Jun 2003 16:55:09 +0200 (CEST) Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: <1056379573.9786.8.camel@doubledemon.codesourcery.com> Message-ID: On 23 Jun 2003, Mark Mitchell wrote: > Richard -- > > I think it would be great to overhaul the POOMA build system to use > autoconf and more traditional makefiles. There is no real harm if we > break things on a few platforms; they can get put back together. > > I think it's even OK to assume GNU Make if that makes things simpler. Even if we break application Makefiles that include the installed pooma suite files with its rules? I'd like to go the full way or not at all here, if possible. And vpath builds (require gmake) would make the POOMASUITE stuff go away. Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From mark at codesourcery.com Mon Jun 23 15:23:12 2003 From: mark at codesourcery.com (Mark Mitchell) Date: 23 Jun 2003 08:23:12 -0700 Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: References: Message-ID: <1056381792.9720.32.camel@doubledemon.codesourcery.com> On Mon, 2003-06-23 at 07:55, Richard Guenther wrote: > On 23 Jun 2003, Mark Mitchell wrote: > > > Richard -- > > > > I think it would be great to overhaul the POOMA build system to use > > autoconf and more traditional makefiles. There is no real harm if we > > break things on a few platforms; they can get put back together. > > > > I think it's even OK to assume GNU Make if that makes things simpler. > > Even if we break application Makefiles that include the installed pooma > suite files with its rules? Yes, I think it's OK if we break old application Makefiles. We will of course need to fix the documentation when we do that. We don't have time to do this work internally, but if it were something that you wanted to work on, I'd be supportive. We should also see what Jeffrey has to say on the topic. -- Mark Mitchell CodeSourcery, LLC mark at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Mon Jun 23 15:42:23 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Mon, 23 Jun 2003 17:42:23 +0200 (CEST) Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: <1056381792.9720.32.camel@doubledemon.codesourcery.com> Message-ID: On 23 Jun 2003, Mark Mitchell wrote: > On Mon, 2003-06-23 at 07:55, Richard Guenther wrote: > > On 23 Jun 2003, Mark Mitchell wrote: > > > > > Richard -- > > > > > > I think it would be great to overhaul the POOMA build system to use > > > autoconf and more traditional makefiles. There is no real harm if we > > > break things on a few platforms; they can get put back together. > > > > > > I think it's even OK to assume GNU Make if that makes things simpler. > > > > Even if we break application Makefiles that include the installed pooma > > suite files with its rules? > > Yes, I think it's OK if we break old application Makefiles. > > We will of course need to fix the documentation when we do that. > > We don't have time to do this work internally, but if it were something > that you wanted to work on, I'd be supportive. We should also see what > Jeffrey has to say on the topic. Ok, the things I am able to spent time on are: - use autoconf/automake/libtool to get rid of current build system - make build/install/dist work for the targets I have access to and I care about (linux gcc/icc, with and without cheetah for the moment) - update the docs - I probably can give a hand if other people need to do fixes for other targets/compilers In this process we're going to loose a lot of the current configure options, also (initially) support for tau, paws, etc. (but cheetah). If more people like to work on this, this can be done on a branch, if not, I can try to do this locally in a bk repository. Richard. From mark at codesourcery.com Mon Jun 23 15:50:34 2003 From: mark at codesourcery.com (Mark Mitchell) Date: 23 Jun 2003 08:50:34 -0700 Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: References: Message-ID: <1056383434.9786.53.camel@doubledemon.codesourcery.com> > Ok, the things I am able to spent time on are: > - use autoconf/automake/libtool to get rid of current build system I'd rather see us avoid automake and libtool, if we can. I don't think we should need automake if we use GNU make, probably. I'm not sure about libtool, but adding libtool to a build process seems to make it complex, fragile, and slow almost all of the time... After all, POOMA is only going to be used on a few real systems these days -- not a whole lot of 68K HP-UX boxes running POOMA... > In this process we're going to loose a lot of the current configure > options, also (initially) support for tau, paws, etc. (but cheetah). Yes, I'm not sure how much people need those things these days, but we can add them back with --enable-tau, etc. > If more people like to work on this, this can be done on a branch, if not, > I can try to do this locally in a bk repository. A branch is probably a good idea. -- Mark Mitchell CodeSourcery, LLC mark at codesourcery.com From rguenth at tat.physik.uni-tuebingen.de Tue Jun 24 11:07:53 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Tue, 24 Jun 2003 13:07:53 +0200 (CEST) Subject: POOMA+CHEETAH In-Reply-To: Message-ID: On Tue, 24 Jun 2003, Yaqoub El Khamra wrote: > > > Dear Sir > I fixed the cheetah config and the cheetah tests compile nicely. I still have a > few question though: how do I compile a pooma cpp with cheetah enabled? I get a > missing .h file the cheetahconfiguration.h file which is for some reason in the > codewarrior directory. even after I give its location to the compiler, it > tries to get the console.h (I give its location in the inc arguments as well) > and then it gives me undefined errors to size_t and such. Is the > cheetahfonfiguration file in the code warrior directory the correct file? If > yes where should I put it and what header files should I include along with it? > Thank you I think all the Windows ide/ directories are way out of sync. And as I dont use Windows, I cannot help you here - CC'ed to the pooma-dev mailinglist. I assume you are using pooma r2 as can be downloaded from CodeSourcery, or do you use a different version, or even CVS? Richard. -- Richard Guenther WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/ From rguenth at tat.physik.uni-tuebingen.de Thu Jun 26 18:23:43 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 26 Jun 2003 20:23:43 +0200 (CEST) Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: <1056383434.9786.53.camel@doubledemon.codesourcery.com> Message-ID: On 23 Jun 2003, Mark Mitchell wrote: > > Ok, the things I am able to spent time on are: > > - use autoconf/automake/libtool to get rid of current build system > > I'd rather see us avoid automake and libtool, if we can. I don't think > we should need automake if we use GNU make, probably. I'm not sure > about libtool, but adding libtool to a build process seems to make it > complex, fragile, and slow almost all of the time... > > After all, POOMA is only going to be used on a few real systems these > days -- not a whole lot of 68K HP-UX boxes running POOMA... Ok, I can see this makes sense somehow. The only problem I see is building shared libraries - though I think usage of shared libaries in high performance computation is very rare. Though, of course, using automake will help doing dependencies and installation rules. So while I'm in favour of dropping shared library support (and such libtool), I'd be rather not willing to miss automake. > > In this process we're going to loose a lot of the current configure > > options, also (initially) support for tau, paws, etc. (but cheetah). > > Yes, I'm not sure how much people need those things these days, but we > can add them back with --enable-tau, etc. Yes. Last time I checked, most of those packages are neither maintained anymore, nor do they compile on recent compilers/distros. > > If more people like to work on this, this can be done on a branch, if not, > > I can try to do this locally in a bk repository. > > A branch is probably a good idea. Yes, I think so, too. Jeffrey? Richard. From jxyh at lanl.gov Thu Jun 26 18:40:13 2003 From: jxyh at lanl.gov (John H. Hall) Date: Thu, 26 Jun 2003 12:40:13 -0600 Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: Message-ID: <9F6887DA-A805-11D7-BA00-0003938E6E0A@lanl.gov> Richard, et al.: Shared libraries are big deal for the old major customer of POOMA, the Blanca Project. Link times with shared libraries were minutes compared to hours for static libraries. While a final optimized release might not want to rely on shared libraries fro some performance aspect, the debug cycle will probably need them. On the SGI's and Q machines almost all the system libraries are shared libraries, so I am not really sure what impact shared libraries have on the final code performance. John Hall On Thursday, June 26, 2003, at 12:23 PM, Richard Guenther wrote: > On 23 Jun 2003, Mark Mitchell wrote: > >>> Ok, the things I am able to spent time on are: >>> - use autoconf/automake/libtool to get rid of current build system >> >> I'd rather see us avoid automake and libtool, if we can. I don't >> think >> we should need automake if we use GNU make, probably. I'm not sure >> about libtool, but adding libtool to a build process seems to make it >> complex, fragile, and slow almost all of the time... >> >> After all, POOMA is only going to be used on a few real systems these >> days -- not a whole lot of 68K HP-UX boxes running POOMA... > > Ok, I can see this makes sense somehow. The only problem I see is > building > shared libraries - though I think usage of shared libaries in high > performance computation is very rare. Though, of course, using automake > will help doing dependencies and installation rules. So while I'm in > favour of dropping shared library support (and such libtool), I'd be > rather not willing to miss automake. > >>> In this process we're going to loose a lot of the current configure >>> options, also (initially) support for tau, paws, etc. (but cheetah). >> >> Yes, I'm not sure how much people need those things these days, but we >> can add them back with --enable-tau, etc. > > Yes. Last time I checked, most of those packages are neither maintained > anymore, nor do they compile on recent compilers/distros. > >>> If more people like to work on this, this can be done on a branch, >>> if not, >>> I can try to do this locally in a bk repository. >> >> A branch is probably a good idea. > > Yes, I think so, too. Jeffrey? > > Richard. From rguenth at tat.physik.uni-tuebingen.de Thu Jun 26 19:13:31 2003 From: rguenth at tat.physik.uni-tuebingen.de (Richard Guenther) Date: Thu, 26 Jun 2003 21:13:31 +0200 (CEST) Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: <9F6887DA-A805-11D7-BA00-0003938E6E0A@lanl.gov> Message-ID: On Thu, 26 Jun 2003, John H. Hall wrote: > Richard, et al.: > Shared libraries are big deal for the old major customer of POOMA, the > Blanca Project. Link times with shared libraries were minutes compared > to hours for static libraries. While a final optimized release might > not want to rely on shared libraries fro some performance aspect, the > debug cycle will probably need them. On the SGI's and Q machines almost > all the system libraries are shared libraries, so I am not really sure > what impact shared libraries have on the final code performance. > John Hall Shared libraries with Linux usually mean you have restricted heap size and portability issues if build machine != computation host. But I take your argument for link times. So I vote against re-inventing libtool just to avoid using it. This makes the proposed tools, again, autoconf, automake and libtool. Richard. From mark at codesourcery.com Thu Jun 26 22:29:30 2003 From: mark at codesourcery.com (Mark Mitchell) Date: 26 Jun 2003 15:29:30 -0700 Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: References: Message-ID: <1056666571.4424.13.camel@doubledemon.codesourcery.com> > Ok, I can see this makes sense somehow. The only problem I see is building > shared libraries - though I think usage of shared libaries in high > performance computation is very rare. Though, of course, using automake > will help doing dependencies and installation rules. So while I'm in > favour of dropping shared library support (and such libtool), I'd be > rather not willing to miss automake. I've never really found automake to be much of a win -- relative to GNU make -- but then again, I've not found it to be much of a loss either. And you are volunteering to do the work! So, I'll back out of this debate. :-) Thanks, -- Mark Mitchell CodeSourcery, LLC mark at codesourcery.com From mark at codesourcery.com Thu Jun 26 22:35:37 2003 From: mark at codesourcery.com (Mark Mitchell) Date: 26 Jun 2003 15:35:37 -0700 Subject: [pooma-dev] Re: pooma-dev Digest 23 Jun 2003 12:57:44 -0000 Issue 243 In-Reply-To: References: Message-ID: <1056666937.4398.16.camel@doubledemon.codesourcery.com> > So I vote against re-inventing libtool just to avoid using it. This makes > the proposed tools, again, autoconf, automake and libtool. OK by me. Let Jeffrey should weigh in, if he wants to do so. Thanks! -- Mark Mitchell CodeSourcery, LLC mark at codesourcery.com