Remote access of distributed multipatched arrays
Arno Candel
candel at itp.phys.ethz.ch
Sun Apr 6 17:00:33 UTC 2003
Hi everybody,
I encountered a problem doing local if-statements using multiple values
of several distributed multipatched brick arrays to calculate the value
of one such array.
Below I have included a simple test program which shows the relevant
problem. Three 3D distributed multipatched brick arrays are created and
allocated on different domains and thus on different patches. Two arrays
are now used to calculate the values of the other array, using local
if-statements. Both a "stupid serial" and a "parallel" version of the
calculation part are implemented. The use of data-parallel expression or
stencils was not possible, in my view.
The "stupid serial" version runs flawlessly, but takes far too long on
parallel execution, as every node does all the work.
Unfortunately, the "parallel" version crashes when executed in parallel
(mpirun -np 2 Test -mpi) as remote access to values of distributed
arrays seems forbidden. Due to the different domains of the multiple
arrays, they are differently stored on the contexts and simultaneous
local access is not possible everywhere. Now, I would expect that there
is a way to allow remote access via communication in a loop over local
patches. This would still scale much better than the "stupid serial"
version.
I would greatly appreciate your suggestions and comments.
Arno Candel
simple test program:
**********************************************************************
#include "Pooma/Pooma.h"
#include "Pooma/Particles.h"
#include "Pooma/BrickArrays.h"
#include "Pooma/Arrays.h"
#include "Utilities/Clock.h"
#include "Layout/GridLayout.h"
struct FillB { double operator()(int i, int j, int k) const { return
i*sin(i*j/2.0)/(k+1); } };
struct FillC { double operator()(int i, int j, int k) const { return
k*cos(3.0*j*k)/(i+1); } };
int main(int argc, char* argv[])
{
Pooma::initialize(argc,argv);
Pooma::Clock clock;
double time;
Inform pout("Test");
Loc<3> ex=Loc<3>(1,0,0);
Loc<3> ez=Loc<3>(0,0,1);
int MX=3,MY=4,MZ=7; //
dimensions of calculational domain
Loc<3> blocks(1,1,Pooma::contexts()); //
parallelization along z-axis
GuardLayers<3> intguards(1), extguards(0); // one internal
guard layer, no external guard layer
GridPartition<3> Part=GridPartition<3>(blocks,intguards,extguards);
// Create three distributed MultiPatch brick arrays
// with differing domains and thus different distributions among
contexts
Array<3, double, MultiPatch<GridTag, Remote<Brick> > > A, B, C;
Interval<3> ADom=Interval<3>(Interval<1>(1,MX), Interval<1>(1,MY),
Interval<1>(1,MZ));
Interval<3> BDom=Interval<3>(Interval<1>(0,MX), Interval<1>(1,MY),
Interval<1>(0,MZ+1));
Interval<3> CDom=Interval<3>(Interval<1>(0,MX), Interval<1>(1,MY),
Interval<1>(1,2*MZ));
GridLayout<3> Alayout = GridLayout<3>(ADom,Part,DistributedTag());
GridLayout<3> Blayout = GridLayout<3>(BDom,Part,DistributedTag());
GridLayout<3> Clayout = GridLayout<3>(CDom,Part,DistributedTag());
A.initialize(Alayout);
B.initialize(Blayout);
C.initialize(Clayout);
A=1;
B=Array<3, double, IndexFunction<FillB> >();
C=Array<3, double, IndexFunction<FillC> >();
// "stupid serial" version
Pooma::blockAndEvaluate();
time=clock.value();
for (int i=ADom[0].first();i<=ADom[0].last();i++)
for (int j=ADom[1].first();j<=ADom[1].last();j++)
for (int k=ADom[2].first();k<=ADom[2].last();k++) {
Loc<3> x(i,j,k);
if ( (B(x+ez)<0.5) && (C(x-ex)>0.3) ) {
A(x)=B(x-ez-ex)+C(x+ez);
}
}
Pooma::blockAndEvaluate();
pout << A << "\nstupid serial version took " << clock.value()-time
<< " secs" << std::endl;
// "parallel" version, iterate only over local patches of A
Pooma::blockAndEvaluate();
time=clock.value();
for (GridLayout<3>::const_iterator it = Alayout.beginLocal(); it !=
Alayout.endLocal(); it++) {
for (int i=it->domain()[0].first();i<=it->domain()[0].last();i++)
for (int j=it->domain()[1].first();j<=it->domain()[1].last();j++)
for (int k=it->domain()[2].first();k<=it->domain()[2].last();k++) {
Loc<3> x(i,j,k);
if ( (B(x+ez)<0.5) && (C(x-ex)>0.3) ) {
A(x)=B(x-ez-ex)+C(x+ez);
// problem: local patches of A might not contain needed
values of B and C and remote access is forbidden!
// Will crash in parallel execution!
}
}
}
Pooma::blockAndEvaluate();
pout << A << "\nparallel version took " << clock.value()-time << "
secs" << std::endl;
Pooma::finalize();
return 0;
}
More information about the pooma-dev
mailing list