Remote access of distributed multipatched arrays

Arno Candel candel at itp.phys.ethz.ch
Sun Apr 6 17:00:33 UTC 2003


Hi everybody,

I encountered a problem doing local if-statements using multiple values 
of several distributed multipatched brick arrays to calculate the value 
of one such array.

Below I have included a simple test program which shows the relevant 
problem. Three 3D distributed multipatched brick arrays are created and 
allocated on different domains and thus on different patches. Two arrays 
are now used to calculate the values of the other array, using local 
if-statements. Both a "stupid serial" and a "parallel" version of the 
calculation part are implemented. The use of data-parallel expression or 
stencils was not possible, in my view.

The "stupid serial" version runs flawlessly, but takes far too long on 
parallel execution, as every node does all the work.

Unfortunately, the "parallel" version crashes when executed in parallel 
(mpirun -np 2 Test -mpi) as remote access to values of distributed 
arrays seems forbidden. Due to the different domains of the multiple 
arrays, they are differently stored on the contexts and simultaneous 
local access is not possible everywhere. Now, I would expect that there 
is a way to allow remote access via communication in a loop over local 
patches. This would still scale much better than the "stupid serial" 
version.

I would greatly appreciate your suggestions and comments.
Arno Candel




simple test program:
**********************************************************************

#include "Pooma/Pooma.h"
#include "Pooma/Particles.h"
#include "Pooma/BrickArrays.h"
#include "Pooma/Arrays.h"
#include "Utilities/Clock.h"
#include "Layout/GridLayout.h"

struct FillB { double operator()(int i, int j, int k) const { return 
i*sin(i*j/2.0)/(k+1); } };
struct FillC { double operator()(int i, int j, int k) const { return 
k*cos(3.0*j*k)/(i+1); } };

int main(int argc, char* argv[])
{
    Pooma::initialize(argc,argv);
    Pooma::Clock clock;
    double time;
    Inform pout("Test");

    Loc<3> ex=Loc<3>(1,0,0);
    Loc<3> ez=Loc<3>(0,0,1);
    
    int MX=3,MY=4,MZ=7;                                        // 
dimensions of calculational domain
    Loc<3> blocks(1,1,Pooma::contexts());                  // 
parallelization along z-axis    
    GuardLayers<3> intguards(1), extguards(0);         // one internal 
guard layer, no external guard layer   
    GridPartition<3> Part=GridPartition<3>(blocks,intguards,extguards);

    // Create three distributed MultiPatch brick arrays
    // with differing domains and thus different distributions among 
contexts
    Array<3, double, MultiPatch<GridTag, Remote<Brick> > > A, B, C;
    Interval<3> ADom=Interval<3>(Interval<1>(1,MX), Interval<1>(1,MY), 
Interval<1>(1,MZ));
    Interval<3> BDom=Interval<3>(Interval<1>(0,MX), Interval<1>(1,MY), 
Interval<1>(0,MZ+1));
    Interval<3> CDom=Interval<3>(Interval<1>(0,MX), Interval<1>(1,MY), 
Interval<1>(1,2*MZ));
    GridLayout<3> Alayout = GridLayout<3>(ADom,Part,DistributedTag());
    GridLayout<3> Blayout = GridLayout<3>(BDom,Part,DistributedTag());
    GridLayout<3> Clayout = GridLayout<3>(CDom,Part,DistributedTag());
    A.initialize(Alayout);
    B.initialize(Blayout);
    C.initialize(Clayout);

    A=1;
    B=Array<3, double, IndexFunction<FillB> >();
    C=Array<3, double, IndexFunction<FillC> >();


    // "stupid serial" version
    Pooma::blockAndEvaluate();
    time=clock.value();
    for (int i=ADom[0].first();i<=ADom[0].last();i++)
    for (int j=ADom[1].first();j<=ADom[1].last();j++)
    for (int k=ADom[2].first();k<=ADom[2].last();k++) {
        Loc<3> x(i,j,k);
        if ( (B(x+ez)<0.5) && (C(x-ex)>0.3) ) {
            A(x)=B(x-ez-ex)+C(x+ez);
        }
    }
    Pooma::blockAndEvaluate();
    pout << A << "\nstupid serial version took " << clock.value()-time 
<< " secs" << std::endl;


    // "parallel" version, iterate only over local patches of A
    Pooma::blockAndEvaluate();
    time=clock.value();
    for (GridLayout<3>::const_iterator it = Alayout.beginLocal(); it != 
Alayout.endLocal(); it++) {
        for (int i=it->domain()[0].first();i<=it->domain()[0].last();i++)
        for (int j=it->domain()[1].first();j<=it->domain()[1].last();j++)
        for (int k=it->domain()[2].first();k<=it->domain()[2].last();k++) {
            Loc<3> x(i,j,k);
            if ( (B(x+ez)<0.5) && (C(x-ex)>0.3) ) {
                A(x)=B(x-ez-ex)+C(x+ez);
                // problem: local patches of A might not contain needed 
values of B and C and remote access is forbidden!
                // Will crash in parallel execution!
            }
        }
    }
    Pooma::blockAndEvaluate();
    pout << A << "\nparallel version took " << clock.value()-time << " 
secs" << std::endl;


    Pooma::finalize();
    return 0;
}




More information about the pooma-dev mailing list