[PATCH] Fix reductions for MPI operation
Jeffrey D. Oldham
oldham at codesourcery.com
Mon Aug 23 23:59:39 UTC 2004
Richard Guenther wrote:
>
> This patch fixes (works around) a previously discovered problem
> (remember the WaitingIterate). I'm sure there is a real problem
> to fix (at least for MPI - I'm not sure about Cheetah), and this
> is the least intrusive way of fixing it until the right idea for
> a cross-context csem like mechanism pops up.
>
> Without this patch random lockups during reductions may occour.
>
> Ok?
>
> Richard.
>
>
> 2004Aug21 Richard Guenther <richard.guenther at uni-tuebingen.de>
>
> * src/Engine/RemoteEngine.h: For MPI avoid doing blocking
> operation during reductions while iterates are still pending.
Yes, this is fine.
>------------------------------------------------------------------------
>
>Index: src/Engine/RemoteEngine.h
>===================================================================
>RCS file: /home/pooma/Repository/r2/src/Engine/RemoteEngine.h,v
>retrieving revision 1.42
>diff -u -u -r1.42 RemoteEngine.h
>--- src/Engine/RemoteEngine.h 19 Jan 2004 22:04:33 -0000 1.42
>+++ src/Engine/RemoteEngine.h 21 Aug 2004 20:10:06 -0000
>@@ -2065,6 +2065,11 @@
> Pooma::scheduler().endGeneration();
>
> csem.wait();
>+#if POOMA_MPI
>+ // The above single thread waiting has the same problem as with
>+ // the MultiPatch variant. So fix it.
>+ Pooma::blockAndEvaluate();
>+#endif
>
> RemoteProxy<T> globalRet(ret, computationContext);
> ret = globalRet;
>@@ -2186,6 +2191,27 @@
>
> Pooma::scheduler().endGeneration();
> csem.wait();
>+#if POOMA_MPI
>+ // We need to wait for Reductions on _all_ contexts to complete
>+ // here, as we may else miss to issue a igc update send iterate that a
>+ // remote context waits for. Consider the 2-patch setup
>+ // a,b | g| | g|
>+ // with the expressions
>+ // a(I) = b(I+1);
>+ // bool res = all(a(I) == 0);
>+ // here we issue the following iterates:
>+ // 0: guard receive from 1 (write request b)
>+ // 1: guard send to 0 (read request b)
>+ // 0/1: expression iterate (read request b, write request a)
>+ // 0/1: reduction (read request a)
>+ // 0/1: blocking MPI_XXX
>+ // here the guard send from 1 to 0 can be skipped starting the
>+ // blocking MPI operation prematurely while context 0 needs to
>+ // wait for this send to complete in order to execute the expression.
>+ //
>+ // The easiest way (and the only available) is to blockAndEvaluate().
>+ Pooma::blockAndEvaluate();
>+#endif
>
> if (n > 0)
> {
>
>
--
Jeffrey D. Oldham
oldham at codesourcery.com
More information about the pooma-dev
mailing list