[RFC] Specialize (internal) guard cell handling, more data-flow analysis

Wed Jul 7 17:22:59 UTC 2004

I'm at the point thinking on how to improve MPI scalarization even more. 
  And again the obvious point is we're doing way too much (unnecessary) 
communication.

The problem is we "lower" the representation of guard cells and 
necessary updates of them too early (in the intersectors) and create 
usual iterates out of them.  A better approach would be to compute 
necessary guards at intersection time only (as I introduced in the 
previous performance improvement patches), _not_ update them, but store 
this information in the iterates.  We can then, before finally issuing 
the iterates, do data-flow analysis of the necessary guard cells and
insert optimal update iterates at optimal places (in principle).

As iterates are per-patch, in the process of getting the above done, I'd 
suggest moving the dirty flag and its handling from the MultiPatchEngine 
down to the BrickEngine, together with more accurate information about 
the up-to-date-ness of the guards (use f.i. a GuardLayer<> to count the 
up-to-date cells).

Were there any previous plans on improving this situation within POOMA? 
  Did you never experience performance problems with the communication? 
  How did SMARTS improve situation with the guard updates (did it?)?

Thanks for any input (before I start hacking this up)!

Richard.