Explanation of blockAndEvaluate()
Jeffrey Oldham
oldham at codesourcery.com
Tue Dec 4 20:43:13 UTC 2001
Mark requested that Stephen Smith's explanation be posted to the
pooma-dev mailing list so it is archived for posterity.
Jeffrey's complaint:
> When I run the attached Pooma program (from examples/Manual/Doof2d/)
> for one-processor, it works fine, returning 55.0221 for 4 averagings
> and an array size of 20. When I run it with Pooma configured with
> --messaging and use the MM Shared Memory Library, it returns 0. Just
> before the blockAndEvaluate() call, the "b" array has the proper value
> but afterwards it has changed to zero. Why? Why is it ever dangerous
> to call blockAndEvaluate()? How do I explain when to call
> blockAndEvaluate()?
The program is attached.
Stephen Smith's (stephens at proximation.com) reply:
> This code is missing a blockAndEvaluate, it should look
> like:
>
> a = b = 0;
> Pooma::blockAndEvaluate();
> b(n/2,n/2) = 1000.0;
>
> Currently the default is that all code is dangerous, which may
> not be a good thing. To ensure correctness you either need
> to run with --poomaBlockingExpressions or add blockAndEvaluate()
> all the necessary places.
>
> Here's the basic issue:
>
> 1: a = b;
> 2: c = a;
> 3: e = c;
> 4: c(5) = 7;
> 5: d = c + e;
> 6: cout << d(5) << d(3) << endl;
>
> For this code to work correctly, the data-parallel expressions
> writing to c must be done before statement 4 is run and the
> data-parallel expression writing to d must be done before the
> line that prints values from d. Using blockingExpressions()
> ensures correctness by inserting blockAndEvaluate() after EVERY
> data-parallel statement:
>
> 1: a = b;
> blockAndEvaluate();
> 2: c = a;
> blockAndEvaluate();
> 3: e = c;
> blockAndEvaluate();
> 4: c(5) = 7;
> 5: d = c + e;
> blockAndEvaluate();
> 6: cout << d(5) << d(3) << endl;
>
> This may not be very efficient when the arrays are decomposed
> into patches, because all the patches in statement 1 must execute
> before any from statement 2. It would be a lot more cache efficient
> to perform (a = b; c = a; e = c;) on one patch, then move to the next
> patch.
>
> In the past, my recommendation to users was to add blockAndEvaluate
> immediately before any serial code:
>
> 1: a = b;
> 2: c = a;
> 3: e = c;
> blockAndEvaluate();
> 4: c(5) = 7;
> 5: d = c + e;
> blockAndEvaluate();
> 6: cout << d(5) << d(3) << endl;
>
> This approach is guaranteed to ensure correctness. There was no
> way for use to implement this automatically. We know inside POOMA
> every time a data-parallel expression occurs, but we don't know what
> the next statement is going to be. There's no simple way to check for
> serial access without slowing the code down incredibly. All the inner
> loops which get run by SMARTS also access elements through operator(),
> so we would have to put an if test for every element access that would
> say "Are we running inside the evaluator, or back in the users code?"
>
> So the use of blockAndEvaluate is an optimization. Perhaps it would be
> better to make --blockingExpressions the default and if users want more
> efficient code they can add the necessary blockAndEvaluates and run
> --withoutBlockingExpressions. Note that if they really understand
> the parallelism issues, they could get trickier:
>
> 1: a = b;
> 2: c = a;
> blockAndEvaluate();
> 3: e = c;
> 4: c(5) = 7;
> 5: d = c + e;
> blockAndEvaluate();
> 6: cout << d(5) << d(3) << endl;
>
> is also correct because we've guaranteed that c has been computed. Note
> that blockAndEvaluate() causes EVERY expression to finally be computed.
> We had at one point thought about a more specific syntax:
>
> blockOnEvaluation(c);
> c(5) = 7;
>
> This syntax would ensure that all the expressions relating to a given
> array are finished. (That would allow the main branch of the code to
> continue while some computations are still going.)
>
> This idea is a ways off from even being prototyped, though.
Thanks,
Jeffrey D. Oldham
oldham at codesourcery.com
-------------- next part --------------
#include <iostream> // has std::cout, ...
#include <stdlib.h> // has EXIT_SUCCESS
#include "Pooma/Arrays.h" // has Pooma's Array
// Doof2d: Pooma Arrays, element-wise implementation
int main(int argc, char *argv[])
{
// Prepare the Pooma library for execution.
Pooma::initialize(argc,argv);
// Ask the user for the number of averagings.
long nuAveragings, nuIterations;
std::cout << "Please enter the number of averagings: ";
std::cin >> nuAveragings;
nuIterations = (nuAveragings+1)/2; // Each iteration performs two averagings.
// Ask the user for the number n of elements along one dimension of
// the grid.
long n;
std::cout << "Please enter the array size: ";
std::cin >> n;
// Specify the arrays' domains [0,n) x [0,n).
Interval<1> N(0, n-1);
Interval<2> vertDomain(N, N);
// Create the arrays.
// The template parameters indicate 2 dimensions, a 'double' element
// type, and ordinary 'Brick' storage.
Array<2, double, Brick> a(vertDomain);
Array<2, double, Brick> b(vertDomain);
// Set up the initial conditions.
// All grid values should be zero except for the central value.
a = b = 0.0;
b(n/2,n/2) = 1000.0;
// In the average, weight element with this value.
const double weight = 1.0/9.0;
// Perform the simulation.
for (int k = 0; k < nuIterations; ++k) {
// Read from b. Write to a.
for (int j = 1; j < n-1; j++)
for (int i = 1; i < n-1; i++)
a(i,j) = weight *
(b(i+1,j+1) + b(i+1,j ) + b(i+1,j-1) +
b(i ,j+1) + b(i ,j ) + b(i ,j-1) +
b(i-1,j+1) + b(i-1,j ) + b(i-1,j-1));
// Read from a. Write to b.
for (int j = 1; j < n-1; j++)
for (int i = 1; i < n-1; i++)
b(i,j) = weight *
(a(i+1,j+1) + a(i+1,j ) + a(i+1,j-1) +
a(i ,j+1) + a(i ,j ) + a(i ,j-1) +
a(i-1,j+1) + a(i-1,j ) + a(i-1,j-1));
}
// Print out the final central value.
std::cout << "before: " << (nuAveragings % 2 ? a(n/2,n/2) : b(n/2,n/2)) << std::endl; // TMP
Pooma::blockAndEvaluate(); // Ensure all computation has finished.
std::cout << (nuAveragings % 2 ? a(n/2,n/2) : b(n/2,n/2)) << std::endl;
// The arrays are automatically deallocated.
// Tell the Pooma library execution has finished.
Pooma::finalize();
return EXIT_SUCCESS;
}
More information about the pooma-dev
mailing list