[vsipl++] Example of parallel processing
Stefan Seefeld
stefan at codesourcery.com
Wed Jul 14 15:10:17 UTC 2010
Hi Bill,
On 07/14/2010 10:51 AM, Cassanova, Bill wrote:
> Your response does help although I need to clarify a few things a bit
> further if I may.
>
> (1) You stated that foreach_vector is not recommended.
I need to correct what I said a little: The foreach_vector function is
indeed exposed as part of the vsip:: namespace (and documented as such
in the Users Guide).
We need to correct that, since in fact the vsip:: namespace is reserved
for functions and types provided by the VSIPL++ specification.
> My presumption
> is that the recommended way is to code an explicit loop as was done in
> the 8.2 section of the users-guide.pdf file using local subviews
> instead.
Not quite. The new API we are working on, and which I alluded to in my
mail yesterday, provides a means for implicit parallelism. Instead of
writing an explicit loop, you would write things like:
Matrix<> input = ...;
Matrix<> output = ...;
Iterator<> i;
output.row(i) = function(input.row(i));
This should be intuitively clear: The intent is, similar to
foreach_vector, to apply "function" to each row of the input.
Using such a compact notation gives much more latitude to the library to
implement this efficiently (for example by evaluating different rows in
parallel).
Such "Parallel Iterator" expressions are also much more powerful since
there are many more things that can be expressed than with a "foreach"
construct.
But, to be clear: this is still in development. It will be included in
future Sourcery VSIPL++ releases.
> (2) You stated that each process runs the exact same code. Let's take a
> distributed scenario where, for example I have 4 individual machines
> named A,B,C, and D. Each of these machines has 4 processors. So...I
> have 16 processors on which "work" can be done. However, only Node A
> has a particular input file on disk that contains the data that must be
> first read in before being processed.
>
> (A) Is my understanding correct that the actual binary program must
> physically exist on each and every machine or does VSIPL++/MPI
> "take care of" sending the necessary instruction codes from the
> master Node to the slave nodes...In this case Node A, to each of the
> machines.
The binary must exist on all nodes. The program then is started using
the standard "mpirun" command, which, suitably configured, will spawn
the 16 processes on the 4 machines.
> (B) Does the input file have to exist on all 4 nodes or is it
> possible to read the data in on Node A, load the data into an
> appropriate
> data structure and then let VSIPL++ "distribute" the processing
> using either foreach_vector or using the explicitly coded loop.
File I/O is typically done by only one process (such as you have done
using the conditional "if (rank == 0)"), so you only need the input
locally to that process.
> (3) Is there a complete example somewhere of running an parallel
> program? I found directory ../sourceryvsipl++-2.2/src/tests/parallel
> and there are good example of coding parallel programs. At this point,
> however, I have found scant documentation on actually running these
> program in a parallel mode. I had to figure out on my own that a
> program must be invoked using mpirun and had I no prior knowledge of
> mpirun I probably wouldn't have even tried it...Just a
> documentation/example suggestion for those of us who are just starting
> to learn parallel programming. I haven't yet delved into the CUDA parts
> of vsipl++ either so an example runner script would be quite handy and
> probably cut down on some questions.
We do have plans to improve our documentation on parallelism with
Sourcery VSIPL++ (including how to program as well as deploy it). Thank
you for reminding us how useful this is. Such feedback definitely helps
us assigning the appropriate priorities to such documentation tasks.
> VSIPL++ shows great promise and at this point at least it seems a bit
> easier than trying to code an MPI program from scratch.
I'm glad to hear that.
Thanks for your interest,
Stefan
--
Stefan Seefeld
CodeSourcery
stefan at codesourcery.com
(650) 331-3385 x718
More information about the vsipl++
mailing list