Problems running in parallel
Dave Nystrom
wdn at lanl.gov
Tue Jun 26 04:21:49 UTC 2001
John and I are trying to get our stuff running in parallel and we are
having runtime problems. We are using KCC-4.0e with the 4.0f prelinker on
RedHat Linux 7.1. We are using Cheetah 1.0 sitting on top of mm-1.1.1
i.e. we used --shmem --nompi on the Cheetah configure script. We
successfully ran the tests for mm and Cheetah. We built Pooma 2 using the
--messaging option to select the use of Cheetah. We built tests in the Array
and NewField directory and ran them. For these tests, we were doing debug
builds with KCC without exceptions and for the Cheetah case. We ran about
half of the Array tests and they all reported passing. We ran all of the
tests for the NewField case and all passed except one. The one that failed
was FieldTour1 which is the only one that seemed to have any Cheetah specific
code in it. The command line that we were using was:
FieldTour1 -shmem -np 4
Also, the following command line failed:
FieldTour1 -shmem -np 1
How should we proceed on this? Are there examples in the Pooma 2 source tree
which demonstrate how to use the NewField Fields in parallel that we can look
at? Are there any Pooma 2 tests that exercise running in parallel that pass?
John and I would like to get our stuff running in parallel before I leave for
vacation. Right now, we only have 3 more days in which our schedules
overlap.
Also, it turns out that John used the FieldTour1.cpp example as a guide
for building our Fields for the parallel case. When we run our test problem,
we are getting some pretty strange behavior - it seems that we are not even
getting reproducible results from one run to the next and only about half of
the field is getting the correct answer. Our test problems run correctly for
the serial case i.e. no Cheetah and CompressibleBrick.
--
Dave Nystrom email: wdn at lanl.gov
LANL X-3 phone: 505-667-7913 fax: 505-665-3046
More information about the pooma-dev
mailing list