parallel particle bctest3 crash

Arno Candel candel at itp.phys.ethz.ch
Sat Oct 9 16:54:39 UTC 2004


Hi,

I am using r2 CVS with Linux 32-bit Intel g++ 3.4.1 as well as Linux 
64-bit Opteron g++ 3.3.4 together with cheetah 1.1.4 & CVS and LAM 7.0.4.

I encounter crashes of r2/src/Particles/tests/bctest3 when running with 
more than 10 contexts:



candel at gate12:~/r2/src/Particles/tests/GPCEO$ mpirun -np 10 ./bctest3 -mpi
PASSED ... KillBC with expression

candel at gate12:~/r2/src/Particles/tests/GPCEO$ mpirun -np 11 ./bctest3 -mpi
MPI_Recv: process in local group is dead (rank 8, comm 2)
Rank (8, MPI_COMM_WORLD): Call stack within LAM:
Rank (8, MPI_COMM_WORLD):  - MPI_Recv()
Rank (8, MPI_COMM_WORLD):  - main()
MPI_Recv: invalid rank: Invalid argument (rank 9, comm 4)
Rank (9, MPI_COMM_WORLD): Call stack within LAM:
Rank (9, MPI_COMM_WORLD):  - MPI_Recv()
Rank (9, MPI_COMM_WORLD):  - main()
CHEETAH-ON-MPI ERROR: Unrecognized get response tag -32766!
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 26355 failed on node n0 (127.0.0.1) due to signal 8.
-----------------------------------------------------------------------------
MPI_Recv: invalid rank: Invalid argument (rank 7, comm 4)
Rank (7, MPI_COMM_WORLD): Call stack within LAM:
Rank (7, MPI_COMM_WORLD):  - MPI_Recv()
Rank (7, MPI_COMM_WORLD):  - main()
CHEETAH-ON-MPI ERROR: Unrecognized get response tag -32766!


The problem seems to stem from the sync() call, some PatchFunction might 
have a problem.

Thanks for your help,
Arno Candel



More information about the pooma-dev mailing list