[pooma-dev] [PATCH] parallel particle bctest3 crash

Arno Candel candel at itp.phys.ethz.ch
Sun Oct 10 14:06:01 UTC 2004


okay, this patch fixes the UniformLayout particle swapping, and 
particularly the bctest3 particle test.

There was a floating point exception triggered as soon as less particles 
were created than patches existed -> sizePerPatch=0.
A there was an assertion error when the particle was being sent to a 
patch that didn't even exist (npid >= patchesGlobal()).


Arno.



Index: UniformLayout.h
===================================================================
RCS file: /home/pooma/Repository/r2/src/Particles/UniformLayout.h,v
retrieving revision 1.23
diff -u -r1.23 UniformLayout.h
--- UniformLayout.h     14 Jul 2004 15:44:59 -0000      1.23
+++ UniformLayout.h     10 Oct 2004 13:51:36 -0000
@@ -311,11 +311,11 @@
 
     for (int i = 0; i < size; ++i)
     {
-      int npid = (i+offset) / sizePerPatch;
+      int npid = (i+offset) / (sizePerPatch>0?sizePerPatch:1);
 
       // check for a leftover particle
 
-      if (npid == patchesGlobal())
+      if (npid >= patchesGlobal())
        npid = (i+offset) - (sizePerPatch*patchesGlobal());
 
       // Make sure this is kosher


Richard Guenther wrote:

> Arno Candel wrote:
>
>>
>>> This is no wonder:
>>>
>>>   if (Pooma::context() == 0)
>>>     P.create(10,0,false);
>>>   P.sync(P.a1);
>>>
>>> i.e. we create 10 particles - distributing over 11 contexts isn't 
>>> going to work.  We don't handle contexts with zero particles 
>>> gracefully.  I think there are similar problems with #patches < 
>>> #contexts.  But I'd qualify these cases as user error.
>>>
>>> Hope this helps,
>>> Richard.
>>
>>
>> Unfortunately, that's not the problem. It still doesn't work after 
>> creating 11 or more particles.
>
>
> Hm ok, that was just a guess.
>
>> I've been using position-dependent distribution of particles for 
>> months now in my own code, often with many contexts not containing 
>> any particles. The problem only arises when the number of contexts is 
>> getting too high (mostly 8-10 being the limit). However, I have no 
>> idea what triggers the error...
>
>
> I don't have time at the moment to dig into this myself, maybe you can 
> try to debug this yourself - usually doing printf-debugging is the 
> only way to debug these parallel failures though (unless you have 
> expensive tools around, which I don't have).
>
> Richard.




More information about the pooma-dev mailing list