[pooma-dev] [PATCH] parallel particle bctest3 crash
Arno Candel
candel at itp.phys.ethz.ch
Sun Oct 10 14:06:01 UTC 2004
okay, this patch fixes the UniformLayout particle swapping, and
particularly the bctest3 particle test.
There was a floating point exception triggered as soon as less particles
were created than patches existed -> sizePerPatch=0.
A there was an assertion error when the particle was being sent to a
patch that didn't even exist (npid >= patchesGlobal()).
Arno.
Index: UniformLayout.h
===================================================================
RCS file: /home/pooma/Repository/r2/src/Particles/UniformLayout.h,v
retrieving revision 1.23
diff -u -r1.23 UniformLayout.h
--- UniformLayout.h 14 Jul 2004 15:44:59 -0000 1.23
+++ UniformLayout.h 10 Oct 2004 13:51:36 -0000
@@ -311,11 +311,11 @@
for (int i = 0; i < size; ++i)
{
- int npid = (i+offset) / sizePerPatch;
+ int npid = (i+offset) / (sizePerPatch>0?sizePerPatch:1);
// check for a leftover particle
- if (npid == patchesGlobal())
+ if (npid >= patchesGlobal())
npid = (i+offset) - (sizePerPatch*patchesGlobal());
// Make sure this is kosher
Richard Guenther wrote:
> Arno Candel wrote:
>
>>
>>> This is no wonder:
>>>
>>> if (Pooma::context() == 0)
>>> P.create(10,0,false);
>>> P.sync(P.a1);
>>>
>>> i.e. we create 10 particles - distributing over 11 contexts isn't
>>> going to work. We don't handle contexts with zero particles
>>> gracefully. I think there are similar problems with #patches <
>>> #contexts. But I'd qualify these cases as user error.
>>>
>>> Hope this helps,
>>> Richard.
>>
>>
>> Unfortunately, that's not the problem. It still doesn't work after
>> creating 11 or more particles.
>
>
> Hm ok, that was just a guess.
>
>> I've been using position-dependent distribution of particles for
>> months now in my own code, often with many contexts not containing
>> any particles. The problem only arises when the number of contexts is
>> getting too high (mostly 8-10 being the limit). However, I have no
>> idea what triggers the error...
>
>
> I don't have time at the moment to dig into this myself, maybe you can
> try to debug this yourself - usually doing printf-debugging is the
> only way to debug these parallel failures though (unless you have
> expensive tools around, which I don't have).
>
> Richard.
More information about the pooma-dev
mailing list