[vsipl++] SIMD threshold with loop fusion

Jules Bergmann jules at codesourcery.com
Wed May 16 03:43:25 UTC 2007


>> [1] This looks good.  However, do you think faux-SIMD should have the
>> same "API" as the real SIMD functions below?
>>
>> For example, AltiVec vgt returns 0xFFFFFFFF or 0x00000000 for each
>> position.  That can be used as a mask.  (What does SSE do?)
> SSE is the same thing because there is a website that has a 
> cross-reference for altivec and sse instructions.
>>
>> Since faux SIMD returns 1 or 0, it can't be used as mask.  A generic
>> routine that uses vgt may not work with faux-simd if it expects
>> vgt/vlt to return a value valid for a mask.
> Why not? I use normal bit operations on the return values. If I and '1' 
> with another value, I get the value, right?

It depends on whether the 'and' is binary or logical.

I.e. if you do something like

	mask = simd::vgt(a, b);
	result = simd::band(mask, a);

For AltiVec and SSE, this does the right thing because mask[i] is 
0xffffffff when a[i] > b[i].

For faux-simd, mask[0] is 0x00000001 when a[0] > b[0].  That will pull 
just the lowest order bit out of a[0], not the entire value.


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705



More information about the vsipl++ mailing list