[vsipl++] [patch] Minor CFAR changes

Jules Bergmann jules at codesourcery.com
Thu Jun 8 22:49:21 UTC 2006


Mark Mitchell wrote:
> Don McCoy wrote:
>> Jules Bergmann wrote:
>>> Attached graphs show original (cfar-orig) and new (cfar) performance,
>>> for GCC 3.4 and GCC 4.1 on Pastec.  The changes for the slice version
>>> have a larger impact.  Using 4.1 is a win!
> 
> What's Pastec?

Pastec is another name for the GTRI cluster, aka durip (some acronym or 
such).

> 
> It's nice to know GCC 4.1 is good for something!  

Good job!

> But, from what you
> said this morning, don't those results still fall short, relative to the
> C code?

Yes, that's right.  I'm producing results for those cases now.  However, 
it looks like 4.1 boosted our "slice" version, while at the same time 
pessimizing the plain C "vector" version.

For a particular dataset size (dataset #3 at 200 gates):

	Variation		MFLOPS
	3.4 VSIPL++ slice	136
	3.4 VSIPL++ vector	 60
	3.4 C vector		141
	3.4 C+SIMD vector	470

	4.1 VSIPL++ slice 	226
	4.1 VSIPL++ vector	100
	4.1 C vector		128
	4.1 C+SIMD vector	830

(I need to repackage/rerun the VSIPL++ + SIMD approach.)

Question on SIMD:  For the C+SIMD version, I used the intrinsics from 
xmmintrin.h (__m128, _mm_add_ps(), etc).  This works with both 3.4 and 
4.1.  For the VSIPL++ SIMD version, I used the GCC vector extensions 
(typedef float v4sf __attribute++ ((vector_size(16))), '+' operator). 
The typedefs work with 3.4 and 4.1, but the operators (+, *, etc) only 
work with 4.1.  Is there any difference in code generated from these two 
approaches?  In particular, would it be worthwhile at all to recode the 
C+SIMD version to use the vector extensions?

				thanks,
				-- Jules


-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705



More information about the vsipl++ mailing list