[vsipl++] [patch] CBE split-complex vmmul

Jules Bergmann jules at codesourcery.com
Thu Oct 2 21:08:13 UTC 2008


> * What is the use case for the 'verbose' flag ? How does it differ from 
> profiling ? Couldn't it be integrated into the profiler, such that users 
> would see task initialization / finalization in the profile logs ?

The use case is determining what kernels are being used/loaded/reused 
during an execution.  For example:

cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:      svpp_kernels.so      alf_vcopy_s_spu reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse
cached_alf_task_create:  spu_fftw_kernels.so       fftwf_spu_fftw reuse

This information can in turn drive optimization.  As a library 
developer, if you see a kernel being used repeatedly, but not being 
reused, it may be case where you can standardize the block sizes across 
invocations.  As a user, if you see a low-performance kernel (for 
example vmul) causing a high-performance kernel (for example fft) to 
reload, you can look for fusion opportunities or somehow disabling the 
low-performanc kernel.

Yes, it could go into the profiler somehow, but some design needs to be 
done.

The main reason I included it in the patch was because split vmmul 
required changes to task_manager.hpp.  I'll separate those patches.

				-- Jules

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705



More information about the vsipl++ mailing list