[vsipl++] [patch] Built-in function profiling

Don McCoy don at codesourcery.com
Tue Jul 11 20:17:17 UTC 2006


Jules Bergmann wrote:
> Don, this looks good.  I have several comments below.  Can you please 
> address #1 and #3-#5 before checking in?  We can address #2 later.
> 
Sounds good.  Almost there.

>> @@ -171,6 +179,7 @@
>>  Profiler::dump(char* filename, char /*mode*/)
>>  {
>>    std::ofstream    file;
>> +  const char delim[] = " : ";
> 
> While the spaces improve the human readability, they are more difficult 
> to post-process in a languages like perl and python (although the extra 
> difficulty is pretty minor).  Let's leave the space in, but in general 
> we should "error" on the side of easier post-processing rather than 
> readability since the profiling output will have a lot of raw 
> information that will be difficult to digest without some reduction.
> 
I'll leave it as is, but you can see I was thinking the same thing by 
making it easy to change in the future.  It might even be nice to allow 
it to be overridden by the user.  This will help people who would find 
it easier or preferable to use X over whatever default we settle on.

>>      for (iterator cur = accum_.begin(); cur != accum_.end(); ++cur)
>>      {
>> -      file << (*cur).first << ":"
>> -       << TP::ticks((*cur).second.total) << ":"
>> -       << (*cur).second.count << std::endl;
>> +      float mflops = (*cur).second.count * (*cur).second.value /
>> +        (1e6 * TP::seconds((*cur).second.total));
>> +      file << (*cur).first +           << delim << 
>> TP::ticks((*cur).second.total)
>> +           << delim << (*cur).second.count
>> +           << delim << (*cur).second.value
>> +           << delim << mflops
>> +           << std::endl;
>> +      // clear log
>>        cur->second.total = TP::zero();
>>        cur->second.count = 0;
> 
> [1] I think this should also clear 'value' too.
> 
I ran into some strange behavior (incorrect results, missing ops value) 
when I did this initially.  Looking at the method used for 'data_' (not 
shown, but it is a vector<> rather than a map<>) so I changed it to:

	accum_.clear();  (after the for loop)

Is there a reason to avoid this?  It seems to be the right thing, but 
there may have been a reason not to that is not obvious at first glance.


> 
>>  
>> +namespace conv
>> +{
>> +template <typename T>
>> +struct Op_count
>> +{
>> +  static length_type
>> +  apply(length_type kernel_len, length_type output_len)
>> +  {
>> +    return static_cast<length_type>(kernel_len * output_len *
>> +      (Ops_info<T>::mul + Ops_info<T>::add));
>> +  }
>> +};
> 
> This could also have been a template function:
> 
>     template <typename T>
>     inline length_type
>     op_count(...)
>     {
>        return ...
>     }
> 
I reverted it to be a template function.  Not sure what I did wrong the 
first time I attempted that, but it was giving me an error compiling it 
that I couldn't figure out at the time.


>> @@ -152,7 +172,11 @@
>>      Matrix<T, Block2>       out)
>>      VSIP_NOTHROW
>>    {
>> -    timer_.start();
>> +    length_type const M = this->kernel_size()[0].size();
>> +    length_type const P = this->output_size()[0].size();
>> +    int ops = impl::conv::Op_count<T>::apply(M, P);
> 
> [3a] For a matrix convolution, the ops count will also depend on the 
> kernel_size()[1].size() and output_size()[1].size().  (The ops 
> computation in the original impl_performance was also broken for matrices).
> 
Corrected.  I should have caught that -- I'm glad you did.


Working on [4] and [5].  Thanks for the feedback.

Regards,


-- 
Don McCoy
don (at) CodeSourcery
(888) 776-0262 / (650) 331-3385, x712



More information about the vsipl++ mailing list