[vsipl++] [PATCH] Implement Fir<>, native C++ version

Mon Oct 10 14:07:03 UTC 2005

Nathan, Looks good.  A couple of comments below. -- Jules

Nathan (Jasper) Myers wrote:
> The following patch has been committed.
> 
> It implements vsip::Fir<> using native C++ code, and a comprehensive
> test of all its modes.  
> 
> Note that a few bits of the test are commented out; it uses 
> vsip::Convolution<> to generate the reference output, and that has 
> a little bug I haven't got to tracking down yet.
> 
> Nathan Myers
> ncm

> +    order_(kernel.size() * (1 + (symV != vsip::nonsym)) - 
> +      (symV == vsip::sym_even_len_odd) - 1),

Clever.  This is portable, right?  (i.e. do comparisons portably 
evaluate to 0 or 1?)

> +    decimation_(decimation),
> +    skip_(0),
> +    state_saved_(0),
> +    state_(this->order_, T(0)),
> +    kernel_(this->order_ + 1)
> +  {
> +    assert(input_size > 0);
> +    assert(this->order_ + 1 > 1);  // counter unsigned wraparound

What's going on here?  The comment is a bit hard to decipher at first. 
Perhaps "Use modulo arithmetic to counter the effect of unsigned 
wraparound", although that doesn't fit on a single line.

Does this catch symV == odd && kernel->size() == 0:

	symV	kernel->size()	order_
	nonsym	0		-1
	nonsym	1		 0
	even	0		-1
	even	1		 1
	odd	0		-2 <<<
	odd	1		 0

How about

	assert(kernel->size(0) > 0 && this->order_ >= 1);

> +      // Compute rough estimate of flop-count.
> +      unsigned cxmul = impl::Is_complex<T>::value ? 4 : 1; // *(4*,2+), +(2+)

A good approx is that each filter tap is a multiply-add.  For complex 
this is 6 ops, for real this is 2 ops.

> +      return (this->timer_.count() * cxmul * 2 *  // 1* 1+
> +        ((this->order + 1) * this->input_size_ / this->decimation_)) /
> +          (1e6 * this->timer_.total());
> +    }
> +    else if (!strcmp(what, "time"))
> +      return this->timer_.total();
> +    return 0.f;
> +  }
> +

> +  assert(vsip::alltrue(result == reference));

Using '==' for floating point values may cause the test to fail.  For 
example, if FIR and convolution perform the same operations in different 
order, they may have different rounding/accumulation errors even though 
they produce effectively the same answer.  You should try using 
'view_equal' (which uses almost_equal() underneath) or perhaps 
'error_db' instead.