[vsipl++] Matlab IO

Fri Jul 21 13:52:08 UTC 2006

Assem Salama wrote:
 > Everyone,
 >  This is new Matlab IO iterator interface.

Assem,

I have a few comments below (Don't be discouraged by the number!)
This is looking good.

				-- Jules

 > Index: src/vsip_csl/GNUmakefile.inc.in
 > ===================================================================
 > --- src/vsip_csl/GNUmakefile.inc.in	(revision 144405)
 > +++ src/vsip_csl/GNUmakefile.inc.in	(working copy)
 > @@ -22,8 +22,10 @@
 >  endif
 >  src_vsip_csl_cxx_objects := $(patsubst $(srcdir)/%.cpp, %.$(OBJEXT),\
 >                                $(src_vsip_csl_cxx_sources))
 > -cxx_sources += $(src_vsip_csl_cxx_sources)
 >
 > +cxx_sources += $(src_vsip_csl_cxx_sources) matlab_file.cpp

 > +cxx_objects += $(src_vsip_csl_cxx_objects) matlab_file.$(OBJEXT)
 > +

[1] We use nested makefiles (i.e. the top level makefile includes all
the other makefiles in the source tree, that's why they have .inc
suffixes.

This means we have to be careful with our variable names so that
two different GNUmakefile.inc files don't inadvertantly trample
on each other.  To do this, we have a convention that variables
local to a particular GNUmakefile.inc have a prefix related to
the directory path.  For src/vsip_csl/GNUmakefile.inc, the prefix
is src_vsip_csl_ .

Instead of creating a new variable 'cxx_objects' here, you should
add the objects to the existing 'src_vsip_csl_cxx_objects' variable.

 >  libs += lib/libvsip_csl.a
 >
 >  ########################################################################
 > @@ -35,7 +37,7 @@
 >  clean::
 >  	rm -f lib/libvsip_csl.a
 >
 > -lib/libvsip_csl.a: $(src_vsip_csl_cxx_objects)
 > +lib/libvsip_csl.a: $(cxx_objects)

[1b] and here you should revert to using src_vsip_csl_cxx_objects.

 >  	$(AR) rc $@ $^ || rm -f $@
 >
 >  # Install the extensions library and its header files.
 > Index: src/vsip_csl/matlab_file.hpp
 > ===================================================================

 > +#include "vsip_csl/matlab_bin_formatter.hpp"

[2] Use angle-brackets for this include: <...>

 > +class Matlab_file
 > +{
 > +  public:
 > +    // Constructors
 > +    Matlab_file(std::string fname);
 > +
 > +  // classes
 > +  public:
 > +    class iterator
 > +    {
 > +      public:
 > +        iterator() {}

[3] Is the default constructor necessary?  My guess is probably not.

However, if it is necessary, it should initialize the iterator into
a "good" state.  For example, it might set mf_ to NULL.

Going down this path, if mf_ could be NULL, then all uses of mf_
should check its validity (and probably through an exception if it is
NULL).

If we decide that the default constructor isn't necessary, we can
replace this NULL check on use with a NULL check on construction.
Better yet, we can store mf_ as a reference (Matlab_file&) instead of
a pointer.  Then it is guarenteed to be valid (you can't have a NULL
reference).

 > +        iterator(bool end_iterator,Matlab_file *mf) :

[4] Why not put the constructor parameters in the same order
as the member order?  I.e. (mf, end_iterator).

 > +	  mf_(mf),
 > +	  end_iterator_(end_iterator) {}
 > +
 > +	void read_header() { mf_->is_ >> mf_->view_header_; }
 > +	std::istream *get_stream() { return &(mf_->is_); }

[5] We should be able to remove this function, see below.

 > +      public:
 > +	// copy constructors
 > +	iterator(iterator const &obj) :
 > +          mf_(obj.mf_), end_iterator_(obj.end_iterator_) {}
 > +
 > +        // = operator
 > +	iterator&
 > +	operator=(iterator &src)
 > +	{
 > +	  this->mf_           = src.mf_;
 > +	  this->end_iterator_ = src.end_iterator_;
 > +	  return *this;
 > +	}

[6] Can you put the copy constructor and assignment operator at the
top of the class together with the other constructors?  In general,
we try to lay out classes as:

  - compile-time constants (typedefs, static variables, etc)
  - constructors, destructors, assignment operators
  - accessor member functions
  - private member data

Being consistent makes it easier to find things.

 > +
 > +      private:
 > +        Matlab_file *mf_;

[7] (see above) it would be "safer" to store Matlab_file as a
reference.  Then we don't have to check if mf_ is NULL on construction
or on use.

 > +        bool end_iterator_;
 > +
 > +
 > +    };
 > +
 > +    friend class iterator;
 > +
 > +  public:
 > +    // iterator functions
 > +    iterator begin() { return begin_iterator_; };
 > +    iterator end() { return end_iterator_; };
 > +
 > +    // read a view from a matlab file after reading the header
 > +    template <typename T,
 > +	      typename Block0,
 > +	      template <typename,typename> class View>
 > +    void read_view(View<T,Block0> view, iterator  &iter);
 > +
 > +  private:
 > +    Matlab_bin_hdr                    matlab_header_;
 > +    std::ifstream                     is_;
 > +    iterator                          begin_iterator_;
 > +    iterator                          end_iterator_;
 > +
 > +
 > +  // these variables are used for the iterator
 > +  private:
 > +    Matlab_view_header view_header_;
 > +    bool read_data_;
 > +    bool end_of_file_;
 > +    uint32_t length_;
 > +
 > +  // make a private copy constructor and assignment
 > +  private:
 > +    Matlab_file(Matlab_file const & /*obj*/) {}
 > +    Matlab_file& operator=(Matlab_file & /*src*/)
 > +      {
 > +        VSIP_IMPL_THROW(std::runtime_error(
 > +          "Trying to use Matlab_file assignment operator"));
 > +        return *this;
 > +      }

[8] It is better to leave the function bodies for these out.  I.e.

	Matlab_file(Matlab_file const&);

would be preferrable.  This way, you can't inadvertantly make a copy
inside of a member function.

However, an even better approach (hint hint :), we have a special class
you can derive from to indicate that copy and assignment are
dissallowed:

	#include <vsip/impl/noncopyable.hpp>

	class Matlab_file : Non_copyable
	{
	   ...

	   no need to make copy constructor and assignment private,
	   Non_copyable does it for you
	};

 > +
 > +};
 > +
 > +
 > 
+/*****************************************************************************
 > + * Definitions
 > + 
*****************************************************************************/
 > +
 > +// read a view from a matlab file after reading the header
 > +template <typename T,
 > +	  typename Block0,
 > +	  template <typename,typename> class View>
 > +void Matlab_file::read_view(View<T,Block0> view, iterator  &iter)
 > +{

[9] it would be good to make the assumption that 'iter' is an iterator
of 'this' explicit with an assertion:

	assert(iter.mf == this);

 > +  typedef typename vsip::impl::Scalar_of<T>::type scalar_type;
 > +  vsip::dimension_type v_dim = vsip::impl::Dim_of_view<View>::dim;
 > +  Matlab_view_header *header = *iter;
 > +  std::istream *is = iter.get_stream();

[10] since 'iter' belongs to 'this', 'iter.get_stream()' is equivalent
to 'this->is_'.

 > +
 > +  // is this complex?

[11] This comment describes what is being done below in a more direct
fashion, which is good.  However, since it is phrased as a question,
it might be confusiong to the reader.  In general, they might think
that you were unsure of the following code and put a question in as a
reminder to come back later.

Can you repharse this?  "Check that view and file are both complex or 
scalar."

 > +  if(vsip::impl::Is_complex<T>::value && !header->is_complex)
 > +    VSIP_IMPL_THROW(std::runtime_error(
 > +      "Trying to read complex view into a real view"));
 > +

[12] Likewise for these comments:

 > +  // is the class ok?

 > +  // do dimensions agree?

 > +    // should we swap this field?

 > Index: tests/matlab_iterator_test.cpp

[13] Since this file is in the "tests/" subdirectory, it is not
necessary to put the word "test" in the filename.
"matlab_iterator.cpp" is preferrable.

 > ===================================================================
 > --- tests/matlab_iterator_test.cpp	(revision 0)
 > +++ tests/matlab_iterator_test.cpp	(revision 0)
 > +
 > +#include "test.hpp"

[14] Make sure to do a 'svn update' before committing this.
"test.hpp" has moved into vsip_csl so you need to replace this with:

	#include <vsip_csl/test.hpp>

 > +
 > +using namespace std;
 > +using namespace vsip;
 > +using namespace vsip_csl;
 > +
 > +#define MAX(a,b) ( (a>b)? a:b)

[15] Assem, you're an ole fashion C hacker! :)

Instead of defining your own MAX macro, you should get in the habit
of using C++'s std::max:

Include algorithms:

	#include <algorithms>

Use it:

	int x = std::max(a, b);

Or if you have done a 'using namespace std;':

	int x = max(a, b);

 > +template <typename T>
 > +void read_view_test(Matlab_file::iterator iterator, Matlab_file &mf)
 > +{
 > +  Matlab_view_header *header = *iterator;
 > +  if(header->num_dims == 2 && (header->dims[0] == 1 | 
header->dims[1] == 1))
 > +  {
 > +    // vector
 > +    Vector<T> a(MAX(header->dims[0],header->dims[1]));

[16] Why is it necessary to use MAX here?

I assume that matlab distinguishes between row vectors and column
vectors by storing them as either (N x 1) matrices (i.e. a column vector),
or (1 x N) matrices (i.e. a row vector)?

Since we don't distinguish between those cases in VSIPL++, I'm
hesitant to export that interface and force the user to do a max.

Instead, when we read the header for a vector, let's do two things:

  1) Put the size into dims[0] and put 1 into dims[1].

     This will make it easier for the majority of cases when it doesn't
     matter.

  2) Add a flag 'row_vector' that indicates when the vector was a row
     vector, that is, when it was necessary to swap dims[0] and dims[1].

     This will make it possible to determine what the vector was in
     cases where it does matter.

This suggests we need a similar capability to write column and row
vectors.  If that is the case, let's not worry about it now, but
instead file an enhancement issue.

 > +    mf.read_view(a,iterator);
 > +    view_test(a);
 > +  } else if(header->num_dims == 2)
 > +  {
 > +    // matrix
 > +    Matrix<T> a(header->dims[0],header->dims[1]);
 > +    mf.read_view(a,iterator);
 > +    view_test(a);
 > +  } else if(header->num_dims == 3)
 > +  {
 > +    // tensor
 > +    Tensor<T> a(header->dims[0],header->dims[1],header->dims[2]);
 > +    mf.read_view(a,iterator);
 > +    view_test(a);
 > +  }
 > +}
 > +
 > +int main()
 > +{
 > +  Matlab_file mf("temp.mat");

[17] Where does "temp.mat" come from?  I assume it has been written
already.

That's fine, we just need to handle it properly.  As above, this test
won't work when building the library outside of the source tree.

Can you check how data files are handled by the fft tester?

-- 
Jules Bergmann
CodeSourcery
jules at codesourcery.com
(650) 331-3385 x705