Monday, June 01, 2009

Note on Boost's Multidimensional Array

Note on Boost's Multidimensional Array

(Jun 1, 2009)
(Last update: Mar 28, 2010)

The Boost C++ library provides an elegant way to handle a multi-dimensional array.  We can access an array element in a compact way as usual (A[i][j][k], for example).  Boundary checking is performed by assertion which is typically disabled when NDEBUG flag is employed.   Because of the use of assertion mechanism, accessing elements outside the range of any dimension does not cause an exception, but a program will be terminated right away.  This makes sense because a program normally has no use to access an invalid entry and should be rewritten.  Plus, graceful stack unwinding and tracing for exception is not available as a standard way in C++ (there may be a better, standard solution about this in the future, however).  Thus, program halting is not a bad idea under a typical C++ circumstance. 

Moreover, we almost always prevent out-of-bound access explicitly in our code and there is little use to do boundary checking again, especially when speed is important.  Therefore, in a production release, we can remove the boundary checking to increase program speed by disabling assertion.  Note: Alternatively, we can disable this boundary checking by defining BOOST_DISABLE_ASSERTS.

Besides an example in Boost website, I want to add another example and more explanation that may be helpful for later reference (and people who stumble at this page).   In this example, I show that we need to pass boost::multi_array by reference explicitly.  Otherwise, modification in a called function will not have any effect (to see the difference, try removing & in a function signature of change).  Also, there is no exception thrown or caught at all.  The program will be terminated if assertion is enabled.

Another side note, but is important, about Boost's multidimensional array is that we cannot explicitly reallocate it by using = to an instance of the array.  Reassigning an array instance can be done only if both input and output arrays are of the same size.

For example, because we know only an array size during runtime, some may choose to declare a boost multidimensional array without any initial size.

boost::multi_array<double, 3> arrayTest;

Then, we know at runtime that the size is (size_x, size_y, size_z), so we reallocate the array variable by reassignment (using = ) as follows

arrayTest = boost::multi_array<double, 3>(boost::extents[size_x][size_y][size_z]);

This will not cause an error outright in a release mode without assertion, but a program is very likely to crash later on.  A correct way to dynamically reallocate the space for a multidimensional array is to use 'resize' function, as shown below.

arrayTest.resize(boost::extents[size_x][size_y][size_z]);

Notice that it is boost::extents (with s), not just boost::extent (without s).  There is boost::extent, but it is another thing.  Please don't get confused with this.

A correct way to copy centents from one array to another

As discussed earlier, reassigning Boost's multi-dimensional array may cause an error unless both input and output arrays are equal in size.  Thus, a proper way for reassignment is:

// Assume that the input array is of size: size_x, size_y, and size_z.
// 1. resize an output array
outArray.resize( boost::extents[size_x][size_y][size_z] );

// 2. now, we can copy contents of an input array by using =
outArray = inArray;

Side note: assigning initial contents of an array when it is declared.

Assigning contents to an array when it is declared is not considered reassignment.  Hence, such operation is correctly executed as long as the dimensionality is matched.

// This content assignment is correct, provided that the dimensionality of the input array is three in this case.
boost::multi_array<double, 3> outArray = inArray;

The above operation implicitly initialized the size of the outArray to be the same as that of inArray.  It is important to note that such implicit size initialization can be done at the declaration only.  If we do not assign any content to the array, the length of each dimension is zero.  Consequently, content assignment int the following example is an incorrect use.

boost::multi_array<double, 3> outArray;    // since we say nothing else here, the length of the array in each dimension is zero.
outArray = inArray;    // The assignment is not done at the declaration time.  Therefore, this is reassignment.
                       // If inArray size does not match with outArray, the error will occur.

How can we know the size of Boost's multi dimensional array

The size of the array comes in two folds: dimensionality and the array length of each dimension.

  • To obtain dimensionality
    int nNumDims = A.dimensionality;

  • Now, obtain the array length of the ith dimension
    int nLengh = A.shape()[i];

    The following example obtains and prints out dimensionality and lengths of an array

    int nNumDims = A.dimensionality;

    cout << "Dimensionality = " << nNumDims << endl;

    // Report size list
    for (int i = 0; i != nNumDims; ++i) {
         cout << "Size( " << i << " ) = " << A.shape()[i] << endl;
    }

Pitfall: Be careful when Boost's multi-dimensional array is a class member

Suppose class MyArray has a three dimensional array as a member.  Copying contents from one instance of MyArray to another implies copying Boost's multi-dimensional array, as well.
So, make sure that the dimensionalities of input and output arrays match before copying.  Alternatively, we need to copy the contents of MyArray instance at declaration.  See section 'A correct way to copy centents from one array to another' and its side note for detail.

Example:
MyArray outArray;
MyArray inArray;

// Do something with inArray
// ...

// Now, try to save the contents of inArray to outArray
outArray = inArray;  // Beware! There is reassignment of the array inside. 
                     // We have to make sure that all Boost's multi dimensional arrays inside inArray and outArray match in size.

As discussed in the side note above, however, the following code has no problem.
MyArray inArray;

// Do something with inArray
// ...

// Now, try to save the contents of inArray to outArray
MyArray outArray = inArray;  // Correct!  There is no array reassignment, but array initialization at declaration.


=============================================

The following code is an example about assertion and exception throw related to Boost's multi dimensional array.

#include "boost/multi_array.hpp"

using namespace std;

typedef boost::multi_array<double, 3> array3_double;

void change(array3_double& A);
void causeException(array3_double& A);    // In fact, it's not an exception.  Program will be aborted immediately if assertion is enabled.

int main() {
    // Notice that it is extents, not just extent.
    array3_double A(boost::extents[4][3][2]);

    int values = 0;
    for(int i = 0; i != 4; ++i)
        for(int j = 0; j != 3; ++j)
            for(int k = 0; k != 2; ++k)
                A[i][j][k] = values++;

    change(A);

    // Delete this block if you don't want a program to halt.
    // Note: there is no exception thrown.  Program will be just terminated.
    {
        try {
            causeException(A);
        } catch(std::exception ex) {
            cout << "Exception caught:" << endl;
            cout << ex.what() << endl;
        }
    }

    cout << A[3][2][1] << endl; // check this if you pass by value
    return 0;
}

// The parameter is passed by reference.  This is important as the change of A in this function
//   will not affect the
original copy if we pass by value (without using & in the parameter). 
// If we pass by value, the whole thing in
the original copy will be copied to a local copy of
//   this function.

void change(array3_double& A) {
    int nNumDims = A.dimensionality;
    cout << "Dimensionality = " << nNumDims << endl;

    // Report size list
    for (int i = 0; i != nNumDims; ++i) {
        cout << "Size( " << i << " ) = " << A.shape()[i] << endl;
    }

    // Report total size
    cout << "Total size = " << A.num_elements() << endl;

    A[3][2][1] = 99;

    return;
}

void causeException(array3_double& A) {
    int k = 7;

    A[5][5][k] = 9999;

    return;
}


=============================================

Pinyo Taeprasartsit

(This document can be viewed at my blog and Google docs)
(Feel free to leave comment or question in my blog.  Don't worry if it is an old post.  I get a notification of your comment in my mailbox and tend to answer questions from the reader.
Google docs version is better for printing.)

6 comments:

Anonymous said...

Thank you for this nice and easy-to-follow illustration.

Anonymous said...

Really great! Thank you!

crazyboy_18 said...

really helpful! thank you.

Anonymous said...

Finally someone explaining clearly how the sizes of the dimensions can be found. This should go to the boost docs!

Forceflow said...

Thanks for this - spent two hours trying to figure out how to pass a multi array by reference, then fill it - resize() was the perfect answer.

David Doria said...

Thank you - you should see about getting simple examples like these added to the Boost documentation!