12.07.06

Assignment vs Initialization

Posted in cplusplus at 11:57 pm by Fernando Cacciola

Take a look at the following C++ code snippet

SomeContainer* c_ptr = ..whatever..
for ( some_iterator it = c_ptr->begin()
    ; it != c_ptr->end() && !found
    ; ++ it
    )
{
  while ( some_nested_loop )
  {
    if ( some_condition )
    {
      found = true ;
      c_ptr = c_ptr->OTHER_RELATED_CONTAINER() ;
    }
  }
}

There is very subtle problem here…the loop condition has undefined behavoir.

The problem is that, as you know well, the loop condition is evaluated first and then after each iteration over the loop body, and this body changes c_ptr causing it’s end() iterator to compare against an iterator previously obtained from another container. As it turns out, comparing incompatible iterators is undefined behaviour, and, for instance, the checked iterators feature of Dinkumware’s STL (that ships with VC8) assert that.

But the problem is subtle because the loop is supposed to be aborted right after the assignment, via the boolean flag used to control the loop (a break statement wouldn’t suffice because the assignment ocurrs in a nested loop), so the actual “mistake” here is quite silly and subtle (it touches on a dark corner of the C++ standard nobody knows about): the boolean flag “found” should be tested first to avoid comparing incompatible iterators.

So, here’s a fundamental question: can such a subtle mistake ever be avoided? Well, some mistakes can.

The basic approach is to learn which programming constructs are prone to error and avoid them as much as possible.

One example is using data to control flow. That is, aborting a loop via a boolean flag. There has been an interesting discussion about it, recently, in the ACCU general mailing list.

Another example is assignment, and I mean real assignment, not initialization (that is, when you change the value of a variable). Functional languages just lack assignment, and if you are a functional programming fan, you might even conclude that assignment is unnecesary, even, evil.

I believe there’s some truth in that, after all, assignment changes state and that’s always a source of error… we just saw how a little subtle mistake resulted in a bug because of that.

Of course, in an imperative language like C++ you just can’t avoid assignment altoghether, and there is no reason to do it. It would be like trying to avoid all mutating operations on objects.

Yet many times we do try to avoid mutating operations on objects. In fact, in C++ we even use const to force us to stay on the safe side, because a changing state, neccessary as it is, is error prone.

So, should we avoid assignment? I’d say yes, as much as we can.

Can we write the above code without it then?

How about this?

SomeContainer* c_ptr = ..whatever..
SomeContainer* other_c_ptr = NULL ; // Conceptually uninitialized really
for ( some_iterator it = c_ptr->begin()
    ; other_c_ptr == NULL && it != c_ptr->end()
    ; ++ it
    )
{
  if ( some_condition )
  {
    other_c_ptr = c_ptr->OTHER_RELATED_CONTAINER() ;
  }
}

This comes close: the object determining the search range is held in a variable that never changes (is never assigned), which is great, no more risk of comparing incompatible iterators.

The result, OTOH, is held in a variable that, conceptually, is also just initialized, not really assigned, but with a subtelty: its initialization is conditional.

But yes…there’s no such thing as conditional initialization, of course.

So, can we do better? Let’s see:

SomeContainer* const find ( SomeContainer* const src )
{
  for ( some_iterator it = src->begin()
      ; it != src->end()
      ; ++ it
      )
  {
    if ( some_condition )
      return src->OTHER_RELATED_CONTAINER() ;
  }
  return NULL ;
}
SomeContainer* const source = ..whatever..
SomeContainer* const = find(source);

Voila! No assignment.

Noticed the const qualifiers in the last example? They are not strictly needed (the code would compile anyway), but they enforce the intention at compile time, which is always an excelent idea.

Wrap up: assignment should be used only when it is strictly neccessary to change the value of a variable, that is, when we can’t use a different variable to hold the new value.

Leave a Comment