May 1998/Questions & Answers

Columns

Questions & Answers: Use Caution with Temporary Objects

Pete Becker

All things in life are fleeting, say the philosophers. In C++, it helps to know exactly how fleeting, however.

To ask Pete a question about C or C++, send e-mail to petebecker@acm.org, use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

In response to my February column about the approval of the C++ Draft Information Standard, I've received several requests for information on how to get copies of the DIS. Unfortunately, the answer isn't very satisfying. ISO and ANSI sell the standard, and you have to go through them to get it. However, it hasn't yet been formally approved, so they're not publishing it. Look for an official version later this summer.
In the meantime, there are electronic copies of the November 1997 working paper available online. That's the document that was made available for the U.S. public review, and you can find it at several web sites. It's currently available for download from http://www.setech.com/c++access.html#Access, or, as I mentioned last month, www.cygnus.com/misc/wp/dec96pub. Keep in mind that what you can get at these sites is not the document that was approved as the DIS. It's the version before the DIS, and some changes were made to turn it into the DIS. It's the best that you can get at the moment, though, if you're not participating in the standards effort.

Q

I've just read your answer to Rob Richardson about streaming debug messages. The point is to call ostream::flush() for every streaming job. For such purposes I use temporary stream objects in the following manner (actually, it doesn't need to be a template function, but my compiler sometimes has problems resolving calls where streams are passed via ostream reference):
#ifdef NO_RESOLVING_PROBLEMS
std::ostream
protected_ostream(std::ostream &os)
{
  return std::ostream(os.rdbuf());
}
#else
template <class oStr>
oStr protected_ostream(oStr &os)
{
  return oStr(os.rdbuf());
}
#endif

// some code
catch(std::exception const &e)
{
  protected_ostream(cout) <<
    e.what() << endl;
  protected_ostream(errfile) <<
    e.what() << endl;
}
Of course we can simply write:
{
  ostream(cout.rdbuf()) <<
    "ala ma kota" << endl;
  ostream(errfile.rdbuf()) <<
    "ala ma kota" << endl;
}
but I think this is less readable than the function call. I wonder what the Standard says about temporary object lifetime. My technique works for all compilers I know (I've never lost an error message), but maybe this is not portable? — Krzysztof Berezowski

A

The code isn't portable, but not for the reason you suspect. There's no problem with the lifetime of the temporary — it hangs around until the end of the statement. The problem is that you can't actually write to it. I'll give you the details in a moment, but first I want to talk a little bit about conditional code.

A problem that we all run into is code that suddenly stops working when we change compilers. Our first inclination — and it's a good one — is to figure out what works for the new compiler and conditionalize the code. That way, we have code that works with each of the compilers we're using. In your example, because some compilers have problems with references to ostreams, you added a template version for those compilers.
There's a further step, however, and it's one that often gets left out: checking whether the new version works with all of the relevant compilers. If it does, you can forget about the original version — you don't need it. In your example, just use the template, with a comment explaining that the simpler and more obvious solution doesn't work for some compilers. Don't take on the burden of maintaining multiple versions of a function if you don't have to. Use conditional code only when the compilers you work with require different coding approaches. If you have a solution that works for all platforms, just use it.
Now, back to the technical issues that your solution presents. First, as you've noticed, the call to protected_ostream produces a temporary object of type ostream. Back in the olden days it wasn't clear how long that temporary would last before its destructor was invoked. Some compilers invoked the destructor as soon as possible, and some invoked the destructor at the end of the block in which the temporary was created. That led to uncertainties about the timing of the destructor call, and folks were justifiably nervous about code that created temporary objects. That's all been changed, though.
According to the language definition, in most cases [1] the destructor of a temporary object will be invoked at the end of the full expression [2] in which the object was created. In your example, this means at the end of the insertion expressions in your code. So the temporary will hang around until the insertions have been completed, and it will be destroyed at the end of the insertion expression. A simpler version of this trick that I used to use is:
std::ofstream ("error.log",   std::ios::app)
    << "Error at line " << lineno
    << '\n';
My code relies on the same assumption as yours: the temporary ostream object will be destroyed, closing the file, after the error message and the newline have been inserted into the stream.
I don't use that trick any more (and I urge you to view it with some skepticism) because it is no longer valid, except in somewhat narrow situations. The problem is that an object of a class type that is returned by a function is not an lvalue, so there are some significant constraints on what you can do with it. In particular, you cannot pass it to a function that takes a non-const reference. For example:
void f( std::ostream& );
void g( const std::ostream& );
void test()
{
  f( protected_stream(std::cout) ); // invalid
  g( protected_stream(std::cout) ); // valid
}
The call to f is not valid, because it attempts to pass a temporary object to a function by reference. The call to g is okay, because it gets the ostream through a const reference. The idea behind this is that making changes to a temporary object really doesn't make sense, because those changes will be lost when the temporary goes away.

The rule is actually a bit more confusing, because you are allowed to call non-const member functions on temporary objects. This means that the code examples we've been talking about here are actually valid, because the inserters for the built-in types are members of ostream. So inserting a char *, as we've been doing here, works. The inserter for a user-defined type, though, can't be a member of ostream, so you won't be able to insert your own types into a stream with this technique. Thus:
protected_stream(std::cout) << "this is legal"; // valid
class C {};
std::ostream& operator << ( std::ostream&, const C& );
C c;
protected_stream(std::cout) << c; // invalid
I don't know of any commercially available compilers that actually enforce this restriction now, so you aren't likely to notice the change. Be aware, however, that when you upgrade to a newer version of your compiler, it will refuse to compile code like this.

Q

I am a little puzzled by your response to a query re the conditional operator. The code presented will not compile under ISO C rules because conditional operators return rvalues. So, for example, your
(b==6) ? a : c = 3 ;
at the top of page 81 (CUJ, January 1998) does not compile because, depending on the value of b, the conditional evaluates to either the value of a or the value of c, and 3 cannot be assigned to either of these. — Francis Glassborow

A

Whoops, you're right: in C the result of the conditional operator is always an rvalue, so you can't assign to it. I don't know how I got it into my head that it was an lvalue. In C++ it's an lvalue, so you can assign to it. This is a subtle difference between C and C++. The C++ rule is not incompatible with C, in the sense that valid C code continues to be valid in C++, but it's incompatible in the other direction. That is, it's valid in C++ and it looks like ordinary C code, but it's not valid in C. That makes it something to avoid. If you or the company you work for write a fair amount of code in both C and C++, you'll find this usage rather confusing. All of which underscores the statement I made in my column that an ordinary if statement is often a better approach.

Q
I am interested in your discussion of iostreams in the September issue of C/C++ Users Journal. I also have a question about formatted input using iostream. Usually, I use the scanf (sscanf, fscanf) function to handle formatted input in C. Supposing I have this kind of input file,
  1  14040 VI2   KNOT   0.0000  3000.0000  -2
I open the file and place each line in a buffer, and then, using sscanf, I extract the data into predefined variables. I do this because sometimes the input can vary: it is possible, for example, that the third data item may be missing. To handle this case, I check the return value of the sscanf function and determine if a data item is missing. Then I rescan the buffer, but now with a different format, of course. Things can be more complicated when any of the data items might be missing, instead of just the third item. I can skip the data I don't want with this format, sscanf(buff,"%*d%s%s%s%f%f%d,...). This function is quite easy to use yet very handy. When I rewrite my program in C++, I can still use this function by including <stdio.h>, but I thought iostream could handle it better using the input extractor. Could you give a design example of a class to handle formatted input utilizing iostream? — Harimurti Widyasena

A

C++'s iostreams can do everything that the printf and scanf families of functions in C can do, and they're more flexible. The price you pay for that flexibility and the type safety of iostreams is that code that uses them is more verbose. You've described two problems that can be easily solved with scanf. I'll reduce them to simpler examples and show how to accomplish the same thing with iostreams.

Your first example deals with files containing lines with different data formats. To make a much simpler example, let's say that we have a file that contains some lines with a single int value and some lines with two int values. Your technique would handle such a file like this:
void read()
{
  int r1, r2;
  int v1;
  char buff[256];
  int res;

  if( fgets( buff, sizeof(buff), stdin ) == NULL)
    exit(EXIT_FAILURE);

  res = sscanf( buff, "%d %d", &r1, &r2 );

  if( res != 2 )
  {
    res = sscanf( buff, "%d", &v1 );
    if( res != 1 )
      exit(EXIT_FAILURE);
  }
}
That is, the code first reads a line from stdin into the buffer buff, then attempts to extract two integer values from that input line into the variables r1 and r2. If that fails, it then tries to extract a single integer value from that input line into the variable v1. I've called the function exit when the code couldn't make sense of the input, either because it wasn't there or because it wasn't formatted in either of the two ways that this code recognizes. In a real application, of course, you'd need a more sophisticated error handling mechanism.
That's a good approach. Whenever there's a possibility that the input data won't be in the format that you're expecting, read the input into a buffer and then try to extract values from the text in the buffer. If you don't do this and the first call to scanf fails, you have no easy way to recover. If you're dealing interactively with users, you can simply tell them that their input was not valid, but that can get to be rather tedious. It's usually better to try to make sense of the input, experimenting with different formats that they might have used and seeing what happens. When you're reading from a file, you can't ask to have the data retransmitted in a different format you have to take whatever you've been given. You could, of course, mark your position in the input file before you try to read the data. Then, if you don't recognize the data, seek back to the starting position and start again. That approach works for files, but not for input from the console. You're better off adopting a uniform approach, reading into a buffer and working from there.
You can do the same thing in C++, but, as I said, it's a bit more verbose:
void read()
{
  int r1, r2;
  int v1;
  std::string buff;

  std::getline( std::cin, buff );
  if( !std::cin )
    exit(EXIT_FAILURE);

  std::istringstream input(buff);
  input >> r1 >> r2;

  if( !input )
  {
    input.clear();
    input.seekg(0);
    input >> v1;
  }
  if( !input )
    exit(EXIT_FAILURE);
}
Here, the code reads a line into a string. If that fails, the function calls exit. If it succeeds, the code constructs an input stream that reads from the string, then tries to read two integers from the string. If that fails, it resets the stream's error state, seeks back to the beginning, and tries to read a single integer from the string. If that too fails, it calls exit.
There's one major design difference between what the C version of this code does and what the C++ version does: the C++ code does not use a buffer with a hardcoded size. It uses a string, which will grow as necessary to accommodate the input. That's an advantage if there's a possibility of an input line that's too long for the buffer — the C version will fail if this occurs. This benefit comes at a price, though: the C++ version is significantly slower than the C version. If you've got a controlled input source so that you can be sure that you won't need to deal with a line that's longer than you expected, the C version's speed advantage can be compelling.
The second problem that you asked about is skipping fields that you're not interested in. You can do this easily in scanf by putting a '*' after the '%' in a format specifier. That tells scanf to read the characters for that type, but not to store the value anywhere. For example:
void read()
{
  int val;
  scanf( "%*d %d\n", &val );
}
This code will read an integer value from standard in, ignore it, read a second integer value from standard in, and store that value in val.

There's no simple way to do this in C++ other than the brute force technique of reading into a dummy variable:
void read()
{
  int dummy;
  int val;
  std::cin >> dummy >> val;
}
If you have several fields of the same type that you want to skip, you can read into the same dummy variable several times. Beyond that, there's not much you can do to make this code look more elegant.
Of course, the ultimate question here is, why are you rewriting working code? That can be a useful exercise to better understand C++, and if that's why you're doing it, go ahead. Also, it's sometimes useful to rewrite C code in C++ when you're enhancing the code, because C++ makes it possible to do some things more easily than you can do them in C. In this case that doesn't seem to apply. If this code is part of a production application and it meets the application's needs, leave it alone. If you're adding new code to the application and writing that new code in C++, don't feel compelled to replace existing code with C++. You can compile your old code as C and call it from C++. You may even be able to recompile your old code as C++ with only a few minor syntactic tweaks, and call it from your new code. Unless there's a good reason to rewrite it, though, don't change it. If it ain't broke, don't fix it.

Notes

[1] The major exception here involves binding a reference to a temporary. In that case the temporary is supposed to hang around for as long as the reference does. I've always found this rather confusing, and haven't relied on it in any of my code. Some compilers don't implement this.
[2] There's a formal definition of "full expression," but it probably fits your intuition: for simple statements, the full expression ends at the closing semicolon. If your code uses a conditional operator that creates a temporary, some compilers used to destroy the temporary when the conditional operator completed. If that occurred before the end of the full expression that contained the conditional operator, you could be in trouble. Imagine what would have happened with code like this:
int f( Foo *fp )
{
  int val;
  val = ((fp == 0) ? Foo() : *fp).getValue();
  return val;
}
In this case, when f is called with a null pointer, the conditional expression generates a temporary object of type Foo. If this temporary is destroyed at the end of the conditional expression, the member function getValue would be called on an object that had already been destroyed. This was a problem that people ran into with some compilers, and it's one of the motivations behind the requirement that the temporary live until the end of the full expression. o

Pete Becker is Technical Project Leader for Dinkumware, Ltd. He spent eight years in the C++ group at Borland International, both as a developer and manager. He is a member of the ANSI/ISO C++ standardization commmittee. He can be reached by email at petebecker@acm.org.