April 1997/C++ Theory and Practice

Columns

C++ Theory and Practice

Dan Saks

Placement new

Build your own objects, using ordinary materials found around the home, with placement new. But be careful.

Copyright © 1997 by Dan Saks

This is the third in a series of articles on new and delete. I assume that you know by now that a new-expression creates objects in two distinct steps. First, it allocates storage for an object by calling an allocation function (a function named either operator new or operator new[]). Then it initializes the object by calling a constructor or, if the object is an array, by calling a constructor for each array element.

A delete-expression also does its thing in two steps. First, it destroys the value in an object by calling a destructor or, if the object is an array, by calling a destructor for each array element. It then deallocates the object's storage by calling a deallocation function (a function named either operator delete or operator delete[]). (For more details, see the first article in this series, "C++ Theory and Practice: new and delete," CUJ, January, 1997.)

C++ specifies the underlying machinery for each step so that programmers can fine tune the behavior of new- and delete-expressions by replacing parts of that machinery. Not only can you replace the global allocation and deallocation functions, you can also define allocation and deallocation functions as members of individual classes. (See the second article in this series, "C++ Theory and Practice: Class-specific new and delete," CUJ, March 1997.)

Class-specific allocation and deallocation functions give you the ability to establish different allocation*/- strategies for each different class type or array thereof. If that's not enough flexibility for you, C++ even lets you establish multiple allocation strategies, at either the global or class scopes. This added flexibility comes to you courtesy of a feature called placement.

Placement Syntax

A typical new-expression such as
p = new T;
allocates memory for a T object (for some non-array type T) by the allocation function call:
p = operator new(sizeof(T));
A typical array-new-expression such as
p = new T[n];
allocates memory for an array of n T objects by the allocation function call:
p = operator new[](n * sizeof(T));
The allocation functions must be declared as
void *operator new(size_t n)
    throw (bad_alloc);
void *operator new[](size_t n)
    throw (bad_alloc);
They could be global or, if T is a class type, they could be members of T.

Both size_t and bad_alloc are types defined in namespace std. As in my previous article, I shall refer to size_t and bad_alloc without the std:: prefix. You should assume that the using-directive
using namespace std;
is in effect from here on.

As with any other function, you can overload an operator new (at either global or class scope) by simply declaring additional functions named operator new with different parameter types. For example, in addition to the operator new declared above, you might declare
void *
operator new(size_t n, alloc_info i)
    throw (bad_alloc);
void *
operator new[](size_t n,
               alloc_info i)
    throw (bad_alloc);
where alloc_info is some user-defined type for conveying additional allocation information to the allocation function. That's simple enough, but how do you can you get new-expression to use one of these allocation functions instead of the usual ones?

The problem is finding some way to pass the additional argument to operator new or operator new[]. You can't add a parenthesized argument list after the type name in the new-expression because the compiler will interpret that list as arguments to a constructor, not as arguments to an operator new. For example,
p = new T (x);
passes x as an argument to a T constructor. Consequently, you must squeeze the additional argument(s) to the allocation function into some other spot in the new-expression. The chosen spot is immediately after the keyword new.

For example, the new-expression in
alloc_info ai;
...
p = new (ai) T;
specifies ai as an additional argument to operator new. As always, the new-expression uses sizeof(T) as the first argument to operator new. Thus, the new-expression results in a call to
operator new(sizeof(T), ai)
To create an array of T using the alternate operator new[], you use a new--expression such as
p = new (ai) T[n];
which results in a call to
operator new[](n * sizeof(T), ai)
Compilers look in the usual places (class and global scope) and apply the usual rules for argument matching in overload resolution to find an allocation function that will accept this assembled argument list. It's a compile-time error if no such function exists.

Since C++ uses the usual argument matching rules, the parameter type to an allocation function need not match the argument type exactly. For example, if class D is publicly derived from B, the new-expression in
D d;
...
p = new (d) T;
can invoke
void *operator new(size_t n, B &b)
    throw(bad_alloc);
because a B & can bind to a D. If there is also a
void *operator new(size_t n, D &d)
    throw(bad_alloc);
this function is a better match for argument d, and the new-expression will use this one instead.

Stroustrup [1] explained that his primary motivation for adding this syntax to new-expressions was to pass information on where to place the created object. Hence, he dubbed the argument list after the keyword new the placement syntax. A new-expression that includes the placement syntax is called new with placement or just placement new.

Using Placement new

The Standard library provides an operator new defined as
void *operator new(size_t, void *p) throw()
    {
    return p;
    }
which a program can use to construct an object at a particular address. The library also provides
void *operator new[](size_t, void *p) throw()
    {
    return p;
    }
which a program can use to construct an array at a particular address.

For example, a program for an embedded system that employs memory-mapped i/o might construct i/o port objects at specified hardware locations using placement new as follows:
class port
    {
    ...
    };

port *const console =
    static_cast<port *>(0xFF70);

...

new (console) port;
This new-expression calls
operator new(sizeof(port), console)
which ignores the first argument (the size) and simply returns console as the location of the storage for constructing a port object.

The program can then perform operations on the port almost as if it were any other dynamically-created object. For example, putting a character to the console might entail calling
console->put('x');
If class port has a destructor, the program can shut down the console sometime later by explicitly calling that destructor using
console->~port();
You should be careful to avoid writing
delete console;
because console does not point to storage that came from the free store. That delete-expression could wreak all sorts of havoc on the program.

If you prefer that the console appear to the application as an object rather than as a pointer, you can declare the console as a reference:
port &console =
    *static_cast<port *>(0xFF70);
and use &console (rather than just console) as the placement argument:
new (&console) port;
Then, member function calls applied to the console would look like:
console.put('x');
and the destructor call would look like:
console.~port();
Another use for placement new is to place an object anywhere within a particular region of memory, rather than at just one specified location. For example, in a multitasking system, you might want to place some objects in a region shared by several tasks, and place other objects in a region accessible only to a single task. This you might do by declaring
enum privilege { exclusive, shared };
void *operator new(size_t n, privilege p);
Then you can place objects in shared memory by using new-expressions of the form:
p = new (shared) T;
Both Murray [2] and Stroustrup [1] use the term arena to refer to a memory allocation region. Murray provides a fairly complete example of a simple template class for managing arenas.

Calling a Constructor

As I demonstrated earlier, C++ lets you apply a destructor directly to an object using a call such as x.~T() or p->~T(), where x is a T object or a reference to a T and p is a pointer to T. On the other hand, C++ does not let you apply a constructor using the corresponding notation x.T() or p->T(). However, you can achieve the effect of a constructor call by using placement new.

For example,
new (&x) T;
constructs a T object in object x by applying T's default constructor to x as if it were an unitialized T object. This raises an interesting question: How can you declare x so this placement new will behave itself?

x must be a region of memory big enough to hold a T. You might try declaring x as
char x[sizeof(T)];
and it might even work. However, if you try this on an architecture where a T object must be aligned on an address that's a multiple of say, four or eight bytes, the placement new expression might fail. In particular, if character array x doesn't satisfy the alignment requirement for T, the placement new expression will induce a hardware addressing fault when it attempts to construct a T object in array x.

One way to ensure that x is properly aligned is to declare it as
T x;
but then x is not just a region of memory anymore; it's a completely constructed T object (initialized by T's default constructor). Applying
new (&x) T;
constructs a T object right on top of an existing T, which might cause resource leaks. If, prior to executing the placement new, x possesses some allocated resources, the constructor call from the placement new expression will allocate resources for a new T object without releasing the old resources. Not good.

You can avoid potential resource leaks by destroying the value of x before building a new object on top of it. That is, you should call x.~T() prior to the placement new expression.

As if often the case with C++, powerful features such as placement new can be an agent for evil as well as good. For example, you can use explicit destructors and placement new to alter the value of a logically const object, as follows.

Normally, a program cannot alter the value of a logically const object. For instance, given
string const name("Ben");
where string is the standard library string class, compilers should reject any attempts to alter the value of name, such as
name = "Jeremy";
But a const object is const only until its destructor commences. So you can nullify the logical const-ness of name by destroying it and then constructing the new value in its place, as in
name.~string();
new (&name) string("Jeremy");
A complete program that does this nasty deed appears in Listing 1. The placement argument &name has type string const *. The new-expression (immediately above and in Listing 1) requires an operator new whose second parameter is either void const * or string const *. It cannot use the Standard library function
void *operator new(size_t, void *)
    throw();
because there's no standard conversion from string const * to void *. Therefore, the program supplies its own operator new whose second parameter is void const *. There is a standard conversion from string const * to void const *.

As the caption for Listing 1 says, this technique is pretty perverse. In fact, the current draft standard states that it produces undefined behavior. I'm not sure, but this might be worse than casting-away const because of the way it disguises what it's doing. I normally wouldn't even show you this except that I saw something like this in a C++ textbook whose name I won't mention. The book did not include what I would consider an appropriate "Don't do this at home" admonition, so I'm giving it to you now.

For the most part, using placement new to construct an object at a specified address is a relatively low-level style of programming. It's not something you want to do very often. As always, you must make thoughtful judgments about what's robust and maintainable.

Here's an example of a coding practice I find much more acceptable. Suppose some part of your program deletes a pointer p pointing to a T object, and then turns around almost immediately and allocates another T object, as in
delete p;
...
p = new T;
If your performance measurements indicate that the overhead of deallocating and reallocating these objects is too costly at this point in the program, you might be able to reduce that overhead by rewriting the code as
p->~T();
...
new (p) T;
As long as the distance between these two statements is short, and there's a comment to explain this little hack, I think this would be okay.

new with nothrow

In days of old, allocation functions and new-expressions used to indicate allocation failures by returning a null pointer. These days, the "usual" allocation functions indicate failure by throwing an exception of type bad_alloc. Some programmers with existing C++ applications may prefer to avoid integrating C++ exception-handling into their code. Even those who choose to use exceptions may elect to avoid them in certain parts of their application.

One way to avoid throwing exceptions when creating objects dynamically is to allocate memory using malloc instead of operator new. malloc is a Standard C function which is incapable of throwing an exception. Thus, rather than risk throwing an exception using
p = new T (v);
you can avoid exceptions by writing
p = static_cast<T *>(malloc(sizeof(T)));
new (p) T (v);
Although this placement new expression calls an operator new, it calls the Standard library's
void *operator new(size_t, void *)
    throw();
which never throws exceptions. (This discussion presumes that T's constructor does not throw any exceptions either.)

You can collapse the previous two statements into one, and avoid the cast in the process (always a good thing), by writing
p = new (malloc(sizeof(T))) T (v);
The C++ library offers a slightly cleaner way to avoid throwing exceptions from allocation failures. It provides yet another overloading for each of the allocation functions:
void *
operator new(size_t, const nothrow_t &)
    throw();
void *
operator new[](size_t, const nothrow_t &)
    throw();
These are the nothrow allocation functions.

nothrow_t is a type defined in namespace std as an empty struct:
struct nothrow_t { };
Its only purpose is to provide a type that gives the nothrow allocation functions a distinct parameter type.

The nothrow allocation functions behave exactly like the "usual" allocation functions:
void *operator new(size_t)
    throw(bad_alloc);
void *operator new[](size_t)
    throw(bad_alloc);
except when an allocation failure occurs. Whereas the "usual" allocation functions throw a bad_alloc exception, the nothrow allocation functions return a null pointer.

For example, the new-expression:
p = new (nothrow_t()) T;
creates a T object in storage allocated by nothrow operator new via the call:
operator new(sizeof(T), nothrow_t())
Notice that the placement argument is nothrow_t() rather than just nothrow_t. The argument to the nothrow operator new must be a nothrow_t object. The expression nothrow_t() yields a temporary nothrow_t object initialized using a compiler-generated default constructor.

Those added parentheses make nothrow new-expressions a little harder to read and write, so the library includes a definition for a common nothrow_t object:
extern const nothrow_t nothrow;
Using this object, you can simplify the previous nothrow new-expression by writing
p = new (nothrow) T;
You can freely intermix throw and nothrow new-expressions in your programs. In either case, you discard the objects using
delete p;
for individual objects or
delete [] p;
for arrays.

References

[1] Bjarne Stroutrup. The Design and Evolution of C++ (Addison-Wesley, 1994).

[2] Murray [1994]. Robert B. Murray. C++ Strategies and Tactics (Addison-Wesley, 1993).

Dan Saks is the president of Saks & Associates, which offers training and consulting in C++ and C. He is active in C++ standards, having served nearly seven years as secretary of the ANSI and ISO C++ standards committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield, OH 45504-4906 USA, by phone at +1-937-324-3601, or electronically at dsaks@wittenberg.edu.