Adding Exceptions & RTTI to the Windows CE Compiler: Part II

Dr. Dobb's Journal September 2002

Tweaking the TCU library

By Dani Carles

Dani is a senior software engineer for Silicon & Software Systems (http://www .s3group.com/). He can be contacted at dani.carles@s3group.com.

In the first installment of this two-part article, I described why Silicon & Software Systems (where I work) had to reengineer an RTOS-based application so it could run under Windows CE. The problem, you recall, is that the Windows CE 3.0 compiler doesn't support C++ exception handling and Run Time Type Identification (RTTI). The freely available TCU library (see "Resource Center," page 5), which emulates exception handling and RTTI, came to the rescue — to a point. For a number of reasons, we ended up modifying the library to solve our problem. This month, I'll present those workarounds.

The TCU Library

The TCU library defines two types of stacks that are built dynamically at run time:

The destruction stack is a list of pointers to tcu__Xsc objects that need to be destroyed during stack unwinding.
The try block stack is a list of pointers to try block descriptors. These descriptors are pushed and popped from this list as the code works its way though the call stack. In the working case I presented last month, the try block in main() would be the only member in the stack. If foo() had a try block of its own, a descriptor for it would be added to the top of the stack, and so on.

When an exception is thrown, the execution flow is implemented in terms of the C Runtime Library setjmp()/longjmp() functions. Every try block descriptor contains a jmp_buf array used by setjmp()/longjmp() to save/restore the program environment. try block descriptors also contain a pointer where the current top of the destruction stack is saved, a pointer to the previous top of the try stack, and a pointer to the currently thrown exception, if any.

Code helps in understanding how a try-catch clause is implemented by TCU. Listing Four is the implementation of the try-catch block in Listing Three (presented last month). A local try block descriptor is instantiated on the stack along with a local pointer to the current exception. The catch clauses are implemented as if-else branches of a setjmp(). The conditions of those branches make use of RTTI to determine what exception handler to use. Once an exception handler is entered, a local reference to the current exception is made available. Then the pointer to the current exception is cleared to signify that the exception was successfully caught.

Every execution thread stores a pointer to its exception-handling context in Thread Local Storage (TLS) memory. This context contains pointers to the top of the current destruction stack, top of the try block stack, and current exception object.

Every time an object derived from tcu__Xsc is constructed, its address is pushed onto the current destruction stack. When it is destroyed, the address is popped out of it. When a try block is encountered, its descriptor address is pushed onto the try block stack, the address of the current destruction stack is saved into the descriptor, and the current destruction stack is cleared from the context. This is because every try block only cleans up what was constructed within its own scope. When the try block is left either via normal exit or exception throwing, objects in its destruction stack are destroyed, the try block descriptor address is popped off the stack, and the saved destruction stack address is restored. If further unwinding takes place, these objects are the responsibility of the next try block up the call stack.

The method that implements the actual throwing of an exception gets the top of the destruction stack from the context, and triggers the destruction of the objects it contains. When the destruction is complete, it jumps back to the nearest try block (which it gets from the context) via longjmp().

Throwing from a catch Block

One problem we ran into when using the TCU library out-of-the-box, so to speak, involved throwing inside a catch block. Listing Five is the problem reduced to its minimum expression. What happens when you run the program is that the final assertion fails. The condition in the assertion checks that as many UnwindableObj objects were destroyed as constructed. A quick inspection of this counter reveals a value of 1, suggesting there is one object whose destructor is never called. We found that the culprit was the first exception object created on the heap — the exception object thrown from bar(). So what's the problem? A search within the TCU code reveals that the current exception is only deleted in two places — inside tcu_xTerminate() and in the destructor of the try descriptor. Recall from Listing Four that the try descriptor is only a local variable. Because you are exiting the catch clause by throwing a different exception (using longjmp()), the try descriptor destructor is never called, so the current exception is never cleaned up. If you want to allow exceptions to be thrown from inside catch blocks, what's needed is some way of forcing that destructor to be called.

To make matters worse, this is only the tip of the problem. The fundamental problem with TCU is that it does not have enough context information to know if it is inside a catch block. Listing Six presents a more serious consequence of the same problem. Suffice it to say that the TCU_X_RETHROW statement in main() does not behave as expected and the exception is not rethrown. The flow of execution continues as if no exception was thrown. What is actually happening is that the try block descriptor in bar() is wrongly clearing the current exception from the context.

Copy Constructors and Object Destruction Order

Another problem with the TCU library concerns unwinding the stack in the presence of implicit copy constructor calls. Remember that each object derived from tcu__Xsc points to the previous object on the destruction stack. It is this pointer to the previous object that TCU uses to reset the top of the stack when destroying an object. This works fine if the object being destroyed is indeed on the top of the stack, and this is the assumption TCU is making throughout. However, there is at least one case when this is not so. Listing Seven shows this situation. In the example, function printUnwObjList() is only a debug aid that displays the contents of the destruction stack.

In a standard conforming C++ implementation, the order of constructor and destructor calls should be:

ret.DerivedObj()

d.DerivedObj(ret)

ret.~DerivedObj()

d.~DerivedObj()

where, for brevity, only the most derived constructor/destructor calls are shown. TCU relies on destructors being called in strict reverse order to that of construction. In other words, the object being removed from the stack must always be the top one. When ret is removed in this example, the object on the top is actually d, given the construction/destruction order just shown. TCU sets the top of the destruction stack in the current context to be ret's previous object, in this case 0 or the end of the list. The result is that d's destructor is never called.

The Modified Library

The library with our modifications that overcome the mentioned problems is available electronically. Here, I describe the changes we made and how they are implemented.

Adding catch blocks. To support exception throwing from a catch block, you first need to be able to tell whether you are in a catch block. Once you know, you can modify the unwinding code to force a call of the try block descriptor destructor — a call missing in the original TCU implementation. You can also make sure that the current exception object is only deleted when it is not rethrown and that the current exception pointer is only cleared from the context when no new exception is thrown. To this effect, I added a new class I call a "catch block descriptor," which exhibits this behavior: Its constructor registers itself with the current context, so that it is possible to tell from the context that the execution has entered a catch block. It then stores a pointer to the try block descriptor associated with this catch block (this pointer is needed during cleanup). Finally, it stores a pointer to the current context exception — this is the exception caught by the catch block (the try block destructor needs access to this pointer).

Its destructor unregisters itself from the current context, thereby indicating you are leaving the catch block. Next, it deletes the caught exception object only if it is not rethrown. Then it clears the current exception pointer from the context only if no new exception is thrown from this catch block.

Finally, the new class has a cleanup() function that makes an explicit destructor call of this object and makes an explicit destructor call of the associated try block descriptor.

The try block descriptor class needs to be modified so that:

A pointer to an associated catch block descriptor is added to the class members.
In the constructor, if the current execution is inside a catch block, save the pointer to the catch block descriptor and unregister that pointer from the context. While inside this try block, you are effectively no longer in a catch block and must update the context accordingly.
In the destructor, if a catch block descriptor was saved, register it back with the context. Get the current exception from the catch block descriptor and register it with the context.
In the unwind() method, if you are inside a catch block, make a call to its cleanup() method. This forces both the catch block and try block destructors to be called before longjmp() is called.

A pointer to a catch block descriptor is also added to the context. Again, this lets you tell whether the execution path is inside a catch block (but not inside a nested try block).

Finally, the TCU_CATCH_XX macros are modified to include the instantiation of a local catch block descriptor variable on the stack. This object has the same scope as the catch block itself, so its constructor is called when the block is entered and its destructor is called when the block is left. If the block is left via longjmp(), the destructor is still explicitly called by the cleanup() method.

Searching the object list. The second problem we encountered concerned the object destruction order. The workaround I implemented does not incur excessive overhead when the construction and destruction sequences are symmetrical. This is the usual case. For less common asymmetrical situations (such as with copy constructors), an extra list search occurs. This solution introduces an extra check to see whether the object being destroyed is the same as the top of the destruction stack. If it is, things proceed as before; otherwise, the destruction list is searched for a match and, if the pointer is found, removed from the list. We did not see any noticeable performance decrease in our application after this change was introduced. That said, our application wasn't hard real time, so functionality was more important than performance. That might differ in your case.

RTTI and DLLs. Again, RTTI does not work when objects cross DLL borders in the TCU library. Here's why: Our inspection of the original TCU code reveals the template class in Example 1. This template class is used to get a unique ID for class T. This unique ID is, in fact, &tcu__RttiTypeIdImplementation<T>::s_id and references to this address are made throughout the code. The problem with RTTI and DLLs is that every DLL has a different instantiation of the aforementioned template class and, therefore, the address of its only static data member will be different. The workaround I propose here was never integrated in our embedded product because the DLL issue was not a problem for us. However, we did test it in a Windows NT environment and it should work equally well for Windows CE.

The template class in Example 1 can be rewritten as in Example 2. All the direct references to s_id's address in the code are replaced with calls to GetUniqueId(). Now, if you write a specialization of GetUniqueId() for classes whose objects may cross DLL borders, the problem goes away — well, as long as the returned const void * is a unique number associated to that particular class. One way to do this in Win32 is by using the Win32 API RegisterWindowMessageA() function, which takes a string as an argument and returns an integer that is guaranteed to be unique throughout your system, even across different processes. Adding Example 3 to the API-specific part of TCU should do the trick.

Major Limitations

Of course, even the modified TCU library has limitations, including:

Intrusive. You must change your code to use the TCU-specific syntax.
Exception-sensitive classes must either directly or indirectly inherit from tcu__Xsc.
It can turn single inheritance into multiple inheritance and may turn nonvirtual multiple inheritance into virtual multiple inheritance.
Exception classes must be directly or indirectly derived from exception base class tcu_Xc; for instance, exceptions of class int or std::string cannot be thrown.
All exception classes must provide a copy constructor.
Try, throw, and catch constructs must be replaced with library-specific macros.
Exceptions inside constructors do not behave according to the C++ Standard. According to the Standard, an object is not considered fully constructed until its constructor has completed. Therefore, exceptions thrown from inside a constructor do not result in the object's destructor being called. The TCU library incurs a destructor call in this case. Both TCU and a C++ Standard conformant implementation call the destructors for the base class and any data members.
Exception specifications are not supported.
First operation in the constructor body of any class derived from tcu__Xsc should be TCU_X_RESET. This is true even for constructors with an empty body. This statement may or may not be omitted for classes that have no members derived from tcu__Xsc.
When implicit copy constructor calls are involved, there is a performance penalty to be paid in the form of a list search. This is linear time so it can be noticeable for large lists.
If objects of a particular class have to cross DLL borders, an extra template specialization is required as part of the class definition for RTTI to work as expected.

Conclusion

The TCU library is no match for native exception handling and RTTI mechanisms implemented at compiler level. However, it offers a workable alternative for Windows CE embedded systems that are not bound by hard real-time deadlines. The usefulness of the TCU library is twofold:

If you are writing code from scratch, it lets you avail of two missing but otherwise important C++ language features — RTTI and exception handling.
When porting code that relies on those features, this may be your only (cheap) option.

Obviously, if a more robust solution is desired, investing in a different compiler should be considered. There is at least one company that offers custom ports of their EDG C++ front end with a C back end tailored to your specific platform (à la cfront). This would not only give you RTTI and exception handling, but almost full C++ Standard compatibility including the most obscure features you can ever dream of. Almost, that is, because as I write this, nobody in the world has been able to implement the controversial export keyword.

Acknowledgments

Thanks to Dominic Herity and Oscar del Pozo for their helpful suggestions in improving the clarity of this article. I would also like to thank Derek Dwyer, Brian McNamara, Pavol Droba, and Clare McCormack at Silicon & Software Systems for their help throughout a difficult project.

DDJ

Back to Article

Listing Four

{
    tcu__XTryBlock __tcuXtb__; 
    tcu_Xc *__tcuXcPtr__; 
    __tcuXcPtr__ = 0; 
    
    if (!_setjmp (__tcuXtb__.jmpBuf())) 
    {
        {
            DerivedObj d;
            foo();
        }
    } else if ((__tcuXcPtr__= 
           tcu_rttiDynamicCast<ExcDerived>(__tcuXtb__.valuePtr())) != 0)
    { 
        ExcDerived &e = *reinterpret_cast<ExcDerived *>(__tcuXcPtr__); 
        __tcuXtb__.valuePtr() = 0;
        {
            cout << "ExcDerived caught: " << e.what() << 
                                        " (0x" << &e << ")" << endl;        
        }
    } else if ((__tcuXcPtr__= 
           tcu_rttiDynamicCast<ExcBase>(__tcuXtb__.valuePtr())) != 0)
    { 
        ExcBase &e = *reinterpret_cast<ExcBase *>(__tcuXcPtr__); 
        __tcuXtb__.valuePtr() = 0;
        {
            cout << "ExcBase caught: " << e.what() << 
                                     " (0x" << &e << ")" << endl;
        }
    } else 
    { 
        __tcuXtb__.valuePtr() = 0;
        {
            cout << "Unknown exception" << endl;
        } 
    }
}

Back to Article

Listing Five

#include <iostream>
#include <cassert>
#include "tcutest.h"

using namespace std;

// File scope functions
namespace {
void bar()
{
    cout << "In bar() ..." << endl;
    UnwindableObj bar1,bar2;        
    TCU_X_THROW(ExcDerived("A derived exception"));
}
void foo()
{
    cout << "In foo() ..." << endl;
    UnwindableObj foo1,foo2;    
    TCU_X_TRY
    {            
        bar();                 
    }
    TCU_X_CATCH_TYPE(ExcDerived) 
    {       
        UnwindableObj foo3,foo4;
        cout << "ExcDerived caught in foo " << endl;        
        TCU_X_THROW(ExcDerived("A different derived exception"));
    } TCU_X_END_TRY
}
}
// Static member definition
unsigned int UnwindableObj::mv_nCount = 0;
// main
int main()
{
    cout << "Throwing inside a catch block using the 
                                 original TCU library" << endl << endl;
    // EH test
    TCU_X_TRY
    {
        DerivedObj d;
        foo();      
    }
    TCU_X_CATCH(ExcDerived,e)
    {
      cout << "ExcDerived caught: " << e.what() << 
                                  " (0x" << &e << ")" << endl;        
    }
    TCU_X_CATCH(ExcBase,e)
    {
      cout << "ExcBase caught: " << e.what() << " (0x" << &e << ")" << endl;
    }
    TCU_X_CATCH_ALL
    {
      cout << "Unknown exception" << endl;
    } TCU_X_END_TRY     
    assert(UnwindableObj::getCtorCount() == 0);
    cout << "Press Enter to exit" << endl;
    cin.get();
    return 0;
}

Back to Article

Listing Six

#include <iostream>
#include <cassert>
#include "tcutest.h"
using namespace std;
// File scope functions
namespace {
void bar()
{
    TCU_X_TRY
    {                      
        cout << "In bar() ..." << endl;
    }
    TCU_X_CATCH_TYPE(ExcDerived) 
    {               
        cout << "ExcDerived caught in bar " << endl;                        
    } TCU_X_END_TRY
}
void foo()
{
    cout << "In foo() ..." << endl;         
    TCU_X_THROW(ExcDerived("A derived exception"));
}
}
// Static member definition
unsigned int UnwindableObj::mv_nCount = 0;
// main
int main()
{    
    bool bRethrownCaught = false;
    TCU_X_TRY
    {
        TCU_X_TRY
        {           
            foo();      
        }
        TCU_X_CATCH(ExcDerived,e)
        {
            cout << "ExcDerived caught: " << e.what() << 
                                        " (0x" << &e << ")" << endl;    
            bar();
            TCU_X_RETHROW;
        } TCU_X_END_TRY 
    }
    TCU_X_CATCH(ExcDerived,e)
    {
        cout << "Rethrown ExcDerived caught: " << e.what() << 
                                             " (0x" << &e << ")" << endl;   
        bRethrownCaught = true;
    } TCU_X_END_TRY
    assert(bRethrownCaught);
    cout << "Press Enter to exit" << endl;
    cin.get();
    return 0;
}

Back to Article

Listing Seven

#include <iostream>
#include <cassert>
#include "tcutest.h"

using namespace std;

// File scope functions
namespace {

DerivedObj bar()
{
    cout << "In bar() ..." << endl;
    DerivedObj ret;  
    printUnwObjList();
    return ret;
}
void foo()
{
    cout << "In foo() ..." << endl;
    printUnwObjList();
    DerivedObj d = bar();
    printUnwObjList();
    TCU_X_THROW(ExcDerived("A derived exception"));
}
}
// Static member definition
unsigned int UnwindableObj::mv_nCount = 0;
// main
int main()
{
    cout << "Showing destruction order problem using the original 
                                              TCU library" << endl << endl;
    // EH test
    TCU_X_TRY
    {
        foo();      
    }
    TCU_X_CATCH(ExcDerived,e)
    {
      cout << "ExcDerived caught:" << e.what() << "(0x" << &e << ")" << endl;
    }
    TCU_X_CATCH(ExcBase,e)
    {
      cout << "ExcBase caught: " << e.what() << " (0x" << &e << ")" << endl;
    }
    TCU_X_CATCH_ALL
    {
        cout << "Unknown exception" << endl;
    } TCU_X_END_TRY     
    assert(UnwindableObj::getCtorCount() == 0);
    cout << "Press Enter to exit" << endl;
    cin.get();
    return 0;
}

Back to Article