May 2002/Refactoring Global Objects in Multithreaded Applications

Multithreading

Refactoring Global Objects in Multithreaded Applications

Patrick Tennberg

Although you may get fired for introducing any new global variables, it's too much work to rewrite old code to remove them. So make them thread-safe and stop worrying.

Introduction

Say that you have a single-threaded application that uses Singletons or other shared global data/state. One fine day, the application is refactored and made multithreaded, where various parts of the code — including parts that try to use the shared global state — may suddenly execute concurrently. Either you expend the effort to determine and write the additional serialization code needed to protect all of the shared objects, or the various threads will try to access the global shared data without serialization and very bad things will happen. This article demonstrates an intermediate alternative.

In this article, I will present an example of refactoring using the proxy pattern [1] to guarantee that each thread gets its own instance of a global object and at the same time allow client code to homogeneously reference the object. The refactoring named “Access Object through Proxy” will effectively allow us to get rid of a long parameter list and the chain of dependencies that often plague multithreaded applications.

Refactoring from Single-Threaded to Multithreaded Code

Refactoring is the art of gradually improving the architecture of an existing application without changing the underlying functionality. The origin of refactoring goes back to the Smalltalk community. More recently it has become a mainstream discipline with the book Refactoring: Improving the Design of Existing Code [2]. Refactoring an application is a difficult task and dangerous — dangerous because if you make a mistake, your working application suddenly doesn’t work and you have introduced subtle bugs. The keyword for successful refactoring is therefore “gradually,” and each class should contain the code necessary to test itself.

I have often seen examples where single-threaded applications have been turned into multithreaded applications by moving the code from the process’ primary thread entry point (e.g., main) into a thread entry point of execution. A textbook example of a multithreaded application is a server that listens for clients in the main thread and then spawns a worker thread for each connecting client. If the worker thread code relies on Singleton objects — or, worse, global data — for keeping state and other volatile data, then a new instance needs to be created for each thread. The creation of global objects is done in the entry point of the worker thread, and the data is then passed as a parameter to all parts of the system that need it.

This approach may cause several architectural problems. For example, if you introduce a new global object, you need to add code to pass it around to every piece of code that uses it causing a chain of dependencies. You might also encounter long parameter lists to constructors and methods that over time become hard to understand and maintain. There are two standard methods for refactoring [2] that allow us to deal with long parameter lists: “Replace Parameter with Method” and “Introduce Parameter Object.” Both of these methods can help you to get rid of a long parameter list, but they do not solve the dependency problem related to introducing new data.

Thread Local Storage

TLS (Thread Local Storage) [3] allows you to associate a data instance with a specific thread (for example, giving each thread its own copy of what would otherwise be a shared global Singleton). Under Windows, TLS is implemented by internally associating an array of slots with each thread (under Windows NT 4 there are 64 slots, and under Windows 2000, there are over 1,000). Windows exposes a set of four functions to cover allocation and freeing, as well as placing and retrieving, data from an allocated slot. When you allocate a slot, you tell the system that this particular slot is in use for all threads created within the process. Every time you create a new thread, you associate the slot with a data instance in the thread’s entry point. The code used from the thread can then refer to its own copy of the data by retrieving it from the previously allocated slot. (For information on other platforms, see the sidebar, “TLS on Other Platforms.”)

Before you get your hands dirty in the implementation details, it’s important to note that sometimes you do want a shared global state anyway, even with multiple threads, and you then need to provide serialization (e.g., using a mutex or another synchronization mechanism). The refactoring proposed in this article is not a silver bullet for migrating all single-threaded applications. Instead, you need to analyze the data/state from case to case and apply serialization where it’s appropriate and to use this method for global data/state that needs to be truly global.

Implementation

The implementation goals were to provide a lightweight encapsulation of TLS and an intuitive way to access the data. The proxy class (Listing 1) encapsulates TLS by allocating a slot in the constructor and freeing it in the destructor. The method setData is used to associate a data instance with the slot, while the pointer to member and indirection operators allow you to retrieve it.

The test program (Listing 2) shows the usage of the TLS proxy class. The primary thread spawns two worker threads that use the helper function PrintResult to display a counter. Each thread creates its own instance of the Counter class and associates it with the proxy (and a TLS slot) by calling the method setData. PrintResult accesses the counter through the proxy and, as a result, will display different values depending on which thread calls it.

In a real application, the proxy class can be created in one compilation unit and then can be accessed from other units by declaring the proxy class as extern in a header file.

Conclusion

Singleton is a useful pattern that can be hard to enforce in a multithreaded environment. Using the refactoring methods presented in this article, you can package Singleton objects to appear unique to the code that uses it, while really being unique only within each executing thread. This refactoring method can also be used to quickly port a single-threaded application relying heavily on global data/state to a multithreaded environment, without altering the architecture. When the application is up and running, you can gradually refactor the application and improve the architecture.

Another usage of the TLS proxy class is in support routines where you need temporary storage. With the TLS proxy class, you can wrap the storage and access it through the proxy saving the overhead of allocating data for each call and, at the same time, allowing thread-safe access to the functionality.

I thank my friend and colleague, Lennart Börjeson, and my friend and mentor, Carlo Pescio, for their help with this article.

References

[1] Erich Gamma, et al. Design Patterns Elements of Reusable Object-Oriented Software (Addison Wesley, 1995).

[2] Martin Fowler, et al. Refactoring: Improving the Design of Existing Code (Addison Wesley, 1999).

[3] Jeffrey Richter. Programming Applications for Microsoft Windows, 4th Edition (Addison Wesley, 1999).

[4] David R. Butenhof. Programming with POSIX Threads (Addison Wesley, 1997).

Further Reading

Bruce Montrose. “A Thread Local Storage Class for Win32,” C/C++ Users Journal, November 1997.

Patrik Tennberg. “Creating Active Data Types via Multithreading,” C/C++ Users Journal, January 1998.

Patrick Tennberg received his B.S in computer science from the University of Umea, Sweden. He has been in the industry for 15 years and is currently a partner with Cinnober Financial Technology. He can be contacted at pte@canit.se.