December 2000 Java Solutions/Java Named Semaphores

Java

Java Named Semaphores

Mark Nadelson

Java doesn't directly support synchronization of processes across a heterogeneous network. But thanks to Java's cross-platform nature, it is surprisingly easy to accomplish.

Introduction

I recently started developing applications in Java, after many years of C++ development. To my satisfaction I have found the transition smooth and have enjoyed the many "built-in" features of Java. These features are not readily available in C++ without the addition of libraries or DLLs. One of my first projects was to develop a multi-tiered network application that would run in a heterogeneous environment consisting of both Windows NT and Unix operating systems. To complicate matters, applications at level B could proceed only when applications at level A were complete, and applications at both levels shared a common resource.

Java makes the heterogeneous part of this requirement easy. Since Java can run on both Windows and Unix platforms, only one version of the application must be developed. This is unlike my C++ days, in which I would be required to write a separate version for Windows NT and Unix.

The next hurdle to overcome was that of sharing resources and synchronizing processing over a network. Back in my C++ days sharing resources and synchronizing processing was accomplished using operating system specific named semaphores. Unfortunately, because Java is platform independent and named semaphores are operating system specific, Java does not provide this facility from within the JDK. This situation provided me with the challenge of developing my own named semaphore technique using existing Java objects. The result proved to be surprisingly simple and powerful. With very little code (especially compared to accomplishing the same task using C++) I was able to develop cross-platform Java named semaphores.

Overview of Semaphores

A semaphore is a variable that can be updated atomically and that provides one or more cooperative processes with mutually exclusive access to a critical section of code. A semaphore is very similar to a mutex, which also provides mutually exclusive access to critical sections of code, but within a single program. When protecting critical sections of code, such as code that accesses a common memory block, a mutex is said to be locked. This prevents other threads from accessing the same memory block at the same time. When the thread is done using the memory block it unlocks the mutex, allowing the next thread to lock it and access the memory block. A semaphore extends the functionality of the mutex by allowing multiple locks before achieving a total lockout. The semaphore accomplishes this by keeping a count of the total number of locks currently in effect. Using a semaphore as a counting device enables you to control the number of resources a program uses while avoiding the excessive CPU load that results from polling.

A semaphore is typically used as follows: each time a process acquires a new resource it invokes the semaphore's lock method, decrementing the semaphore's internal count. When the semaphore's count reaches zero, the semaphore is completely locked, preventing other resources from being used. They cannot be used until an unlock is performed on the semaphore, increasing its value above zero.

When a semaphore is used within a single process it is known as an unnamed semaphore. An unnamed semaphore has scope only within the process that creates it. In contrast, a named semaphore has scope beyond the process that creates it. The named semaphore can be accessed by all other processes running on a workstation. Processes access the semaphore by its unique name, which is assigned to it upon its creation. Any other process that wants to use the semaphore can retrieve it from the operating system by referring to its given name.

Both Windows NT and Unix provide methods for using a named semaphore via C++. Of course the method of creating and using the semaphore is different for each operating system. Under Windows NT you create a named semaphore using function CreateSemaphore, you access it via OpenSemaphore, you lock it using WaitForSingleObject, and you unlock it using ReleaseSemaphore. On Unix you create and access a named semaphore using sem_open, you lock it using sem_wait, and you unlock it using sem_post.

I could have implemented named semaphores in Java by employing the JNI (Java Native Interface) routines. With JNI I could have created a C++ DLL that manipulated the named semaphore. However, this DLL would be platform specific and my application would lose the inherent cross-platform capabilities that Java provides. In addition to the loss of heterogeneous processing, such a named semaphore would have scope only within the workstation on which it was created. Neither operating system provides named semaphores that have scope throughout a network.

Instead of JNI I used the basic constructs provided by Java. Semaphores created in native Java are compatible and easily deployed on multiple operating systems. The remainder of this article explains the constructs Java provides for mutually exclusive access to critical areas of code and expands on those basic constructs to create Java unnamed semaphores. Building upon the unnamed semaphores, I create a named semaphore system.

Java Synchronization and Mutexes

Java provides a couple of methods for protecting critical areas of code and memory blocks within multithreaded programs. One way to make object methods thread safe is to apply the synchronized keyword. When a method is synchronized, Java guarantees that only one thread has access to the method at any given time. A good place for synchronized methods is within an object such as a linked list, that manipulates dynamic memory. The deleteFirst method of such an object removes the head node from the list. If two threads attempt to delete the first node from the list at the same time, a memory corruption might occur in an unsynchronized version of this method. By using the following code segment, each thread removes the beginning node from the list without the occurrence of memory trampling.
public synchronized void deleteFirst()
{
   Object oldFirstNode = (Object)getFirst();
   currentFirstNode = (Object)getFirst().getNext();
   oldFirstNode = null;
}
This method of synchronizing threads is useful when the code or memory block being protected is encapsulated within an object method.

Java also provides protection for shared resources that are used within different threads within the same program and are not encapsulated within a single method. This protection is accomplished via Java's version of the mutex; in Java, every object implicitly implements a mutex by inheriting the wait, notify, and notifyAll methods of Object (the base class of all Java objects).

To protect a critical piece of code via the Java mutex, the programmer places a call to wait within the source code, immediately before the code to be protected; and a call to either notify or notifyAll immediately after the critical section.

When a thread executes wait, one of two things happens. If no other thread has executed wait, wait does not block, and the thread passes through, gaining access to the critical code. On the other hand, if another thread has already execute wait, processing on the current thread is suspended — wait blocks until another thread calls notify or notifyAll. (If wait is called without arguments it blocks forever. wait can also take timeout parameters, which include the number of seconds and nanoseconds to wait before timing out.)

When a thread has finished executing a critical section of code, it calls either notify or notifyAll to allow other threads to access the critical section. The difference between notify and notifyAll is that notify will inform just one thread that it can stop waiting, whereas notifyAll informs all waiting threads. The use of notify is problematic if multiple threads are running, because it notifies only one other thread. The thread of choice is arbitrary and the one chosen is not guaranteed to be waiting at the time notify is called. This means that the waiting threads remain blocked in their wait state since they have not been notified. To make sure all threads receive the notify signal, you can use notifyAll. Then if multiple threads are waiting, one is arbitrarily chosen to discontinue waiting. The conclusion to be drawn from this is to always use notifyAll to avoid potential deadlock problems.

Problems with the Java Mutex

A problem with using the wait/notify construct is that you can't have multiple mutexes within a single object. Since every Java object extends from Object, each Java object contains one and only one mutex. If an application has multiple critical sections that could be synchronized separately, it would be nice to have corresponding mutexes to protect each one. Using only one mutex for different critical sections means that each thread has to wait on any other thread in any other critical section before access is allowed. Such protection is overkill since it will cause threads to block on critical sections that are not being accessed by any other threads. This in turn slows down overall performance of the application and is a serious detriment to the parallel run-time nature of a multithreaded application.

To enable use of multiple mutexes in a single object, I have developed an explicit Mutex class, which is implemented in terms of the implicit Java mutex. This Mutex class contains two methods. The lock method executes wait; the unlock method executes notifyAll. Each method is also synchronized to ensure that only one thread will lock and unlock a critical section of code at any one time. The Mutex class is shown in Listing 1.

Moving on to Semaphores

Now that I have a Mutex object I can create multiple instances of it and use it to protect various critical sections of code separately within a single object. Unfortunately, Mutex has only two states, locked and unlocked. It still isn't powerful enough to effectively control the use of resources within a program. For instance, suppose you have a multithreaded banking application that spawns a separate thread to compute the balance for each account in the bank. This works well if there are only a few accounts, but will quickly overload CPU and memory thresholds if there are hundreds of accounts. You need a mechanism to control the number of threads spawned. One solution is to create a few threads at a time and wait for them all to complete before spawning more. This method is inefficient, since some accounts may be smaller than others and complete more quickly. Because some threads will finish before others, less than the maximum number of threads will usually be running. Ideally, a new thread would be spawned each time a thread completed, thereby maintaining the maximum number of threads.

Controlling the number of resources used within a program can be easily and efficiently accomplished through the use of unnamed counting semaphores. Instead of having only two states, like Mutex, the semaphore can have multiple states. Mutex's states include locked and unlocked. The unlocked state can be equated to a value of one and the locked state can be equated to a value of zero. The semaphore can be given a maximum value greater than one. Each time lock is executed on the semaphore, its value is decremented by one. This lock operation does not cause processing to be suspended until the semaphore's value reaches zero. Likewise, executing unlock on the semaphore increments the semaphore's value by one up to its maximum value.

I created a counting semaphore class by extending Mutex. The Semaphore class is shown in Listing 2.

Semaphore's constructor takes an initial value and a maximum value. Think carefully before setting the semaphore's initial value to zero. If the first action taken on the semaphore is lock, a deadlock situation occurs, since the semaphore is already in a locked state.

The lock method decrements the semaphore's value by one. If the semaphore's value reaches zero, then lock makes a call to the lock method of the semaphore's Mutex subobject.

The unlock method increments the semaphore's value by 1. If the value after incrementing is greater than the maximum specified for the semaphore, it is set back to the maximum value. In effect, the value never gets incremented beyond its maximum value.

The final method provided by Semaphore is a getValue function for the purpose of returning the current value of the semaphore.

Using a counting semaphore to control the number of threads spawned is straightforward. I construct a semaphore with both the initial and maximum value set to the number of threads that can be spawned. I then loop through all bank accounts, spawning a thread for each one, passing it a reference to the semaphore. For each thread spawned, the loop code calls the semaphore's lock method. When the semaphore is truly locked this method blocks and the loop is suspended. Before a thread exits, it calls the unlock method on the semaphore. This enables the account loop to continue and spawn one additional thread. Using this technique I am guaranteed to have the same number of threads running at any given time (except of course when the last few accounts are being processed).

Java Named Semaphores

Mutexes and unnamed semaphores are useful when the goal is to protect or control critical resources within a single program. Named semaphores are needed when it is necessary to protect resources shared by multiple processes. Such a resource might be a shared file or a process server. Because Java applications can run on multiple platforms, Java named semaphores are able to synchronize processing across a network.

Suppose I have a program that acts as a shared memory server. This server both accumulates information from and provides it to other processes. Each time a process sends data, the shared memory server performs some computations, and makes the data available to those processes that need it. It is important to ensure that the processes needing the data do not access it before it is available in the shared memory server. This synchronization operation can be enforced via named semaphores.

The process writing to the shared memory server locks the server's semaphore, thus preventing any other processes from accessing it. When the semaphore is unlocked, the retrieval processes can lock the semaphore and access the server. Since the semaphore is locked, new data cannot be written until the current data is retrieved. The named semaphore being used in this example has only two states, locked and unlocked, and can therefore be thought of as a named mutex. It provides each process with mutually exclusive access to the shared memory server.

The semaphore must be shared among all processes, and these processes need a way to reference the semaphore without using an address, since they do not share the same address space. (They might not even be on the same machine.) Instead, the processes reference the semaphore by name. The application that creates, manipulates, and manages all named semaphores in the network is known as the semaphore server. When a process creates a named semaphore it makes a call to the semaphore server with a request to create a semaphore with a given name, and initial and maximum value. The semaphore server creates an unnamed semaphore and stores it along with its name within an object. The semaphore server then stores this object in a list of current named semaphores. If the semaphore already exists, the semaphore server returns an error. When a client makes a request to lock or unlock a named semaphore, the semaphore server searches its list of named semaphores, and if found, locks or unlocks the associated unnamed semaphore. After all processes are finished with a named semaphore it should be removed from the semaphore server. If the semaphore is not removed, the semaphore server returns a failure message the next time a request is made to create it.

Since the semaphore server is accessed over the network, a communication protocol must be used. Java's RMI (Remote Method Invocation) facility makes access to the semaphore server relatively seamless. The semaphore server's methods appear to be regular procedure calls, with the underlying TCP/IP communication protocol hidden from the user.

The first step in creating the semaphore server it to define the RMI interface. The RMI interface contains all the methods that the semaphore server makes available to connecting clients. These methods include create, lock, unlock, remove, and getValue. Each method takes one argument, the name of the semaphore. The NamedSemaphore interface definition is shown in Listing 3.

The next step is to implement the NamedSemaphore interface. This implementation (NamedSemaphoreImpl) extends Java's UnicastRemoteObject (the object that implements RMI protocols) and implements NamedSemaphore. The NamedSemaphoreImpl class contains one data member. This data member is a Java Hashtable, which uses the semaphore name as its key and the underlying unnamed semaphore as its value. Each method of NamedSemaphoreImpl requires the semaphore name as a parameter, and each method checks to see if this name exists within the hash table. For example, when create is called, it performs a lookup on the hash table to determine if the name already exists. If the name exists, create throws an exception, which bubbles up to the RMI client. create is the only method that throws an exception if the name already exists. Since all other methods manipulate a previously created named semaphore, these methods throw an exception if the name does not exist. The implementation details for the NamedSemaphore server are shown in Listing 4.

The next step is to compile both the interface and the implementation. RMI communication stubs must be generated from the compiled NamedSemaphoreImpl class. RMI stub files are automatically generated class files that implement the underlying communication protocol for RMI. To generate the stub files from NamedSemaphoreImpl, execute the following command:
rmic NamedSemaphoreImpl
If your interface is within a package, rmic must be called with the full package name. Since the stub files are compiled together with the semaphore server, they must exist within your semaphore server package.

With the named semaphore implementation defined and RMI stub files created, the next step is to define the semaphore server process itself, NamedSemaphoreServer. The only thing NamedSemaphoreServer does is create an instance of NamedSemaphoreImpl and bind the RMI lookup name, "semaphoreServer", to the instance of NamedSemaphoreImpl. Before executing NamedSemaphoreServer make sure you have rmiregistry running in the background. NamedSemaphoreServer should also be run in the background.

A Producer/Consumer System Using Named Semaphores

A producer/consumer system makes a good example of how named semaphores can be useful. In such a system a producer sends data to a central processor, and the consumer extracts the data from that processor. Before retrieving data, the consumer must make sure that the producer has sent its data to the processor. Likewise, the producer should wait to send additional data to the central processor until the consumer has processed the current data. Clearly these two processes must be synchronized. This can be accomplished with named semaphores.

Using named semaphores, the producer/consumer system unfolds as follows. Both the producer and consumer process must be spawned separately. Since there is no way to guarantee which process starts first, both must create and initialize the named semaphores. In order to synchronize processing, two named semaphores must be created. One semaphore is created with the name "consumer" and the other is created with the name "producer". Once the first process is spawned, the semaphores are both created. The second process spawned fails in its attempt to create the semaphores since they already exist.

Both semaphores are created with a maximum value of one. The producer semaphore is initialized with a value of one (unlocked) and the consumer semaphore has an initial value of zero (locked). Because the producer's semaphore is unlocked, it can write data to the central processor, guaranteeing that the consumer has data to read. Initially locking the consumer semaphore ensures that the consumer does not try to access the central processor before data is available.

After the consumer starts, it attempts to lock the consumer semaphore. The semaphore remains locked until the producer writes data to the central processor and unlocks "consumer". If the consumer is able to obtain the "consumer" lock, it retrieves data from the central processor and unlocks the "producer" semaphore. The consumer then loops, locking the "consumer" semaphore before reading more data from the central processor. The consumer is started with the following command [line broken to fit column]:
java Sem.Consumer <Central Processor Host Name>
   <Semaphore Server URL>
When the producer starts up, the "producer" semaphore is unlocked. This allows the producer to write data to the central processor before the consumer has a chance to access it. After the producer writes data it unlocks the "consumer" semaphore and loops back, attempting to lock the "producer" semaphore and write more data to the central processor. The producer is started with the following command [line broken to fit column]:
java Sem.Producer <Central Processor Host Name>
   <Semaphore Server URL> <Amount of Data to be Sent>
Figure 1 illustrates the producer/consumer system.

In order to manipulate the named semaphores, the producer and consumer clients must be able to access the semaphore server. The semaphore server is accessed by specifying the host and the name of the RMI semaphore server. The name of the semaphore server is the one used when binding the RMI lookup name. In this case the name is "semaphoreServer". The name of the host on which the semaphore server is run is specified using a URL. The URL takes on the following format: rmi://www.yourserver.com/RMIServerName. In this example both the producer and consumer are running on the same machine as the semaphore server. Therefore the URL used is simply rmi://localhost/semaphoreServer or just semaphoreServer. To gain access to the semaphore server, the URL is passed to the Naming.lookup RMI method. This method returns a reference to NamedSemaphore:
NamedSemaphore sem =
   (NamedSemaphore)Naming.lookup("semaphoreServer");
The methods of NamedSemaphore are called as if they are local method calls, thereby hiding the fact that the semaphores being manipulated are remote.

Conclusion

Although they are not a part of the language, Java provides all the basic building blocks needed for creating unnamed and named semaphores. The advantage of Java semaphores over semaphores implemented in other languages is that Java semaphores are platform independent while maintaining implementation consistency. By taking advantage of RMI, Java named semaphores can be used not only by processes within a single workstation (like traditional named semaphores) but they can also be used by processes distributed over a network.

Mark Nadelson has been programming for the last eight years in a variety of industries including telecommunications, internet software development, and financial computing. He has co-authored two books titled Making Unix and Windows NT Talk and C++ Objects for Making Unix and WinNT Talk. His other interests include his wife Kim and daughter Hannah.