Dynamically Reconfigurable Servers

Dr. Dobb's Journal January 1999

Python's extensibility makes it easy

By Ron Klatchko

Ron is manager of the Advanced Technology Group at the Center for Knowledge Management, University of California at San Francisco. He can be contacted at ron@ckm.ucsf.edu.

With the advent of the Web, it is more important than ever to have systems that run continuously. It is always business hours somewhere in the world; there is no good time to bring down servers for maintenance. Over the life of the server, it may be necessary to import new data and code. Using the techniques I present here, it is relatively easy to implement servers that can import new modules and data at run time. This dynamically reconfigurable server -- available electronically (see "Resource Center," page 5) -- is implemented in Python, a portable, interpreted, extensible object-oriented programming language. Although the server I implement here runs on UNIX, Python runs on many platforms, including most flavors of UNIX, Windows, Macintosh, and OS/2. It is freely copyable and can be used without fee in commercial products. For more information (and source code) on Python, see http://www.python.org/.

Here at the UCSF Library and Center for Knowledge Management (CKM), I needed to implement a server that could verify that a user was associated with the campus. The initial requirements only specified students, staff, and faculty, but eventually would include postdoctorate researchers and employees from the clinical enterprise. I needed to have the server up and running before getting either the schema or data for the new groups. Since the server needed to be available 24-hours a day, it needed to dynamically add new code modules and data sets without being shut down.

Dynamic Servers: Theory

Dynamic servers are partitioned into a core engine and extension modules. The extension modules need to implement a common API so that the core engine can communicate with them. To keep the system flexible, the API should be kept as generic as possible. I have found that the most productive design is based on classes with a high-level interface. In my design, the server verifies users by comparing their provided credentials against entries from the different data sets. The core engine chooses several candidate user objects and keeps only those whose credentials match the user-supplied credentials. Each user type may require different types and numbers of credentials, and the credentials might be stored in varying formats (for example, student PINs that are one-way encrypted with UNIX's crypt()). Each extension module implements a class with a single entry point; this method takes an array of credentials and returns a Boolean indicating whether the credentials match the user. By delegating the entire verification process to the class, there is no need to require common fields or storage requirements for the different types.

Another important capability is being able to load a new class at run time. Python does this via the import statement. Unlike the Java statement of the same name or the C/C++ include statement, Python's version is a run-time command -- it instructs the interpreter to load the named module. Although the import statement does not directly support specifying the module via a string, you can use the exec statement to overcome this. Example 1 defines a Python function that takes the name of the module you want to import, imports it into the module's namespace, and returns a reference to the module. Because Python treats all references identically, you can use the return value just like the name of a module you imported normally. Finally, you need to be able to access data or call code in your newly imported module. Depending on how you designed your API, you will need to proceed in one of two ways. If you have chosen constant names for your API, Python's use of references and late binding lets you access data, functions, and classes in your module directly (see Example 2). Although this is convenient, I find it confusing to have different classes with the same name.

Another option is to relate the module name to your API names. Python allows code to access module level references using the __dict__ variable. Dictionaries are Python's implementation of associative arrays (also known as "hashes"). Once you have looked up a class reference, you still need to instantiate an object via the () operator (see Example 3). The syntax for dynamic references can be confusing. Working left to right and considering what each operator returns lets you understand the syntax. From the example, mod.__dict__ accesses the namespace dictionary for the module. The [] operator does a lookup on the dictionary and returns a reference. Since the reference is for a class, you instantiate an object with the () operator.

The Shelve Module

The shelve module lets you skip all the dirty work. Shelve is a basic implementation of a persistent object store. It lets you store Python data structures (including user-defined classes) on disk. When you retrieve an object, shelve loads the module that implements the object's class.

You can use this power to greatly simplify writing a dynamic server. To add a new class to your server, just put the implementation module in the server's search path and add an object of the new class to shelve. When your server accesses the object, the new module will be loaded into your server. As long as the class supports your standard API, the server can immediately use the new object.

Concurrent Access to Data

Another issue you must handle in keeping your server continuously running is updating your data on the fly. One standard way to do this is using reader/writer locks. These locks allow many processes to read a file simultaneously, but only a single one to write it. Python's fcntl module provides access to reader/writer locks via file locking.

Anytime a write lock is held on your database, your server is unavailable; therefore, I wouldn't advise holding a write lock for an extended period of time. Unfortunately, it's sometimes not possible to release the write lock in a timely fashion. Here at the UCSF Library and CKM, our data feeds come from other parts of the campus. Rather then getting changes, I receive complete data snapshots. Replacing an old snapshot with the newly received one was easier than analyzing the differences and updating the current data. To do this, you must ensure the entire data file can be switched over atomically, so that the server will never attempt to access a partially written database.

File Consistency via UNIX Inodes

The standard UNIX filesystem uses inodes as an extra layer of indirection between a file's name and its contents. A filename references an inode, while an inode keeps a count of how many references there are to it and points to the file's contents; see Figure 1. Both filesystem entries and file-opening processes effectively increment the reference count (see Figure 2). The UNIX kernel looks up the inode when the open() call is made; from then on, the kernel has no further need of the filename. This use of reference counting helps maintain filesystem coherency. On UNIX, it is safe to delete a file while a process has it opened; the filesystem will not recycle the blocks the file is on until the process closes the file.

Understanding whether a UNIX system call interacts with a file's inode or with its content is key in implementing this process. If a file already exists, open() will preserve the file's inode and modify its contents. If you already have a new file available, you can use rename() to change the inode a filename refers to. rename takes a current filename and a new filename. If the new filename already exists, it decrements the reference count for its inode (and recycles the contents if it reaches 0); it then has the new filename point to the inode the current filename references, and removes the directory entry for the current filename. Finally, unlink() removes the directory entry for the specified filename and decrements the underlying inode's reference count. This should apply to most flavors of UNIX, although I have only verified it against Solaris 2.6 with the ufs filesystem and Red Hat Linux 5.0 (2.0.32) with the ext2 filesystem.

Let's examine what happens when you rename a file that a process has open. In Figure 3(a), filename1 references inode1; a process had opened filename1 so the process also has a reference to inode1; since both filename1, and a file descriptor reference the inode, it has a reference count of 2. After calling rename and specifying filename2 as the current name and filename1 as the new name, you have the situation pictured in Figure 3(b). Filename1 now references inode2 while the process continues to reference inode1; as long as the process keeps the file descriptor open, it will have access to the original file contents. Once the process closes the file descriptor, the file contents referenced by inode1 will be recycled. The next time the process opens filename1, it will get the new contents.

Using a temporary filename before renaming keeps the current data set available even if problems arise. If a failure occurs when creating the new file, the original contents are still available with the correct name. If for some unforeseen reason, the server process needs to be restarted while the update process is running, a full set of data is available. It also allows for the concurrent update of multiple data files. Because the data feeds came from different departments, I found it easier to store students and employees in separate databases. Although I start each update processes simultaneously, each process continues at its own pace. Instead of needing to cooperate in communicating with the server, the server can reload all the data files each time a new file is ready.

Simple IPC Using Signals

Once the external process has updated the data file, the main process must be notified so it can use the new data. UNIX provides simple interprocess communication with signals. A process registers a function to be called upon signal arrival either with sigaction() or signal(). Other processes can send a signal with kill(). Since signals arrive asynchronously, you need to make sure that it is safe to change data files. Due to its cross-platform implementation, Python only implements common signal functionality; it does not allow you to temporarily block signals. There are a few strategies to deal with this; one of the simplest is to set a flag upon signal arrival and check for the flag in a place you know can safely reload the data files.

Conclusion

At this writing, my server has been running for six months without being shut down. In this time, is has gone through numerous data reloads and the addition of a new module -- neither of which has required any downtime. Application of the techniques I have presented here can allow your server to do the same.

DDJ