December 2001/Guidelines for Wrapping Sockets in Classes

Internet and Network Programming

Guidelines for Wrapping Sockets in Classes

James Pee

Programming sockets has always tested one's attention to detail. Wrap stuff in a class and voilà — Cake City.

Introduction

In client/server programming, the communication layer over which the client and server pass data is a key element. Though there are several ways to implement this communication layer, at the core this layer is almost always implemented via sockets. Sockets provide a mechanism for network communication, and some implementation of an interface exists on most operating systems, most of which are very similar. There are different kinds of sockets with different properties. Some allow data to pass back and forth, while others only allow one message to be sent at a time; some are reliable and guarantee transmission, and others are not. Sockets also operate on different domains. For instance, Unix processes may communicate with each other in the AF_UNIX domain. However, the most interesting domain deals with communication across a network: the AF_INET domain.

This article assumes that you are familiar with socket programming and instead focuses on how to successfully use sockets in a real world environment. This includes wrapping calls to the underlying socket API, encapsulating the concept of a socket on a class level, and implementing and using a platform-independent Socket class. If you are not familiar with the basics of socket programming, I suggest W. Richard Stevens’ Unix Network Programming Volume 1 (Prentice Hall, 1990), which covers the socket API in great detail and is a perfect socket reference for any platform.

Reasons for Wrapping the Socket API

Socket programming is fairly straightforward at the most basic level and really only requires a handful of function calls. Once a socket is created with the socket function and connected with the connect function to a server, data can immediately be sent or received on the socket. For example, you can connect to a server listening on a known port and send data to it easily with a call to write and get data from it with a call to read, much like file-based reading or writing. Listing 1 shows a simple socket application that connects to the mail transport agent running on port 25 and sends mail via SMTP. I used a TCP socket, which is a full duplex stream that allows data to be passed back and forth, on the Internet domain by creating the socket with the AF_INET, SOCK_STREAM, and the default protocol flags. The underlying protocols on which TCP is implemented guarantee that the data will be transmitted in a reliable manner. Then, after filling out the sockaddr_in structure with the hostname, port number, and address family, the socket is ready to be connected. After connection, the remainder of the program looks exactly as it would if it was taking input and output from a file.

A server application is nearly as easy to write as a client application and needs only a few additional function calls. After creating a socket, it needs to be bound to a port with bind so that clients can connect to the server an advertised port. After binding to a port, the server application needs to listen for incoming connections with listen. The application then calls accept to wait for clients to connect. Whenever a client connects, accept will return a socket descriptor that can be used for communicating with the client. The original socket descriptor can continue to be used to listen for incoming connections. Listing 2 shows a simple client that echoes back data to the client. You can simply connect to the port with telnet and see the results. There are only slight changes between the client example and this server example. The port changed from 25 to 2112 because the well-known mail process already uses port 25.

Then, instead of initiating a connection with the connect call, the server binds to the specified address and port. In the example, the server binds to the localhost address 127.0.0.1 and the port 2112. In practice, you need not specify the actual address to which the server will bind. The value INADDR_ANY serves as a wildcard and allows you to avoid specifying a particular address. Binding is not necessarily a server-only process; however, this is usually not done on the client side as the operating system will automatically bind to a freely available port. Of course, the server side makes this call so that clients can connect to a known port.

Though these examples show the simplicity of using the socket API directly, there are a number of things that make using sockets in this way needlessly complex and error prone. These potential pitfalls range from trivial to serious, and wrapping these API calls certainly minimizes the impact of socket programming errors in your code. In any nontrivial application, it is always preferable to wrap calls to the socket API to ensure that they are used properly. This is done to ensure that all the error conditions and return codes are checked whenever an API call is made. Wrapping these functions can make it easier to deal with the API and the return codes. Or you might like to provide a function that creates a socket and binds it to a port, thereby avoiding another potential trouble spot. If you are developing a library to be used by others, you may choose to wrap the internals so that the underlying calls are hidden from other developers. This is particularly important in a development environment where not everyone is a socket-programming expert, or when dealing in a cross-platform environment where there may be subtle differences in the socket API.

One key thing to keep in mind when working with sockets is that reading and writing to a socket is not guaranteed. That is, a call to read (or recv) may return prior to reading the requested number of bytes, and a call to write (or send) may return prior to writing the requested number of bytes. However, these may not be error conditions and may simply require additional calls to read or write (or send or recv). In practice, that is why calls to the socket API are almost always wrapped. There are a number of ways to wrap the calls, but most importantly, you should check the return codes for errors and iteratively make the calls to read or write until the requested number of bytes has been processed. There are quite a few resources available that show how these functions can be wrapped.

Signal handling may prove to be another potential trouble spot, and as such, it is important to develop robust signal handling when working with sockets. Unhandled signals in a socket application may obviously cause the application to crash, but mishandled signals can be cause for even greater concern. For example, if you connect to the server example above and disconnect from it, the server receives a SIG_PIPE signal and terminates, but in the forked server model, there are SIG_CHILD signals that need to be handled. Not catching that signal could result in floating zombie processes. It is possible that these zombie processes maintain open socket descriptors, potentially crippling your server by not allowing new descriptors to be opened when clients connect. To deal with these situations, you need to implement a signal handler to handle any potential signals and deal with them appropriately.

In the server example, the server can only service one client at a time. In most instances, you would want to create a multi-process or multithreaded server to handle multiple simultaneous connections instead of the single-threaded server. To do this, the program should be altered to allow child processes or threads to handle communication with the client after the connection has been made. The parent process can then be reduced to the task of accepting connections. You should certainly pay close attention to how the handoff from the parent to the children takes place. In a threaded environment, you most likely would have the threads wait on a condition variable and awaken to handle incoming clients in a mutually exclusive manner so that no two threads would be servicing the same client. In a forked server, you can simply fork a new process to handle each connection and have those children die when the connection is complete. In the second case, you bear the overhead of process creation at the time the connection is made. Instead, you could prefork these processes and have the server behave much like the threaded server described above.

Implementing a Socket Class

In C++, creating a socket class to further abstract the API is even more beneficial. Though creation and maintenance of a socket may stay primarily the same, with functions wrapping API calls, an elegantly defined class can be both useful and easy to use. If the design of this class follows that of an already established method of reading and writing, like C++ iostreams, then reading and writing to a socket becomes second nature. Since sockets behave much like files, using iostreams as a model is perfect. By adding functions that handle all the native types to this class, you can reduce reading to or writing from a socket to a call to either the operator >> or operator << methods. By making a few minor additions to the socket class, it can extend to work with user-defined classes in an open-closed manner. To do this, there must be a base class from which all other classes that will be sent over the socket will be derived. I have chosen to call this class Streamable since the socket class behaves like a stream. Streamable only defines two pure virtual members, Marshall and Unmarshall. Marshall prepares an object for transport across the socket, while the Unmarshall member reassembles the object after transport. Classes that need to be sent across a socket will now inherit from this class and implement Marshall and Unmarshall. Implementing these members is fairly straightforward if the socket class already has members to deal with the basic data types. Streamable is defined in Streamable.h (Listing 3).

At first glance, you may be inclined to introduce another base class, from which the socket class inherits, so that any type of communication can be done in a similar fashion. However, this would limit the flexibility of the communication and would require that the underlying communication mechanism be compatible with how you want to implement your socket communication. For example, you might implement writing strings by first writing an integer indicating the length of the string. If you chose to implement a file stream in the same fashion as the socket communication, you would then have a dependency on this class to read and write to a file, which is probably something to avoid.

Of course, you could provide differing implementations in the derived classes, but in my experience, I have not found this to be useful.

With the addition of Streamable, the Socket class must implement two additional functions to support the reading and writing of Streamable types. These two functions are the operator<< and operator>> members of Streamable. Socket is defined in Socket.h (Listing 4). Now, by implementing Marshall and Unmarshall in the derived classes, an object can be transported across the socket identically as a native type. This can simply be done by evoking operator<< and operator>> already defined in the socket class, keeping in mind that the calling order of the functions must be the same in the two functions.

Something important to keep in mind when dealing with integer-based data that is passed between a client and a server is that numerical representation can differ from machine to machine. Underlying calls must first transform the numerical data to network byte ordering and then decipher them on the recipient’s side. Implementing operator>> and operator<< with calls to the htonl and ntohl family of functions will eliminate problems that could arise with different byte ordering on the client and server.

There is a final problem when dealing with client/server programming: the difficulty in dealing with pointers. In reality, the real problem with pointers only arises when they point to derived classes. In all other cases, you can simply dereference the pointer and send the data across the socket. To deal with this problem, you have to implement some type of class factory. The class factory will perform the task of creating a concrete object whenever a pointer to a derived class is being received from a socket. You need to consider this at the beginning of your design process as adding in a factory could potentially break your existing code. There are a number of ways to implement a factory, but at the core a factory is a class whose responsibility is to create instances of classes. One way to do this involves creating a base class from which all classes created by the class factory will inherit. Each derived class will then be assigned a class ID, usually obtained by calling a virtual function defined in the base class. The class factory can then create objects based on that ID and return some reference to that class, most likely as a pointer to the base class. By adding something like this to the socket implementation, you will allow the proper object to be created on the recipient’s side prior to unmarshalling by first sending across the class ID and creating the object via the class factory. At this point, Unmarshall can be evoked by the correct derived class.

Using the Socket Class

In Listing 5, I show the example echo server as above except using Socket. This example clearly shows socket communication in more of a C++ style and hides the details of the socket API. Aside from the first few socket-specific calls, there is no difference syntactically between the socket functions and similar functions in iostreams. Of course, this is the most basic example of using the Socket class: streaming native C++ types.

Listing 6 is a simple example of a server that treats user-defined classes in the same manner, which really shows the power of the Socket class. By simply adding the inheritance to the Streamable class and implementing the Marshall and Unmarshall functions, the Message class can be streamed across Socket. This clearly cleans up the socket programming interface tremendously and allows any developer to use sockets without having to deal with the difficulties innate to that type of development. Not only does this interface hide those details, but it can be used to hide details of other complex issues related to socket programming as well. Stream-based encryption can be added to the underlying members in the Socket class to perform encryption and decryption at one central point, for example.

This socket implementation allows you to develop clients and servers without being a socket expert. Any developer that knows how to use iostreams can also use sockets and can be a socket programmer without having to know about all the pitfalls and trouble spots associated with socket programming. Having said that, there are several potential trouble spots associated with socket programming, and this class by no means addresses all of them. Because of this, it is important to localize those trouble spots and hide the underlying mechanism as much as possible, especially on a large project. By taking the socket API and implementing a socket class, the problems associated with socket programming can be worked out on a small scale and then deployed when they are in working order. Problems that arise or changes necessary due to advancements in technology, like the transition from IPv4 to IPv6, can then be dealt with by making changes to a single class, rather than having to modify socket code sprinkled throughout your code base. Though this class minimizes the learning curve when dealing with sockets, the underlying implementation must be robust. Failing that would result in less than desirable communication, which minimizes the usefulness of client/server applications.

James Pee is a software engineer at G Systems in Plano, Texas. He can be reached at james@jamesandwaysquared.dhs.org.