Dr. Dobb's Journal April 1998
Communication protocols have one basic task -- to guarantee that messages sent from network entities reach their destinations without getting lost or corrupted. This is easier said than done, however. The challenge for communication system designers is to be ready to recover from message loss in unreliable communication channels.
One well-known solution to the problem of flow-and-error control is the sliding-window protocol, which is based on the sender having a message buffer. The messages are continuously numbered as 0,1,...,N-1,0,1,...,N-1,0... that is, modulo N. The sequence numbers of messages the sender has sent with no acknowledgment from the receiver are kept in a cyclic list called a "window," and the corresponding messages are temporarily stored in the buffer (of size N). Whenever a new message is sent, the upper window edge is incremented (modulo N) by one and the corresponding message is stored in the buffer. When an acknowledgment of a successful message transfer is received, the lower window edge is incremented (modulo N) by one. The sender may send new messages as long as there are empty slots in the window (and in the buffer).
Figure 1 illustrates the windowing mechanism with N=8. The sender is currently waiting for the acknowledgment of the sent messages numbered 2, 3, 4, and 5. The next message to be sent has 6 as its sequence number (modulo 8).
The advantage of the windowing mechanism becomes evident when a message is lost (or corrupted in such a way that the receiver cannot recognize it). Suppose the sender obtains the message Nak(5) about the loss of message number 5 in Figure 1. This means the receiver has not received the message with sequence number 5; however, the messages numbered 2, 3, and 4 have reached their destinations in proper form. In this case, the lower window edge is shifted up to 5, and just the message numbered 5 is resent from the buffer. The sender may fill the window and the buffer by also sending the messages numbered 6, 7, 0, 1, 2, 3, and 4 (modulo 8), but after that, it has to stop and wait for the acknowledgment of the resent message 5.
To handle possible losses of the (negative) acknowledgments from the receiver, the sender has a timer that fires if no acknowledgment for a resent message is obtained within a certain period of time. The timer firing simply starts a similar resending process as a negative Nak acknowledgment.
Because of the complexity of communication systems, protocol engineering has evolved into a discipline of its own. This development has led to several formal specification and verification methods, conformance testing strategies, and protocol engineering languages and tools.
Kannel is an object-oriented programming language designed for protocol engineering. Kannel provides facilities for describing the main aspects of communication protocols and implementing the corresponding software. Distribution, architectural layering, message transfer, and reactivity can be expressed in Kannel with linguistic elements such as processes, channels, ports, transfer syntaxes, and statecharts.
Kannel was developed by the Nodes research group (of which we are members) at the Department of Computer Science, University of Helsinki. Kannel is freely available at http://www.cs.helsinki.fi/ research/kannel/.
In addition to these protocol-like features, Kannel includes the conventional machinery of general programming languages. Most notably, Kannel's object-oriented type system provides mechanisms such as interfaces, subtyping, code inheritance, polymorphism, and dynamic binding. From this point of view, Kannel can be compared to object-oriented languages such as C++, Eiffel, and Java.
Kannel is partly visual, in that many protocol aspects can be written in a graphical notation. This applies most notably to modular layers and statecharts, two central protocol components with a standard graphical representation. These also have a purely textual syntax, as does the entire language.
In our Kannel implementation, we have split the sliding-window protocol into three layers (see Figure 2), each with a well-defined task:
The birds-eye view of the system is created with the graphical editor ked. The actual contents of some processes have been hidden in a cloud to keep the picture uncluttered.
The three layers are called "branch processes" since they only modularize the protocol structure by hiding the layer contents, a useful feature when you wish to have several type-compatible layers that are substitutable with each other (however, our example does not exploit this feature).
Kannel processes can only communicate through named channels that attach to ports within the process. Channels are abstract data paths that convey messages from one process to another. There are two kinds of channels -- local channels connect processes that reside within the same address space, while separate channels connect distributed processes. Communication over a separate channel is nontrivial and must be realized either by a lower layer or by a driver object that represents the physical layer of a protocol (such as a network adapter). Figure 2 depicts four channels: UserCon and ConDat specify the messages exchanged between UserLayer, LinkLayer, and TransferLayer; whereas the separate Transfer channel specifies the messages exchanged within the TransferLayer. The TimerCh channel attaches timers to the TransferLayer components. Table 1 lists some of the messages.
The actual protocol work is done by leaf processes, which encapsulate attributes, methods, and a state automaton. The automaton controls how the process reacts to incoming messages. For example, the send_lost message that aborts a connection (see Table 1) can only be applied when a connection is already established. A state automaton provides a clean way to specify this constraint.
Kannel employs statecharts to describe process behavior. Statecharts (developed by David Harel) are a structured variation of state automata that allow you to cleanly separate concurrent activities, such as sending and receiving, from each other. In addition, states can contain substates that share the transitions of their parents, making descriptions more compact. For more information on statecharts, see "Extended State Diagrams and Reactive Systems," by Doron Drusinsky (DDJ, October 1994).
Figure 3 illustrates the statechart of ConnectionManager instance cm1. A request for opening a connection is initiated with the conn_req message issued by the User process. When cm1 receives this message, one of its concurrent substates, send_conn, reacts by changing from state disconnected to state opening, and by sending a SendFrameReq(T_CIND, NO_DAT) message (see Table 1) to the DTransfer process below. To keep statecharts compact, Figure 3 only shows the messages that trigger the transitions (see Listing One for the textual equivalent of Figure 3). Concurrent substates (also called AND states), such as "Connection" in Figure 3, have a shaded background.
The receiving end must first confirm that it accepts a connection. Eventually, the packet sent arrives as a conn_ind message to the rec_conn submachine within the second ConnectionManager process cm2. This triggers the sending of a SendFrameReq(T_CCNF, NO_DAT) message that eventually arrives at cm1 as a conn_conf, and finally establishes the connection. Now cm1 can begin sending data to cm2. Note, however, that since the protocol is fully symmetric, the send_conn of cm2 remains in state disconnected and cannot be used to transfer data to cm1 unless a similar transaction is performed in the reverse direction.
The DTransfer process is responsible for providing a reliable data path to the remote system. The sliding-window mechanism it employs can recover from lost and duplicate packets while still being reasonably efficient in its use of the network bandwidth. Figure 4 shows the statechart for the DTransfer process. It consists of an AND state Transmission that contains two concurrent substates, Sender and Receiver, making simultaneous sending and receiving possible. Both states use auxiliary variables listed in Table 2.
The sender transmits numbered packets (received via send_frame_req from ConnectionManager) to the receiver. The packets sent remain within send_window until they are acknowledged. If the window ever becomes full, Sender stops sending packets and instead stores them in send_buffer to wait for subsequent transmission. The numbering ensures that packets are received in correct order.
Whenever a packet is sent, Sender starts a timer that is used to control the retransmission of lost packets. If the timer expires (message t_exp), the packet is resent and the timer is restarted. This process is retried at most RETX_MAX times, after which Sender gives up and terminates the connection (see Listing Two).
However, under normal conditions the receiver will get the packet and reply with an ack(n) message, which removes packets numbered 0,...,n from send_window and refills the window from send_buffer if it contains any data (see Listing Three).
The receiving end will detect when a packet is duplicated or lost and send information about it in a nak(n) message. This causes Sender to discard all packets up to n from the window and retransmit all remaining packets in the window. Interestingly, nak(n) thus combines a positive acknowledgment for packets 0,...,n-1 with a negative acknowledgment for packet n (see Listing Four).
The receiving side of the protocol depicted by Receiver in Figure 4 is simpler, as it does not have to store any packets. The basic operation is performed in state data where iframe messages are checked for a matching sequence number. If a match is found, the packet is acknowledged. Otherwise, the process sends a nak message and enters state nak_sent, where all incoming packets are discarded until the one with the expected sequence number is found.
To cope with sender failures, nak messages also use a timer that eventually causes the connection to break if the sender keeps quiet.
The Transfer channel is separate (see the dashed line in Figure 2), forcing the two protocol stacks {u1,cm1,d1} and {u2,cm2,d2} into separate executables.
To keep things simple, we use the Socket driver from the Kannel library to interconnect the stacks. Socket delivers packets using the UDP protocol. The socket object is configured with information obtained from the command line when starting one side of the protocol stack (see Listing Five). After initialization, the syntax for using the channel is uniform; the driver is abstracted away, making it simple to change the driver to accommodate changes in the hardware.
The Kannel environment includes a tool called the Kannel Animator (Kana) for the symbolic execution of protocols. Because Kana is interpreter-based, it is highly interactive. In addition to controlling the execution and allowing easy manipulation of run-time data structures, the tool accepts normal Kannel code. Kana animates both the internal activity of the protocol processes and message passing between them. The animation trace produced by the tool can be saved for later examination. The user can specify breakpoints to pause execution, then inspect and modify the attributes. Process priorities and message buffers can be changed to illustrate lost or duplicated messages.
Kana animates the protocol with message sequence charts (see Figure 5). The processes, with their names and types, are displayed as boxes in the upper part of the MSC window. The boxes can be collapsed or opened to, say, display the subparts of a branch process. The vertical lines act as endpoints for the horizontal arrows that visualize the messages sent between the processes. Kana highlights the currently active process and the message that was last sent. The messages that have not yet been received are shown as dashed arrows.
As an example, consider the sliding-window protocol in a situation where a packet (IFrame) is lost. Figure 5 is a Kana session where the groups of three processes on the left and on the right represent the protocol stacks of the sender and the receiver, respectively. The simulation has stopped due to a breakpoint. The u1 process has issued requests to send data to the receiver.
The requests are shown as SendReq messages to the lower layer (cm1), which has redirected them to the d1 process as SendFrameReq messages. The d1 process in turn has sent them across the network inside three IFrame messages.
The receiver has acknowledged the first IFrame normally, and delivered it to the upper layer as a DataInd message. You can see that the second frame has been lost, because the last frame has already been received. The user has manually deleted the second IFrame from the receiver's (d2) buffer to simulate the message loss. The next message that arrives is out of sequence, and the receiver issues a negative acknowledgment (nak) asking the sender to retransmit all frames onward from the one the receiver is waiting for. When the sender gets the nak message, it retransmits the second and third IFrame. They, too, will eventually be acknowledged and removed from the sender's send window.
Kana can also animate the graphical specifications created with the ked tool.
Figure 5 illustrates the statechart of the d2 process (the receiver's lowest layer). The process has received a frame that is out of sync and that has fired the transition leading to state nak_sent. The process has subsequently sent a negative acknowledgment and is awaiting the correct frame. When that frame arrives, the process will return to its normal state data. If, however, the timer of the receiver should fire, the nak message would be retransmitted and, eventually, the connection would be terminated.
We continue to actively develop Kannel. Most notably, we're adding a mechanism called "refinement" for extending protocol layers in a type-safe manner. Furthermore, a flexible interface for supporting the different transfer representations of messages is in the works.
DDJ
and state connection is init send_conn;
state rec_conn is
...
end rec_conn;
state send_conn is
...
state disconnected arcs
conn_req -> {
peer ! SendFrameReq.create(T_CIND, NO_DAT);
go opening
}
disc_req -> { err("not connected") }
send_req -> { err("not connected") }
end disconnected
end send_conn
end connection
t_exp -> { if t_exp.num <> REC_TIMER then
f : IFrame;
if send_count >= RETX_MAX then
up ! new RecLost;
send_buffer.clear;
...
send_window.clear;
current:= 0;
send_count:= 0
else
sw_iter.initiate(send_window);
loop sw_iter.exhausted.until;
f:= sw_iter.get_current;
if f.frame_num = t_exp.num then
peer ! f;
tim ! TStart.create(t_exp.num, TIME);
send_count:= send_count+1;
break
end;
sw_iter.next
end loop
end if
end if
}
ack -> { f,g : IFrame;
loop send_window.remove_head(f).while;
tim ! TStop.create(f.frame_num);
send_count:= 0;
if send_buffer.remove_head(g) then
g.frame_num:= current;
peer ! g;
tim ! TStart.create(g.frame_num, TIME);
send_window.put_tail(g);
inc(current)
end;
(f.frame_num = ack.frame_num).until
end loop
}
nak -> { f : IFrame;
loop send_window.first(f).while;
(f.frame_num = nak.frame_num).until;
send_window.remove_head(f);
tim ! TStop.create(f.frame_num)
end loop;
sw_iter.initiate(send_window);
loop sw_iter.exhausted.until;
f:= sw_iter.get_current;
peer ! f;
tim ! TStart.create(f.frame_num, TIME);
sw_iter.next
end loop
}
main(dvec[string]) is send_buffer:= IFrame_queue.create(SEND_BUFFER_SIZE);
send_window:= IFrame_queue.create(SEND_WINDOW_SIZE);
sw_iter:= IFrame_queue_iter.create(send_window);
current:= 0; -- Number for next iframe
expected:= 0;
send_count:= 0; -- Retransmission counters
rec_count:= 0;
if arg.size < 4 then
stdout.print_str("Usage: " + arg.get(0) + " s-port host t-port\n");
SYS.shutdown
end;
peer.configure(SOCK_SPORT, arg.get(1)); -- source port
peer.configure(SOCK_THOST, arg.get(2)); -- target host
peer.configure(SOCK_TPORT, arg.get(3)); -- target port
peer.configure(SOCK_TCP, "false"); -- use UDP
peer.open
end main