Quality of Service & Voice-Over-IP

Dr. Dobb's Journal May 2001

Testing IP packets

By Vilho RŠisŠnen

Vilho is a senior specialist at Nokia Networks, Espoo, Finland. He can be contacted at vilho.raisanen@nokia.com.

Quality of service (QoS), measured by limited end-to-end delay and packet loss, is clearly important for optimal TCP operation. However, QoS is even more important for interactive real-time communication such as voice-over-IP (VoIP) or video conferencing over the Internet. That said, it is equally clear that today's Internet does not guarantee QoS.

While QoS enhancing techniques, such as Differentiated Services and Integrated Services (see http://www.ietf.org/), have been designed to tackle this problem, their deployment is still in the future. In the meantime, there are tools you can use to measure packet transport and QoS. To that end, I present in this article one such measurement system I built (see "Resource Center," page 5). First, however, I'll examine some of the issues involved when measuring packet transport on the Internet.

Measurement Basics

For illustration purposes I'll use a media stream of VoIP (packet voice) as a prototype for a QoS-critical application. The most important requirements for transport of VoIP are:

For VoIP, the allowable values for both requirements depend on various factors such as the audio coding used. Here, I'll base my discussion on the publicly available Internet Engineering Task Force's (IETF) QoS Standards. (QoS documents are also provided on a members-only access basis by other IP telephony standardization bodies such as the ITU-T, TIA, and ETSI.)

For starters, the requirements for detecting QoS problems are:

Each of these have been addressed to varying degrees in "RTP: A Transport Protocol for Real-Time Applications," by H. Schulzrinne et al. (IETF RFC 1889), the de-facto VoIP and packet-video transport protocol that provides timestamping and sequence numbering.

There are two basic approaches to measuring transport QoS: passive measurement (monitoring) of production use streams, and active measurements with a stream of test packets. The two methods are different: Active measurements, for instance, require extra bandwidth whereas passive ones do not. More detailed comparison between the two involve issues such as the passive monitoring of high-capacity links that require specialized hardware and may be affected by encryption and privacy concerns (see "Measurement and Analysis of IP Network Usage and Behavior," by R. Caceres et al., IEEE Communications, May 2000). Moreover, there are problems in always obtaining reliable delay values due to clock synchronization and unknown QoS effects from operating system and TCP/IP stack implementation.

Active measurements can be implemented in a simple way, providing a controlled and portable test environment. A higher degree of control results from using tested hardware and operating systems. Portability, for example, means that a measurement can (in principle) be made at any location with IP connectivity. Subsequently, this approach has been subject to standardization work in the IPPM working group in the IETF (http://www.ietf.org/). Two approaches to active measurements being standardized within IETF include the transmission of test packets at Poisson-distributed time intervals and the emulation of periodic media stream (such as that of VoIP).

The requirements for QoS problem detection can be met with active measurements by transmitting/receiving test traffic at or above IP level, and by using suitable timestamping and sequence numbering of the test packets. Why should you use specific test IP packets instead of ICMP ping packets? Because the use of the same protocol families as a real application is a guarantee against possible negative effects due to protocol discrimination in IP network elements.

As Figure 1 illustrates, an active measurement is performed between two hosts. Different possibilities for making the measurement include one-way measurement (A-B), paired one-way measurement (A-B+B-A), and round-trip measurement (A-B-A). One of the reasons for making measurements in different directions include possible asymmetry of IP transport delays. One-way measurements require that host clocks be synchronized to millisecond accuracy. Different arrangements have been described by V. Paxson et al., in "Framework For IP Performance Metrics (IETF RFC 2330, draft-ietf-ippm-npmps-04.txt.) and V. Räisänen and G. Grotefeld in "Network Performance Measurement For Periodic Streams" (draft-ietf-ippm-npmps-02.txt). Here, I'll describe a Linux implementation of an active measurement system that is suitable for both media emulation and Poisson-distributed test packets.

The original reason I implemented my own tools rather than using publicly available tools such as MGEN (http://manimac.itd.nrl.navy.mil/MGEN/) was to be able to experiment with both one-way and round-trip measurements. Moreover, it seemed important to be able to implement advanced features such as the setting of Differentiated Services TOS bits/IPv6 traffic class. (Setting of TOS bits seems to be supported in the latest version of MGEN, but there's no mention of IPv6.) The same concerns were raised with commercially available tools a few years ago.

The Implementation

The design principles for my QoS tool implementation are:

The first point is important because of the pressure for extending IP address space from both emerging markets and predicted ubiquity of wireless Internet access devices of the future (see http://www.3gpp.org/). IPv6 support for security and mobility is also important. At least FreeBSD 4.1/KAME as well as the latest Linux distributions support IPv6 (for Linux, IPv6 support is experimental and kernel recompilation is needed). In Linux, a protocol version of a transparent socket interface is provided by the getaddrinfo() function (see "Basic Socket Interface Extensions to IPv6," by R. Gilligan et al., IETF RFC 2553). From the point of view of VoIP application, a media emulation measurement yields realistic results when UDP sockets are used. This is because a VoIP application, such as a terminal program, makes use of a UDP port multiplexing interface. In addition, the use of a transport-level socket interface makes programming easier. For Poisson-type measurements, either UDP sockets or raw IP sockets can be used; see Figure 2. Running a program making use of raw sockets typically requires SUID root privileges.

The requirement of synchronization between host clocks in delay measurements can be circumvented while maintaining the possibility of making one-way delay measurements at the same time. By way of an example, here's how to do this with the A-B-A test arrangement of Figure 1: Set timestamp TS1 and sequence number S1 immediately prior to transmitting the test packet from host A. TS2 and S2 are set upon reception of the test packet at host B. TS3 is set immediately prior to the transmission of the test packet from B to A, and TS4 is set when the measurement packet is received back at host A. If an identification of the received packet is stored in the order of arrival at host A, packet losses and reorderings can be detected in both directions.

Figure 3 shows a general measurement packet format. TS1-TS4 are timestamps and S1 and S2 are sequence numbers. Optional padding may be used to make the total packet size correspond to realistic VoIP codec output (most relevant to media stream emulation).

With the scheme just described, round-trip delay can now be computed as d=a(TS4-TS3+TS2-TS1), where a is a conversion factor that results in milliseconds. If host clocks of A and B are synchronized to millisecond accuracy, one-way delays d_AB=a(TS2-TS1) and d_BA=a(TS4-TS3) can also be computed. Host clock synchronization can be achieved by using a Network Time Protocol (NTP) client, if an NTP server (preferably low stratum) is available nearby. (Network congestion may affect the accuracy of NTP operation.) A more accurate way is based on using a time synchronization Global Positioning System (GPS) receiver.

Proper accuracy for timestamps is milliseconds, which is easy to achieve with a gettimeofday() function, for example. Process scheduling in the operating system must be taken into account in transmitting packets; all packet transmission intervals may not be possible (at least without recompiling the kernel).

Packet losses in either direction cause loss of measurement data. This must be taken into account in the analysis phase. One approach is to record the IDs of received packets in host B and to transmit a list of the received packets to host A after the actual measurement is over. In some cases, when media stream emulation measurement is made, it may be beneficial to attempt to maintain realistic stream metrics on the IP level. If a packet is lost en route from A to B, stream metrics restoration may be attempted by transmitting a fake packet from B to A in place of the missing one.

The A-B-A measurement example does not necessitate immediate returning of measurement in host B, but packets can be stored and transmitted at a later time. In this way, less bandwidth is needed for the measurement.

A practical aspect of measurement program implementation is that multiple threads can be used to make program structure conceptually simple; see Figure 4. As usual, simplicity is bought at the price of performance. Even with a relatively low-power PC, host A can typically execute separate threads for transmitting packets and receiving them, as well as analyzing results on the fly and displaying statistics. Similarly, at the remote end (host B), one thread can be used for receiving packets and another for sending them. In Linux, the pthreads library can be used for implementing threads.

Transmitting packets at proper intervals can be implemented in various ways, ranging from the usleep() function call to timers and signals. In Linux environments, a simple delay function based on usleep() often provides a sufficiently smooth transmission of packets due to the aforementioned scheduling process.

Analysis

To obtain more accurate results, the delay and packet loss data in a measurement may be stored for future analysis. From this data, characteristics such as delay distributions — quantiles as well as loss patterns — and correlations can be extracted.

For Poisson-type test streams, you can run a long enough test to obtain sufficient amount of data at all time scales. For media simulation measurements, I recommend you perform a series of measurements in such a way that the duration of the interval between successive measurements is Poisson distributed. In this way, multiple time scales in network behavior are sampled.

Figure 5 shows a delay trace (that is, a time series of delays experienced by test packets) for a media stream emulation-type measurement (Figure 4). The parameters used in the measurement are as follows: Measurement packet size equals 172 bytes transmitted into socket (200 bytes at IP level), interpacket interval equals 20 ms, number of packets equals 90,000. The packet stream corresponds to the metrics of a media stream of the G.711 codec or PCM stream (8-bit sampling at 8 KHz, single sample frame per IP packet). The measurement was unusually long (30 minutes) and was performed to clearly display changes in network transport QoS at time scales of minutes. The packet loss percentage, averaged over the whole measurement, was <0.3 percent. However, Figure 5 shows that losses appear in a bursty fashion.

Approximating one-way delay as one-half of round-trip delay, IP transport delay stays at a tolerable level (<100 ms) for most of the time. On the other hand, at times one-way delay has high peaks and there are bursts of packet losses. Based on the measurement results, VoIP call quality would be okay most of the time, but with some bad luck the call could be placed at a time during which transport QoS is poor. The results, despite duration of 30 minutes, pertain to a specific time of day — measurement between the same endpoints a few hours later yields different results.

Conclusion

Active measurements, based on the transmission of IP test packets, provide a portable way of estimating network transport quality between two points in an IP network. A few subspecies of the method were discussed, including one-way and round-trip measurements, and random versus media stream emulation measurements. Factors such as IP protocol version independence and clock synchronization were listed.

A careful, controlled test requires attention to many details that cannot be given here. Interested readers are referred to study the documents of the IETF IPPM working group for issues such as the definition of packet loss events and reporting of measurement results. Moreover, advanced tests may involve measurements at multiple protocol levels.

DDJ