The Media GatewayControl Protocol

Dr. Dobb's Journal May 2000

A simpler and more reliable voice over the Internet

By Linden deCarmo

Linden is a senior software engineer at NetSpeak where he works on advanced telephony software for IP networks. He is the author of The Core Java Media Framework (Prentice Hall 1999) and can be contacted at lindend@mindspring.com.

Nothing is more terrifying than a telephone that crashes as you're dialing 911. Luckily, conventional analog phones are simple, stable devices that users can trust. Digital Internet phones, on the other hand, are feature rich, but more susceptible to instabilities. In this article, I'll examine MGCP, the lightweight telephony protocol that promises reduced complexity and increased reliability for digital Internet phones.

In my article "Internet Telephony Protocols" (DDJ, July 1999), I examined voice-over-Internet-protocols (VoIP) that reduce expenses and ease service creation. In the process, I pointed out that gateways are responsible for converting packet-based audio formats into protocols understandable by the analog phone system. Unfortunately, the two dominant signaling protocols in VoIP -- H.323 and Session Initiation Protocol (or SIP) -- are industrial-strength solutions better suited for servers, rather than consumer (residential) gateways.

Consumer gateways perform basic activities such as ringing the phone, detecting key presses, and transmitting audio/video streams. However, both H.323 and SIP must handle network-related issues, such as service creation and user authentication. Because these extra functions are irrelevant for gateways, consumer-centric companies have been gravitating toward simplified Device Control Protocols (DCP), rather than all-encompassing signaling protocols such as H.323.

The Internet Protocol Device Control (IPDC) was a first-generation DCP. Its goal was to create dumb endpoints (or gateways). Endpoint simplicity was obtained by separating services from the gateway (or endpoint) and placing network intelligence on a server. For its part, the Simple Gateway Control Protocol (SGCP) was created at approximately the same time frame as IPDC. Like IPDC, SGCP revolves around intelligent servers and simple endpoints. Unlike IPDC, however, SGCP is a text-based protocol, easing development and debugging.

Because IPDC and SGCP were redundant, the two camps merged the protocols to form the Media Gateway Control Protocol (MGCP). This merger combined the best features of each protocol into a single entity. Like SGCP, for instance, MGCP is text based. It also leverages IPDC's superior event handling and extensibility features. After the merger was completed, MGCP was submitted to the Internet Engineering Task Force (IETF) as a draft.

MGCP designers believe its simplicity improves reliability, interoperability, and security. For instance, its simplicity means devices should contain less code and therefore fewer bugs. Furthermore, the reduction in messages should facilitate interoperability between MGCP vendors. Finally, MGCP devices are untrusted network elements (all critical information is stored on trusted servers). This means that network administrators need only secure a limited number of servers and not millions of devices on consumer premises.

The Ultimate Merger

While MGCP was evolving, a parallel effort was underway at the International Telecommunication Union (ITU). After analyzing the bloated feature set in H.323 terminals, the ITU developed H.GCP -- a protocol that contains the minimal features necessary to create a gateway.

The ITU and IETF realized that competing protocols were likely to stymie the growth of DCP in the consumer market. Therefore, the two groups pooled their efforts and created the MEGACO protocol. Although MEGACO is still being refined, it contains all of the functionality of MGCP, plus superior controls over analog phone lines and the ability to transport multiple commands in a single packet; see Table 1.

Mastering the Device

All DCPs are designed to facilitate communication between endpoints and endpoint controllers (or servers). This communication involves enabling connections between endpoints, detection, and reporting of events and signal generation.

Endpoints are logical representations of devices that produce and/or consume multimedia data. They can be terminators or gateways. Terminating endpoints directly accept or produce VoIP streams (that is, a native IP telephone).

However, virtually all endpoints are gateways because the existing telephony infrastructure is incompatible with packet-based audio (see Figure 1). In fact, the two most popular types of gateways are residential and trunking. Residential gateways convert audio packets to the RJ-11 format expected by analog phones used by consumers. Trunking gateways convert to Public Switch Telephone Network (PSTN) formats such as Integrated Services Digital Network (ISDN) and Signaling System 7 (SS7). These gateways let residential users reach numbers other than VoIP phone numbers or access PSTN features such as 800-number services.

DCP endpoints don't know how to locate other endpoints. Consequently, an external entity, called a "call agent" (CA), is needed to complete calls (see Figure 2). Since the CA locates other endpoints, the device's only responsibility is to execute the low-level commands sent to it by the CA. Once the CA establishes a connection between the endpoints, the endpoints communicate directly via the Real-Time Protocol (RTP).

Sometimes a CA must communicate with another CA in order to complete a call; see Figure 3. Currently, SIP is favored for these purposes, but other protocols may be used for inter-CA communication. Fortunately, most DCPs are neutral in these inter-call agent protocol tussles. They mention that a CA must be able to communicate with other CAs but avoid specifying what protocol should be used.

The CA's primary purpose is to facilitate connections between endpoints. A connection represents the flow of multimedia data between at least two endpoints. DCPs support two types of connections: point-to-point and multipoint. Point-to-point connections represent the streaming between two endpoints. In contrast, multipoint connections occur when three or more endpoints are exchanging data.

Every connection has special attributes that enable it to be identified. The most important of these attributes are the connection and call identifiers. The Connection Identifiers uniquely identify a connection. In contrast, call identifiers (Call-Ids) label the flow of media between two -- and only two -- endpoints. Call-Ids are particularly important in multipoint connections because they are the only mechanism to uniquely identify media flow between two endpoints in the conference (see Figure 4).

Events Are the Lifeblood of a Connection

Events are activities that are detectable by an endpoint. In fact, everything an endpoint does revolves around detecting events, reporting events to a call agent, and executing commands from the call agent based on events. Common events include taking a phone off the hook and pressing a numeric key on the phone's keypad.

A call agent may request that an endpoint notify it should an event (or list of events) be detected. Since endpoints are unaware of the services being provided by the CA, they have no idea if a key press should generate an event (for example, should 911 be an event or should 1-900-USE-MGCP?). Consequently, DCPs let CAs define when one or more key presses become events via digit maps. Typically, these digit maps use a flexible, pattern-matching syntax so that CAs have explicit control over key-press detection.

Once an event that the CA cares about is detected, the endpoint reports it to the CA. After the CA is notified of the event, it may issue instructions to the endpoint. For example, the CA may request that the endpoint play a message if you press the voicemail button on your telephone.

Smoke Signals

A second duty of an endpoint is signal processing (or the detection and generation of signals). Signals are audio tones that communicate the state of the call at a specific point in time. For instance, when you pick up a handset, a dial-tone signal is generated. Similarly, a fax signal is sent to alert receiving fax machines that a fax is present.

Events are intimately connected to signals. Typically, when an endpoint detects an event, the call agent instructs it to generate a signal to update the call state. For example, if the endpoint notices that the phone has been taken off the hook, it informs the call agent. The CA then instructs it to generate a dial tone signal or another signal. The beauty of a DCP is that this event to signal mapping is performed in the call agent. Thus, when you wish to modify the signals generated by a given event, you need only upgrade the CA. All endpoints attached to the CA automatically inherit the new behavior.

How Does MGCP Measure Up?

Because MGCP is the most widely deployed of the emerging DCPs, I'll examine its feature set to see if DCPs really simplify device programming. After reading the specification, you notice how few commands there are to implement. There are only three categories of MGCP commands -- connection, event, and configuration.

MGCP offers three commands to manipulate connections: CreateConnection, ModifyConnection, and DeleteConnection. CreateConnection is sent from the CA to the endpoint to initialize a connection. This initialization data is described by a limited form of the IETF's Session Description Protocol (SDP). Although an endpoint must be able to parse any legitimate SDP message, MGCP's flavor of SDP is strictly limited to describing audio attributes and data transfer settings.

Information that may appear in MGCP-compatible SDP includes communication options, media description, and encoding attributes (see Figure 5). Communication options are described by the c= parameter and indicate if the destination endpoint is using IP and the destination endpoint's IP address.

SDP's media description m= parameter enumerates the type of media that will be sent over the connection. This includes the format of audio data that will be transmitted via RTP. In addition, the SDP attribute parameter a= enables finer granularity of control over audio settings. For instance, you can use this feature to control the size of the audio packets in order to minimize audio latency (or vulnerability to audio break-ups).

ModifyConnection is used to change parameters for the connection. For example, the call agent may use this command to change the audio compression algorithm used between endpoints or the UDP port where audio is being sent.

When the CA or an endpoint wishes to remove a connection, it sends a DeleteConnection command. Endpoints do not have the intelligence to create connections, but they can destroy them. To illustrate, if an endpoint were going to undergo a software upgrade, it would issue a DeleteConnection command on all outstanding connections before the upgrade process could start (see Examples 1 and 2).

MGCP supports two event-oriented commands: NotificationRequest and Notify. NotificationRequest enumerates the events that the CA wishes the endpoint to monitor. When one of these events is detected, the endpoint issues the Notify command to inform the call agent (or NotifiedEntity) of the event's status.

Package It Up!

MGCP offers several features that ease the manipulation of events and signals. For instance, an endpoint may theoretically define thousands of events making it virtually impossible to monitor each one. Therefore, MGCP groups a related set of events and signals into a package. The MGCP standard document defines a basic set of packages such as Media, Handset, RTP, and Scripting (see Table 2).

One of the most intriguing packages is the script package that lets the CA download and run Java bytecodes, Tcl, or Perl scripts on the endpoint. While this ensures that endpoints can run the latest programs, it potentially lets hackers execute rogue applications on your phone. Fortunately, an endpoint will only accept such scripts from authorized CAs. Thus, to crash your phone, a hacker would have to perform the Herculean task of gaining access to a private network, impersonate a call agent, and break the CA-to-endpoint security protocol.

Besides grouping, packages reduce development time, increase performance, and facilitate the addition of new features. MGCP packages, such as Java packages, have namespaces. Consequently, you can request that an endpoint detect all events in a package with one simple statement. Without such a feature, thousands of packets might have to be exchanged between the endpoint and CA to cover each event.

Packages also facilitate the smooth incorporation of new features. The Internet Assigned Numbers Authority (IANA) supervises the adoption of new packages by ensuring that the event names within a package are logical and that plain text explanations are available.

Another event manipulation tool MGCP offers is pattern matching via the * and $ wildcards. * matches characters in the package name field, whereas $ matches characters in the event or signal field (see Example 3). For instance, to detect all events in the media package originating from a government domain, you would use the syntax media*/$@$.gov.

Behind the Scenes

Although the event detection and signal creation features have all the sizzle, they would be useless without MGCP's configuration options. These configuration APIs permit CAs to monitor the health of an endpoint and track the status of a connection.

The AuditEndpoint command lets a CA peer inside an endpoint and retrieve important status information (see Example 4). For instance, AuditEndpoint reports the digit map the endpoint is using, the signals it will generate, the events it is monitoring, and the CAs that will be notified when the event(s) are detected.

The AuditConnection command lets the CA interrogate the status of connections within an endpoint. Some connection attributes it reports include Call-Id, SDP settings, and NotifiedEntity for events. AuditConnection may be used to verify that the endpoint correctly implements a ModifyConnection request or to periodically ensure that the endpoint and CA are synchronized.

Conclusion

DCPs are based on the premise that a simpler protocol results in stabler endpoints. This simplicity is achieved by limiting a device's responsibility to event detection, signaling, and event reporting and placing virtually all intelligence in a call agent. The CA instructs the endpoint which events should be detected and when to apply signals.

MGCP is currently the most popular and widely deployed of the DCPs. Its simplicity lets you focus on endpoint-centric features such as connections, events, signaling, and configuration. Furthermore, the flexibility of packages ensures that the protocol can incorporate new features as the industry evolves. Although the early trials appear to validate MGCP's promises, it will have to undergo widespread deployment before its stability claims can conclusively be proven.

For More Information

http://search.ietf.org/internet-drafts/draft-taylor-ipdc-00.txt.

DDJ