IP Security Protocols

Dr. Dobb's Journal December 1999

VPNs and e-commerce demand greater IP security

By Eva Bozoki

Eva is chief scientist and director of research at Fortress Technologies. She can be contacted at eva@fortresstech.com.

Internet communication is based on the TCP/IP protocol. However, IPv4 is not secure enough to protect the integrity and privacy of data being transferred over the Internet. This is of particular concern with the rapid emergence of Internet-based virtual private networks (VPNs) and e-commerce.

The need to address IPv4's lack of security has prompted the release of a number of standards, protocols, and applications. In general, these secure IP protocols must provide:

Authentication. The identity of the communication partners must be mutually validated, and the origin of each IP packet must be ascertained. Dial-up protocols usually provide limited authentication of the peer entity, such as the clear text Password Authorization Protocol (PAP) or the encrypted Challenge Handshake Authentication Protocol (CHAP). Stronger forms of peer authentication -- such as proving the knowledge of a shared secret without revealing it, using cryptographically secure hash functions such as MD5, MD6, and SHA1, or using digital certificates -- may be part of other protocols. (Digital certificates, however, require the preexistence of a Certificate Authority or a system of CAs.)
Privacy. Confidentiality of transmitted information must be assured by encryption. Encryption involves not only the encryption algorithms themselves, but key management, too. Consequently, both must be part of the protocol.
Data integrity. It must be ascertained that messages were not tampered with in transit.
Nonrepudiation. It must be assured that senders of messages cannot deny authorship. This is essential for e-commerce in particular. Digital signatures combined with time-stamps can provide this feature.
Access control. Network resources must be protected by granting access selectively to authorized users only. More often than not, access control is provided by a separate device, such as a firewall.

Beyond the security functions that IP protocols deliver, there are practical considerations regarding speed, scalability, and convenience. VPN devices are packet encryptors that have to process data packets at high speed, typically at T1-T3 speed (1.54-44.736 Mbits/sec.), without any detrimental effect on network performance. Consider, for example, that T3 speed corresponds to an average of 4500 packets per second, which allows for all processing in less than 2 milliseconds. Therefore, the protocol must have minimal overhead and employ fast algorithms, implement all critical security functions without human intervention, and automatically recognize which packets need to be encrypted. These same features will also yield convenience in using the protocol: They assure that a VPN device based on the protocol is transparent, easy to set up, and simple to use.

Scalability is another concern. For large enterprises with hundreds or thousands of hardware- or software-based VPN devices, setup must be reasonable. The setup effort cannot grow significantly with the number of devices in the system. This need is satisfied if the handshaking is automatic and does not need human intervention.

General-Purpose IP Protocols

True general-purpose IP protocols are implemented in the network layer. They can be used regardless of how the connection between the communicating nodes is established -- whether the connection is permanent or dial-up. Working in the network layer -- the lowest, purely software layer in the ISO/OSI architecture -- these protocols modify the IP packets. They can be used on any IP network -- on the Internet or on any private IP backbone. The most important of such protocols are IPSEC, SPS, and SKIP. All three essentially transform a valid IPv4 packet into a new valid IPv4 packet, adding security functions in the process.

The Internet Protocol Security (IPSEC) is a public protocol developed under the sponsorship of the Internet Engineering Task Force (IETF). Its basic features have been defined in a series of Requests For Comments (RFCs), but improvements and additions are still being proposed. Interoperability between the products of different manufacturers, should they all use IPSEC, is a cornerstone of IETF's philosophy. IPSEC will be an integral part of IPv6, the next version of the IP protocol. Because implementation of IPv6 is still in the future, the Internet community has decided to implement IPv6 security ahead of time and use it with the existing IPv4.

IPSEC is a true network layer protocol and provides authentication, privacy, and data integrity. Nonrepudiation is not an integral part of the protocol, but it is supported. The three basic elements of IPSEC are:

The Authentication Header (AH), which contains the so-called Authentication Data, a cryptographically secure one-way function (typically MD5 or SHA1) of the input and the secret authentication key. This authenticates the sender and protects the integrity of the transmitted data since it is computationally infeasible to reverse the computation and find the input data. It also contains a Security Parameters Index (SPI) identifying the authentication algorithm and the authentication key.
Privacy provided through the Encapsulated Security Payload (ESP). Besides the encrypted payload, ESP contains the Initialization Vector used in the encryption and a second SPI that identifies the encryption algorithm and encryption key. The SPIs and the Security Association (SA), the security data referred to must be in place before communication between two parties can begin. This means manually setting up the SA and agreeing upon the SPIs. Figure 1 illustrates the basic structure and content of an IPSEC packet.
The Internet Key Management Protocol (IKMP), formerly known as ISAKMP/Oakley, encompasses both key determination and key distribution. It also provides SA management, authenticates each peer, and negotiates security policy. Key management can also be manual. In such cases, the keys are generated at a central site and distributed to users who manually enter them into the database.

As Figure 2 shows, IPSEC can be implemented in two ways. In the transport mode, the original data is encrypted, then the ESP header and the Authentication header are placed between the original IP header and the encrypted data. In tunnel mode, the whole original IP packet is encrypted, then the ESP and Authentication headers and a new IP header are prepended. The first mode is used mainly when IPSEC is implemented in a host, the second mode when IPSEC is implemented in a network gateway that has multiple hosts, yielding the additional benefit of hiding the internal host addresses.

IPSEC is a comprehensive and thorough endeavor that is being publicly designed, discussed, tested, and improved upon. However, there are often nonstandard needs that can best be solved by proprietary protocols. One such proprietary network layer protocol is Secure Packet Shield (SPS) from Fortress Technologies (the company I work for). This patented protocol provides all three security functions: authentication, privacy, and data integrity. Figure 3 shows the basic structure and content of an SPS packet. The encrypted part of the packet contains the compressed original payload, the original IP header, and a checksum that is calculated from the SPS tail. Another checksum is then calculated from the entire encrypted portion of the packet. This corresponds to the Authentication Data in IPSEC. A new IP header is prepended. The new IP packet is different from the original one; it has a payload size that doesn't match the original, a different protocol identifier that reveals nothing about the original service, and a different fragmentation index. Due to compression, even packets of the same length will yield different final packet lengths. The Message Authentication Code (MAC) address in the MAC header is also replaced with the MAC address of the VPN device, thus hiding the protected clients. The unencrypted checksum is used for data integrity. Unlike IPSEC, the integrity of the encryption can also be verified using the checksum in the SPS tail. In an encrypted area, even a checksum is very strong because it is an indistinguishable pattern in a block of ciphertext. However, cryptographically secure hash functions can be used in place of the checksums. The SPS packet handling is equivalent to the IPSEC tunnel mode.

SPS uses the LZRW algorithm for compression, and the 128-bit key IDEA algorithm for encryption. However, SPS supports other 64-bit block symmetric encryptions such as the 56-bit DES, Triple-DES, or 128-bit Guava.

In SPS, key generation and key negotiation are an integral part of the protocol. SPS uses the Diffie-Hellman key exchange protocol to negotiate the common secret key used by the symmetric encryption algorithm and to automatically build the database (indicating whether to encrypt packets between two communicating partners and the key to be used). This database is the equivalent of IPSEC's SA. There is no need for SPI, given that all partners use the same authentication, compression, and encryption algorithms. This makes it simple to deploy SPS in large networks: SPS scales well. Two sets of private and public keys are used to compute two secret common keys between any two communication partners: One is static, used to encrypt the public key exchanges; and one is dynamic, used to encrypt communication and changed periodically and automatically (say, every 24 hours). The encrypted Diffie-Hellman protocol is immune from "man-in-the-middle" attacks.

SPS takes advantage of the maximum transmission unit of 1540 bytes for Ethernet to increase the throughput of data and reduce packet collisions. This, combined with small overhead (there is no algorithm negotiation, and, by default, key negotiation occurs only twice in the valid lifetime of the dynamic key -- by default, this is 24 hours) and the fact that all processing is done in the network layer, makes SPS a very fast protocol.

Additional features of SPS include smart encryption (based on a Finite State Machine model, the protocol automatically recognizes when encryption is necessary), dual authentication (both the sending VPN device and the origin of each packet is authenticated through the use of the encrypted tail checksum), smart bridge (packets from one client to another on the protected side of a VPN device are forwarded internally, without the packet appearing on the "world side"), and unique company signature. The public keys are sent in an SPS packet (UCS). The payload of these key packets also contains a UCS string. Communication between nodes is restricted to those having the same UCS.

The Simple Key Management for Internet Protocol (SKIP) is the Sun Microsystems open standard network layer protocol, invented by Ashar Aziz and Whitfield Diffie. In addition to the three security functions, it provides authorization through an Access Control List (ACL). Data authentication is provided via the use of hashed MD5. For privacy, SKIP supports the 40-bit RC2 and RC4, 56-bit DES, Triple-DES, and the 128-bit SAFER encryption algorithms.

There are two keys used in SKIP: the randomly generated packet key, Kp, which is used for encryption and authentication of the packets; and the K_M secret master key (common to any two communicating partners), which is used to encrypt the packet key. The master key is regenerated periodically. This is similar to the use of the static and dynamic common keys in the SPS protocol. However, here, the encrypted packet key is transmitted in each packet header, while the master key is computed by each communicating partner from an authenticated Diffie-Hellman public key. Unlike in SPS, the public keys are not exchanged -- in the absence of an authenticated public-key-distribution infrastructure, they are distributed using a directory service, X.509 certificates, or manually.

Figure 4 shows the basic structure and content of a SKIP packet. The protocol first looks up the destination IP address in the ACL and determines authorization, whether encryption and/or authentication is needed, and what algorithms to use. Then the SKIP header is built and prepended. The header contains information on the authentication and encryption algorithms to be used and the encrypted packet key. Next, the original packet is encrypted and a new IP header is prepended before the SKIP header.

Several interoperable SKIP implementations (by Sun, OpenRoute, Check Point, and VPNet) on the same platforms (Solaris, Windows 95/NT) are commercially available.

Table 1 compares the essential features of the IPSEC, SPS, and SKIP IP protocols. They differ in the selection of algorithms, speed, overhead, ease of large-scale integration, and interoperability.

Dial-up Protocols

Dial-up protocols let telecommuters, remote employees, and the like connect to the corporate Local Area Network (LAN) using the existing routing infrastructure. Implemented in the link layer, the dial-up protocols use tunneling -- encapsulating datagrams (packets or frames) inside a protocol that is understood at the entry and exit points of the exchange. The encapsulating protocol is the IP protocol, the encapsulated protocols can be other network layer protocols such as IP, IPX, and NetBEUI.

At present, link-layer tunneling protocols are insufficient to be VPN solutions on their own. They offer no protection against eavesdropping and tampering at the network layer where intruders work and, in general, they do not provide the data encryption, authentication, or integrity functions that are critical to maintaining VPN privacy.

The first of these protocols is the Point-to-Point Protocol (PPP), designed by Microsoft. PPP (see Figure 5) provides multiplexing of multiprotocol datagrams simultaneously over the same link (full duplex simultaneous bidirectional operation) and encapsulates IP, IPX, and NetBEUI packets into IP datagrams to traverse the Internet.

The remote user initiates communication with the corporate LAN by establishing a PPP connection to the ISP over a telephone line. Both the ISP and the corporate LAN are cabled to the Internet (the cable connection ends at the POP server). The PPP Client encapsulates the native (IP, IPX, or NetBEUI) packet, which then traverses the Internet in a valid IP packet. The PPP server at the corporate site strips the PPP framing and delivers the native packet to its destination. There are four steps to establishing a PPP session:

1. Link Control Protocol (LCP) is used to establish, maintain, and close the actual links.

2. User authentication occurs using the method selected in step 1. The level of security that can be negotiated ranges from clear text password authentication (PAP) to encrypted authentication (CHAP) to callback security.

3. Various Network Control Protocols (NCP) are used to configure the protocols used by the remote client. Dynamic IP addressing takes place in this step. NCPs are selected during step 1.

4. PPP-encapsulated datagrams are transmitted between the two endpoints. PPP encapsulation prepends a PPP and a new IP header. At the exit point of the tunnel, a PPP server strips the prepended headers and forwards the original protocol to the addressee. The PPP header contains an encapsulated protocol identifier and a frame checksum that is used to verify data integrity upon receiving the PPP frame.

No security, other than limited user authentication, was designed into PPP. However, since any valid IP protocol (such as IPSEC, SPS, or SKIP) can be tunneled by it, security can be incorporated by running PPP over any of the secure IP protocols described earlier. In such cases, the native packets are secure IPSEC or SPS packets, which are tunneled and delivered by PPP.

The Point-to-Point Tunneling Protocol, an extension to PPP, initiated by Microsoft and developed by the PPTP Forum (a group of commercial companies), allows PPP frames to be channeled through an IP network. It encapsulates PPP packets into IP datagrams, which are then sent over the Internet or other TCP/IP-based networks.

The remote client uses PPP to establish the initial connection and communication between the client and the ISP. Then, PPTP takes over and a second dial-up call is made over the existing PPP connection. The PPTP protocol (Figure 6) specifies a series of control messages sent between the PPTP-enabled client and the PPTP server. The control messages establish, maintain, and end the PPTP tunnel. The PPTP header for the control messages contains a message-type identifier (call-control or call-maintenance message) and a second control ID that specifies the message inside. PPTP relies on PPP-provided peer authentication. After the PPTP tunnel is established, PPTP performs data tunneling: Data is sent over the second connection in the form of IP datagrams that contain encapsulated PPP frames. The IP datagrams are created using a modified version of the Generic Routing Encapsulation (GRE) protocol. The encapsulated PPP frames can be compressed and/or encrypted. For encryption keys, PPTP uses shared secrets (for example, user passwords). Security was not originally designed into the PPTP protocol, but was implemented as an afterthought.

Layer 2 Forwarding (L2F) is a transmission protocol that permits link-layer tunneling of higher-layer protocols; see Figure 7. Developed by Cisco, it is described in RFC 2341. L2F, like PPTP, encapsulates the PPP protocol and uses the IP protocol as carrier to traverse a TCP/IP-based network. The L2F header contains a packet key that is part of the authentication process and (optionally) a checksum that is used for testing data integrity in transit.

Remote users initiate the PPP connection. After the ISP authenticates the user, it initiates the L2F tunnel to the L2F server at the corporate LAN. That is, L2F functions only in compulsory ISP-created tunnels and, unlike PPTP and L2TP, it has no defined client. Once the L2F server authenticates the remote user, a tunnel and a virtual dial-up connection are established between the remote user and the L2F server. Remote users can access the corporate LAN as if they dialed into it directly, even though the physical dial-up is via the ISP. At this point, link-layer frames can pass over the L2F tunnel in both directions. In addition to PPP-supported peer authentications, L2F can also support RADIUS authentication.

Layer 2 Tunneling Protocol (L2TP) intends to combine the best features of L2F and PPTP. PPTP and L2F are vendor-specific proprietary protocols, so interoperability is limited to products from supporting vendors. L2TP is a multivendor effort, so interoperability is less problematic.

L2TP (Figure 8) can encapsulate PPP frames into IP and into X.25, frame relay, and ATM protocols as well. Thus, the carrier protocol is no longer restricted to IP.

After a PPP connection is established, and the L2TP tunnel is set up, its end points authenticate each other and an L2TP session is created. Optionally, the corporate side can also authenticate the remote user. Tunnel and session management is preformed via control messages containing Attribute-Value Pairs (AVPs). Control messages use a reliable control channel within L2TP to guarantee delivery. A tunnel consists of a control connection carrying the control messages and any number of L2TP sessions carrying encapsulated PPP datagrams.

L2TP over IP networks uses UDP and a series of L2TP messages for tunnel maintenance. L2TP also uses UDP to send L2TP-encapsulated PPP frames as the tunneled data. The payloads of encapsulated PPP frames can be encrypted and/or compressed.

The L2TP protocol specification states that IPSEC protocols must be used over an IP network for L2TP to operate securely. However, L2TP may be run on a link layer that does not have a security mechanism (such as IPSEC). In such cases, it becomes necessary for L2TP to provide its own mechanism for packet-level security.

Table 2 summarizes the basic differences between PPTP and L2TP, which are similar protocols competing for general acceptance. L2TP can be used over any frame-oriented PPP. L2TP supports multiple tunnels, thereby providing a tool for creating different tunnels for different Quality of Service(QoS) connections. L2TP can compress the header, thereby reducing the per-packet overhead. L2TP can also provide tunnel authentication.

Application-Oriented Protocols

Providing application-to-application security, application-oriented protocols are usually implemented between the application and transport layers. They operate between two applications, and each application has to be modified to accommodate the protocol. They do not provide a general VPN solution, nor do they secure a whole network.

The Secure Socket Layer (SSL) protocol was originally developed by Netscape to provide application-to-application security over any -- not just TCP/IP -- connection-oriented network. It provides authentication, privacy, and data integrity. SSL, a session-layer protocol, consists of two separate protocols layered on top of each other:

The handshake protocol is responsible for setting up the connection between two parties, for server-client authentication, and for negotiating the cipher suit and crypto keys.
The record protocol provides optional compression, message integrity, and encryption.

SSL uses RSA, DSS, or X.509 certificates for authentication; RC4, DES, Triple-DES, or IDEA for encryption; and SHA or MD5 for message integrity. SSL requires extensive CPU resources. The overhead can reduce throughput by as much as 10 percent (it also reduces the number of possible simultaneous connections).

SSLeay is a free implementation of Netscape's SSL. An initiative of Eric Young and Timothy Hudson at RSA Australia, SSLeay is an extension of SSL with a math library for RSA, DSA, DH, and X.509 certificates and simple CA.

Transport Layer Security (TLS) is an IETF effort to standardize SSL. TLS 1.0 was specified in RFC 2246 (based on SSL 3.0). The differences between TLS 1.0 and SSL 3.0 are not dramatic, but are significant enough that the protocols do not interoperate.

SSL is being applied in a number of applications, such as the terminal emulation program for TCP/IP networks (Telnet), the Internet File Transfer Protocol (FTP), and the Network News Transport Protocol (NNTP). However, the most common application is in the HyperText Transfer Protocol (HTTP), developed by the HTTP Working Group in IETF. Both HTTP browsers and back-end applications must be SSL enabled, requiring that application-specific front-ends be written.

Secure Shell (SSH) protocol, originally developed by Tatu Ylonen (the founder of SSH Communications Security), is a transport-layer protocol for remote administration over the Internet. It lets users securely log into another computer over a network, execute commands in a remote machine, or move files from one machine to another. SSH is being standardized by the SECSH Working Group of IETF. It is being specified in a number of international drafts.

In SSH, the payload is optionally compressed, then encrypted for privacy. A MAC header is used for data integrity. Key distribution is not part of the SSH protocol. However, keys are exchanged immediately after the SSH client sets up the transport-layer connection. During key exchange, the algorithms are agreed upon. SSH uses zlib for compression, Triple-DES, IDEA, or Blowfish for encryption, and MD5 or SHA for message authentication.

Conclusion

In general, general-purpose protocols are better suited for LAN-to-LAN applications, while dial-up protocols have been designed for a single remote user-to-LAN usage. Application-oriented protocols are used between two applications, mostly between web clients and servers.

General-purpose protocols operate in the network layer, thus securing all information (including higher layer headers) from eavesdropping and tampering in the lowest, still software, layer. Dial-up protocols are implemented in the link layer, thus information is visible to a network layer observer. Application-oriented protocols encrypt the original message, but leave all header information (prepended in the lower layers) in the clear.

Both general-purpose and dial-up protocols provide a general VPN solution to secure an entire network over insecure channels, while application-oriented protocols do not.

None of the protocols are interoperable, although IETF is taking steps to standardize them. IPSEC, L2TP, and TLS are the emerging IETF standards, but clearly not the only important protocols.

Finally, other important factors in VPN implementation include network performance and QoS considerations such as latency, packet loss, and delivery guarantees. VPNs may link disparate bandwidth domains (LAN, WAN, remote users), and QoS requirements on a mix of domains leads to issues of traffic prioritization, bandwidth management, and network-resource allocation/reservation.

DDJ