Where Do IPsec and IPv6 Security Fit Into the Picture?

Introduction

This security portal has introduced you to many security concepts, algorithms, and protocols. Where does IPsec (IP Security) fit in? Isn't IPsec supposed to provide an all-encompasing security solution? On the home page in this security portal, I named several features of secure network protocols: privacy, authentication, data integrity, and non-repudiation. IPsec uses a slightly different set of terminology to refer to what in the end offers basically the same functionality: confidentiality, data integrity, access control, including minimal firewall functionality, and data source authentication. IPsec specifically provides this functionality to IP datagrams. Obviously, IPsec can make use of the various cryptographic algorithms, key distribution methods, etc., that you've learned about. But what about the secure applications? Since all of our Internet applications run over IP (using a transport layer such as TCP or UDP), if IPsec is implemented, does IPsec render the secure applications superfluous?

The questions just raised are also relevant to IPv6, which incorporates the IPsec architecture. In fact, because IPsec was incorporated into IPv6, everything in this chapter is relevant to IPv6. Though the header formats that will be shown are the more familiar IPv4 formats, the principles are the same for IPv6 (and I'll include one IPv6 format to demonstrate this). To answer the questions posed, we will first take a look at how IPsec provides its security services.

IPsec Architecture and Protocols

Most of the security services that IPsec provides are provided through the use of two traffic security protocols, the Authentication Header (AH) and the Encapsulating Security Payload (ESP), and through the use of key cryptography and key distribution protocols. Which protocols are used in a particular situation, which features of each protocol are used, with which cryptographic algorithms, are determined according to the particular requirements of the users and/or network adminstrators.

However, to facilitate interoperability in the global Internet, RFC 4305 specifies a set of default cryptographic algoritms for use with AH and ESP, and RFC 4307 specifies a set of cryptographic algorithms that must be implemented for IKEv2 (Internet Key Exchange 2). IKE is the component of IPsec that is used to perform mutual authentication and to establish and maintain security associations (SAs). An SA is a shared state between the source and sink of an IP datagram. There is a state for each direction (the source for one direction is the sink for another). Each SA defines, among other things, the specific security services provided to the datagram, which cryptographic algorithms will employed to provide the services, and the keys that will used as input to the cryptographic algorithms. Though keys could be established manually between two end-systems, this becomes way too burdensome, if not totally impractical, with more than a few end-systems. IKE is a protocol used by the end systems to establish SAs, dynamically, as needed. We'll return to IKE later in this chapter.

IPsec might be implemented in a host (an end system), such as in Host A in Figure 43, or in an intermediary system, such as a firewall, router, or other device, which serves as a security gateway, as depicted in the right-hand side of Figure 43.

Figure 43: Server Sends its Certificate

IPsec services can be provided to hosts using two types of SAs: transport mode and tunnel mode. With tranport mode, the SA provides end-to-end security services, typically between two hosts, such as between Host A and Host C in Figure 43. But a transport mode SA can operate between a host and a security gateway, such as between Host A and the right-hand Gateway in Figure 43. With a transport mode SA, protection is provided to IP's payload, i.e., the next layer protocols, such as TCP and the application layer.

With a tunnel mode SA, the entire IP datagram is secured by encapsulating the secured IP datagram inside another IP header. A tunnel mode SA would be used between two gateways, such as between the two gateways in Figure 43, to form a secure tunnel across an insecure public network. This is commonly done in a traditional VPN. However, we might also have a tunnel between a single host and a gateway or between two hosts. This animation shows the difference between the transport and tunnel modes in an IPsec-based VPN. Take a look at the instructions, and then follow how the encapsulation changes as the datagram traverses the network, for each mode. By selecting different pairs of hosts, you can also see the difference between a host that does not have IPsec and one that does. If the host does not have IPsec, the security features can only begin or end at its IPsec gateway, in which case the tunnel mode would be used.

Therefore, I must warn you: The VPN animation is overly flexible, in that it allows configurations that would not be permitted by the IPsec RFC, which states that:

	To clarify, the use of transport mode by an intermediate system
	(e.g., a security gateway) is permitted only when applied to packets
	whose source address (for outbound packets) or destination address
	(for inbound packets) is an address belonging to the intermediate 
	system itself.

An example of this might be a management station managing a router via a transport mode SA.

The RFC also states:

   Two hosts MAY establish a tunnel mode SA between themselves.
   Aside from the two exceptions below, whenever either end of a
   security association is a security gateway, the SA MUST be tunnel
   mode.  Thus, an SA between two security gateways is typically a
   tunnel mode SA, as is an SA between a host and a security gateway.
   The two exceptions are as follows.

    o Where traffic is destined for a security gateway, e.g., Simple
      Network Management Protocol (SNMP) commands, the security gateway
      is acting as a host and transport mode is allowed.  In this case,
      the SA terminates at a host (management) function within a
      security gateway and thus merits different treatment.

    o As noted above, security gateways MAY support a transport mode SA
      to provide security for IP traffic between two intermediate
      systems along a path, e.g., between a host and a security gateway
      or between two security gateways.

Therefore, you can create scenarios in the VPN animation that are not according to spec. This does not mean that such implementations don't exist. It certainly is feasible to have a gateway that intercepts the datagrams transparently and secures the payload without changing the address information, so that only the intended recipient can decipher the payload upon receipt. But this is not how IPsec defines the architecture.

A host implementaton of IPsec must support both the tunnel and transport mode, while a security gateway must support the tunnel mode and may support the transport mode.

Authentication Header

I started this section by mentioning two traffic security protocols in IPsec: AH and ESP. Each of these protocols inserts an extension to the original IP header. Typically one or the other is used, though it is possible (though not common) to use them both. We'll first take a look at the AH header and what it does, but before we do that, we must review how it is possible to "extend" the standard IP header. Figure 44 shows the structure of the IP header.

Figure 44: IP Header

The proto field, singled out in red, is where the IP header indicates what kind of data (i.e., the protocol) is in the payload. The value would be 6 for TCP, 17 for UDP, and 1 for ICMP. A value of 50 indicates that what follows is the ESP header extension and a value of 51 indicates that what follows is an AH extension. The complete list of protocol values is maintained by IANA, but only 50 and 51 are of interest to us here. So, it would make sense to rename the proto field the "Next Header" field, because in effect, the field indicates the next header that is encapsulated by IP. (Remember, for example, if the payload is a TCP segment, there would be a TCP header, followed by the application data, etc.) In fact, in IPv6, the header field that serves the same function as proto is called the "Next Header" field.

Now we can look at a datagram with an AH extension. AH is used to authenticate, but not encrypt, the IP data. Therefore, AH will provide authentication, data integrity, and replay protection. AH can be used either for transport mode or tunnel mode. With tranport mode, as we saw above, the SA provides end-to-end security services, typically between two hosts, and protection is provided to IP's payload. Using AH for transport mode, the regular IP header in an IP datagram would be followed by the AH, which is then followed by the, as we see in Figure 45.

Figure 45: IP Header with AH Using Transport Mode

Let's look at the fields in the added header:

The next header field serves the same purpose as the protocol field in the regular IP header - it indicates what is the next level protocol that is encapsulated in this header. The value should be equal to the protocol value in the original IP datagram, before applying any security extensions. So, if the original IP datagram was carrying a TCP segment, the value of next header would be 6, just as as it was in the original IP datagram.

The payload length is certainly named badly, because I would think that this means the length of the payload that follows the AH, but it is defined as "the length of AH in 32-bit words (4-byte units), minus '2'." So this field really should be called "AH length", or better "AH length minus 2." Where does the "minus 2" come from? Now that is weird. In the original AH definition, this field was defined as the length of the Authentication Data field, which is what is now called the Integrity Check Value, while the Sequence Number was not yet defined as a field. Therefore, the payload length was simply the length of the essential information being carried in the AH, that length being variable. Since the first two 32-bit words in the AH are fixed in length, they didn't need to include them in the count.

In the next version of AH, the Sequence Number field was defined, adding one more 32-bit word before we get to the Authentication Data (now called the Integrity Check Value - are you confused?) If you are still with me on this, you might ask: So then, why didn't they maintain the definition of payload length, since the added field is also fixed in size? Or, why didn't they say it is the "AH minus 3"?? The answer to this is explained in more detail than even I feel like continuing with (since I imagine all this seems very silly to you and you are wondering why I'm bothering), but in short, it was to maintain consistency with the definition of extension headers in IPv6. Now, if you are wondering why I bothered at all trying to explain why the payload length is defined as "AH length minus 2", it is because I think there is some value to understanding how we sometimes inherit complexities due to historical reasons, even though there is a simpler way to do it.

The reserved field does not serve any useful purpose at this point, but must be set to "zero". Though the RFC states that it is reserved for future use, I think that is a creative way of saying that since the next field is a 32-bit field (or, originally, when the next field was the Authentication Data, a multiple of 32 bits), we'd rather start the next field with a new 32-bit word, and leave the rest of a partially used 32-bit word unused. And, we can give the impression that we plan ahead, by leaving blank space. It might be interesting (not too interesting, though) to look at other cases of "reserved" fields to see how they've been used. (For example, the TCP header has a "reserved" field.)

The Security Parameters Index (SPI) field is a 32-bit field used by the receiver to identify the SA to which an incoming datagram is bound, considering that a receiver might have multiple SAs, or conversations, that are currently active. Since each SA uses a hash algorithm (MD5, SHA-1, etc.), some kind of secret data, and a bunch of other parameters, the SPI can be viewed as an index into a table of these settings, to associate a datagram with its parameters.

In the course of that rather useless discussion about "minus 2", we did mention the Sequence Number field. This field is a per-SA monotonically increasing datagram sequence number. This number is used for anti-replay protection, but even if the receiver does not elect to use the anti-replay service, the field must be present. Because the field is included in the calculation of the Integrity Check Value, tampering with its value would be detected. Offering an Extended Sequence Number (ESN) option to allow 64-bit sequence numbers is recommended, to support high-speed implementations. This would be negotiated for the SA. With the ESN option, there is no change to Figure 45, because only the low-order 32 bits of the Sequence Number are transmitted in the AH header of each datagram, maintaining the field length and thus keeping the overhead under control. The high-order 32 bits are maintained as part of the Sequence Number counter by both transmitter and receiver and are included in the computation of the Integrity Check Value, but are not transmitted.

The last field, and, basically the raison d'etre of the AH, is the Integrity Check Value (ICV), which is a hash that is calculated over the entire datagram, excluding a few header fields, indicated in Figure 45 by the cross-hatch pattern. These omitted fields are those that change in transit (called "mutable"), such as the TTL and the checksum, and the fields that are affected by fragmentation - the flags and the fragment offset. The TOS (Type of Service) field might also be used for purposes - Differentiated Services Code Point (DSCP) and Explicit Congestion Notification (ECN) that would make the field "mutable", but we will not get into those here. For the calculation, the ICV itself is set to zero. The recipient recomputes the same hash and if the result does not match the ICV, then the datagram is discarded. It might have been damaged (or tampered with) in transit or the secret key is incorrect. This field is variable length - its length depends upon the algorithm used, but it must be an integral multiple of 32 bits.

Figure 46, below, is identical to Figure 45 except that AH is encapsulated in an IPv6 datagram. For details on the placement of IPv6 extension headers, I'll refer you to the AH RFC. Some of the extension headers would be placed before the AH, some after, and some in either position.

Figure 46: IPv6 Header with AH Using Transport Mode

With the tunnel mode, the entire IP datagram - including all the fields in the header - is encapsulated and secured. This permits the encapsulating header to have different IP addresses from those in the original header, thus forming a "tunnel". The concept of a tunnel and changing IP addresses is demonstrated nicely in the VPN applet. As with the transport mode, an ICV is generated to authenticate the sender and to detect any modification in transit. The encapsulating header, i.e., the outer header, is secured by the ICV just as it is with the transport mode - the same fields are omitted from calculation of the ICV. But, because the entire IP datagram is encapsulated, the original header values are part of the payload for the ICV calculation.

Figure 47 shows an IP datagram encapsulated with AH using the tunnel mode. Notice that there is no specific field that indicates that this is a tunnel mode - rather than transport mode - datagram. However, because the Next Header field in the AH indicates IP (rather than a transport layer protocol) - 4 for IPv4 and 41 for IPv6, it is evident that we are in tunnel mode. If the Next Header is IP, then it means that an entire IP datagram is encapsulated. I noted above that a host — as opposed to a gateway — is required to support both transport and tunnel modes. However, when creating a host-to-host connection, it definitely seems superfluous to use the tunnel mode.

Figure 47: IP Header with AH Using Tunnel Mode

When a tunnel mode datagram reaches its destination, the ICV is calculated, and, as with the transport mode, if it does not agree with the received ICV, the datagram is discarded. If the ICV is correct, the encapsulating IP and AH headers are stripped off, and the original IP datagram remains intact, to be sent along its merry way according to the usual routing process. It might be destined for a local host, or routed elsewhere, but is no longer secured by IPsec.

One problem that comes up for many users is that AH is not compatible with the use of NAT (Network Address Translation) (you can also try this interactive demo). NAT is commonly used these days so that IP addresses from the private address space can be used on local networks. If you have a home router, then the IP address on your computer is almost certainly from the private space. That means that the NAT (which is a function that your home router performs) would translate your IP address to a public address. However, AH considers the IP address in the original datagram to be immutable. Any modification of the IP addresses will cause the ICV to fail.

One solution to this problem is for the NAT device to be the tunnel endpoint. A better solution to this problem is to migrate to IPv6. Though some people think they need NAT for security purposes, that was really a concept used to convince people that using NAT to save network addresses is really a good thing. By using IPv6, there is no shortage of addresses, and with IPsec, the security concerns can be addressed as they should be. If privacy about internal addresses is desired, then an IPsec tunnel is an appropriate solution. However, perhaps the answer is to use ESP and not AH. We'll discuss that in the next section, which is about ESP.

Encapsulating Security Payload

One of the criticisms of IPsec is its complexity. The complexity is partly due to the spectrum of choices offered to the user, and the way that they are offered. In the previous section, we took a look at AH, which offers authentication, data integrity, and replay protection - which we will simply call "integrity", but not confidentiality (i.e., privacy). We also saw that an SA can be in either tunnel or transport mode. ESP provides all of the security features that AH does, plus encryption. And, with ESP, a user can opt for encryption with no authentication. In fact, a user can opt to use ESP with AH - ESP for encryption and AH for authentication, though that is certainly not a typical setup.

It is reasonable to ask why we need both AH and ESP - why not just allow ESP to have an option without encryption? In fact, it does, and that is what is called ESP-Null. But that only came about when the problem of not being able to traverse a NAT became an issue with AH. The reason that originally there were two different protocols for IPsec has to do with export restrictions of the cryptography technologies. Security expert Merike Kaeo wrote the following in a message posted to the NANOG (North American Network Operators Group) mailing list on May 26, 2009 (I put a link to the page, because the discussion that follows - and the participants are all networking experts - is quite interesting if you are interested in IPsec):

Yep.....integrity was specifically decoupled due to export  
restrictions on cryptography technologies used for encryption - the  
restrictions do not apply for just authentication/integrity  
cryptography.  Hence AH and ESP.   ESP-Null came about when folks  
realized AH could not traverse NATs.

Now that we understand some of the historical background for the multiplicity of IPsec protocols and that ESP, with the ESP-Null option can satisfy a range of needs, let's look at the ESP header. We'll start with transport mode, shown in Figure 48.

Figure 48: IP Header with ESP Using Transport Mode

First of all, notice that the protocol field in the IP header is set to 50, which I said, above, is the value to indicate that the next header is ESP. Following the standard IP header, we've got the SPI and the Sequence Number fields, which are the same as they are in AH. As in AH, there is support for the Extended Sequence Number option. As in AH, the data (such as TCP or UDP, including the header, according to the Next Header field) follows. Though I did not show it in the figure, if the algorithm used to encrypt the payload requires cryptographic synchronization data, e.g., an Initialization Vector (IV), then this data is carried explicitly in the data field, but the transmission of an explicit IV is invisible to ESP.

The Padding field is a field that we did not see in AH. This field is necessary because some encryption algorithms require the plaintext to be a multiple of some number of bytes, e.g., the block size of a block cipher, in which case the Padding field is used to fill the plaintext (consisting of the payload data, the Padding, the Pad Length, and the Next Header fields) to the size required by the algorithm. Padding might also may be required, irrespective of the requirements of the encryption algorithm, to ensure that the resulting ciphertext terminates on a 4-byte boundary. Specifically, the Pad Length and Next Header fields must be right aligned within a 4-byte word, as we see in Figure 47, so that the ICV field (if integrity is in use for that SA) is aligned on a 4-byte boundary. It could turn out that the uses zero bytes, so the Pad Length could be anywhere between zero and 255 bytes.

The Next Header field, unlike AH, is in a "trailer" rather than the ESP header, but nevertheless represents the same value - i.e., the number assigned to the next level protocol, such as 6 for TCP.

The ICV field, defined as it is for AH, is optional. It is present only if the integrity service is selected for the SA. Notice, however, that, unlike with AH, ESP's integrity service does not secure any fields in the IP header itself. This is why ESP would be the choice of IPsec protocol, rather than AH, when NAT is employed, even if only integrity, and not confidentiality, is desired. By selecting the NULL encryption algorithm, ESP will provide the integrity service only. As mentioned earlier, an SA could also have confidentiality and not integrity - perhaps integrity is not necessary for this SA (though the RFC explains some concerns about doing this), or perhaps the more extensive (as far as the header is concerned) integrity of AH is preferred for this SA. However, the RFC does not permit ESP without either the integrity or confidentiality service. At least one of these services must be selected - they can't both use a NULL algorithm.

Now we'll take a look at a datagram secured with ESP for the tunnel mode, shown in Figure 49.

Figure 49: IP Header with ESP Using Tunnel Mode

As we saw with AH in tunnel mode, it is the (ESP) Next Header field that tips us off that we are in tunnel mode. Since that value would be either 4 (for IPv4) or 41 (for IPv6), we know that an entire IP datagram is encapsulated in the ESP. As mentioned before, ESP could provide either integrity or confidentiality - or both, but not neither. Using ESP with both would be the typical VPN implementation.

Internet Key Exchange

I stated above that IKE is the component of IPsec that is used to perform mutual authentication and to establish and maintain security associations (SAs). The current version is IKE2, which is the default automated key management protocol that was defined for use with IPsec. IKE actually provides more functionality than just key distribution. I explained above that to provide the security services that IPsec is defined to provide, a shared state between the source and sink of an IP datagram, an SA, must be maintained. Among the things that this state defines are the specific security services provided to the datagram, such as which cryptographic algorithms will be used to provide the services, and the keys used as input to the cryptographic algorithms. Since establishing this shared state manually does not scale well, a protocol is needed to define this relationship dynamically, and that is the purpose of IKE.

I'll quote from the RFC:

   IKE performs mutual authentication between two parties and
   establishes an IKE security association (SA) that includes shared
   secret information that can be used to efficiently establish SAs for
   Encapsulating Security Payload (ESP) and/or Authentication Header (AH)
   and a set of cryptographic algorithms to be used by the SAs to
   protect the traffic that they carry.

We know that if Alice and Bob want to communicate using IPsec, there are actually two SAs defined, one for each direction. SAs for either AH or ESP are called CHILD_SAs. IKE itself also sets up an SA, called the IKE_SA. The IKE message flow always has a request followed by a response. The first request/response pair negotiates security parameters for the IKE_SA, sends nonces, and sends Diffie-Hellman values. Nonces are used as inputs to cryptographic functions.

The second request/response exchange will authenticate the previous messages, transmit identities, prove knowledge of the secrets corresponding to those identities, and establish the first CHILD_SA. The subsequent request/response exchanges might either set up another CHILD_SA or be INFORMATIONAL, such that an INFORMATIONAL exchange might delete an SA, report an error condition, or do some other "housekeeping".

IKE normally listens and sends on UDP port 500. Since UDP is a datagram (unreliable) protocol, IKE includes in its definition recovery from transmission errors, including packet loss, packet replay, and packet forgery.

So Why Do We Need Secure Applications?

OK - now we get to the big question: Prior to this chapter, we looked at a variety of secure applications and at Transport Layer Security (TLS). If IPsec provides security - and all our network traffic is encapsulated in IP datagrams, why do we need secure applications, such as SSH, or PGP to secure mail? Why do we need TLS? Why not just use IPsec? At first glance, since IPsec can also provide host-to-host security, don't secure applications, or a secure Transport layer, become superfluous?

Since IPsec appears to provide a comprehensive security solution, I'll point out several reasons why secure applications and TLS still have important value. Relying solely on IPsec to provide security makes it difficult to customize the security policies to specific applications. It might be desirable to implement different levels of security and different policies for different applications. One size does not fit all. The current IPsec policy system is not application-aware, so that it is not feasible to configure application-specific IPsec policies. The other side of the coin is that IPsec is transparent to applications - the end-user need not be involved, which can be considered a benefit. Clearly, it all depends on your priorities.

Another limitation of IPsec, which is a network layer security service, is that it cannot authenticate two end-users to each other. It can only authenticate IP addresses, or hosts. IKE is only used for host-oriented authentication, but user-oriented authentication is usually required by secured applications. For example, Alice's digital signature on an email message proves that Alice actually generated the message, and that it was received unaltered. Non-repudiation is part and parcel of this level of authenticity.

A perceived drawback of IPsec, in contrast to secure applications or TLS, is that it requires modification of the operating system. However, this drawback is mitigated by the fact that most modern operating systems implement (or will implement) IPsec, especially if IPv6, which requires IPsec, becomes ubiquitous (a questionable premise, though I tend to believe it will eventually happen).

On the other hand, no matter how secure the upper layer protocols are, the network layer is vulnerable to attack, such as IP address spoofing and fragmentation attacks, which can sabotage end-to-end communications. An application layer (or transport layer) security protocol cannot protect IP header information. Of course, it would have been better to design a system with security in mind from the beginning, but we can't start the development of the Internet over again. Until, if ever, a comprehensive unified security solution is developed, users will find themselves employing a combination of the various security tools that were presented in this portal.

Since VPNs are probably the most commonly found application of IPsec, it is natural that we continue to the next chapter.

Next

www.rad.com