Appendix D. Design Alternatives
This document has picked a certain set of design choices in order to try to work out a bunch of the details and to stimulate discussion. But, as has been discussed on the mailing list, there are other choices that make sense. This appendix tries to enumerate some alternatives.D.1. Context Granularity
Over the years, various suggestions have been made whether the shim should, even if it operates at the IP layer, be aware of ULP connections and sessions and, as a result, be able to make separate shim contexts for separate ULP connections and sessions. A few different options have been discussed: o Each ULP connection maps to its own shim context. o The shim is unaware of the ULP notion of connections and just operates on a host-to-host (IP address) granularity. o Hybrids in which the shim is aware of some ULPs (such as TCP) and handles other ULPs on a host-to-host basis. Having shim state for every ULP connection potentially means higher overhead since the state-setup overhead might become significant; there is utility in being able to amortize this over multiple connections. But being completely unaware of the ULP connections might hamper ULPs that want different communication to use different locator pairs, for instance, for quality or cost reasons. The protocol has a shim that operates with host-level granularity (strictly speaking, with ULID-pair granularity) to be able to amortize the context establishment over multiple ULP connections. This is combined with the ability for Shim6-aware ULPs to request context forking so that different ULP traffic can use different locator pairs.D.2. Demultiplexing of Data Packets in Shim6 Communications
Once a ULID-pair context is established between two hosts, packets may carry locators that differ from the ULIDs presented to the ULPs using the established context. One of the main functions of the Shim6 layer is to perform the mapping between the locators used to forward packets through the network and the ULIDs presented to the ULP. In order to perform that translation for incoming packets, the
Shim6 layer needs to first identify which of the incoming packets need to be translated and then perform the mapping between locators and ULIDs using the associated context. Such operation is called "demultiplexing". It should be noted that, because any address can be used both as a locator and as a ULID, additional information, other than the addresses carried in packets, needs to be taken into account for this operation. For example, if a host has addresses A1 and A2 and starts communicating with a peer with addresses B1 and B2, then some communication (connections) might use the pair <A1, B1> as ULID and others might use, for example, <A2, B2>. Initially there are no failures, so these address pairs are used as locators, i.e., in the IP address fields in the packets on the wire. But when there is a failure, the Shim6 layer on A might decide to send packets that used <A1, B1> as ULIDs using <A2, B2> as the locators. In this case, B needs to be able to rewrite the IP address field for some packets and not others, but the packets all have the same locator pair. In order to accomplish the demultiplexing operation successfully, data packets carry a Context Tag that allows the receiver of the packet to determine the shim context to be used to perform the operation. Two mechanisms for carrying the Context Tag information have been considered in depth during the shim protocol design: those carrying the Context Tag in the Flow Label field of the IPv6 header and those using a new Extension header to carry the Context Tag. In this appendix, we will describe the pros and cons of each mechanism and justify the selected option.D.2.1. Flow Label
A possible approach is to carry the Context Tag in the Flow Label field of the IPv6 header. This means that when a Shim6 context is established, a Flow Label value is associated with this context (and perhaps a separate Flow Label for each direction). The simplest way to do this is to have the triple <Flow Label, Source Locator, Destination Locator> identify the context at the receiver. The problem with this approach is that, because the locator sets are dynamic, it is not possible at any given moment to be sure that two contexts for which the same Context Tag is allocated will have disjoint locator sets during the lifetime of the contexts. Suppose that Node A has addresses IPA1, IPA2, IPA3, and IPA4 and that Host B has addresses IPB1 and IPB2.
Suppose that two different contexts are established between Host A and Host B. Context #1 is using IPA1 and IPB1 as ULIDs. The locator set associated to IPA1 is IPA1 and IPA2, while the locator set associated to IPB1 is just IPB1. Context #2 uses IPA3 and IPB2 as ULIDs. The locator set associated to IPA3 is IPA3 and IPA4, while the locator set associated to IPB2 is just IPB2. Because the locator sets of Context #1 and Context #2 are disjoint, hosts could think that the same Context Tag value can be assigned to both of them. The problem arrives when, later on, IPA3 is added as a valid locator for IPA1 in Context #2 and IPB2 is added as a valid locator for IPB1 in Context #1. In this case, the triple <Flow Label, Source Locator, Destination Locator> would not identify a unique context anymore, and correct demultiplexing is no longer possible. A possible approach to overcome this limitation is to simply not repeat the Flow Label values for any communication established in a host. This basically means that each time a new communication that is using different ULIDs is established, a new Flow Label value is assigned to it. By these means, each communication that is using different ULIDs can be differentiated because each has a different Flow Label value. The problem with such an approach is that it requires the receiver of the communication to allocate the Flow Label value used for incoming packets, in order to assign them uniquely. For this, a shim negotiation of the Flow Label value to use in the communication is needed before exchanging data packets. This poses problems with non- Shim6-capable hosts, since they would not be able to negotiate an acceptable value for the Flow Label. This limitation can be lifted by marking the packets that belong to shim sessions from those that do not. These markings would require at least a bit in the IPv6 header that is not currently available, so more creative options would be required, for instance, using new Next Header values to indicate that the packet belongs to a Shim6-enabled communication and that the Flow Label carries context information as proposed in [8]. However, even if new Next Header values are used in this way, such an approach is incompatible with the deferred-establishment capability of the shim protocol, which is a preferred function since it suppresses delay due to shim context establishment prior to the initiation of communication. Such capability also allows nodes to
define at which stage of the communication they decide, based on their own policies, that a given communication requires protection by the shim. In order to cope with the identified limitations, an alternative approach that does not constrain the Flow Label values that are used by communications using ULIDs equal to the locators (i.e., no shim translation) is to only require that different Flow Label values are assigned to different shim contexts. In such an approach, communications start with unmodified Flow Label usage (could be zero or as suggested in [12]). The packets sent after a failure when a different locator pair is used would use a completely different Flow Label, and this Flow Label could be allocated by the receiver as part of the shim context establishment. Since it is allocated during the context establishment, the receiver of the "failed over" packets can pick a Flow Label of its choosing (that is unique in the sense that no other context is using it as a Context Tag), without any performance impact, respecting that, for each locator pair, the Flow Label value used for a given locator pair doesn't change due to the operation of the multihoming shim. In this approach, the constraint is that Flow Label values being used as context identifiers cannot be used by other communications that use non-disjoint locator sets. This means that once a given Flow Label value has been assigned to a shim context that has a certain locator sets associated, the same value cannot be used for other communications that use an address pair that is contained in the locator sets of the context. This is a constraint in the potential Flow Label allocation strategies. A possible workaround to this constraint is to mark shim packets that require translation, in order to differentiate them from regular IPv6 packets, using the artificial Next Header values described above. In this case, the Flow Label values constrained are only those of the packets that are being translated by the shim. This last approach would be the preferred approach if the Context Tag is to be carried in the Flow Label field. This is the case not only because it imposes the minimum constraints to the Flow Label allocation strategies, limiting the restrictions only to those packets that need to be translated by the shim, but also because context-loss detection mechanisms greatly benefit from the fact that shim data packets are identified as such, allowing the receiving end to identify if a shim context associated to a received packet is supposed to exist, as will be discussed in the context-loss detection appendix below.
D.2.2. Extension Header
Another approach, which is the one selected for this protocol, is to carry the Context Tag in a new Extension header. These Context Tags are allocated by the receiving end during the Shim6 protocol initial negotiation, implying that each context will have two Context Tags, one for each direction. Data packets will be demultiplexed using the Context Tag carried in the Extension header. This seems a clean approach since it does not overload existing fields. However, it introduces additional overhead in the packet due to the additional header. The additional overhead introduced is 8 octets. However, it should be noted that the Context Tag is only required when a locator other than the one used as ULID is contained in the packet. Packets where both the Source and Destination Address fields contain the ULIDs do not require a Context Tag, since no rewriting is necessary at the receiver. This approach would reduce the overhead because the additional header is only required after a failure. On the other hand, this approach would cause changes in the available MTU for some packets, since packets that include the Extension header will have an MTU that is 8 octets shorter. However, path changes through the network can result in a different MTU in any case; thus, having a locator change, which implies a path change, affect the MTU doesn't introduce any new issues.D.3. Context-Loss Detection
In this appendix, we will present different approaches considered to detect context loss and potential context-recovery strategies. The scenario being considered is the following: Node A and Node B are communicating using IPA1 and IPB1. Sometime later, a shim context is established between them, with IPA1 and IPB1 as ULIDs and with IPA1,...,IPAn and IPB1,...,IPBm as locator sets, respectively. It may happen that, later on, one of the hosts (e.g., Host A) loses the shim context. The reason for this can be that Host A has a more aggressive garbage collection policy than Host B or that an error occurred in the shim layer at Host A and resulted in the loss of the context state. The mechanisms considered in this appendix are aimed at dealing with this problem. There are essentially two tasks that need to be performed in order to cope with this problem: first, the context loss must be detected and, second, the context needs to be recovered/ re-established. Mechanisms for detecting context loss.
These mechanisms basically consist in each end of the context that periodically sends a packet containing context-specific information to the other end. Upon reception of such packets, the receiver verifies that the required context exists. In the case that the context does not exist, it sends a packet notifying the sender of the problem. An obvious alternative for this would be to create a specific context keepalive exchange, which consists in periodically sending packets with this purpose. This option was considered and discarded because it seemed an overkill to define a new packet exchange to deal with this issue. Another alternative is to piggyback the context-loss detection function in other existent packet exchanges. In particular, both shim control and data packets can be used for this. Shim control packets can be trivially used for this because they carry context-specific information. This way, when a node receives one such packet, it will verify if the context exists. However, shim control frequency may not be adequate for context-loss detection since control packet exchanges can be very limited for a session in certain scenarios. Data packets, on the other hand, are expected to be exchanged with a higher frequency but do not necessarily carry context-specific information. In particular, packets flowing before a locator change (i.e., a packet carrying the ULIDs in the address fields) do not need context information since they do not need any shim processing. Packets that carry locators that differ from the ULIDs carry context information. However, we need to make a distinction here between the different approaches considered to carry the Context Tag -- in particular, between those approaches where packets are explicitly marked as shim packets and those approaches where packets are not marked as such. For instance, in the case where the Context Tag is carried in the Flow Label and packets are not marked as shim packets (i.e., no new Next Header values are defined for shim), a receiver that has lost the associated context is not able to detect that the packet is associated with a missing context. The result is that the packet will be passed unchanged to the upper-layer protocol, which in turn will probably silently discard it due to a checksum error. The resulting behavior is that the context loss is undetected. This is one additional reason to discard an approach that carries the Context Tag in the Flow Label field and does not explicitly mark the shim packets as such. On the other hand, approaches that mark shim data packets (like those that use the Extension header or the Flow Label
with new Next Header values) allow the receiver to detect if the context associated to the received packet is missing. In this case, data packets also perform the function of a context-loss detection exchange. However, it must be noted that only those packets that carry a locator that differs from the ULID are marked. This basically means that context loss will be detected after an outage has occurred, i.e., alternative locators are being used. Summarizing, the proposed context-loss detection mechanisms use shim control packets and Shim6 Payload Extension headers to detect context loss. Shim control packets detect context loss during the whole lifetime of the context, but the expected frequency in some cases is very low. On the other hand, Shim6 Payload Extension headers have a higher expected frequency in general, but they only detect context loss after an outage. This behavior implies that it will be common that context loss is detected after a failure, i.e., once it is actually needed. Because of that, a mechanism for recovering from context loss is required if this approach is used. Overall, the mechanism for detecting lost context would work as follows: the end that still has the context available sends a message referring to the context. Upon the reception of such message, the end that has lost the context identifies the situation and notifies the other end of the context-loss event by sending a packet containing the lost context information extracted from the received packet. One option is to simply send an error message containing the received packets (or at least as much of the received packet that the MTU allows to fit). One of the goals of this notification is to allow the other end that still retains context state to re-establish the lost context. The mechanism to re-establish the lost context consists in performing the 4-way initial handshake. This is a time- consuming exchange and, at this point, time may be critical since we are re-establishing a context that is currently needed (because context-loss detection may occur after a failure). So another option, which is the one used in this protocol, is to replace the error message with a modified R1 message so that the time required to perform the context-establishment exchange can be reduced. Upon the reception of this modified R1 message, the end that still has the context state can finish the context-establishment exchange and restore the lost context.D.4. Securing Locator Sets
The adoption of a protocol like SHIM, which allows the binding of a given ULID with a set of locators, opens the door for different types of redirection attacks as described in [15]. The goal, in terms of
security, for the design of the shim protocol is to not introduce any new vulnerability into the Internet architecture. It is a non-goal to provide additional protection other than what is currently available in the single-homed IPv6 Internet. Multiple security mechanisms were considered to protect the shim protocol. In this appendix we will present some of them. The simplest option to protect the shim protocol is to use cookies, i.e., a randomly generated bit string that is negotiated during the context-establishment phase and then is included in subsequent signaling messages. By these means, it would be possible to verify that the party that was involved in the initial handshake is the same party that is introducing new locators. Moreover, before using a new locator, an exchange is performed through the new locator, verifying that the party located at the new locator knows the cookie, i.e., that it is the same party that performed the initial handshake. While this security mechanism does indeed provide a fair amount of protection, it leaves the door open for so-called time-shifted attacks. In these attacks, an attacker on the path discovers the cookie by sniffing any signaling message. After that, the attacker can leave the path and still perform a redirection attack since, as he is in possession of the cookie, he can introduce a new locator into the locator set and can also successfully perform the reachability exchange if he is able to receive packets at the new locator. The difference with the current single-homed IPv6 situation is that in the current situation the attacker needs to be on-path during the whole lifetime of the attack, while in this new situation (where only cookie protection is provided), an attacker that was once on the path can perform attacks after he has left the on-path location. Moreover, because the cookie is included in signaling messages, the attacker can discover the cookie by sniffing any of them, making the protocol vulnerable during the whole lifetime of the shim context. A possible approach to increase security is to use a shared secret, i.e., a bit string that is negotiated during the initial handshake but that is used as a key to protect following messages. With this technique, the attacker must be present on the path and sniffing packets during the initial handshake, since this is the only moment when the shared secret is exchanged. Though it imposes that the attacker must be on path at a very specific moment (the establishment phase), and though it improves security, this approach is still vulnerable to time-shifted attacks. It should be noted that, depending on protocol details, an attacker may be able to force the re-creation of the initial handshake (for instance, by blocking
messages and making the parties think that the context has been lost); thus, the resulting situation may not differ that much from the cookie-based approach. Another option that was discussed during the design of this protocol was the possibility of using IPsec for protecting the shim protocol. Now, the problem under consideration in this scenario is how to securely bind an address that is being used as ULID with a locator set that can be used to exchange packets. The mechanism provided by IPsec to securely bind the address that is used with cryptographic keys is the usage of digital certificates. This implies that an IPsec-based solution would require a common and mutually trusted third party to generate digital certificates that bind the key and the ULID. Considering that the scope of application of the shim protocol is global, this would imply a global public key infrastructure (PKI). The major issues with this approach are the deployment difficulties associated with a global PKI. The other possibility would be to use some form of opportunistic IPSec, like Better-Than-Nothing-Security (BTNS) [22]. However, this would still present some issues. In particular, this approach requires a leap- of-faith in order to bind a given address to the public key that is being used, which would actually prevent the most critical security feature that a Shim6 security solution needs to achieve from being provided: proving identifier ownership. On top of that, using IPsec would require to turn on per-packet AH/ESP just for multihoming to occur. In general, SHIM6 was expected to work between pairs of hosts that have no prior arrangement, security association, or common, trusted third party. It was also seen as undesirable to have to turn on per- packet AH/ESP just for the multihoming to occur. However, Shim6 should work and have an additional level of security where two hosts choose to use IPsec. Another design alternative would have employed some form of opportunistic or Better-Than-Nothing Security (BTNS) IPsec to perform these tasks with IPsec instead. Essentially, HIP in opportunistic mode is very similar to SHIM6, except that HIP uses IPsec, employs per-packet ESP, and introduces another set of identifiers. Finally, two different technologies were selected to protect the shim protocol: HBA [3] and CGA [2]. These two techniques provide a similar level of protection but also provide different functionality with different computational costs. The HBA mechanism relies on the capability of generating all the addresses of a multihomed host as an unalterable set of intrinsically bound IPv6 addresses, known as an HBA set. In this approach,
addresses incorporate a cryptographic one-way hash of the prefix set available into the interface identifier part. The result is that the binding between all the available addresses is encoded within the addresses themselves, providing hijacking protection. Any peer using the shim protocol node can efficiently verify that the alternative addresses proposed for continuing the communication are bound to the initial address through a simple hash calculation. A limitation of the HBA technique is that, once generated, the address set is fixed and cannot be changed without also changing all the addresses of the HBA set. In other words, the HBA technique does not support dynamic addition of address to a previously generated HBA set. An advantage of this approach is that it requires only hash operations to verify a locator set, imposing very low computational cost to the protocol. In a CGA-based approach, the address used as ULID is a CGA that contains a hash of a public key in its interface identifier. The result is a secure binding between the ULID and the associated key pair. This allows each peer to use the corresponding private key to sign the shim messages that convey locator set information. The trust chain in this case is the following: the ULID used for the communication is securely bound to the key pair because it contains the hash of the public key, and the locator set is bound to the public key through the signature. The CGA approach then supports dynamic addition of new locators in the locator set, since in order to do that the node only needs to sign the new locator with the private key associated with the CGA used as ULID. A limitation of this approach is that it imposes systematic usage of public key cryptography with its associate computational cost. Either of these two mechanisms, HBA and CGA, provides time-shifted attack protection, since the ULID is securely bound to a locator set that can only be defined by the owner of the ULID. So the design decision adopted was that both mechanisms, HBA and CGA, are supported. This way, when only stable address sets are required, the nodes can benefit from the low computational cost offered by HBA, while when dynamic locator sets are required, this can be achieved through CGAs with an additional cost. Moreover, because HBAs are defined as a CGA extension, the addresses available in a node can simultaneously be CGAs and HBAs, allowing the usage of the HBA and CGA functionality when needed, without requiring a change in the addresses used.D.5. ULID-Pair Context-Establishment Exchange
Two options were considered for the ULID-pair context-establishment exchange: a 2-way handshake and a 4-way handshake.
A key goal for the design of this exchange was protection against DoS attacks. The attack under consideration was basically a situation where an attacker launches a great amount of ULID-pair establishment- request packets, exhausting the victim's resources similarly to TCP SYN flooding attacks. A 4-way handshake exchange protects against these attacks because the receiver does not create any state associated to a given context until the reception of the second packet, which contains prior- contact proof in the form of a token. At this point, the receiver can verify that at least the address used by the initiator is valid to some extent, since the initiator is able to receive packets at this address. In the worst case, the responder can track down the attacker using this address. The drawback of this approach is that it imposes a 4-packet exchange for any context establishment. This would be a great deal if the shim context needed to be established up front, before the communication can proceed. However, thanks to the deferred context-establishment capability of the shim protocol, this limitation has a reduced impact in the performance of the protocol. (However, it may have a greater impact in the situation of context recovery, as discussed earlier. However, in this case, it is possible to perform optimizations to reduce the number of packets as described above.) The other option considered was a 2-way handshake with the possibility to fall back to a 4-way handshake in case of attack. In this approach, the ULID-pair establishment exchange normally consists of a 2-packet exchange and does not verify that the initiator has performed a prior contact before creating context state. In case a DoS attack is detected, the responder falls back to a 4-way handshake similar to the one described previously, in order to prevent the detected attack from proceeding. The main difficulty with this attack is how to detect that a responder is currently under attack. It should be noted that, because this is a 2-way exchange, it is not possible to use the number of half-open sessions (as in TCP) to detect an ongoing attack; different heuristics need to be considered. The design decision taken was that, considering the current impact of DoS attacks and the low impact of the 4-way exchange in the shim protocol (thanks to the deferred context-establishment capability), a 4-way exchange would be adopted for the base protocol.D.6. Updating Locator Sets
There are two possible approaches to the addition and removal of locators: atomic and differential approaches. The atomic approach essentially sends the complete locator set each time a variation in the locator set occurs. The differential approach sends the
differences between the existing locator set and the new one. The atomic approach imposes additional overhead since all of the locator set has to be exchanged each time, while the differential approach requires re-synchronization of both ends through changes (i.e., requires that both ends have the same idea about what the current locator set is). Because of the difficulties imposed by the synchronization requirement, the atomic approach was selected.D.7. State Cleanup
There are essentially two approaches for discarding an existing state about locators, keys, and identifiers of a correspondent node: a coordinated approach and an unilateral approach. In the unilateral approach, each node discards information about the other node without coordination with the other node, based on some local timers and heuristics. No packet exchange is required for this. In this case, it would be possible that one of the nodes has discarded the state while the other node still hasn't. In this case, a No Context Error message may be required to inform the other node about the situation; possibly a recovery mechanism is also needed. A coordinated approach would use an explicit CLOSE mechanism, akin to the one specified in HIP [20]. If an explicit CLOSE handshake and associated timer is used, then there would no longer be a need for the No Context Error message due to a peer having garbage collected at its end of the context. However, there is still potentially a need to have a No Context Error message in the case of a complete state loss of the peer (also known as a crash followed by a reboot). Only if we assume that the reboot takes at least the time of the CLOSE timer, or that it is okay to not provide complete service until CLOSE-timer minutes after the crash, can we completely do away with the No Context Error message. In addition, another aspect that is relevant for this design choice is the context confusion issue. In particular, using a unilateral approach to discard context state clearly opens up the possibility of context confusion, where one of the ends unilaterally discards the context state, while the other does not. In this case, the end that has discarded the state can re-use the Context Tag value used for the discarded state for another context, creating potential context confusion. In order to illustrate the cases where problems would arise, consider the following scenario: o Hosts A and B establish context 1 using CTA and CTB as Context Tags.
o Later on, A discards context 1 and the Context Tag value CTA becomes available for reuse. o However, B still keeps context 1. This would create context confusion in the following two cases: o A new context 2 is established between A and B with a different ULID pair (or Forked Instance Identifier), and A uses CTA as the Context Tag. If the locator sets used for both contexts are not disjoint, we have context confusion. o A new context is established between A and C, and A uses CTA as the Context Tag value for this new context. Later on, B sends Payload Extension header and/or control messages containing CTA, which could be interpreted by A as belonging to context 2 (if no proper care is taken). Again we have context confusion. One could think that using a coordinated approach would eliminate such context confusion, making the protocol much simpler. However, this is not the case, because even in the case of a coordinated approach using a CLOSE/CLOSE ACK exchange, there is still the possibility of a host rebooting without having the time to perform the CLOSE exchange. So, it is true that the coordinated approach eliminates the possibility of context confusion due to premature garbage collection, but it does not prevent the same situations due to a crash and reboot of one of the involved hosts. The result is that, even if we went for a coordinated approach, we would still need to deal with context confusion and provide the means to detect and recover from these situations.
Authors' Addresses
Erik Nordmark Sun Microsystems 17 Network Circle Menlo Park, CA 94025 USA Phone: +1 650 786 2921 EMail: erik.nordmark@sun.com Marcelo Bagnulo Universidad Carlos III de Madrid Av. Universidad 30 Leganes, Madrid 28911 SPAIN Phone: +34 91 6248814 EMail: marcelo@it.uc3m.es URI: http://www.it.uc3m.es