When the IKE Initiator uses TCP encapsulation, it will initiate a TCP connection to the Responder using the Responder's preconfigured TCP port. The first bytes sent on the TCP stream
MUST be the stream prefix value (
Section 4). After this prefix, encapsulated IKE messages will negotiate the IKE SA and initial Child SA [
RFC 7296]. After this point, both encapsulated IKE (
Figure 1) and ESP (
Figure 2) messages will be sent over the TCP connection. The TCP Responder
MUST wait for the entire stream prefix to be received on the stream before trying to parse out any IKE or ESP messages. The stream prefix is sent only once, and only by the TCP Originator.
In order to close an IKE session, either the Initiator or Responder
SHOULD gracefully tear down IKE SAs with DELETE payloads. Once the SA has been deleted, the TCP Originator
SHOULD close the TCP connection if it does not intend to use the connection for another IKE session to the TCP Responder. If the TCP connection is no longer associated with any active IKE SA, the TCP Responder
MAY close the connection to clean up IKE resources if the TCP Originator didn't close it within some reasonable period of time (e.g., a few seconds).
An unexpected FIN or a TCP Reset on the TCP connection may indicate a loss of connectivity, an attack, or some other error. If a DELETE payload has not been sent, both sides
SHOULD maintain the state for their SAs for the standard lifetime or timeout period. The TCP Originator is responsible for re-establishing the TCP connection if it is torn down for any unexpected reason. Since new TCP connections may use different IP addresses and/or ports due to NAT mappings or local address or port allocations changing, the TCP Responder
MUST allow packets for existing SAs to be received from new source IP addresses and ports. Note that the IPv6 Flow-ID header
MUST remain constant when a new TCP connection is created to avoid ECMP load balancing.
A peer
MUST discard a partially received message due to a broken connection.
Whenever the TCP Originator opens a new TCP connection to be used for an existing IKE SA, it
MUST send the stream prefix first, before any IKE or ESP messages. This follows the same behavior as the initial TCP connection.
Multiple IKE SAs
MUST NOT share a single TCP connection, unless one is a rekey of an existing IKE SA, in which case there will temporarily be two IKE SAs on the same TCP connection.
If a TCP connection is being used to continue an existing IKE/ESP session, the TCP Responder can recognize the session using either the IKE SPI from an encapsulated IKE message or the ESP SPI from an encapsulated ESP packet. If the session had been fully established previously, it is suggested that the TCP Originator send an UPDATE_SA_ADDRESSES message if MOBIKE is supported and an empty informational message if it is not.
The TCP Responder
MUST NOT accept any messages for the existing IKE session on a new incoming connection, unless that connection begins with the stream prefix. If either the TCP Originator or TCP Responder detects corruption on a connection that was started with a valid stream prefix, it
SHOULD close the TCP connection. The connection can be corrupted if there are too many subsequent messages that cannot be parsed as valid IKE messages or ESP messages with known SPIs, or if the authentication check for an IKE message or ESP message with a known SPI fails. Implementations
SHOULD NOT tear down a connection if only a few consecutive ESP packets have unknown SPIs since the SPI databases may be momentarily out of sync. If there is instead a syntax issue within an IKE message, an implementation
MUST send the INVALID_SYNTAX notify payload and tear down the IKE SA as usual, rather than tearing down the TCP connection directly.
A TCP Originator
SHOULD only open one TCP connection per IKE SA, over which it sends all of the corresponding IKE and ESP messages. This helps ensure that any firewall or NAT mappings allocated for the TCP connection apply to all of the traffic associated with the IKE SA equally.
As with TCP Originators, a TCP Responder
SHOULD send packets for an IKE SA and its Child SAs over only one TCP connection at any given time. It
SHOULD choose the TCP connection on which it last received a valid and decryptable IKE or ESP message. In order to be considered valid for choosing a TCP connection, an IKE message must be successfully decrypted and authenticated, not be a retransmission of a previously received message, and be within the expected window for IKE message IDs. Similarly, an ESP message must be successfully decrypted and authenticated, and must not be a replay of a previous message.
Since a connection may be broken and a new connection re-established by the TCP Originator without the TCP Responder being aware, a TCP Responder
SHOULD accept receiving IKE and ESP messages on both old and new connections until the old connection is closed by the TCP Originator. A TCP Responder
MAY close a TCP connection that it perceives as idle and extraneous (one previously used for IKE and ESP messages that has been replaced by a new connection).
Section 2.1 of
RFC 7296 describes how IKEv2 deals with the unreliability of the UDP protocol. In brief, the exchange Initiator is responsible for retransmissions and must retransmit request messages until a response message is received. If no reply is received after several retransmissions, the SA is deleted. The Responder never initiates retransmission, but it must send a response message again in case it receives a retransmitted request.
When IKEv2 uses a reliable transport protocol, like TCP, the retransmission rules are as follows:
-
The exchange Initiator SHOULD NOT retransmit request message (*); if no response is received within some reasonable period of time, the IKE SA is deleted.
-
If a new TCP connection for the IKE SA is established while the exchange Initiator is waiting for a response, the Initiator MUST retransmit its request over this connection and continue to wait for a response.
-
The exchange Responder does not change its behavior, but acts as described in Section 2.1 of RFC 7296.
(*) This is an optimization; implementations may continue to use the retransmission logic from
Section 2.1 of
RFC 7296 for simplicity.
IKEv2 provides a DoS attack protection mechanism through Cookies, which is described in
Section 2.6 of
RFC 7296. [
RFC 8019] extends this mechanism for protection against DDoS attacks by means of Client Puzzles. Both mechanisms allow the Responder to avoid keeping state until the Initiator proves its IP address is legitimate (and after solving a puzzle if required).
The connection-oriented nature of TCP transport brings additional considerations for using these mechanisms. In general, Cookies provide less value in the case of TCP encapsulation; by the time a Responder receives the IKE_SA_INIT request, the TCP session has already been established and the Initiator's IP address has been verified. Moreover, a TCP/IP stack creates state once a TCP SYN packet is received (unless SYN Cookies described in [
RFC 4987] are employed), which contradicts the statelessness of IKEv2 Cookies. In particular, with TCP, an attacker is able to mount a SYN flooding DoS attack that an IKEv2 Responder cannot prevent using stateless IKEv2 Cookies. Thus, when using TCP encapsulation, it makes little sense to send Cookie requests without Puzzles unless the Responder is concerned with a possibility of TCP sequence number attacks (see [
RFC 6528] and [
RFC 9293] for details). Puzzles, on the other hand, still remain useful (and their use requires using Cookies).
The following considerations are applicable for using Cookie and Puzzle mechanisms in the case of TCP encapsulation:
-
The exchange Responder SHOULD NOT send an IKEv2 Cookie request without an accompanied Puzzle; implementations might choose to have exceptions to this for cases like mitigating TCP sequence number attacks.
-
If the Responder chooses to send a Cookie request (possibly along with Puzzle request), then the TCP connection that the IKE_SA_INIT request message was received over SHOULD be closed after the Responder sends its reply and no repeated requests are received within some short period of time to keep the Responder stateless (see Section 6.3.1). Note that the Responder MUST NOT include the Initiator's TCP port into the Cookie calculation (*) since the Cookie can be returned over a new TCP connection with a different port.
-
The exchange Initiator acts as described in Section 2.6 of RFC 7296 and Section 7 of RFC 8019, i.e., using TCP encapsulation doesn't change the Initiator's behavior.
(*) Examples of Cookie calculation methods are given in
Section 2.6 of
RFC 7296 and in
Section 7.1.1.3 of
RFC 8019, and they don't include transport protocol ports. However, these examples are given for illustrative purposes since the Cookie generation algorithm is a local matter and some implementations might include port numbers that won't work with TCP encapsulation. Note also that these examples include the Initiator's IP address in Cookie calculation. In general, this address may change between two initial requests (with and without Cookies). This may happen due to NATs, which have more freedom to change source IP addresses for new TCP connections than for UDP. In such cases, cookie verification might fail.
There is a trade-off in choosing the period of time after which the TCP connection is closed. If it is too short, then the proper Initiator that repeats its request would need to re-establish the TCP connection, introducing additional delay. On the other hand, if it is too long, then the Responder's resources would be wasted in case the Initiator never comes back. This document doesn't mandate the duration of time because it doesn't affect interoperability, but it is believed that 5-10 seconds is a good compromise. Also, note that if the Responder requests that the Initiator solve a puzzle, then the Responder can estimate how long it would take the Initiator to find a solution and adjust the time interval accordingly.
Section 2.21.1 of
RFC 7296 describes how error notifications are handled in the IKE_SA_INIT exchange. In particular, it is advised that the Initiator should not act immediately after receiving an error notification; instead, it should wait some time for a valid response since the IKE_SA_INIT messages are completely unauthenticated. This advice does not apply equally in the case of TCP encapsulation. If the Initiator receives a response message over TCP, then either this message is genuine and was sent by the peer or the TCP session was hijacked and the message is forged. In the latter case, no genuine messages from the Responder will be received.
Thus, in the case of TCP encapsulation, an Initiator
SHOULD NOT wait for additional messages in case it receives an error notification from the Responder in the IKE_SA_INIT exchange.
In the IKE_SA_INIT exchange, if the Responder returns an error notification that implies a recovery action from the Initiator (such as INVALID_KE_PAYLOAD or INVALID_MAJOR_VERSION, see
Section 2.21.1 of
RFC 7296), then the Responder
SHOULD NOT close the TCP connection immediately in anticipation of the fact that the Initiator will repeat the request with corrected parameters. See also
Section 6.3.
When negotiating over UDP, IKE_SA_INIT packets include NAT_DETECTION_SOURCE_IP and NAT_DETECTION_DESTINATION_IP payloads to determine if UDP encapsulation of IPsec packets should be used. These payloads contain SHA-1 digests of the SPIs, IP addresses, and ports as defined in [
RFC 7296]. IKE_SA_INIT packets sent on a TCP connection
SHOULD include these payloads with the same content as when sending over UDP and
SHOULD use the applicable TCP ports when creating and checking the SHA-1 digests.
If a NAT is detected due to the SHA-1 digests not matching the expected values, no change should be made for encapsulation of subsequent IKE or ESP packets since TCP encapsulation inherently supports NAT traversal. However, for the transport mode IPsec SAs, implementations need to handle TCP and UDP packet checksum fixup during decapsulation, as defined for UDP encapsulation in [
RFC 3948].
Implementations
MAY use the information that a NAT is present to influence keepalive timer values.
Encapsulating IKE and IPsec inside of a TCP connection can impact the strategy that implementations use to maintain middlebox port mappings.
In general, TCP port mappings are maintained by NATs longer than UDP port mappings, so IPsec ESP NAT-keepalive packets [
RFC 3948]
SHOULD NOT be sent when using TCP encapsulation. Any implementation using TCP encapsulation
MUST silently drop incoming NAT-keepalive packets and not treat them as errors. NAT-keepalive packets over a TCP-encapsulated IPsec connection will be sent as a 1-octet-long payload with the value 0xFF, preceded by the 2-octet Length specifying a length of 3 (since it includes the length of the Length field).
Peer liveness should be checked using IKE informational packets [
RFC 7296].
Note that, depending on the configuration of TCP and TLS on the connection, TCP keep-alives [
RFC 1122] and TLS keep-alives [
RFC 6520]
MAY be used. These
MUST NOT be used as indications of IKE peer liveness, for which purpose the standard IKEv2 mechanism of exchanging (usually empty) INFORMATIONAL messages is used (see
Section 1.4 of
RFC 7296).
Using TCP encapsulation affects some aspects of IPsec SA processing.
-
Section 8.1 of RFC 4301 requires all tunnel mode IPsec SAs to be able to copy the Don't Fragment (DF) bit from inner IPv4 header to the outer (tunnel) one. With TCP encapsulation, this is generally not possible because the TCP/IP stack manages the DF bit in the outer IPv4 header, and usually the stack ensures that the DF bit is set for TCP packets to avoid IP fragmentation. Note, that this behavior is compliant with generic tunneling considerations since the outer TCP header acts as a link-layer protocol and its fragmentation and reassembly have no correlation with the inner payload.
-
The other feature that is less applicable with TCP encapsulation is an ability to split traffic of different QoS classes into different IPsec SAs, created by a single IKE SA. In this case, the Differentiated Services Code Point (DSCP) field is usually copied from the inner IP header to the outer (tunnel) one, ensuring that IPsec traffic of each SA receives the corresponding level of service. With TCP encapsulation, all IPsec SAs created by a single IKE SA will share a single TCP connection; thus, they will receive the same level of service (see Section 9.3). If this functionality is needed, implementations should create several IKE SAs each over separate TCP connections and assign a corresponding DSCP value to each of them.
TCP encapsulation of IPsec packets may have implications on performance of the encapsulated traffic. Performance considerations are discussed in
Section 9.