4. Protocol Operation
4.1. L2TP Over Specific Packet-Switched Networks (PSNs)
L2TP may operate over a variety of PSNs. There are two modes described for operation over IP, L2TP directly over IP (see Section 4.1.1) and L2TP over UDP (see Section 4.1.2). L2TPv3 implementations MUST support L2TP over IP and SHOULD support L2TP over UDP for better NAT and firewall traversal, and for easier migration from L2TPv2. L2TP over other PSNs may be defined, but the specifics are outside the scope of this document. Examples of L2TPv2 over other PSNs include [RFC3070] and [RFC3355]. The following field definitions are defined for use in all L2TP Session Header encapsulations. Session ID A 32-bit field containing a non-zero identifier for a session. L2TP sessions are named by identifiers that have local significance only. That is, the same logical session will be given different Session IDs by each end of the control connection for the life of the session. When the L2TP control connection is used for session establishment, Session IDs are selected and exchanged as Local Session ID AVPs during the creation of a session. The Session ID alone provides the necessary context for all further packet processing, including the presence, size, and value of the Cookie, the type of L2-Specific Sublayer, and the type of payload being tunneled.
Cookie The optional Cookie field contains a variable-length value (maximum 64 bits) used to check the association of a received data message with the session identified by the Session ID. The Cookie MUST be set to the configured or signaled random value for this session. The Cookie provides an additional level of guarantee that a data message has been directed to the proper session by the Session ID. A well-chosen Cookie may prevent inadvertent misdirection of stray packets with recently reused Session IDs, Session IDs subject to packet corruption, etc. The Cookie may also provide protection against some specific malicious packet insertion attacks, as described in Section 8.2. When the L2TP control connection is used for session establishment, random Cookie values are selected and exchanged as Assigned Cookie AVPs during session creation.4.1.1. L2TPv3 over IP
L2TPv3 over IP (both versions) utilizes the IANA-assigned IP protocol ID 115.4.1.1.1. L2TPv3 Session Header Over IP
Unlike L2TP over UDP, the L2TPv3 session header over IP is free of any restrictions imposed by coexistence with L2TPv2 and L2F. As such, the header format has been designed to optimize packet processing. The following session header format is utilized when operating L2TPv3 over IP: Figure 4.1.1.1: L2TPv3 Session Header Over IP 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Session ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cookie (optional, maximum 64 bits)... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The Session ID and Cookie fields are as defined in Section 4.1. The Session ID of zero is reserved for use by L2TP control messages (see Section 4.1.1.2).
4.1.1.2. L2TP Control and Data Traffic over IP
Unlike L2TP over UDP, which uses the T bit to distinguish between L2TP control and data packets, L2TP over IP uses the reserved Session ID of zero (0) when sending control messages. It is presumed that checking for the zero Session ID is more efficient -- both in header size for data packets and in processing speed for distinguishing between control and data messages -- than checking a single bit. The entire control message header over IP, including the zero session ID, appears as follows: Figure 4.1.1.2: L2TPv3 Control Message Header Over IP 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (32 bits of zeros) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T|L|x|x|S|x|x|x|x|x|x|x| Ver | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Control Connection ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ns | Nr | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Named fields are as defined in Section 3.2.1. Note that the Length field is still calculated from the beginning of the control message header, beginning with the T bit. It does NOT include the "(32 bits of zeros)" depicted above. When operating directly over IP, L2TP packets lose the ability to take advantage of the UDP checksum as a simple packet integrity check, which is of particular concern for L2TP control messages. Control Message Authentication (see Section 4.3), even with an empty password field, provides for a sufficient packet integrity check and SHOULD always be enabled.4.1.2. L2TP over UDP
L2TPv3 over UDP must consider other L2 tunneling protocols that may be operating in the same environment, including L2TPv2 [RFC2661] and L2F [RFC2341]. While there are efficiencies gained by running L2TP directly over IP, there are possible side effects as well. For instance, L2TP over IP is not as NAT-friendly as L2TP over UDP.
4.1.2.1. L2TP Session Header Over UDP
The following session header format is utilized when operating L2TPv3 over UDP: Figure 4.1.2.1: L2TPv3 Session Header over UDP 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T|x|x|x|x|x|x|x|x|x|x|x| Ver | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Session ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cookie (optional, maximum 64 bits)... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The T bit MUST be set to 0, indicating that this is a data message. The x bits and Reserved field are reserved for future extensions. All reserved values MUST be set to 0 on outgoing messages and ignored on incoming messages. The Ver field MUST be set to 3, indicating an L2TPv3 message. Note that the initial bits 1, 4, 6, and 7 have meaning in L2TPv2 [RFC2661], and are deprecated and marked as reserved in L2TPv3. Thus, for UDP mode on a system that supports both versions of L2TP, it is important that the Ver field be inspected first to determine the Version of the header before acting upon any of these bits. The Session ID and Cookie fields are as defined in Section 4.1.4.1.2.2. UDP Port Selection
The method for UDP Port Selection defined in this section is identical to that defined for L2TPv2 [RFC2661]. When negotiating a control connection over UDP, control messages MUST be sent as UDP datagrams using the registered UDP port 1701 [RFC1700]. The initiator of an L2TP control connection picks an available source UDP port (which may or may not be 1701) and sends to the desired destination address at port 1701. The recipient picks a free port on its own system (which may or may not be 1701) and sends its reply to the initiator's UDP port and address, setting its own source port to the free port it found.
Any subsequent traffic associated with this control connection (either control traffic or data traffic from a session established through this control connection) must use these same UDP ports. It has been suggested that having the recipient choose an arbitrary source port (as opposed to using the destination port in the packet initiating the control connection, i.e., 1701) may make it more difficult for L2TP to traverse some NAT devices. Implementations should consider the potential implication of this capability before choosing an arbitrary source port. A NAT device that can pass TFTP traffic with variant UDP ports should be able to pass L2TP UDP traffic since both protocols employ similar policies with regard to UDP port selection.4.1.2.3. UDP Checksum
The tunneled frames that L2TP carry often have their own checksums or integrity checks, rendering the UDP checksum redundant for much of the L2TP data message contents. Thus, UDP checksums MAY be disabled in order to reduce the associated packet processing burden at the L2TP endpoints. The L2TP header itself does not have its own checksum or integrity check. However, use of the L2TP Session ID and Cookie pair guards against accepting an L2TP data message if corruption of the Session ID or associated Cookie has occurred. When the L2-Specific Sublayer is present in the L2TP header, there is no built-in integrity check for the information contained therein if UDP checksums or some other integrity check is not employed. IPsec (see Section 4.1.3) may be used for strong integrity protection of the entire contents of L2TP data messages. UDP checksums MUST be enabled for L2TP control messages.4.1.3. L2TP and IPsec
The L2TP data channel does not provide cryptographic security of any kind. If the L2TP data channel operates over a public or untrusted IP network where privacy of the L2TP data is of concern or sophisticated attacks against L2TP are expected to occur, IPsec [RFC2401] MUST be made available to secure the L2TP traffic. Either L2TP over UDP or L2TP over IP may be secured with IPsec. [RFC3193] defines the recommended method for securing L2TPv2. L2TPv3 possesses identical characteristics to IPsec as L2TPv2 when running over UDP and implementations MUST follow the same recommendation. When operating over IP directly, [RFC3193] still applies, though references to UDP source and destination ports (in particular, those
in Section 4, "IPsec Filtering details when protecting L2TP") may be ignored. Instead, the selectors used to identify L2TPv3 traffic are simply the source and destination IP addresses for the tunnel endpoints together with the L2TPv3 IP protocol type, 115. In addition to IP transport security, IPsec defines a mode of operation that allows tunneling of IP packets. The packet-level encryption and authentication provided by IPsec tunnel mode and that provided by L2TP secured with IPsec provide an equivalent level of security for these requirements. IPsec also defines access control features that are required of a compliant IPsec implementation. These features allow filtering of packets based upon network and transport layer characteristics such as IP address, ports, etc. In the L2TP tunneling model, analogous filtering may be performed at the network layer above L2TP. These network layer access control features may be handled at an LCCE via vendor-specific authorization features, or at the network layer itself by using IPsec transport mode end-to-end between the communicating hosts. The requirements for access control mechanisms are not a part of the L2TP specification, and as such, are outside the scope of this document. Protecting the L2TP packet stream with IPsec does, in turn, also protect the data within the tunneled session packets while transported from one LCCE to the other. Such protection must not be considered a substitution for end-to-end security between communicating hosts or applications.4.1.4. IP Fragmentation Issues
Fragmentation and reassembly in network equipment generally require significantly greater resources than sending or receiving a packet as a single unit. As such, fragmentation and reassembly should be avoided whenever possible. Ideal solutions for avoiding fragmentation include proper configuration and management of MTU sizes among the Remote System, the LCCE, and the IP network, as well as adaptive measures that operate with the originating host (e.g., [RFC1191], [RFC1981]) to reduce the packet sizes at the source. An LCCE MAY fragment a packet before encapsulating it in L2TP. For example, if an IPv4 packet arrives at an LCCE from a Remote System that, after encapsulation with its associated framing, L2TP, and IP, does not fit in the available path MTU towards its LCCE peer, the local LCCE may perform IPv4 fragmentation on the packet before tunnel encapsulation. This creates two (or more) L2TP packets, each
carrying an IPv4 fragment with its associated framing. This ultimately has the effect of placing the burden of fragmentation on the LCCE, while reassembly occurs on the IPv4 destination host. If an IPv6 packet arrives at an LCCE from a Remote System that, after encapsulation with associated framing, L2TP and IP, does not fit in the available path MTU towards its L2TP peer, the Generic Packet Tunneling specification [RFC2473], Section 7.1 SHOULD be followed. In this case, the LCCE should either send an ICMP Packet Too Big message to the data source, or fragment the resultant L2TP/IP packet (for reassembly by the L2TP peer). If the amount of traffic requiring fragmentation and reassembly is rather light, or there are sufficiently optimized mechanisms at the tunnel endpoints, fragmentation of the L2TP/IP packet may be sufficient for accommodating mismatched MTUs that cannot be managed by more efficient means. This method effectively emulates a larger MTU between tunnel endpoints and should work for any type of L2- encapsulated packet. Note that IPv6 does not support "in-flight" fragmentation of data packets. Thus, unlike IPv4, the MTU of the path towards an L2TP peer must be known in advance (or the last resort IPv6 minimum MTU of 1280 bytes utilized) so that IPv6 fragmentation may occur at the LCCE. In summary, attempting to control the source MTU by communicating with the originating host, forcing that an MTU be sufficiently large on the path between LCCE peers to tunnel a frame from any other interface without fragmentation, fragmenting IP packets before encapsulation with L2TP/IP, or fragmenting the resultant L2TP/IP packet between the tunnel endpoints, are all valid methods for managing MTU mismatches. Some are clearly better than others depending on the given deployment. For example, a passive monitoring application using L2TP would certainly not wish to have ICMP messages sent to a traffic source. Further, if the links connecting a set of LCCEs have a very large MTU (e.g., SDH/SONET) and it is known that the MTU of all links being tunneled by L2TP have smaller MTUs (e.g., 1500 bytes), then any IP fragmentation and reassembly enabled on the participating LCCEs would never be utilized. An implementation MUST implement at least one of the methods described in this section for managing mismatched MTUs, based on careful consideration of how the final product will be deployed. L2TP-specific fragmentation and reassembly methods, which may or may not depend on the characteristics of the type of link being tunneled (e.g., judicious packing of ATM cells), may be defined as well, but these methods are outside the scope of this document.
4.2. Reliable Delivery of Control Messages
L2TP provides a lower level reliable delivery service for all control messages. The Nr and Ns fields of the control message header (see Section 3.2.1) belong to this delivery mechanism. The upper level functions of L2TP are not concerned with retransmission or ordering of control messages. The reliable control messaging mechanism is a sliding window mechanism that provides control message retransmission and congestion control. Each peer maintains separate sequence number state for each control connection. The message sequence number, Ns, begins at 0. Each subsequent message is sent with the next increment of the sequence number. The sequence number is thus a free-running counter represented modulo 65536. The sequence number in the header of a received message is considered less than or equal to the last received number if its value lies in the range of the last received number and the preceding 32767 values, inclusive. For example, if the last received sequence number was 15, then messages with sequence numbers 0 through 15, as well as 32784 through 65535, would be considered less than or equal. Such a message would be considered a duplicate of a message already received and ignored from processing. However, in order to ensure that all messages are acknowledged properly (particularly in the case of a lost ACK message), receipt of duplicate messages MUST be acknowledged by the reliable delivery mechanism. This acknowledgment may either piggybacked on a message in queue or sent explicitly via an ACK message. All control messages take up one slot in the control message sequence number space, except the ACK message. Thus, Ns is not incremented after an ACK message is sent. The last received message number, Nr, is used to acknowledge messages received by an L2TP peer. It contains the sequence number of the message the peer expects to receive next (e.g., the last Ns of a non-ACK message received plus 1, modulo 65536). While the Nr in a received ACK message is used to flush messages from the local retransmit queue (see below), the Nr of the next message sent is not updated by the Ns of the ACK message. Nr SHOULD be sanity-checked before flushing the retransmit queue. For instance, if the Nr received in a control message is greater than the last Ns sent plus 1 modulo 65536, the control message is clearly invalid. The reliable delivery mechanism at a receiving peer is responsible for making sure that control messages are delivered in order and without duplication to the upper level. Messages arriving out-of- order may be queued for in-order delivery when the missing messages
are received. Alternatively, they may be discarded, thus requiring a retransmission by the peer. When dropping out-of-order control packets, Nr MAY be updated before the packet is discarded. Each control connection maintains a queue of control messages to be transmitted to its peer. The message at the front of the queue is sent with a given Ns value and is held until a control message arrives from the peer in which the Nr field indicates receipt of this message. After a period of time (a recommended default is 1 second but SHOULD be configurable) passes without acknowledgment, the message is retransmitted. The retransmitted message contains the same Ns value, but the Nr value MUST be updated with the sequence number of the next expected message. Each subsequent retransmission of a message MUST employ an exponential backoff interval. Thus, if the first retransmission occurred after 1 second, the next retransmission should occur after 2 seconds has elapsed, then 4 seconds, etc. An implementation MAY place a cap upon the maximum interval between retransmissions. This cap SHOULD be no less than 8 seconds per retransmission. If no peer response is detected after several retransmissions (a recommended default is 10, but MUST be configurable), the control connection and all associated sessions MUST be cleared. As it is the first message to establish a control connection, the SCCRQ MAY employ a different retransmission maximum than other control messages in order to help facilitate failover to alternate LCCEs in a timely fashion. When a control connection is being shut down for reasons other than loss of connectivity, the state and reliable delivery mechanisms MUST be maintained and operated for the full retransmission interval after the final message StopCCN message has been sent (e.g., 1 + 2 + 4 + 8 + 8... seconds), or until the StopCCN message itself has been acknowledged. A sliding window mechanism is used for control message transmission and retransmission. Consider two peers, A and B. Suppose A specifies a Receive Window Size AVP with a value of N in the SCCRQ or SCCRP message. B is now allowed to have a maximum of N outstanding (i.e., unacknowledged) control messages. Once N messages have been sent, B must wait for an acknowledgment from A that advances the window before sending new control messages. An implementation may advertise a non-zero receive window as small or as large as it wishes, depending on its own ability to process incoming messages before sending an acknowledgement. Each peer MUST limit the number of unacknowledged messages it will send before receiving an acknowledgement by this Receive Window Size. The actual internal
unacknowledged message send-queue depth may be further limited by local resource allocation or by dynamic slow-start and congestion- avoidance mechanisms. When retransmitting control messages, a slow start and congestion avoidance window adjustment procedure SHOULD be utilized. A recommended procedure is described in Appendix A. A peer MAY drop messages, but MUST NOT actively delay acknowledgment of messages as a technique for flow control of control messages. Appendix B contains examples of control message transmission, acknowledgment, and retransmission.4.3. Control Message Authentication
L2TP incorporates an optional authentication and integrity check for all control messages. This mechanism consists of a computed one-way hash over the header and body of the L2TP control message, a pre- configured shared secret, and a local and remote nonce (random value) exchanged via the Control Message Authentication Nonce AVP. This per-message authentication and integrity check is designed to perform a mutual authentication between L2TP nodes, perform integrity checking of all control messages, and guard against control message spoofing and replay attacks that would otherwise be trivial to mount. At least one shared secret (password) MUST exist between communicating L2TP nodes to enable Control Message Authentication. See Section 5.4.3 for details on calculation of the Message Digest and construction of the Control Message Authentication Nonce and Message Digest AVPs. L2TPv3 Control Message Authentication is similar to L2TPv2 [RFC2661] Tunnel Authentication in its use of a shared secret and one-way hash calculation. The principal difference is that, instead of computing the hash over selected contents of a received control message (e.g., the Challenge AVP and Message Type) as in L2TPv2, the entire message is used in the hash in L2TPv3. In addition, instead of including the hash digest in just the SCCRP and SCCCN messages, it is now included in all L2TP messages. The Control Message Authentication mechanism is optional, and may be disabled if both peers agree. For example, if IPsec is already being used for security and integrity checking between the LCCEs, the function of the L2TP mechanism becomes redundant and may be disabled. Presence of the Control Message Authentication Nonce AVP in an SCCRQ or SCCRP message serves as indication to a peer that Control Message Authentication is enabled. If an SCCRQ or SCCRP contains a Control Message Authentication Nonce AVP, the receiver of the message MUST
respond with a Message Digest AVP in all subsequent messages sent. Control Message Authentication is always bidirectional; either both sides participate in authentication, or neither does. If Control Message Authentication is disabled, the Message Digest AVP still MAY be sent as an integrity check of the message. The integrity check is calculated as in Section 5.4.3, with an empty zero-length shared secret, local nonce, and remote nonce. If an invalid Message Digest is received, it should be assumed that the message has been corrupted in transit and the message dropped accordingly. Implementations MAY rate-limit control messages, particularly SCCRQ messages, upon receipt for performance reasons or for protection against denial of service attacks.4.4. Keepalive (Hello)
L2TP employs a keepalive mechanism to detect loss of connectivity between a pair of LCCEs. This is accomplished by injecting Hello control messages (see Section 6.5) after a period of time has elapsed since the last data message or control message was received on an L2TP session or control connection, respectively. As with any other control message, if the Hello message is not reliably delivered, the sending LCCE declares that the control connection is down and resets its state for the control connection. This behavior ensures that a connectivity failure between the LCCEs is detected independently by each end of a control connection. Since the control channel is operated in-band with data traffic over the PSN, this single mechanism can be used to infer basic data connectivity between a pair of LCCEs for all sessions associated with the control connection. Periodic keepalive for the control connection MUST be implemented by sending a Hello if a period of time (a recommended default is 60 seconds, but MUST be configurable) has passed without receiving any message (data or control) from the peer. An LCCE sending Hello messages across multiple control connections between the same LCCE endpoints MUST employ a jittered timer mechanism to prevent grouping of Hello messages.4.5. Forwarding Session Data Frames
Once session establishment is complete, circuit frames are received at an LCCE, encapsulated in L2TP (with appropriate attention to framing, as described in documents for the particular pseudowire type), and forwarded over the appropriate session. For every
outgoing data message, the sender places the identifier specified in the Local Session ID AVP (received from peer during session establishment) in the Session ID field of the L2TP data header. In this manner, session frames are multiplexed and demultiplexed between a given pair of LCCEs. Multiple control connections may exist between a given pair of LCCEs, and multiple sessions may be associated with a given control connection. The peer LCCE receiving the L2TP data packet identifies the session with which the packet is associated by the Session ID in the data packet's header. The LCCE then checks the Cookie field in the data packet against the Cookie value received in the Assigned Cookie AVP during session establishment. It is important for implementers to note that the Cookie field check occurs after looking up the session context by the Session ID, and as such, consists merely of a value match of the Cookie field and that stored in the retrieved context. There is no need to perform a lookup across the Session ID and Cookie as a single value. Any received data packets that contain invalid Session IDs or associated Cookie values MUST be dropped. Finally, the LCCE either forwards the network packet within the tunneled frame (e.g., as an LNS) or switches the frame to a circuit (e.g., as an LAC).4.6. Default L2-Specific Sublayer
This document defines a Default L2-Specific Sublayer format (see Section 3.2.2) that a pseudowire may use for features such as sequencing support, L2 interworking, OAM, or other per-data-packet operations. The Default L2-Specific Sublayer SHOULD be used by a given PW type to support these features if it is adequate, and its presence is requested by a peer during session negotiation. Alternative sublayers MAY be defined (e.g., an encapsulation with a larger Sequence Number field or timing information) and identified for use via the L2-Specific Sublayer Type AVP. Figure 4.6: Default L2-Specific Sublayer Format 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |x|S|x|x|x|x|x|x| Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The S (Sequence) bit is set to 1 when the Sequence Number contains a valid number for this sequenced frame. If the S bit is set to zero, the Sequence Number contents are undefined and MUST be ignored by the receiver.
The Sequence Number field contains a free-running counter of 2^24 sequence numbers. If the number in this field is valid, the S bit MUST be set to 1. The Sequence Number begins at zero, which is a valid sequence number. (In this way, implementations inserting sequence numbers do not have to "skip" zero when incrementing.) The sequence number in the header of a received message is considered less than or equal to the last received number if its value lies in the range of the last received number and the preceding (2^23-1) values, inclusive.4.6.1. Sequencing Data Packets
The Sequence Number field may be used to detect lost, duplicate, or out-of-order packets within a given session. When L2 frames are carried over an L2TP-over-IP or L2TP-over-UDP/IP data channel, this part of the link has the characteristic of being able to reorder, duplicate, or silently drop packets. Reordering may break some non-IP protocols or L2 control traffic being carried by the link. Silent dropping or duplication of packets may break protocols that assume per-packet indications of error, such as TCP header compression. While a common mechanism for packet sequence detection is provided, the sequence dependency characteristics of individual protocols are outside the scope of this document. If any protocol being transported by over L2TP data channels cannot tolerate misordering of data packets, packet duplication, or silent packet loss, sequencing may be enabled on some or all packets by using the S bit and Sequence Number field defined in the Default L2- Specific Sublayer (see Section 4.6). For a given L2TP session, each LCCE is responsible for communicating to its peer the level of sequencing support that it requires of data packets that it receives. Mechanisms to advertise this information during session negotiation are provided (see Data Sequencing AVP in Section 5.4.4). When determining whether a packet is in or out of sequence, an implementation SHOULD utilize a method that is resilient to temporary dropouts in connectivity coupled with high per-session packet rates. The recommended method is outlined in Appendix C.4.7. L2TPv2/v3 Interoperability and Migration
L2TPv2 and L2TPv3 environments should be able to coexist while a migration to L2TPv3 is made. Migration issues are discussed for each media type in this section. Most issues apply only to implementations that require both L2TPv2 and L2TPv3 operation.
However, even L2TPv3-only implementations must at least be mindful of these issues in order to interoperate with implementations that support both versions.4.7.1. L2TPv3 over IP
L2TPv3 implementations running strictly over IP with no desire to interoperate with L2TPv2 implementations may safely disregard most migration issues from L2TPv2. All control messages and data messages are sent as described in this document, without normative reference to RFC 2661. If one wishes to tunnel PPP over L2TPv3, and fallback to L2TPv2 only if it is not available, then L2TPv3 over UDP with automatic fallback (see Section 4.7.3) MUST be used. There is no deterministic method for automatic fallback from L2TPv3 over IP to either L2TPv2 or L2TPv3 over UDP. One could infer whether L2TPv3 over IP is supported by sending an SCCRQ and waiting for a response, but this could be problematic during periods of packet loss between L2TP nodes.4.7.2. L2TPv3 over UDP
The format of the L2TPv3 over UDP header is defined in Section 4.1.2.1. When operating over UDP, L2TPv3 uses the same port (1701) as L2TPv2 and shares the first two octets of header format with L2TPv2. The Ver field is used to distinguish L2TPv2 packets from L2TPv3 packets. If an implementation is capable of operating in L2TPv2 or L2TPv3 modes, it is possible to automatically detect whether a peer can support L2TPv2 or L2TPv3 and operate accordingly. The details of this fallback capability is defined in the following section.4.7.3. Automatic L2TPv2 Fallback
When running over UDP, an implementation may detect whether a peer is L2TPv3-capable by sending a special SCCRQ that is properly formatted for both L2TPv2 and L2TPv3. This is accomplished by sending an SCCRQ with its Ver field set to 2 (for L2TPv2), and ensuring that any L2TPv3-specific AVPs (i.e., AVPs present within this document and not defined within RFC 2661) in the message are sent with each M bit set to 0, and that all L2TPv2 AVPs are present as they would be for L2TPv2. This is done so that L2TPv3 AVPs will be ignored by an L2TPv2-only implementation. Note that, in both L2TPv2 and L2TPv3, the value contained in the space of the control message header utilized by the 32-bit Control Connection ID in L2TPv3, and the 16- bit Tunnel ID and
16-bit Session ID in L2TPv2, are always 0 for an SCCRQ. This effectively hides the fact that there are a pair of 16-bit fields in L2TPv2, and a single 32-bit field in L2TPv3. If the peer implementation is L2TPv3-capable, a control message with the Ver field set to 3 and an L2TPv3 header and message format will be sent in response to the SCCRQ. Operation may then continue as L2TPv3. If a message is received with the Ver field set to 2, it must be assumed that the peer implementation is L2TPv2-only, thus enabling fallback to L2TPv2 mode to safely occur. Note Well: The L2TPv2/v3 auto-detection mode requires that all L2TPv3 implementations over UDP be liberal in accepting an SCCRQ control message with the Ver field set to 2 or 3 and the presence of L2TPv2- specific AVPs. An L2TPv3-only implementation MUST ignore all L2TPv2 AVPs (e.g., those defined in RFC 2661 and not in this document) within an SCCRQ with the Ver field set to 2 (even if the M bit is set on the L2TPv2-specific AVPs).5. Control Message Attribute Value Pairs
To maximize extensibility while permitting interoperability, a uniform method for encoding message types is used throughout L2TP. This encoding will be termed AVP (Attribute Value Pair) for the remainder of this document.5.1. AVP Format
Each AVP is encoded as follows: Figure 5.1: AVP Format 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |M|H| rsvd | Length | Vendor ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Attribute Type | Attribute Value ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (until Length is reached) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The first six bits comprise a bit mask that describes the general attributes of the AVP. Two bits are defined in this document; the remaining bits are reserved for future extensions. Reserved bits MUST be set to 0 when sent and ignored upon receipt.
Mandatory (M) bit: Controls the behavior required of an implementation that receives an unrecognized AVP. The M bit of a given AVP MUST only be inspected and acted upon if the AVP is unrecognized (see Section 5.2). Hidden (H) bit: Identifies the hiding of data in the Attribute Value field of an AVP. This capability can be used to avoid the passing of sensitive data, such as user passwords, as cleartext in an AVP. Section 5.3 describes the procedure for performing AVP hiding. Length: Contains the number of octets (including the Overall Length and bit mask fields) contained in this AVP. The Length may be calculated as 6 + the length of the Attribute Value field in octets. The field itself is 10 bits, permitting a maximum of 1023 octets of data in a single AVP. The minimum Length of an AVP is 6. If the Length is 6, then the Attribute Value field is absent. Vendor ID: The IANA-assigned "SMI Network Management Private Enterprise Codes" [RFC1700] value. The value 0, corresponding to IETF-adopted attribute values, is used for all AVPs defined within this document. Any vendor wishing to implement its own L2TP extensions can use its own Vendor ID along with private Attribute values, guaranteeing that they will not collide with any other vendor's extensions or future IETF extensions. Note that there are 16 bits allocated for the Vendor ID, thus limiting this feature to the first 65,535 enterprises. Attribute Type: A 2-octet value with a unique interpretation across all AVPs defined under a given Vendor ID. Attribute Value: This is the actual value as indicated by the Vendor ID and Attribute Type. It follows immediately after the Attribute Type field and runs for the remaining octets indicated in the Length (i.e., Length minus 6 octets of header). This field is absent if the Length is 6. In the event that the 16-bit Vendor ID space is exhausted, vendor- specific AVPs with a 32-bit Vendor ID MUST be encapsulated in the following manner:
Figure 5.2: Extended Vendor ID AVP Format 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |M|H| rsvd | Length | 0 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 58 | 32-bit Vendor ID ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Attribute Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Attribute Value ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (until Length is reached) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This AVP encodes a vendor-specific AVP with a 32-bit Vendor ID space within the Attribute Value field. Multiple AVPs of this type may exist in any message. The 16-bit Vendor ID MUST be 0, indicating that this is an IETF-defined AVP, and the Attribute Type MUST be 58, indicating that what follows is a vendor-specific AVP with a 32-bit Vendor ID code. This AVP MAY be hidden (the H bit MAY be 0 or 1). The M bit for this AVP MUST be set to 0. The Length of the AVP is 12 plus the length of the Attribute Value.5.2. Mandatory AVPs and Setting the M Bit
If the M bit is set on an AVP that is unrecognized by its recipient, the session or control connection associated with the control message containing the AVP MUST be shut down. If the control message containing the unrecognized AVP is associated with a session (e.g., an ICRQ, ICRP, ICCN, SLI, etc.), then the session MUST be issued a CDN with a Result Code of 2 and Error Code of 8 (as defined in Section 5.4.2) and shut down. If the control message containing the unrecognized AVP is associated with establishment or maintenance of a Control Connection (e.g., SCCRQ, SCCRP, SCCCN, Hello), then the associated Control Connection MUST be issued a StopCCN with Result Code of 2 and Error Code of 8 (as defined in Section 5.4.2) and shut down. If the M bit is not set on an unrecognized AVP, the AVP MUST be ignored when received, processing the control message as if the AVP were not present. Receipt of an unrecognized AVP that has the M bit set is catastrophic to the session or control connection with which it is associated. Thus, the M bit should only be set for AVPs that are deemed crucial to proper operation of the session or control connection by the sender. AVPs that are considered crucial by the sender may vary by application and configured options. In no case shall a receiver of
an AVP "validate" if the M bit is set on a recognized AVP. If the AVP is recognized (as all AVPs defined in this document MUST be for a compliant L2TPv3 specification), then by definition, the M bit is of no consequence. The sender of an AVP is free to set its M bit to 1 or 0 based on whether the configured application strictly requires the value contained in the AVP to be recognized or not. For example, "Automatic L2TPv2 Fallback" in Section 4.7.3 requires the setting of the M bit on all new L2TPv3 AVPs to zero if fallback to L2TPv2 is supported and desired, and 1 if not. The M bit is useful as extra assurance for support of critical AVP extensions. However, more explicit methods may be available to determine support for a given feature rather than using the M bit alone. For example, if a new AVP is defined in a message for which there is always a message reply (i.e., an ICRQ, ICRP, SCCRQ, or SCCRP message), rather than simply sending an AVP in the message with the M bit set, availability of the extension may be identified by sending an AVP in the request message and expecting a corresponding AVP in a reply message. This more explicit method, when possible, is preferred. The M bit also plays a role in determining whether or not a malformed or out-of-range value within an AVP should be ignored or should result in termination of a session or control connection (see Section 7.1 for more details).5.3. Hiding of AVP Attribute Values
The H bit in the header of each AVP provides a mechanism to indicate to the receiving peer whether the contents of the AVP are hidden or present in cleartext. This feature can be used to hide sensitive control message data such as user passwords, IDs, or other vital information. The H bit MUST only be set if (1) a shared secret exists between the LCCEs and (2) Control Message Authentication is enabled (see Section 4.3). If the H bit is set in any AVP(s) in a given control message, at least one Random Vector AVP must also be present in the message and MUST precede the first AVP having an H bit of 1.
The shared secret between LCCEs is used to derive a unique shared key for hiding and unhiding calculations. The derived shared key is obtained via an HMAC-MD5 keyed hash [RFC2104], with the key consisting of the shared secret, and with the data being hashed consisting of a single octet containing the value 1. shared_key = HMAC_MD5 (shared_secret, 1) Hiding an AVP value is done in several steps. The first step is to take the length and value fields of the original (cleartext) AVP and encode them into the Hidden AVP Subformat, which appears as follows: Figure 5.3: Hidden AVP Subformat 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length of Original Value | Original Attribute Value ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... | Padding ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Length of Original Attribute Value: This is length of the Original Attribute Value to be obscured in octets. This is necessary to determine the original length of the Attribute Value that is lost when the additional Padding is added. Original Attribute Value: Attribute Value that is to be obscured. Padding: Random additional octets used to obscure length of the Attribute Value that is being hidden. To mask the size of the data being hidden, the resulting subformat MAY be padded as shown above. Padding does NOT alter the value placed in the Length of Original Attribute Value field, but does alter the length of the resultant AVP that is being created. For example, if an Attribute Value to be hidden is 4 octets in length, the unhidden AVP length would be 10 octets (6 + Attribute Value length). After hiding, the length of the AVP would become 6 + Attribute Value length + size of the Length of Original Attribute Value field + Padding. Thus, if Padding is 12 octets, the AVP length would be 6 + 4 + 2 + 12 = 24 octets.
Next, an MD5 [RFC1321] hash is performed (in network byte order) on the concatenation of the following: + the 2-octet Attribute number of the AVP + the shared key + an arbitrary length random vector The value of the random vector used in this hash is passed in the value field of a Random Vector AVP. This Random Vector AVP must be placed in the message by the sender before any hidden AVPs. The same random vector may be used for more than one hidden AVP in the same message, but not for hiding two or more instances of an AVP with the same Attribute Type unless the Attribute Values in the two AVPs are also identical. When a different random vector is used for the hiding of subsequent AVPs, a new Random Vector AVP MUST be placed in the control message before the first AVP to which it applies. The MD5 hash value is then XORed with the first 16-octet (or less) segment of the Hidden AVP Subformat and placed in the Attribute Value field of the Hidden AVP. If the Hidden AVP Subformat is less than 16 octets, the Subformat is transformed as if the Attribute Value field had been padded to 16 octets before the XOR. Only the actual octets present in the Subformat are modified, and the length of the AVP is not altered. If the Subformat is longer than 16 octets, a second one-way MD5 hash is calculated over a stream of octets consisting of the shared key followed by the result of the first XOR. That hash is XORed with the second 16-octet (or less) segment of the Subformat and placed in the corresponding octets of the Value field of the Hidden AVP. If necessary, this operation is repeated, with the shared key used along with each XOR result to generate the next hash to XOR the next segment of the value with. The hiding method was adapted from [RFC2865], which was taken from the "Mixing in the Plaintext" section in the book "Network Security" by Kaufman, Perlman and Speciner [KPS]. A detailed explanation of the method follows: Call the shared key S, the Random Vector RV, and the Attribute Type A. Break the value field into 16-octet chunks p_1, p_2, etc., with the last one padded at the end with random data to a 16-octet boundary. Call the ciphertext blocks c_1, c_2, etc. We will also define intermediate values b_1, b_2, etc.
b_1 = MD5 (A + S + RV) c_1 = p_1 xor b_1 b_2 = MD5 (S + c_1) c_2 = p_2 xor b_2 . . . . . . b_i = MD5 (S + c_i-1) c_i = p_i xor b_i The String will contain c_1 + c_2 +...+ c_i, where "+" denotes concatenation. On receipt, the random vector is taken from the last Random Vector AVP encountered in the message prior to the AVP to be unhidden. The above process is then reversed to yield the original value.