3.5. Pad TLV
The value part of the Pad TLV contains a variable number (>= 1) of octets. The first octet takes values from the following table; all the other octets (if any) are ignored. The receiver SHOULD verify that the TLV is received in its entirety, but otherwise ignores the contents of this TLV, apart from the first octet. Value Meaning ----- ------- 0 Reserved 1 Drop Pad TLV from reply 2 Copy Pad TLV to reply 3-250 Unassigned 251-254 Reserved for Experimental Use 255 Reserved The Pad TLV can be added to an echo request to create a message of a specific length in cases where messages of various sizes are needed for troubleshooting. The first octet allows for controlling the inclusion of this additional padding in the respective echo reply.3.6. Vendor Enterprise Number
"Private Enterprise Numbers" [IANA-ENT] are maintained by IANA. The Length of this TLV is always 4; the value is the Structure of Management Information (SMI) Private Enterprise Code, in network octet order, of the vendor with a Vendor Private extension to any of the fields in the fixed part of the message, in which case this TLV MUST be present. If none of the fields in the fixed part of the message have Vendor Private extensions, inclusion of this TLV is OPTIONAL. Vendor Private ranges for Message Types, Reply Modes, and Return Codes have been defined. When any of these are used, the Vendor Enterprise Number TLV MUST be included in the message.
3.7. Interface and Label Stack
The Interface and Label Stack TLV MAY be included in a reply message to report the interface on which the request message was received and the label stack that was on the packet when it was received. Only one such object may appear. The purpose of the object is to allow the upstream router to obtain the exact interface and label stack information as it appears at the replying LSR. The Length is K + 4*N octets; N is the number of labels in the label stack. Values for K are found in the description of Address Type below. The Value field of this TLV has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address Type | Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IP Address (4 or 16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Interface (4 or 16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . . . Label Stack . . . . . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Address Type The Address Type indicates if the interface is numbered or unnumbered. It also determines the length of the IP Address and Interface fields. The resulting total for the initial part of the TLV is listed in the table below as "K Octets". The Address Type is set to one of the following values: Type # Address Type K Octets ------ ------------ -------- 0 Reserved 4 1 IPv4 Numbered 12 2 IPv4 Unnumbered 12 3 IPv6 Numbered 36 4 IPv6 Unnumbered 24 5-250 Unassigned 251-254 Reserved for Experimental Use 255 Reserved
IP Address and Interface IPv4 addresses and interface indices are encoded in 4 octets; IPv6 addresses are encoded in 16 octets. If the interface upon which the echo request message was received is numbered, then the Address Type MUST be set to IPv4 or IPv6, the IP Address MUST be set to either the LSR's Router ID or the interface address, and the Interface MUST be set to the interface address. If the interface is unnumbered, the Address Type MUST be either IPv4 Unnumbered or IPv6 Unnumbered, the IP Address MUST be the LSR's Router ID, and the Interface MUST be set to the index assigned to the interface. Label Stack The label stack of the received echo request message. If any TTL values have been changed by this router, they SHOULD be restored.3.8. Errored TLVs
The following TLV is a TLV that MAY be included in an echo reply to inform the sender of an echo request of mandatory TLVs either not supported by an implementation or parsed and found to be in error. The Value field contains the TLVs that were not understood, encoded as sub-TLVs. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 9 | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | . . . . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3.9. Reply TOS Octet TLV
This TLV MAY be used by the originator of the echo request to request that an echo reply be sent with the IP header Type of Service (TOS) octet set to the value specified in the TLV. This TLV has a length of 4 with the following Value field. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reply-TOS Byte| Must Be Zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+4. Theory of Operation
An MPLS echo request is used to test a particular LSP. The LSP to be tested is identified by the "FEC Stack"; for example, if the LSP was set up via LDP, and a label is mapped to an egress IP address of 198.51.100.1, the FEC Stack contains a single element, namely, an LDP IPv4 prefix sub-TLV with value 198.51.100.1/32. If the LSP being tested is an RSVP LSP, the FEC Stack consists of a single element that captures the RSVP Session and Sender Template that uniquely identifies the LSP. FEC Stacks can be more complex. For example, one may wish to test a VPN IPv4 prefix of 203.0.113.0/24 that is tunneled over an LDP LSP with egress 192.0.2.1. The FEC Stack would then contain two sub-TLVs, the bottom being a VPN IPv4 prefix, and the top being an LDP IPv4 prefix. If the underlying (LDP) tunnel were not known, or was considered irrelevant, the FEC Stack could be a single element with just the VPN IPv4 sub-TLV. When an MPLS echo request is received, the receiver is expected to verify that the control plane and data plane are both healthy (for the FEC Stack being pinged), and that the two planes are in sync. The procedures for this are in Section 4.4.4.1. Dealing with Equal-Cost Multipath (ECMP)
LSPs need not be simple point-to-point tunnels. Frequently, a single LSP may originate at several ingresses and terminate at several egresses; this is very common with LDP LSPs. LSPs for a given FEC may also have multiple "next hops" at transit LSRs. At an ingress, there may also be several different LSPs to choose from to get to the desired endpoint. Finally, LSPs may have backup paths, detour paths, and other alternative paths to take should the primary LSP go down.
Regarding the last two points stated above: it is assumed that the LSR sourcing MPLS echo requests can force the echo request into any desired LSP, so choosing among multiple LSPs at the ingress is not an issue. The problem of probing the various flavors of backup paths that will typically not be used for forwarding data unless the primary LSP is down will not be addressed here. Since the actual LSP and path that a given packet may take may not be known a priori, it is useful if MPLS echo requests can exercise all possible paths. This, although desirable, may not be practical because the algorithms that a given LSR uses to distribute packets over alternative paths may be proprietary. To achieve some degree of coverage of alternate paths, there is a certain latitude in choosing the destination IP address and source UDP port for an MPLS echo request. This is clearly not sufficient; in the case of traceroute, more latitude is offered by means of the Multipath Information of the Downstream Detailed Mapping TLV. This is used as follows. An ingress LSR periodically sends an LSP traceroute message to determine whether there are multipaths for a given LSP. If so, each hop will provide some information as to how each of its downstream paths can be exercised. The ingress can then send MPLS echo requests that exercise these paths. If several transit LSRs have ECMP, the ingress may attempt to compose these to exercise all possible paths. However, full coverage may not be possible.4.2. Testing LSPs That Are Used to Carry MPLS Payloads
To detect certain LSP breakages, it may be necessary to encapsulate an MPLS echo request packet with at least one additional label when testing LSPs that are used to carry MPLS payloads (such as LSPs used to carry L2VPN and L3VPN traffic. For example, when testing LDP or RSVP-TE LSPs, just sending an MPLS echo request packet may not detect instances where the router immediately upstream of the destination of the LSP ping may forward the MPLS echo request successfully over an interface not configured to carry MPLS payloads because of the use of penultimate hop popping. Since the receiving router has no means to ascertain whether the IP packet was sent unlabeled or implicitly labeled, the addition of labels shimmed above the MPLS echo request (using the Nil FEC) will prevent a router from forwarding such a packet out to unlabeled interfaces.
4.3. Sending an MPLS Echo Request
An MPLS echo request is a UDP packet. The IP header is set as follows: the source IP address is a routable address of the sender; the destination IP address is a (randomly chosen) IPv4 address from the range 127/8 or an IPv6 address from the range 0:0:0:0:0:FFFF:7F00:0/104. The IP TTL is set to 1. The source UDP port is chosen by the sender; the destination UDP port is set to 3503 (assigned by IANA for MPLS echo requests). The Router Alert IP Option of value 0x0 [RFC2113] for IPv4 or value 69 [RFC7506] for IPv6 MUST be set in the IP header. An MPLS echo request is sent with a label stack corresponding to the FEC Stack being tested. Note that further labels could be applied if, for example, the normal route to the topmost FEC in the stack is via a Traffic Engineered Tunnel [RFC3209]. If all of the FECs in the stack correspond to Implicit Null labels, the MPLS echo request is considered unlabeled even if further labels will be applied in sending the packet. If the echo request is labeled, one MAY (depending on what is being pinged) set the TTL of the innermost label to 1, to prevent the ping request going farther than it should. Examples of where this SHOULD be done include pinging a VPN IPv4 or IPv6 prefix, an L2 VPN endpoint, or a pseudowire. Preventing the ping request from going too far can also be accomplished by inserting a Router Alert label above this label; however, this may lead to the undesired side effect that MPLS echo requests take a different data path than actual data. For more information on how these mechanisms can be used for pseudowire connectivity verification, see [RFC5085][RFC5885]. In "ping" mode (end-to-end connectivity check), the TTL in the outermost label is set to 255. In "traceroute" mode (fault isolation mode), the TTL is set successively to 1, 2, and so on. The sender chooses a Sender's Handle and a Sequence Number. When sending subsequent MPLS echo requests, the sender SHOULD increment the Sequence Number by 1. However, a sender MAY choose to send a group of echo requests with the same Sequence Number to improve the chance of arrival of at least one packet with that Sequence Number. The TimeStamp Sent is set to the time of day in NTP format that the echo request is sent. The TimeStamp Received is set to zero. An MPLS echo request MUST have a FEC Stack TLV. Also, the Reply Mode must be set to the desired Reply Mode; the Return Code and Subcode are set to zero. In the "traceroute" mode, the echo request SHOULD include a Downstream Detailed Mapping TLV.
4.4. Receiving an MPLS Echo Request
Sending an MPLS echo request to the control plane is triggered by one of the following packet processing exceptions: Router Alert option, IP TTL expiration, MPLS TTL expiration, MPLS Router Alert label, or the destination address in the 127/8 address range. The control plane further identifies it by UDP destination port 3503. For reporting purposes, the bottom of the stack is considered to be a stack-depth of 1. This is to establish an absolute reference for the case where the actual stack may have more labels than there are FECs in the Target FEC Stack. Furthermore, in all the Return Codes listed in this document, a stack-depth of 0 means "no value specified". This allows compatibility with existing implementations that do not use the Return Subcode field. An LSR X that receives an MPLS echo request then processes it as follows. 1. General packet sanity is verified. If the packet is not well- formed, LSR X SHOULD send an MPLS echo reply with the Return Code set to "Malformed echo request received" and the Subcode set to zero. If there are any TLVs not marked as "Ignore" (i.e., if the TLV type is less than 32768, see Section 3) that LSR X does not understand, LSR X SHOULD send an MPLS "TLV not understood" (as appropriate), and set the Subcode to zero. In the latter case, the misunderstood TLVs (only) are included as sub-TLVs in an Errored TLVs TLV in the reply. The header field's Sender's Handle, Sequence Number, and Timestamp Sent are not examined but are included in the MPLS echo reply message. The algorithm uses the following variables and identifiers: Interface-I: the interface on which the MPLS echo request was received. Stack-R: the label stack on the packet as it was received. Stack-D: the label stack carried in the "Label stack sub-TLV" in the Downstream Detailed Mapping TLV (not always present). Label-L: the label from the actual stack currently being examined. Requires no initialization.
Label-stack-depth: the depth of the label being verified. Initialized to the number of labels in the received label stack S. FEC-stack-depth: depth of the FEC in the Target FEC Stack that should be used to verify the current actual label. Requires no initialization. Best-return-code: contains the Return Code for the echo reply packet as currently best known. As the algorithm progresses, this code may change depending on the results of further checks that it performs. Best-rtn-subcode: similar to Best-return-code, but for the echo reply Subcode. FEC-status: result value returned by the FEC Checking algorithm described in Section 4.4.1. /* Save receive context information */ 2. If the echo request is good, LSR X stores the interface over which the echo was received in Interface-I, and the label stack with which it came in Stack-R. /* The rest of the algorithm iterates over the labels in Stack-R, verifies validity of label values, reports associated label switching operations (for traceroute), verifies correspondence between the Stack-R and the Target FEC Stack description in the body of the echo request, and reports any errors. */ /* The algorithm iterates as follows. */ 3. Label Validation: If Label-stack-depth is 0 { /* The LSR needs to report that it is a tail end for the LSP */ Set FEC-stack-depth to 1, set Label-L to 3 (Implicit Null). Set Best-return-code to 3 ("Replying router is an egress for the FEC at stack-depth"), set Best-rtn-subcode to the value of FEC-stack-depth (1), and go to step 5 (Egress Processing). } /* This step assumes there is always an entry for well-known label values */
Set Label-L to the value extracted from Stack-R at depth Label-stack-depth. Look up Label-L in the Incoming Label Map (ILM) to determine if the label has been allocated and an operation is associated with it. If there is no entry for Label-L { /* Indicates a temporary or permanent label synchronization problem, and the LSR needs to report an error */ Set Best-return-code to 11 ("No label entry at stack-depth") and Best-rtn-subcode to Label-stack-depth. Go to step 7 (Send Reply Packet). } Else { Retrieve the associated label operation from the corresponding Next Hop Label Forwarding Entry (NHLFE), and proceed to step 4 (Label Operation Check). } 4. Label Operation Check If the label operation is "Pop and Continue Processing" { /* Includes Explicit Null and Router Alert label cases */ Iterate to the next label by decrementing Label-stack-depth, and loop back to step 3 (Label Validation). } If the label operation is "Swap or Pop and Switch based on Popped Label" { Set Best-return-code to 8 ("Label switched at stack-depth") and Best-rtn-subcode to Label-stack-depth to report transit switching. If a Downstream Detailed Mapping TLV is present in the received echo request { If the IP address in the TLV is 127.0.0.1 or 0::1 {
Set Best-return-code to 6 ("Upstream Interface Index Unknown"). An Interface and Label Stack TLV SHOULD be included in the reply and filled with Interface-I and Stack-R. } Else { Verify that the IP address, interface address, and label stack in the Downstream Detailed Mapping TLV match Interface-I and Stack-R. If there is a mismatch, set Best-return-code to 5, "Downstream Mapping Mismatch". An Interface and Label Stack TLV SHOULD be included in the reply and filled in based on Interface-I and Stack-R. Go to step 7 (Send Reply Packet). } } For each available downstream ECMP path { Retrieve output interface from the NHLFE entry. /* Note: this Return Code is set even if Label-stack-depth is one */ If the output interface is not MPLS enabled { Set Best-return-code to Return Code 9, "Label switched but no MPLS forwarding at stack-depth" and set Best-rtn-subcode to Label-stack-depth and go to step 7 (Send Reply Packet). } If a Downstream Detailed Mapping TLV is present { A Downstream Detailed Mapping TLV SHOULD be included in the echo reply (see Section 3.4) filled in with information about the current ECMP path. } }
If no Downstream Detailed Mapping TLV is present, or the Downstream IP Address is set to the ALLROUTERS multicast address, go to step 7 (Send Reply Packet). If the "Validate FEC Stack" flag is not set and the LSR is not configured to perform FEC checking by default, go to step 7 (Send Reply Packet). /* Validate the Target FEC Stack in the received echo request. First determine FEC-stack-depth from the Downstream Detailed Mapping TLV. This is done by walking through Stack-D (the Downstream labels) from the bottom, decrementing the number of labels for each non-Implicit Null label, while incrementing FEC-stack-depth for each label. If the Downstream Detailed Mapping TLV contains one or more Implicit Null labels, FEC-stack-depth may be greater than Label-stack-depth. To be consistent with the above stack-depths, the bottom is considered to be entry 1. */ Set FEC-stack-depth to 0. Set i to Label-stack-depth. While (i > 0) do { ++FEC-stack-depth. if Stack-D [ FEC-stack-depth ] != 3 (Implicit Null) --i. } If the number of FECs in the FEC stack is greater than or equal to FEC-stack-depth { Perform the FEC Checking procedure (see Section 4.4.1). If FEC-status is 2, set Best-return-code to 10 ("Mapping for this FEC is not the given label at stack-depth"). If the Return Code is 1, set Best-return-code to FEC-return-code and Best-rtn-subcode to FEC-stack-depth. } Go to step 7 (Send Reply Packet). }
5. Egress Processing: /* These steps are performed by the LSR that identified itself as the tail-end LSR for an LSP. */ If the received echo request contains no Downstream Detailed Mapping TLV, or the Downstream IP Address is set to 127.0.0.1 or 0::1, go to step 6 (Egress FEC Validation). Verify that the IP address, interface address, and label stack in the Downstream Detailed Mapping TLV match Interface-I and Stack-R. If not, set Best-return-code to 5, "Downstream Mapping Mismatch". A Received Interface and Label Stack TLV SHOULD be created for the echo response packet. Go to step 7 (Send Reply Packet). 6. Egress FEC Validation: /* This is a loop for all entries in the Target FEC Stack starting with FEC-stack-depth. */ Perform FEC checking by following the algorithm described in Section 4.4.1 for Label-L and the FEC at FEC-stack-depth. Set Best-return-code to FEC-code and Best-rtn-subcode to the value in FEC-stack-depth. If FEC-status (the result of the check) is 1, go to step 7 (Send Reply Packet). /* Iterate to the next FEC entry */ ++FEC-stack-depth. If FEC-stack-depth > the number of FECs in the FEC-stack, go to step 7 (Send Reply Packet). If FEC-status is 0 { ++Label-stack-depth. If Label-stack-depth > the number of labels in Stack-R, go to step 7 (Send Reply Packet). Label-L = extracted label from Stack-R at depth Label-stack-depth. Loop back to step 6 (Egress FEC Validation). }
7. Send Reply Packet: Send an MPLS echo reply with a Return Code of Best-return-code and a Return Subcode of Best-rtn-subcode. Include any TLVs created during the above process. The procedures for sending the echo reply are found in Section 4.5.4.4.1. FEC Validation
/* This section describes validation of a FEC entry within the Target FEC Stack and accepts a FEC, Label-L, and Interface-I. If the outermost FEC of the Target FEC stack is the Nil FEC, then the node MUST skip the Target FEC validation completely. This is to support FEC hiding, in which the outer hidden FEC can be the Nil FEC. Else, the algorithm performs the following steps. */ 1. Two return values, FEC-status and FEC-return-code, are initialized to 0. 2. If the FEC is the Nil FEC { If Label-L is either Explicit_Null or Router_Alert, return. Else { Set FEC-return-code to 10 ("Mapping for this FEC is not the given label at stack-depth"). Set FEC-status to 1 Return. } } 3. Check the FEC label mapping that describes how traffic received on the LSP is further switched or which application it is associated with. If no mapping exists, set FEC-return-code to Return 4, "Replying router has no mapping for the FEC at stack- depth". Set FEC-status to 1. Return. 4. If the label mapping for FEC is Implicit Null, set FEC-status to 2 and proceed to step 5. Otherwise, if the label mapping for FEC is Label-L, proceed to step 5. Otherwise, set FEC-return-code to 10 ("Mapping for this FEC is not the given label at stack- depth"), set FEC-status to 1, and return.
5. This is a protocol check. Check what protocol would be used to advertise the FEC. If it can be determined that no protocol associated with Interface-I would have advertised a FEC of that FEC-Type, set FEC-return-code to 12 ("Protocol not associated with interface at FEC stack-depth"). Set FEC-status to 1. 6. Return.4.5. Sending an MPLS Echo Reply
An MPLS echo reply is a UDP packet. It MUST ONLY be sent in response to an MPLS echo request. The source IP address is a routable address of the replier; the source port is the well-known UDP port for LSP ping. The destination IP address and UDP port are copied from the source IP address and UDP port of the echo request. The IP TTL is set to 255. If the Reply Mode in the echo request is "Reply via an IPv4 UDP packet with Router Alert", then the IP header MUST contain the Router Alert IP Option of value 0x0 [RFC2113] for IPv4 or 69 [RFC7506] for IPv6. If the reply is sent over an LSP, the topmost label MUST in this case be the Router Alert label (1) (see [RFC3032]). The format of the echo reply is the same as the echo request. The Sender's Handle, the Sequence Number, and TimeStamp Sent are copied from the echo request; the TimeStamp Received is set to the time of day that the echo request is received (note that this information is most useful if the time-of-day clocks on the requester and the replier are synchronized). The FEC Stack TLV from the echo request MAY be copied to the reply. The replier MUST fill in the Return Code and Subcode, as determined in the previous section. If the echo request contains a Pad TLV, the replier MUST interpret the first octet for instructions regarding how to reply. If the replying router is the destination of the FEC, then Downstream Detailed Mapping TLVs SHOULD NOT be included in the echo reply. If the echo request contains a Downstream Detailed Mapping TLV, and the replying router is not the destination of the FEC, the replier SHOULD compute its downstream routers and corresponding labels for the incoming label and add Downstream Detailed Mapping TLVs for each one to the echo reply it sends back. A replying node should follow the procedures defined in Section 4.5.1 if there is a FEC stack change due to tunneled LSP. If the FEC stack change is due to stitched LSP, it should follow the procedures defined in Section 4.5.2.
If the Downstream Detailed Mapping TLV contains Multipath Information requiring more processing than the receiving router is willing to perform, the responding router MAY choose to respond with only a subset of multipaths contained in the echo request Downstream Detailed Mapping. (Note: The originator of the echo request MAY send another echo request with the Multipath Information that was not included in the reply.) Except in the case of Reply Mode 4, "Reply via application-level control channel", echo replies are always sent in the context of the IP/MPLS network.4.5.1. Addition of a New Tunnel
A transit node knows when the FEC being traced is going to enter a tunnel at that node. Thus, it knows about the new outer FEC. All transit nodes that are the origination point of a new tunnel SHOULD add the FEC stack change sub-TLV (Section 3.4.1.3) to the Downstream Detailed Mapping TLV in the echo reply. The transit node SHOULD add one FEC stack change sub-TLV of operation type PUSH, per new tunnel being originated at the transit node. A transit node that sends a Downstream FEC stack change sub-TLV in the echo reply SHOULD fill the address of the remote peer, which is the peer of the current LSP being traced. If the transit node does not know the address of the remote peer, it MUST set the address type to Unspecified. The Label Stack sub-TLV MUST contain one additional label per FEC being PUSHed. The label MUST be encoded as defined in Section 3.4.1.2. The label value MUST be the value used to switch the data traffic. If the tunnel is a transparent pipe to the node, i.e., the data-plane trace will not expire in the middle of the new tunnel, then a FEC stack change sub-TLV SHOULD NOT be added, and the Label Stack sub-TLV SHOULD NOT contain a label corresponding to the hidden tunnel. If the transit node wishes to hide the nature of the tunnel from the ingress of the echo request, then it MAY not want to send details about the new tunnel FEC to the ingress. In such a case, the transit node SHOULD use the Nil FEC. The echo reply would then contain a FEC stack change sub-TLV with operation type PUSH and a Nil FEC. The value of the label in the Nil FEC MUST be set to zero. The remote peer address type MUST be set to Unspecified. The transit node SHOULD add one FEC stack change sub-TLV of operation type PUSH, per new tunnel being originated at the transit node. The Label Stack sub-TLV MUST contain one additional label per FEC being PUSHed. The label value MUST be the value used to switch the data traffic.
4.5.2. Transition between Tunnels
A transit node stitching two LSPs SHOULD include two FEC stack change sub-TLVs. One with a pop operation for the old FEC (ingress) and one with the PUSH operation for the new FEC (egress). The replying node SHOULD set the Return Code to "Label switched with FEC change" to indicate change in the FEC being traced. If the replying node wishes to perform FEC hiding, it SHOULD respond back with two FEC stack change sub-TLVs, one pop followed by one PUSH. The pop operation MAY either exclude the FEC TLV (by setting the FEC TLV length to 0) or set the FEC TLV to contain the LDP FEC. The PUSH operation SHOULD have the FEC TLV containing the Nil FEC. The Return Code SHOULD be set to "Label switched with FEC change". If the replying node wishes to perform FEC hiding, it MAY choose to not send any FEC stack change sub-TLVs in the echo reply if the number of labels does not change for the downstream node and the FEC type also does not change (Nil FEC). In such case, the replying node MUST NOT set the Return Code to "Label switched with FEC change".4.6. Receiving an MPLS Echo Reply
An LSR X should only receive an MPLS echo reply in response to an MPLS echo request that it sent. Thus, on receipt of an MPLS echo reply, X should parse the packet to ensure that it is well-formed, then attempt to match up the echo reply with an echo request that it had previously sent, using the destination UDP port and the Sender's Handle. If no match is found, then X jettisons the echo reply; otherwise, it checks the Sequence Number to see if it matches. If the echo reply contains Downstream Detailed Mappings, and X wishes to traceroute further, it SHOULD copy the Downstream Detailed Mapping(s) into its next echo request(s) (with TTL incremented by one). If one or more FEC stack change sub-TLVs are received in the MPLS echo reply, the ingress node SHOULD process them and perform some validation. The FEC stack changes are associated with a downstream neighbor and along a particular path of the LSP. Consequently, the ingress will need to maintain a FEC stack per path being traced (in case of multipath). All changes to the FEC stack resulting from the processing of a FEC stack change sub-TLV(s) should be applied only for the path along a given downstream neighbor. The following algorithm should be followed for processing FEC stack change sub-TLVs.
push_seen = FALSE fec_stack_depth = current-depth-of-fec-stack-being-traced saved_fec_stack = current_fec_stack while (sub-tlv = get_next_sub_tlv(downstream_detailed_map_tlv)) if (sub-tlv == NULL) break if (sub-tlv.type == FEC-Stack-Change) { if (sub-tlv.operation == POP) { if (push_seen) { Drop the echo reply current_fec_stack = saved_fec_stack return } if (fec_stack_depth == 0) { Drop the echo reply current_fec_stack = saved_fec_stack return } Pop FEC from FEC stack being traced fec_stack_depth--; } if (sub-tlv.operation == PUSH) { push_seen = 1 Push FEC on FEC stack being traced fec_stack_depth++; } } } if (fec_stack_depth == 0) { Drop the echo reply current_fec_stack = saved_fec_stack return } The next MPLS echo request along the same path should use the modified FEC stack obtained after processing the FEC stack change sub-TLVs. A non-Nil FEC guarantees that the next echo request along the same path will have the Downstream Detailed Mapping TLV validated for IP address, interface address, and label stack mismatches.
If the top of the FEC stack is a Nil FEC and the MPLS echo reply does not contain any FEC stack change sub-TLVs, then it does not necessarily mean that the LSP has not started traversing a different tunnel. It could be that the LSP associated with the Nil FEC terminated at a transit node, and at the same time, a new LSP started at the same transit node. The Nil FEC would now be associated with the new LSP (and the ingress has no way of knowing this). Thus, it is not possible to build an accurate hierarchical LSP topology if a traceroute contains Nil FECs. A reply from a downstream node with Return Code 3, may not necessarily be for the FEC being traced. It could be for one of the new FECs that was added. On receipt of an IS_EGRESS reply, the LSP ingress should check if the depth of Target FEC sent to the node that just responded was the same as the depth of the FEC that was being traced. If it was not, then it should pop an entry from the Target FEC stack and resend the request with the same TTL (as previously sent). The process of popping a FEC is to be repeated until either the LSP ingress receives a non-IS_EGRESS reply or until all the additional FECs added to the FEC stack have already been popped. Using an IS_EGRESS reply, an ingress can build a map of the hierarchical LSP structure traversed by a given FEC. When the MPLS echo reply Return Code is "Label switched with FEC change", the ingress node SHOULD manipulate the FEC stack as per the FEC stack change sub-TLVs contained in the Downstream Detailed Mapping TLV. A transit node can use this Return Code for stitched LSPs and for hierarchical LSPs. In case of ECMP or P2MP, there could be multiple paths and Downstream Detailed Mapping TLVs with different Return Codes (see Section 3.1, Note 2). The ingress node should build the topology based on the Return Code per ECMP path/P2MP branch.4.7. Issue with VPN IPv4 and IPv6 Prefixes
Typically, an LSP ping for a VPN IPv4 prefix or VPN IPv6 prefix is sent with a label stack of depth greater than 1, with the innermost label having a TTL of 1. This is to terminate the ping at the egress PE, before it gets sent to the customer device. However, under certain circumstances, the label stack can shrink to a single label before the ping hits the egress PE; this will result in the ping terminating prematurely. One such scenario is a multi-AS Carrier's Carrier VPN. To get around this problem, one approach is for the LSR that receives such a ping to realize that the ping terminated prematurely and to send back Return Code 13. In that case, the initiating LSR can retry
the ping after incrementing the TTL on the VPN label. In this fashion, the ingress LSR will sequentially try TTL values until it finds one that allows the VPN ping to reach the egress PE.4.8. Non-compliant Routers
If the egress for the FEC Stack being pinged does not support LSP ping, then no reply will be sent, resulting in possible "false negatives". When in "traceroute" mode, if a transit LSR does not support LSP ping, then no reply will be forthcoming from that LSR for some TTL, say, n. The LSR originating the echo request SHOULD try sending the echo request with TTL=n+1, n+2, ..., n+k to probe LSRs further down the path. In such a case, the echo request for TTL > n SHOULD be sent with the Downstream Detailed Mapping TLV "Downstream IP Address" field set to the ALLROUTERs multicast address until a reply is received with a Downstream Detailed Mapping TLV. The label Stack TLV MAY be omitted from the Downstream Detailed Mapping TLV. Furthermore, the "Validate FEC Stack" flag SHOULD NOT be set until an echo reply packet with a Downstream Detailed Mapping TLV is received.5. Security Considerations
Overall, the security needs for LSP ping are similar to those of ICMP ping. There are at least three approaches to attacking LSRs using the mechanisms defined here. One is a Denial-of-Service (DoS) attack, by sending MPLS echo requests/replies to LSRs and thereby increasing their workload. The second is obfuscating the state of the MPLS data-plane liveness by spoofing, hijacking, replaying, or otherwise tampering with MPLS echo requests and replies. The third is an unauthorized source using an LSP ping to obtain information about the network. To avoid potential DoS attacks, it is RECOMMENDED that implementations regulate the LSP ping traffic going to the control plane. A rate limiter SHOULD be applied to the well-known UDP port defined in Section 6.1. Unsophisticated replay and spoofing attacks involving faking or replaying MPLS echo reply messages are unlikely to be effective. These replies would have to match the Sender's Handle and Sequence Number of an outstanding MPLS echo request message. A non-matching replay would be discarded as the sequence has moved on, thus a spoof has only a small window of opportunity. However, to provide a stronger defense, an implementation MAY also validate the TimeStamp Sent by requiring an exact match on this field.
To protect against unauthorized sources using MPLS echo request messages to obtain network information, it is RECOMMENDED that implementations provide a means of checking the source addresses of MPLS echo request messages against an access list before accepting the message. It is not clear how to prevent hijacking (non-delivery) of echo requests or replies; however, if these messages are indeed hijacked, LSP ping will report that the data plane is not working as it should. It does not seem vital (at this point) to secure the data carried in MPLS echo requests and replies, although knowledge of the state of the MPLS data plane may be considered confidential by some. Implementations SHOULD, however, provide a means of filtering the addresses to which echo reply messages may be sent. The value part of the Pad TLV contains a variable number of octets. With the exception of the first octet, these contents, if any, are ignored on receipt, and can therefore serve as a clandestine channel. When MPLS LSP ping is used within an administrative domain, a deployment can increase security by using border filtering of incoming LSP ping packets as well as outgoing LSP ping packets. Although this document makes special use of 127/8 addresses, these are used only in conjunction with the UDP port 3503. Furthermore, these packets are only processed by routers. All other hosts MUST treat all packets with a destination address in the range 127/8 in accordance to RFC 1122. Any packet received by a router with a destination address in the range 127/8 without a destination UDP port of 3503 MUST be treated in accordance to RFC 1812. In particular, the default behavior is to treat packets destined to a 127/8 address as "martians". If a network operator wants to prevent tracing inside a tunnel, one can use the Pipe Model [RFC3443], i.e., hide the outer MPLS tunnel by not propagating the MPLS TTL into the outer tunnel (at the start of the outer tunnel). By doing this, LSP traceroute packets will not expire in the outer tunnel, and the outer tunnel will not get traced. If one doesn't wish to expose the details of the new outer LSP, then the Nil FEC can be used to hide those details. Using the Nil FEC ensures that the trace progresses without false negatives and all transit nodes (of the new outer tunnel) perform some minimal validations on the received MPLS echo requests.