RFC 8029

Detecting Multiprotocol Label Switched (MPLS) Data-Plane Failures

Pages: 78
Proposed Standard
→ Errata
Obsoletes: 4379 6424 6829 7537
Updates: 1122
Updated by: 8611 9041 9570

Part 3 of 4 – Pages 41 to 60

RFC8029 - Page 41 prevText

3.5.  Pad TLV

   The value part of the Pad TLV contains a variable number (>= 1) of
   octets.  The first octet takes values from the following table; all
   the other octets (if any) are ignored.  The receiver SHOULD verify
   that the TLV is received in its entirety, but otherwise ignores the
   contents of this TLV, apart from the first octet.

      Value        Meaning
      -----        -------
          0        Reserved
          1        Drop Pad TLV from reply
          2        Copy Pad TLV to reply
      3-250        Unassigned
    251-254        Reserved for Experimental Use
        255        Reserved

   The Pad TLV can be added to an echo request to create a message of a
   specific length in cases where messages of various sizes are needed
   for troubleshooting.  The first octet allows for controlling the
   inclusion of this additional padding in the respective echo reply.

3.6.  Vendor Enterprise Number

   "Private Enterprise Numbers" [IANA-ENT] are maintained by IANA.  The
   Length of this TLV is always 4; the value is the Structure of
   Management Information (SMI) Private Enterprise Code, in network
   octet order, of the vendor with a Vendor Private extension to any of
   the fields in the fixed part of the message, in which case this TLV
   MUST be present.  If none of the fields in the fixed part of the
   message have Vendor Private extensions, inclusion of this TLV is
   OPTIONAL.  Vendor Private ranges for Message Types, Reply Modes, and
   Return Codes have been defined.  When any of these are used, the
   Vendor Enterprise Number TLV MUST be included in the message.

RFC8029 - Page 42

3.7.  Interface and Label Stack

   The Interface and Label Stack TLV MAY be included in a reply message
   to report the interface on which the request message was received and
   the label stack that was on the packet when it was received.  Only
   one such object may appear.  The purpose of the object is to allow
   the upstream router to obtain the exact interface and label stack
   information as it appears at the replying LSR.

   The Length is K + 4*N octets; N is the number of labels in the label
   stack.  Values for K are found in the description of Address Type
   below.  The Value field of this TLV has the following format:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Address Type  |             Must Be Zero                      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                   IP Address (4 or 16 octets)                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                   Interface (4 or 16 octets)                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      .                                                               .
      .                                                               .
      .                          Label Stack                          .
      .                                                               .
      .                                                               .
      .                                                               .
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Address Type

      The Address Type indicates if the interface is numbered or
      unnumbered.  It also determines the length of the IP Address and
      Interface fields.  The resulting total for the initial part of the
      TLV is listed in the table below as "K Octets".  The Address Type
      is set to one of the following values:

         Type #        Address Type           K Octets
         ------        ------------           --------
              0        Reserved                      4
              1        IPv4 Numbered                12
              2        IPv4 Unnumbered              12
              3        IPv6 Numbered                36
              4        IPv6 Unnumbered              24
          5-250        Unassigned
        251-254        Reserved for Experimental Use
            255        Reserved

RFC8029 - Page 43

   IP Address and Interface

      IPv4 addresses and interface indices are encoded in 4 octets; IPv6
      addresses are encoded in 16 octets.

      If the interface upon which the echo request message was received
      is numbered, then the Address Type MUST be set to IPv4 or IPv6,
      the IP Address MUST be set to either the LSR's Router ID or the
      interface address, and the Interface MUST be set to the interface
      address.

      If the interface is unnumbered, the Address Type MUST be either
      IPv4 Unnumbered or IPv6 Unnumbered, the IP Address MUST be the
      LSR's Router ID, and the Interface MUST be set to the index
      assigned to the interface.

   Label Stack

      The label stack of the received echo request message.  If any TTL
      values have been changed by this router, they SHOULD be restored.

3.8.  Errored TLVs

   The following TLV is a TLV that MAY be included in an echo reply to
   inform the sender of an echo request of mandatory TLVs either not
   supported by an implementation or parsed and found to be in error.

   The Value field contains the TLVs that were not understood, encoded
   as sub-TLVs.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |             Type = 9          |            Length             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                             Value                             |
      .                                                               .
      .                                                               .
      .                                                               .
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

RFC8029 - Page 44

3.9.  Reply TOS Octet TLV

   This TLV MAY be used by the originator of the echo request to request
   that an echo reply be sent with the IP header Type of Service (TOS)
   octet set to the value specified in the TLV.  This TLV has a length
   of 4 with the following Value field.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Reply-TOS Byte|                 Must Be Zero                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4.  Theory of Operation

   An MPLS echo request is used to test a particular LSP.  The LSP to be
   tested is identified by the "FEC Stack"; for example, if the LSP was
   set up via LDP, and a label is mapped to an egress IP address of
   198.51.100.1, the FEC Stack contains a single element, namely, an LDP
   IPv4 prefix sub-TLV with value 198.51.100.1/32.  If the LSP being
   tested is an RSVP LSP, the FEC Stack consists of a single element
   that captures the RSVP Session and Sender Template that uniquely
   identifies the LSP.

   FEC Stacks can be more complex.  For example, one may wish to test a
   VPN IPv4 prefix of 203.0.113.0/24 that is tunneled over an LDP LSP
   with egress 192.0.2.1.  The FEC Stack would then contain two
   sub-TLVs, the bottom being a VPN IPv4 prefix, and the top being an
   LDP IPv4 prefix.  If the underlying (LDP) tunnel were not known, or
   was considered irrelevant, the FEC Stack could be a single element
   with just the VPN IPv4 sub-TLV.

   When an MPLS echo request is received, the receiver is expected to
   verify that the control plane and data plane are both healthy (for
   the FEC Stack being pinged), and that the two planes are in sync.
   The procedures for this are in Section 4.4.

4.1.  Dealing with Equal-Cost Multipath (ECMP)

   LSPs need not be simple point-to-point tunnels.  Frequently, a single
   LSP may originate at several ingresses and terminate at several
   egresses; this is very common with LDP LSPs.  LSPs for a given FEC
   may also have multiple "next hops" at transit LSRs.  At an ingress,
   there may also be several different LSPs to choose from to get to the
   desired endpoint.  Finally, LSPs may have backup paths, detour paths,
   and other alternative paths to take should the primary LSP go down.

RFC8029 - Page 45

   Regarding the last two points stated above: it is assumed that the
   LSR sourcing MPLS echo requests can force the echo request into any
   desired LSP, so choosing among multiple LSPs at the ingress is not an
   issue.  The problem of probing the various flavors of backup paths
   that will typically not be used for forwarding data unless the
   primary LSP is down will not be addressed here.

   Since the actual LSP and path that a given packet may take may not be
   known a priori, it is useful if MPLS echo requests can exercise all
   possible paths.  This, although desirable, may not be practical
   because the algorithms that a given LSR uses to distribute packets
   over alternative paths may be proprietary.

   To achieve some degree of coverage of alternate paths, there is a
   certain latitude in choosing the destination IP address and source
   UDP port for an MPLS echo request.  This is clearly not sufficient;
   in the case of traceroute, more latitude is offered by means of the
   Multipath Information of the Downstream Detailed Mapping TLV.  This
   is used as follows.  An ingress LSR periodically sends an LSP
   traceroute message to determine whether there are multipaths for a
   given LSP.  If so, each hop will provide some information as to how
   each of its downstream paths can be exercised.  The ingress can then
   send MPLS echo requests that exercise these paths.  If several
   transit LSRs have ECMP, the ingress may attempt to compose these to
   exercise all possible paths.  However, full coverage may not be
   possible.

4.2.  Testing LSPs That Are Used to Carry MPLS Payloads

   To detect certain LSP breakages, it may be necessary to encapsulate
   an MPLS echo request packet with at least one additional label when
   testing LSPs that are used to carry MPLS payloads (such as LSPs used
   to carry L2VPN and L3VPN traffic.  For example, when testing LDP or
   RSVP-TE LSPs, just sending an MPLS echo request packet may not detect
   instances where the router immediately upstream of the destination of
   the LSP ping may forward the MPLS echo request successfully over an
   interface not configured to carry MPLS payloads because of the use of
   penultimate hop popping.  Since the receiving router has no means to
   ascertain whether the IP packet was sent unlabeled or implicitly
   labeled, the addition of labels shimmed above the MPLS echo request
   (using the Nil FEC) will prevent a router from forwarding such a
   packet out to unlabeled interfaces.

RFC8029 - Page 46

4.3.  Sending an MPLS Echo Request

   An MPLS echo request is a UDP packet.  The IP header is set as
   follows: the source IP address is a routable address of the sender;
   the destination IP address is a (randomly chosen) IPv4 address from
   the range 127/8 or an IPv6 address from the range
   0:0:0:0:0:FFFF:7F00:0/104.  The IP TTL is set to 1.  The source UDP
   port is chosen by the sender; the destination UDP port is set to 3503
   (assigned by IANA for MPLS echo requests).  The Router Alert IP
   Option of value 0x0 [RFC2113] for IPv4 or value 69 [RFC7506] for IPv6
   MUST be set in the IP header.

   An MPLS echo request is sent with a label stack corresponding to the
   FEC Stack being tested.  Note that further labels could be applied
   if, for example, the normal route to the topmost FEC in the stack is
   via a Traffic Engineered Tunnel [RFC3209].  If all of the FECs in the
   stack correspond to Implicit Null labels, the MPLS echo request is
   considered unlabeled even if further labels will be applied in
   sending the packet.

   If the echo request is labeled, one MAY (depending on what is being
   pinged) set the TTL of the innermost label to 1, to prevent the ping
   request going farther than it should.  Examples of where this SHOULD
   be done include pinging a VPN IPv4 or IPv6 prefix, an L2 VPN
   endpoint, or a pseudowire.  Preventing the ping request from going
   too far can also be accomplished by inserting a Router Alert label
   above this label; however, this may lead to the undesired side effect
   that MPLS echo requests take a different data path than actual data.
   For more information on how these mechanisms can be used for
   pseudowire connectivity verification, see [RFC5085][RFC5885].

   In "ping" mode (end-to-end connectivity check), the TTL in the
   outermost label is set to 255.  In "traceroute" mode (fault isolation
   mode), the TTL is set successively to 1, 2, and so on.

   The sender chooses a Sender's Handle and a Sequence Number.  When
   sending subsequent MPLS echo requests, the sender SHOULD increment
   the Sequence Number by 1.  However, a sender MAY choose to send a
   group of echo requests with the same Sequence Number to improve the
   chance of arrival of at least one packet with that Sequence Number.

   The TimeStamp Sent is set to the time of day in NTP format that the
   echo request is sent.  The TimeStamp Received is set to zero.

   An MPLS echo request MUST have a FEC Stack TLV.  Also, the Reply Mode
   must be set to the desired Reply Mode; the Return Code and Subcode
   are set to zero.  In the "traceroute" mode, the echo request SHOULD
   include a Downstream Detailed Mapping TLV.

RFC8029 - Page 47

4.4.  Receiving an MPLS Echo Request

   Sending an MPLS echo request to the control plane is triggered by one
   of the following packet processing exceptions: Router Alert option,
   IP TTL expiration, MPLS TTL expiration, MPLS Router Alert label, or
   the destination address in the 127/8 address range.  The control
   plane further identifies it by UDP destination port 3503.

   For reporting purposes, the bottom of the stack is considered to be a
   stack-depth of 1.  This is to establish an absolute reference for the
   case where the actual stack may have more labels than there are FECs
   in the Target FEC Stack.

   Furthermore, in all the Return Codes listed in this document, a
   stack-depth of 0 means "no value specified".  This allows
   compatibility with existing implementations that do not use the
   Return Subcode field.

   An LSR X that receives an MPLS echo request then processes it as
   follows.

   1.  General packet sanity is verified.  If the packet is not well-
       formed, LSR X SHOULD send an MPLS echo reply with the Return Code
       set to "Malformed echo request received" and the Subcode set to
       zero.  If there are any TLVs not marked as "Ignore" (i.e., if the
       TLV type is less than 32768, see Section 3) that LSR X does not
       understand, LSR X SHOULD send an MPLS "TLV not understood" (as
       appropriate), and set the Subcode to zero.  In the latter case,
       the misunderstood TLVs (only) are included as sub-TLVs in an
       Errored TLVs TLV in the reply.  The header field's Sender's
       Handle, Sequence Number, and Timestamp Sent are not examined but
       are included in the MPLS echo reply message.

   The algorithm uses the following variables and identifiers:

   Interface-I:        the interface on which the MPLS echo request was
                       received.

   Stack-R:            the label stack on the packet as it was received.

   Stack-D:            the label stack carried in the "Label stack
                       sub-TLV" in the Downstream Detailed Mapping TLV
                       (not always present).

   Label-L:            the label from the actual stack currently being
                       examined.  Requires no initialization.

RFC8029 - Page 48

   Label-stack-depth:  the depth of the label being verified.
                       Initialized to the number of labels in the
                       received label stack S.

   FEC-stack-depth:    depth of the FEC in the Target FEC Stack that
                       should be used to verify the current actual
                       label.  Requires no initialization.

   Best-return-code:   contains the Return Code for the echo reply
                       packet as currently best known.  As the algorithm
                       progresses, this code may change depending on the
                       results of further checks that it performs.

   Best-rtn-subcode:   similar to Best-return-code, but for the echo
                       reply Subcode.

   FEC-status:         result value returned by the FEC Checking
                       algorithm described in Section 4.4.1.

   /* Save receive context information */

   2.  If the echo request is good, LSR X stores the interface over
       which the echo was received in Interface-I, and the label stack
       with which it came in Stack-R.

   /* The rest of the algorithm iterates over the labels in Stack-R,
   verifies validity of label values, reports associated label switching
   operations (for traceroute), verifies correspondence between the
   Stack-R and the Target FEC Stack description in the body of the echo
   request, and reports any errors. */

   /* The algorithm iterates as follows. */

   3.  Label Validation:

      If Label-stack-depth is 0 {

      /* The LSR needs to report that it is a tail end for the LSP */

         Set FEC-stack-depth to 1, set Label-L to 3 (Implicit Null).
         Set Best-return-code to 3 ("Replying router is an egress for
         the FEC at stack-depth"), set Best-rtn-subcode to the value of
         FEC-stack-depth (1), and go to step 5 (Egress Processing).

      }

      /* This step assumes there is always an entry for well-known label
      values */

RFC8029 - Page 49

      Set Label-L to the value extracted from Stack-R at depth
      Label-stack-depth.  Look up Label-L in the Incoming Label Map
      (ILM) to determine if the label has been allocated and an
      operation is associated with it.

      If there is no entry for Label-L {

      /* Indicates a temporary or permanent label synchronization
      problem, and the LSR needs to report an error */

         Set Best-return-code to 11 ("No label entry at stack-depth")
         and Best-rtn-subcode to Label-stack-depth.  Go to step 7 (Send
         Reply Packet).

      }

      Else {

         Retrieve the associated label operation from the corresponding
         Next Hop Label Forwarding Entry (NHLFE), and proceed to step 4
         (Label Operation Check).

      }

   4.  Label Operation Check

      If the label operation is "Pop and Continue Processing" {

      /* Includes Explicit Null and Router Alert label cases */

         Iterate to the next label by decrementing Label-stack-depth,
         and loop back to step 3 (Label Validation).

      }

      If the label operation is "Swap or Pop and Switch based on Popped
      Label" {

         Set Best-return-code to 8 ("Label switched at stack-depth") and
         Best-rtn-subcode to Label-stack-depth to report transit
         switching.

         If a Downstream Detailed Mapping TLV is present in the received
         echo request {

            If the IP address in the TLV is 127.0.0.1 or 0::1 {

RFC8029 - Page 50

               Set Best-return-code to 6 ("Upstream Interface Index
               Unknown").  An Interface and Label Stack TLV SHOULD be
               included in the reply and filled with Interface-I and
               Stack-R.

            }

            Else {

               Verify that the IP address, interface address, and label
               stack in the Downstream Detailed Mapping TLV match
               Interface-I and Stack-R.  If there is a mismatch, set
               Best-return-code to 5, "Downstream Mapping Mismatch".  An
               Interface and Label Stack TLV SHOULD be included in the
               reply and filled in based on Interface-I and Stack-R.  Go
               to step 7 (Send Reply Packet).

            }

         }

         For each available downstream ECMP path {

            Retrieve output interface from the NHLFE entry.

            /* Note: this Return Code is set even if Label-stack-depth
            is one */

            If the output interface is not MPLS enabled {

               Set Best-return-code to Return Code 9, "Label switched
               but no MPLS forwarding at stack-depth" and set
               Best-rtn-subcode to Label-stack-depth and go to step 7
               (Send Reply Packet).

            }

            If a Downstream Detailed Mapping TLV is present {

               A Downstream Detailed Mapping TLV SHOULD be included in
               the echo reply (see Section 3.4) filled in with
               information about the current ECMP path.

            }

         }

RFC8029 - Page 51

         If no Downstream Detailed Mapping TLV is present, or the
         Downstream IP Address is set to the ALLROUTERS multicast
         address, go to step 7 (Send Reply Packet).

         If the "Validate FEC Stack" flag is not set and the LSR is not
         configured to perform FEC checking by default, go to step 7
         (Send Reply Packet).

         /* Validate the Target FEC Stack in the received echo request.

         First determine FEC-stack-depth from the Downstream Detailed
         Mapping TLV.  This is done by walking through Stack-D (the
         Downstream labels) from the bottom, decrementing the number of
         labels for each non-Implicit Null label, while incrementing
         FEC-stack-depth for each label.  If the Downstream Detailed
         Mapping TLV contains one or more Implicit Null labels,
         FEC-stack-depth may be greater than Label-stack-depth.  To be
         consistent with the above stack-depths, the bottom is
         considered to be entry 1.
         */

         Set FEC-stack-depth to 0.  Set i to Label-stack-depth.

         While (i > 0) do {

             ++FEC-stack-depth.
             if Stack-D [ FEC-stack-depth ] != 3 (Implicit Null)
             --i.
         }

         If the number of FECs in the FEC stack is greater than or equal
         to FEC-stack-depth {
         Perform the FEC Checking procedure (see Section 4.4.1).

            If FEC-status is 2, set Best-return-code to 10 ("Mapping for
            this FEC is not the given label at stack-depth").

            If the Return Code is 1, set Best-return-code to
            FEC-return-code and Best-rtn-subcode to FEC-stack-depth.
         }

         Go to step 7 (Send Reply Packet).
      }

RFC8029 - Page 52

   5.  Egress Processing:

      /* These steps are performed by the LSR that identified itself as
      the tail-end LSR for an LSP. */

      If the received echo request contains no Downstream Detailed
      Mapping TLV, or the Downstream IP Address is set to 127.0.0.1 or
      0::1, go to step 6 (Egress FEC Validation).

      Verify that the IP address, interface address, and label stack in
      the Downstream Detailed Mapping TLV match Interface-I and Stack-R.
      If not, set Best-return-code to 5, "Downstream Mapping Mismatch".
      A Received Interface and Label Stack TLV SHOULD be created for the
      echo response packet.  Go to step 7 (Send Reply Packet).

   6.  Egress FEC Validation:

      /* This is a loop for all entries in the Target FEC Stack starting
      with FEC-stack-depth. */

      Perform FEC checking by following the algorithm described in
      Section 4.4.1 for Label-L and the FEC at FEC-stack-depth.

      Set Best-return-code to FEC-code and Best-rtn-subcode to the value
      in FEC-stack-depth.


      If FEC-status (the result of the check) is 1,
      go to step 7 (Send Reply Packet).

      /* Iterate to the next FEC entry */


      ++FEC-stack-depth.
      If FEC-stack-depth > the number of FECs in the FEC-stack,
      go to step 7 (Send Reply Packet).

      If FEC-status is 0 {

         ++Label-stack-depth.
         If Label-stack-depth > the number of labels in Stack-R,
         go to step 7 (Send Reply Packet).

         Label-L = extracted label from Stack-R at depth
         Label-stack-depth.
         Loop back to step 6 (Egress FEC Validation).
      }

RFC8029 - Page 53

   7.  Send Reply Packet:

      Send an MPLS echo reply with a Return Code of Best-return-code and
      a Return Subcode of Best-rtn-subcode.  Include any TLVs created
      during the above process.  The procedures for sending the echo
      reply are found in Section 4.5.

4.4.1.  FEC Validation

   /* This section describes validation of a FEC entry within the Target
   FEC Stack and accepts a FEC, Label-L, and Interface-I.

   If the outermost FEC of the Target FEC stack is the Nil FEC, then the
   node MUST skip the Target FEC validation completely.  This is to
   support FEC hiding, in which the outer hidden FEC can be the Nil FEC.
   Else, the algorithm performs the following steps. */

   1.  Two return values, FEC-status and FEC-return-code, are
       initialized to 0.

   2.  If the FEC is the Nil FEC {

          If Label-L is either Explicit_Null or Router_Alert, return.

          Else {

             Set FEC-return-code to 10 ("Mapping for this FEC is not the
             given label at stack-depth").
             Set FEC-status to 1
             Return.
          }

       }

   3.  Check the FEC label mapping that describes how traffic received
       on the LSP is further switched or which application it is
       associated with.  If no mapping exists, set FEC-return-code to
       Return 4, "Replying router has no mapping for the FEC at stack-
       depth".  Set FEC-status to 1.  Return.

   4.  If the label mapping for FEC is Implicit Null, set FEC-status to
       2 and proceed to step 5.  Otherwise, if the label mapping for FEC
       is Label-L, proceed to step 5.  Otherwise, set FEC-return-code to
       10 ("Mapping for this FEC is not the given label at stack-
       depth"), set FEC-status to 1, and return.

RFC8029 - Page 54

   5.  This is a protocol check.  Check what protocol would be used to
       advertise the FEC.  If it can be determined that no protocol
       associated with Interface-I would have advertised a FEC of that
       FEC-Type, set FEC-return-code to 12 ("Protocol not associated
       with interface at FEC stack-depth").  Set FEC-status to 1.

   6.  Return.

4.5.  Sending an MPLS Echo Reply

   An MPLS echo reply is a UDP packet.  It MUST ONLY be sent in response
   to an MPLS echo request.  The source IP address is a routable address
   of the replier; the source port is the well-known UDP port for LSP
   ping.  The destination IP address and UDP port are copied from the
   source IP address and UDP port of the echo request.  The IP TTL is
   set to 255.  If the Reply Mode in the echo request is "Reply via an
   IPv4 UDP packet with Router Alert", then the IP header MUST contain
   the Router Alert IP Option of value 0x0 [RFC2113] for IPv4 or 69
   [RFC7506] for IPv6.  If the reply is sent over an LSP, the topmost
   label MUST in this case be the Router Alert label (1) (see
   [RFC3032]).

   The format of the echo reply is the same as the echo request.  The
   Sender's Handle, the Sequence Number, and TimeStamp Sent are copied
   from the echo request; the TimeStamp Received is set to the time of
   day that the echo request is received (note that this information is
   most useful if the time-of-day clocks on the requester and the
   replier are synchronized).  The FEC Stack TLV from the echo request
   MAY be copied to the reply.

   The replier MUST fill in the Return Code and Subcode, as determined
   in the previous section.

   If the echo request contains a Pad TLV, the replier MUST interpret
   the first octet for instructions regarding how to reply.

   If the replying router is the destination of the FEC, then Downstream
   Detailed Mapping TLVs SHOULD NOT be included in the echo reply.

   If the echo request contains a Downstream Detailed Mapping TLV, and
   the replying router is not the destination of the FEC, the replier
   SHOULD compute its downstream routers and corresponding labels for
   the incoming label and add Downstream Detailed Mapping TLVs for each
   one to the echo reply it sends back.  A replying node should follow
   the procedures defined in Section 4.5.1 if there is a FEC stack
   change due to tunneled LSP.  If the FEC stack change is due to
   stitched LSP, it should follow the procedures defined in
   Section 4.5.2.

RFC8029 - Page 55

   If the Downstream Detailed Mapping TLV contains Multipath Information
   requiring more processing than the receiving router is willing to
   perform, the responding router MAY choose to respond with only a
   subset of multipaths contained in the echo request Downstream
   Detailed Mapping.  (Note: The originator of the echo request MAY send
   another echo request with the Multipath Information that was not
   included in the reply.)

   Except in the case of Reply Mode 4, "Reply via application-level
   control channel", echo replies are always sent in the context of the
   IP/MPLS network.

4.5.1.  Addition of a New Tunnel

   A transit node knows when the FEC being traced is going to enter a
   tunnel at that node.  Thus, it knows about the new outer FEC.  All
   transit nodes that are the origination point of a new tunnel SHOULD
   add the FEC stack change sub-TLV (Section 3.4.1.3) to the Downstream
   Detailed Mapping TLV in the echo reply.  The transit node SHOULD add
   one FEC stack change sub-TLV of operation type PUSH, per new tunnel
   being originated at the transit node.

   A transit node that sends a Downstream FEC stack change sub-TLV in
   the echo reply SHOULD fill the address of the remote peer, which is
   the peer of the current LSP being traced.  If the transit node does
   not know the address of the remote peer, it MUST set the address type
   to Unspecified.

   The Label Stack sub-TLV MUST contain one additional label per FEC
   being PUSHed.  The label MUST be encoded as defined in
   Section 3.4.1.2.  The label value MUST be the value used to switch
   the data traffic.  If the tunnel is a transparent pipe to the node,
   i.e., the data-plane trace will not expire in the middle of the new
   tunnel, then a FEC stack change sub-TLV SHOULD NOT be added, and the
   Label Stack sub-TLV SHOULD NOT contain a label corresponding to the
   hidden tunnel.

   If the transit node wishes to hide the nature of the tunnel from the
   ingress of the echo request, then it MAY not want to send details
   about the new tunnel FEC to the ingress.  In such a case, the transit
   node SHOULD use the Nil FEC.  The echo reply would then contain a FEC
   stack change sub-TLV with operation type PUSH and a Nil FEC.  The
   value of the label in the Nil FEC MUST be set to zero.  The remote
   peer address type MUST be set to Unspecified.  The transit node
   SHOULD add one FEC stack change sub-TLV of operation type PUSH, per
   new tunnel being originated at the transit node.  The Label Stack
   sub-TLV MUST contain one additional label per FEC being PUSHed.  The
   label value MUST be the value used to switch the data traffic.

RFC8029 - Page 56

4.5.2.  Transition between Tunnels

   A transit node stitching two LSPs SHOULD include two FEC stack change
   sub-TLVs.  One with a pop operation for the old FEC (ingress) and one
   with the PUSH operation for the new FEC (egress).  The replying node
   SHOULD set the Return Code to "Label switched with FEC change" to
   indicate change in the FEC being traced.

   If the replying node wishes to perform FEC hiding, it SHOULD respond
   back with two FEC stack change sub-TLVs, one pop followed by one
   PUSH.  The pop operation MAY either exclude the FEC TLV (by setting
   the FEC TLV length to 0) or set the FEC TLV to contain the LDP FEC.
   The PUSH operation SHOULD have the FEC TLV containing the Nil FEC.
   The Return Code SHOULD be set to "Label switched with FEC change".

   If the replying node wishes to perform FEC hiding, it MAY choose to
   not send any FEC stack change sub-TLVs in the echo reply if the
   number of labels does not change for the downstream node and the FEC
   type also does not change (Nil FEC).  In such case, the replying node
   MUST NOT set the Return Code to "Label switched with FEC change".

4.6.  Receiving an MPLS Echo Reply

   An LSR X should only receive an MPLS echo reply in response to an
   MPLS echo request that it sent.  Thus, on receipt of an MPLS echo
   reply, X should parse the packet to ensure that it is well-formed,
   then attempt to match up the echo reply with an echo request that it
   had previously sent, using the destination UDP port and the Sender's
   Handle.  If no match is found, then X jettisons the echo reply;
   otherwise, it checks the Sequence Number to see if it matches.

   If the echo reply contains Downstream Detailed Mappings, and X wishes
   to traceroute further, it SHOULD copy the Downstream Detailed
   Mapping(s) into its next echo request(s) (with TTL incremented by
   one).

   If one or more FEC stack change sub-TLVs are received in the MPLS
   echo reply, the ingress node SHOULD process them and perform some
   validation.

   The FEC stack changes are associated with a downstream neighbor and
   along a particular path of the LSP.  Consequently, the ingress will
   need to maintain a FEC stack per path being traced (in case of
   multipath).  All changes to the FEC stack resulting from the
   processing of a FEC stack change sub-TLV(s) should be applied only
   for the path along a given downstream neighbor.  The following
   algorithm should be followed for processing FEC stack change
   sub-TLVs.

RFC8029 - Page 57

       push_seen = FALSE
       fec_stack_depth = current-depth-of-fec-stack-being-traced
       saved_fec_stack = current_fec_stack

       while (sub-tlv = get_next_sub_tlv(downstream_detailed_map_tlv))

           if (sub-tlv == NULL) break

           if (sub-tlv.type == FEC-Stack-Change) {

               if (sub-tlv.operation == POP) {
                   if (push_seen) {
                       Drop the echo reply
                       current_fec_stack = saved_fec_stack
                       return
                   }

                   if (fec_stack_depth == 0) {
                       Drop the echo reply
                       current_fec_stack = saved_fec_stack
                       return
                   }

                   Pop FEC from FEC stack being traced
                   fec_stack_depth--;
               }

               if (sub-tlv.operation == PUSH) {
                   push_seen = 1
                   Push FEC on FEC stack being traced
                   fec_stack_depth++;
               }
            }
        }


        if (fec_stack_depth == 0) {
            Drop the echo reply
            current_fec_stack = saved_fec_stack
            return
        }

   The next MPLS echo request along the same path should use the
   modified FEC stack obtained after processing the FEC stack change
   sub-TLVs.  A non-Nil FEC guarantees that the next echo request along
   the same path will have the Downstream Detailed Mapping TLV validated
   for IP address, interface address, and label stack mismatches.

RFC8029 - Page 58

   If the top of the FEC stack is a Nil FEC and the MPLS echo reply does
   not contain any FEC stack change sub-TLVs, then it does not
   necessarily mean that the LSP has not started traversing a different
   tunnel.  It could be that the LSP associated with the Nil FEC
   terminated at a transit node, and at the same time, a new LSP started
   at the same transit node.  The Nil FEC would now be associated with
   the new LSP (and the ingress has no way of knowing this).  Thus, it
   is not possible to build an accurate hierarchical LSP topology if a
   traceroute contains Nil FECs.

   A reply from a downstream node with Return Code 3, may not
   necessarily be for the FEC being traced.  It could be for one of the
   new FECs that was added.  On receipt of an IS_EGRESS reply, the LSP
   ingress should check if the depth of Target FEC sent to the node that
   just responded was the same as the depth of the FEC that was being
   traced.  If it was not, then it should pop an entry from the Target
   FEC stack and resend the request with the same TTL (as previously
   sent).  The process of popping a FEC is to be repeated until either
   the LSP ingress receives a non-IS_EGRESS reply or until all the
   additional FECs added to the FEC stack have already been popped.
   Using an IS_EGRESS reply, an ingress can build a map of the
   hierarchical LSP structure traversed by a given FEC.

   When the MPLS echo reply Return Code is "Label switched with FEC
   change", the ingress node SHOULD manipulate the FEC stack as per the
   FEC stack change sub-TLVs contained in the Downstream Detailed
   Mapping TLV.  A transit node can use this Return Code for stitched
   LSPs and for hierarchical LSPs.  In case of ECMP or P2MP, there could
   be multiple paths and Downstream Detailed Mapping TLVs with different
   Return Codes (see Section 3.1, Note 2).  The ingress node should
   build the topology based on the Return Code per ECMP path/P2MP
   branch.

4.7.  Issue with VPN IPv4 and IPv6 Prefixes

   Typically, an LSP ping for a VPN IPv4 prefix or VPN IPv6 prefix is
   sent with a label stack of depth greater than 1, with the innermost
   label having a TTL of 1.  This is to terminate the ping at the egress
   PE, before it gets sent to the customer device.  However, under
   certain circumstances, the label stack can shrink to a single label
   before the ping hits the egress PE; this will result in the ping
   terminating prematurely.  One such scenario is a multi-AS Carrier's
   Carrier VPN.

   To get around this problem, one approach is for the LSR that receives
   such a ping to realize that the ping terminated prematurely and to
   send back Return Code 13.  In that case, the initiating LSR can retry

RFC8029 - Page 59

   the ping after incrementing the TTL on the VPN label.  In this
   fashion, the ingress LSR will sequentially try TTL values until it
   finds one that allows the VPN ping to reach the egress PE.

4.8.  Non-compliant Routers

   If the egress for the FEC Stack being pinged does not support LSP
   ping, then no reply will be sent, resulting in possible "false
   negatives".  When in "traceroute" mode, if a transit LSR does not
   support LSP ping, then no reply will be forthcoming from that LSR for
   some TTL, say, n.  The LSR originating the echo request SHOULD try
   sending the echo request with TTL=n+1, n+2, ..., n+k to probe LSRs
   further down the path.  In such a case, the echo request for TTL > n
   SHOULD be sent with the Downstream Detailed Mapping TLV "Downstream
   IP Address" field set to the ALLROUTERs multicast address until a
   reply is received with a Downstream Detailed Mapping TLV.  The label
   Stack TLV MAY be omitted from the Downstream Detailed Mapping TLV.
   Furthermore, the "Validate FEC Stack" flag SHOULD NOT be set until an
   echo reply packet with a Downstream Detailed Mapping TLV is received.

5.  Security Considerations

   Overall, the security needs for LSP ping are similar to those of ICMP
   ping.

   There are at least three approaches to attacking LSRs using the
   mechanisms defined here.  One is a Denial-of-Service (DoS) attack, by
   sending MPLS echo requests/replies to LSRs and thereby increasing
   their workload.  The second is obfuscating the state of the MPLS
   data-plane liveness by spoofing, hijacking, replaying, or otherwise
   tampering with MPLS echo requests and replies.  The third is an
   unauthorized source using an LSP ping to obtain information about the
   network.

   To avoid potential DoS attacks, it is RECOMMENDED that
   implementations regulate the LSP ping traffic going to the control
   plane.  A rate limiter SHOULD be applied to the well-known UDP port
   defined in Section 6.1.

   Unsophisticated replay and spoofing attacks involving faking or
   replaying MPLS echo reply messages are unlikely to be effective.
   These replies would have to match the Sender's Handle and Sequence
   Number of an outstanding MPLS echo request message.  A non-matching
   replay would be discarded as the sequence has moved on, thus a spoof
   has only a small window of opportunity.  However, to provide a
   stronger defense, an implementation MAY also validate the TimeStamp
   Sent by requiring an exact match on this field.

RFC8029 - Page 60

   To protect against unauthorized sources using MPLS echo request
   messages to obtain network information, it is RECOMMENDED that
   implementations provide a means of checking the source addresses of
   MPLS echo request messages against an access list before accepting
   the message.

   It is not clear how to prevent hijacking (non-delivery) of echo
   requests or replies; however, if these messages are indeed hijacked,
   LSP ping will report that the data plane is not working as it should.

   It does not seem vital (at this point) to secure the data carried in
   MPLS echo requests and replies, although knowledge of the state of
   the MPLS data plane may be considered confidential by some.
   Implementations SHOULD, however, provide a means of filtering the
   addresses to which echo reply messages may be sent.

   The value part of the Pad TLV contains a variable number of octets.
   With the exception of the first octet, these contents, if any, are
   ignored on receipt, and can therefore serve as a clandestine channel.

   When MPLS LSP ping is used within an administrative domain, a
   deployment can increase security by using border filtering of
   incoming LSP ping packets as well as outgoing LSP ping packets.

   Although this document makes special use of 127/8 addresses, these
   are used only in conjunction with the UDP port 3503.  Furthermore,
   these packets are only processed by routers.  All other hosts MUST
   treat all packets with a destination address in the range 127/8 in
   accordance to RFC 1122.  Any packet received by a router with a
   destination address in the range 127/8 without a destination UDP port
   of 3503 MUST be treated in accordance to RFC 1812.  In particular,
   the default behavior is to treat packets destined to a 127/8 address
   as "martians".

   If a network operator wants to prevent tracing inside a tunnel, one
   can use the Pipe Model [RFC3443], i.e., hide the outer MPLS tunnel by
   not propagating the MPLS TTL into the outer tunnel (at the start of
   the outer tunnel).  By doing this, LSP traceroute packets will not
   expire in the outer tunnel, and the outer tunnel will not get traced.

   If one doesn't wish to expose the details of the new outer LSP, then
   the Nil FEC can be used to hide those details.  Using the Nil FEC
   ensures that the trace progresses without false negatives and all
   transit nodes (of the new outer tunnel) perform some minimal
   validations on the received MPLS echo requests.

(next page on part 4)