RFC 4687

Operations and Management (OAM) Requirements for Point-to-Multipoint MPLS Networks

Pages: 14
Informational

Network Working Group                                        S. Yasukawa
Request for Comments: 4687                               NTT Corporation
Category: Informational                                        A. Farrel
                                                      Old Dog Consulting
                                                                 D. King
                                                      Aria Networks Ltd.
                                                               T. Nadeau
                                                     Cisco Systems, Inc.
                                                          September 2006


             Operations and Management (OAM) Requirements
                 for Point-to-Multipoint MPLS Networks

Status of This Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   Multi-Protocol Label Switching (MPLS) has been extended to encompass
   point-to-multipoint (P2MP) Label Switched Paths (LSPs).  As with
   point-to-point MPLS LSPs, the requirement to detect, handle, and
   diagnose control and data plane defects is critical.

   For operators deploying services based on P2MP MPLS LSPs, the
   detection and specification of how to handle those defects are
   important because such defects not only may affect the fundamentals
   of an MPLS network, but also may impact service level specification
   commitments for customers of their network.

   This document describes requirements for data plane operations and
   management for P2MP MPLS LSPs.  These requirements apply to all forms
   of P2MP MPLS LSPs, and include P2MP Traffic Engineered (TE) LSPs and
   multicast LSPs.

RFC4687 - Page 2

Table of Contents

   1. Introduction ....................................................3
   2. Terminology .....................................................3
      2.1. Conventions Used in This Document ..........................3
      2.2. Terminology ................................................3
      2.3. Acronyms ...................................................3
   3. Motivations .....................................................4
   4. General Requirements ............................................4
      4.1. Detection of Label Switch Path Defects .....................5
      4.2. Diagnosis of a Broken Label Switch Path ....................6
      4.3. Path Characterization ......................................6
      4.4. Service Level Agreement Measurement ........................7
      4.5. Frequency of OAM Execution .................................8
      4.6. Alarm Suppression, Aggregation, and Layer Coordination .....8
      4.7. Support for OAM Interworking for Fault Notification ........8
      4.8. Error Detection and Recovery ...............................9
      4.9. Standard Management Interfaces .............................9
      4.10. Detection of Denial of Service Attacks ...................10
      4.11. Per-LSP Accounting Requirements ..........................10
   5. Security Considerations ........................................10
   6. References .....................................................11
      6.1. Normative References ......................................11
      6.2. Informative References ....................................11
   7. Acknowledgements ...............................................12

RFC4687 - Page 3

1.  Introduction

   This document describes requirements for data plane operations and
   management (OAM) for point-to-multipoint (P2MP) Multi-Protocol Label
   Switching (MPLS).  This document specifies OAM requirements for P2MP
   MPLS, as well as for applications of P2MP MPLS.

   These requirements apply to all forms of P2MP MPLS LSPs, and include
   P2MP Traffic Engineered (TE) LSPs [RFC4461] and [P2MP-RSVP], as well
   as multicast LDP LSPs [MCAST-LDP].

   Note that the requirements for OAM for P2MP MPLS build heavily on the
   requirements for OAM for point-to-point MPLS.  These latter
   requirements are described in [RFC4377] and are not repeated in this
   document.

   For a generic framework for OAM in MPLS networks, refer to [RFC4378].

2.  Terminology

2.1.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   The requirements in this document apply to OAM mechanism and protocol
   development, as opposed to the usual application of RFC 2119
   requirements to an actual protocol, as this document does not specify
   a protocol.

2.2.  Terminology

   Definitions of key terms for MPLS OAM are found in [RFC4377] and the
   reader is assumed to be familiar with those definitions, which are
   not repeated here.

   [RFC4461] includes some important definitions and terms for use
   within the context of P2MP MPLS.  The reader should be familiar with
   at least the terminology section of that document.

2.3.  Acronyms

   The following acronyms are used in this document.

   CE:   Customer Edge
   DoS:  Denial of service
   ECMP: Equal Cost Multipath

RFC4687 - Page 4

   LDP:  Label Distribution Protocol
   LSP:  Label Switched Path
   LSR:  Label Switching Router
   OAM:  Operations and Management
   RSVP: Resource reSerVation Protocol
   P2MP: Point-to-Multipoint
   SP:   Service Provider
   TE:   Traffic Engineering

3.  Motivations

   OAM for MPLS networks has been established as a fundamental
   requirement both through operational experience and through its
   documentation in numerous Internet Drafts.  Many such documents (for
   example, [RFC4379], [RFC3812], [RFC3813], [RFC3814], and [RFC3815])
   developed specific solutions to individual issues or problems.
   Coordination of the full OAM requirements for MPLS was achieved by
   [RFC4377] in recognition of the fact that the previous piecemeal
   approach could lead to inconsistent and inefficient applicability of
   OAM techniques across the MPLS architecture, and might require
   significant modifications to operational procedures and systems in
   order to provide consistent and useful OAM functionality.

   This document builds on these realizations and extends the statements
   of MPLS OAM requirements to cover the new area of P2MP MPLS.  That
   is, this document captures the requirements for P2MP MPLS OAM in
   advance of the development of specific solutions.

   Nevertheless, at the time of writing, some effort had already been
   expended to extend existing MPLS OAM solutions to cover P2MP MPLS
   (for example, [P2MP-LSP-PING]).  While this approach of extending
   existing solutions may be reasonable, in order to ensure a consistent
   OAM framework it is necessary to articulate the full set of
   requirements in a single document.  This will facilitate a uniform
   set of MPLS OAM solutions spanning multiple MPLS deployments and
   concurrent applications.

4.  General Requirements

   The general requirements described in this section are similar to
   those described for point-to-point MPLS in [RFC4377].  The
   subsections below do not repeat material from [RFC4377], but simply
   give references to that document.

   However, where the requirements for P2MP MPLS OAM differ from or are
   more extensive than those expressed in [RFC4377], additional text is
   supplied.

RFC4687 - Page 5

   In general, it should be noted that P2MP LSPs introduce a scalability
   issue with respect to OAM that is not present in point-to-point MPLS.
   That is, an individual P2MP LSP will have more than one egress and
   the path to those egresses will very probably not be linear (for
   example, it may have a tree structure).  Since the number of egresses
   for a single P2MP LSP is unknown and not bounded by any small number,
   it follows that all mechanisms defined for OAM support MUST scale
   well with the number of egresses and the complexity of the path of
   the LSP.  Mechanisms that are able to deal with individual egresses
   will scale no worse than similar mechanisms for point-to-point LSPs,
   but it is desirable to develop mechanisms that are able to leverage
   the fact that multiple egresses are associated with a single LSP, and
   so achieve better scaling.

4.1.  Detection of Label Switch Path Defects

   The ability to detect defects in a P2MP LSP SHOULD not require
   manual, hop-by-hop troubleshooting of each LSR used to switch traffic
   for that LSP, and SHOULD rely on proactive OAM procedures (such as
   continuous path connectivity and Service Level Agreement (SLA)
   measurement mechanisms).  Any solutions SHOULD either extend or work
   in close conjunction with existing solutions developed for point-to-
   point MPLS, such as those specified in [RFC4379] where this
   requirement is not contradicted by the other requirements in this
   section.  This will leverage existing software and hardware
   deployments.

   Note that P2MP LSPs may introduce additional scaling concerns for LSP
   probing by tools such as [RFC4379].  As the number of leaves of a
   P2MP LSP increases it potentially becomes more expensive to inspect
   the LSP to detect defects.  Any tool developed for this purpose MUST
   be cognitive of this issue and MUST include techniques to reduce the
   scaling impact of an increase in the number of leaves.  Nevertheless,
   it should also be noted that the introduction of additional leaves
   may mean that the use of techniques such as [RFC4379] are less
   appropriate for defect detection with P2MP LSPs, while the technique
   may still remain useful for defect diagnosis as described in the next
   section.

   Due to the above scaling concerns, LSRs or other network resources
   MUST NOT be overwhelmed by the operation of normal proactive OAM
   procedures, and measures taken to protect LSRs and network resources
   against being overwhelmed MUST NOT degrade the operational value or
   responsiveness of proactive OAM procedures.  Note that reactive OAM
   may violate these limits (i.e., cause visible traffic degradation) if
   it is necessary or useful to try to fix whatever has gone wrong.

RFC4687 - Page 6

   By "overwhelmed" we mean that it MUST NOT be possible for an LSR to
   be so busy handling proactive OAM that it is unable to continue to
   process control or data plane traffic at its advertised rate.
   Similarly, a network resource (such as a data link) MUST NOT be
   carrying so much proactive OAM traffic that it is unable to carry the
   advertised data rate.  At the same time, it is important to configure
   proactive OAM, if it is in use, not to raise alarms caused by the
   failure to receive an OAM message if the component responsible for
   processing the messages is unable to process because other components
   are consuming too many system resources -- such alarms might turn out
   to be false.

   In practice, of course, the requirements in the previous paragraph
   may be met by careful specification of the anticipated data
   throughput of LSRs or data links.  However, it should be recalled
   that proactive OAM procedures may be scaled linearly with the number
   of LSPs, and the number of LSPs is not necessarily a function of the
   available bandwidth in an LSR or on a data link.

4.2.  Diagnosis of a Broken Label Switch Path

   The ability to diagnose a broken P2MP LSP and to isolate the failed
   component (i.e., link or node) in the path is REQUIRED.  These
   functions include a path connectivity test that can test all branches
   and leaves of a P2MP LSP for reachability, as well as a path tracing
   function.  Note that this requirement is distinct from the
   requirement to detect errors or failures described in the previous
   section.  In practice, Detection and Diagnosis/Isolation MAY be
   performed by separate or the same mechanisms according to the way in
   which the other requirements are met.

   It MUST be possible for the operator (or an automated process) to
   stipulate a timeout after which the failure to see a response shall
   be flagged as an error.

   Any mechanism developed to perform these functions is subject to the
   scalability concerns expressed in section 4.

4.3.  Path Characterization

   The path characterization function [RFC4377] is the ability to reveal
   details of LSR forwarding operations for P2MP LSPs.  These details
   can then be compared later during subsequent testing relevant to OAM
   functionality.  Therefore, LSRs supporting P2MP LSPs MUST provide
   mechanisms that allow operators to interrogate and characterize P2MP
   paths.

RFC4687 - Page 7

   Since P2MP paths are more complex than the paths of point-to-point
   LSPs, the scaling concerns expressed in section 4 apply.

   Note that path characterization SHOULD lead to the operator being
   able to determine the full tree for a P2MP LSP.  That is, it is not
   sufficient to know the list of LSRs in the tree, but it is important
   to know their relative order and where the LSP branches.

   Since, in some cases, the control plane state and data paths may
   branch at different points from the control plane and data plane
   topologies (for example, Figure 1), it is not sufficient to present
   the order of LSRs, but it is important that the branching points on
   that tree are clearly identified.

                                       E
                                      /
                         A---B---C===D
                                      \
                                       F

      Figure 1.  An example P2MP tree where the data path and control
      plane state branch at C, but the topology branches at D.

   A diagnostic tool that meets the path characterization requirements
   SHOULD collect information that is easy to process to determine the
   P2MP tree for a P2MP LSP, rather than provide information that must
   be post-processed with some complexity.

4.4.  Service Level Agreement Measurement

   Mechanisms are required to measure the diverse aspects of Service
   Level Agreements for services that utilize P2MP LSPs.  The aspects
   are listed in [RFC4377].

   Service Level Agreements are often measured in terms of the quality
   and rate of data delivery.  In the context of P2MP MPLS, data is
   delivered to multiple egress nodes.  The mechanisms MUST, therefore,
   be capable of measuring the aspects of Service Level Agreements as
   they apply to each of the egress points to a P2MP LSP.  At the same
   time, in order to diagnose issues with meeting Service Level
   Agreements, mechanisms SHOULD be provided to measure the aspects of
   the agreements at key points within the network such as at branch
   nodes on the P2MP tree.

RFC4687 - Page 8

4.5.  Frequency of OAM Execution

   As stipulated in [RFC4377], the operator MUST have the flexibility to
   configure OAM parameters to meet their specific operational
   requirements.  This requirement is potentially more important in P2MP
   deployments where the effects of the execution of OAM functions can
   be potentially much greater than in a non-P2MP configuration.  For
   example, a mechanism that causes each egress of a P2MP LSP to respond
   could result in a large burst of responses to a single OAM request.

   Therefore, solutions produced SHOULD NOT impose any fixed limitations
   on the frequency of the execution of any OAM functions.

4.6.  Alarm Suppression, Aggregation, and Layer Coordination

   As described in [RFC4377], network elements MUST provide alarm
   suppression and aggregation mechanisms to prevent the generation of
   superfluous alarms within or across network layers.  The same time
   constraint issues identified in [RFC4377] also apply to P2MP LSPs.

   A P2MP LSP also brings the possibility of a single fault causing a
   larger number of alarms than for a point-to-point LSP.  This can
   happen because there are a larger number of downstream LSRs (for
   example, a larger number of egresses).  The resultant multiplier in
   the number of alarms could cause swamping of the alarm management
   systems to which the alarms are reported, and serves as a multiplier
   to the number of potentially duplicate alarms raised by the network.

   Alarm aggregation or limitation techniques MUST be applied within any
   solution, or be available within an implementation, so that this
   scaling issue can be reduced.  Note that this requirement introduces
   a second dimension to the concept of alarm aggregation.  Where
   previously it applied to the correlation and suppression of alarms
   generated by different network layers, it now also applies to similar
   techniques applied to alarms generated by multiple downstream LSRs.

4.7.  Support for OAM Interworking for Fault Notification

   [RFC4377] specifies that an LSR supporting the interworking of one or
   more networking technologies over MPLS MUST be able to translate an
   MPLS defect into the native technology's error condition.  This also
   applies to any LSR supporting P2MP LSPs.  However, careful attention
   to the requirements for alarm suppression stipulated therein and in
   section 4.6 SHOULD be observed.

   Note that the time constraints for fault notification and alarm
   propagation affect the solutions that might be applied to the
   scalability problem inherent in certain OAM techniques applied to

RFC4687 - Page 9

   P2MP LSPs.  For example, a solution to the issue of a large number of
   egresses all responding to some form of probe request at the same
   time might be to make the probes less frequent -- but this might
   affect the ability to detect and/or report faults.

   Where fault notification to the egress is required, there is the
   possibility that a single fault will give rise to multiple
   notifications, one to each egress node of the P2MP that is downstream
   of the fault.  Any mechanisms MUST manage this scaling issue while
   still continuing to deliver fault notifications in a timely manner.

   Where fault notification to the ingress is required, the mechanisms
   MUST ensure that the notification identifies the egress nodes of the
   P2MP LSP that are impacted (that is, those downstream of the fault)
   and does not falsely imply that all egress nodes are impacted.

4.8.  Error Detection and Recovery

   Recovery from a fault by a network element can be facilitated by MPLS
   OAM procedures.  As described in [RFC4377], these procedures will
   detect a broad range of defects, and SHOULD be operable where MPLS
   P2MP LSPs span multiple routing areas or multiple Service Provider
   domains.

   The same requirements as those expressed in [RFC4377] with respect to
   automatic repair and operator intervention ahead of customer
   detection of faults apply to P2MP LSPs.

   It should be observed that faults in P2MP LSPs MAY be recovered
   through techniques described in [P2MP-RSVP].

4.9.  Standard Management Interfaces

   The widespread deployment of MPLS requires common information
   modeling of management and control of OAM functionality.  This is
   reflected in the integration of standard MPLS-related MIBs [RFC3812],
   [RFC3813], [RFC3814], [RFC3815] for fault, statistics, and
   configuration management.  These standard interfaces provide
   operators with common programmatic interface access to operations and
   management functions and their status.

   The standard MPLS-related MIB modules [RFC3812], [RFC3813],
   [RFC3814], and [RFC3815] SHOULD be extended wherever possible, to
   support P2MP LSPs, the associated OAM functions on these LSPs, and
   the applications that utilize P2MP LSPs.  Extending them will
   facilitate the reuse of existing management software both in LSRs and
   in management systems.  In cases where the existing MIB modules
   cannot be extended, then new MIB modules MUST be created.

RFC4687 - Page 10

4.10.  Detection of Denial of Service Attacks

   The ability to detect denial of service (DoS) attacks against the
   data or control planes that signal P2MP LSPs MUST be part of any
   security management related to MPLS OAM tools or techniques.

4.11.  Per-LSP Accounting Requirements

   In an MPLS network where P2MP LSPs are in use, Service Providers can
   measure traffic from an LSR to the egress of the network using some
   MPLS-related MIB modules (see section 4.9), for example.  Other
   interfaces MAY exist as well and enable the creation of traffic
   matrices so that it is possible to know how much traffic is traveling
   from where to where within the network.

   Analysis of traffic flows to produce a traffic matrix is more
   complicated where P2MP LSPs are deployed because there is no simple
   pairing relationship between an ingress and a single egress.
   Fundamental to understanding traffic flows within a network that
   supports P2MP LSPs will be the knowledge of where the traffic is
   branched for each LSP within the network, that is, where within the
   network the branch nodes for the LSPs are located and what their
   relationship is to links and other LSRs.  Traffic flow and accounting
   tools MUST take this fact into account.

5.  Security Considerations

   This document introduces no new security issues compared with
   [RFC4377].  It is worth highlighting, however, that any tool designed
   to satisfy the requirements described in this document MUST include
   provisions to prevent its unauthorized use.  Likewise, these tools
   MUST provide a means by which an operator can prevent denial of
   service attacks if those tools are used in such an attack.  LSP mis-
   merging is described in [RFC4377] where it is pointed out that it has
   security implications beyond simply being a network defect.  It needs
   to be stressed that it is in the nature of P2MP traffic flows that
   any erroneous delivery (such as caused by LSP mis-merging) is likely
   to have more far-reaching consequences since the traffic will be
   mis-delivered to multiple receivers.

   As with the OAM functions described in [RFC4377], the performance of
   diagnostic functions and path characterization may involve the
   extraction of a significant amount of information about network
   construction.  The network operator MAY consider this information
   private and wish to take steps to secure it, but further, the volume
   of this information may be considered as a threat to the integrity of

RFC4687 - Page 11

   the network if it is extracted in bulk.  This issue may be greater in
   P2MP MPLS because of the potential for a large number of receivers on
   a single LSP and the consequent extensive path of the LSP.

6.  References

6.1.  Normative References

   [RFC2119]        Bradner, S., "Key words for use in RFCs to Indicate
                    Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4377]        Nadeau, T., Morrow, M., Swallow, G., Allan, D., and
                    S. Matsushima, "Operations and Management (OAM)
                    Requirements for Multi-Protocol Label Switched
                    (MPLS) Networks", RFC 4377, February 2006.

6.2.  Informative References

   [MCAST-LDP]      Minei, I., Ed., Kompella, K., Wijnands, I., Ed., and
                    B. Thomas, "Label Distribution Protocol Extensions
                    for Point-to-Multipoint and Multipoint-to-Multipoint
                    Label Switched Paths", Work in Progress, June 2006.

   [P2MP-LSP-PING]  Yasukawa, S., Farrel, A., Ali, Z., and B. Fenner,
                    "Detecting Data Plane Failures in Point-to-
                    Multipoint MPLS Traffic Engineering - Extensions to
                    LSP Ping", Work in Progress, April 2006.

   [P2MP-RSVP]      Aggarwal, R., Papadimitriou, D., and S. Yasukawa,
                    "Extensions to RSVP-TE for Point to Multipoint TE
                    LSPs", Work in Progress, July 2006.

   [RFC3812]        Srinivasan, C., Viswanathan, A. and T.  Nadeau,
                    "MPLS Traffic Engineering Management Information
                    Base Using SMIv2", RFC3812, June 2004.

   [RFC3813]        Srinivasan, C., Viswanathan, A. and T.  Nadeau,
                    "MPLS Label Switch Router Management Information
                    Base Using SMIv2", RFC3813, June 2004.

   [RFC3814]        Nadeau, T., Srinivasan, C., and A.  Viswanathan,
                    "Multiprotocol Label Switching (MPLS) FEC-To-NHLFE
                    (FTN) Management Information Base", RFC3814, June
                    2004.

RFC4687 - Page 12

   [RFC3815]        Cucchiara, J., Sjostrand, H., and Luciani, J.,
                    "Definitions of Managed Objects for the
                    Multiprotocol Label Switching (MPLS), Label
                    Distribution Protocol (LDP)", RFC 3815, June 2004.

   [RFC4378]        Allan, D. and T. Nadeau, "A Framework for Multi-
                    Protocol Label Switching (MPLS) Operations and
                    Management (OAM)", RFC 4378, February 2006.

   [RFC4379]        Kompella, K. and G. Swallow, "Detecting Multi-
                    Protocol Label Switched (MPLS) Data Plane Failures",
                    RFC 4379, February 2006.

   [RFC4461]        Yasukawa, S., Ed., "Signaling Requirements for
                    Point-to-Multipoint Traffic-Engineered MPLS Label
                    Switched Paths (LSPs)", RFC 4461, April 2006.

7.  Acknowledgements

   The authors wish to acknowledge and thank the following individuals
   for their valuable comments on this document:  Rahul Aggarwal, Neil
   Harrison, Ben Niven-Jenkins, and Dimitri Papadimitriou.

RFC4687 - Page 13

Authors' Addresses

   Seisho Yasukawa
   NTT Corporation
   (R&D Strategy Department)
   3-1, Otemachi 2-Chome Chiyodaku,
   Tokyo 100-8116 Japan

   Phone: +81 3 5205 5341
   EMail: s.yasukawa@hco.ntt.co.jp


   Adrian Farrel
   Old Dog Consulting

   Phone: +44 (0) 1978 860944
   EMail: adrian@olddog.co.uk


   Daniel King
   Aria Networks Ltd.

   Phone: +44 (0)1249 665923
   EMail: daniel.king@aria-networks.com


   Thomas D. Nadeau
   Cisco Systems, Inc.
   1414 Massachusetts Ave.
   Boxborough, MA 01719

   EMail: tnadeau@cisco.com

RFC4687 - Page 14

Full Copyright Statement

   Copyright (C) The Internet Society (2006).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.

Acknowledgement

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).