Internet Engineering Task Force (IETF) E. Rosen, Ed. Request for Comments: 6513 Cisco Systems, Inc. Category: Standards Track R. Aggarwal, Ed. ISSN: 2070-1721 Juniper Networks February 2012 Multicast in MPLS/BGP IP VPNsAbstract
In order for IP multicast traffic within a BGP/MPLS IP VPN (Virtual Private Network) to travel from one VPN site to another, special protocols and procedures must be implemented by the VPN Service Provider. These protocols and procedures are specified in this document. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6513. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.Table of Contents
1. Introduction ....................................................5 2. Overview .......................................................5 2.1. Optimality vs. Scalability .................................5 2.1.1. Multicast Distribution Trees ........................7 2.1.2. Ingress Replication through Unicast Tunnels .........8 2.2. Multicast Routing Adjacencies ..............................8 2.3. MVPN Definition ............................................9 2.4. Auto-Discovery ............................................10 2.5. PE-PE Multicast Routing Information .......................11 2.6. PE-PE Multicast Data Transmission .........................11 2.7. Inter-AS MVPNs ............................................12 2.8. Optionally Eliminating Shared Tree State ..................13 3. Concepts and Framework .........................................13 3.1. PE-CE Multicast Routing ...................................13 3.2. P-Multicast Service Interfaces (PMSIs) ....................14 3.2.1. Inclusive and Selective PMSIs ......................15 3.2.2. P-Tunnels Instantiating PMSIs ......................16 3.3. Use of PMSIs for Carrying Multicast Data ..................18 3.4. PE-PE Transmission of C-Multicast Routing .................20 3.4.1. PIM Peering ........................................20 3.4.1.1. Full per-MVPN PIM Peering across an MI-PMSI ................................20 3.4.1.2. Lightweight PIM Peering across an MI-PMSI .20 3.4.1.3. Unicasting of PIM C-Join/Prune Messages ...21 3.4.2. Using BGP to Carry C-Multicast Routing .............22 4. BGP-Based Auto-Discovery of MVPN Membership ....................22 5. PE-PE Transmission of C-Multicast Routing ......................25 5.1. Selecting the Upstream Multicast Hop (UMH) ................25 5.1.1. Eligible Routes for UMH Selection ..................26 5.1.2. Information Carried by Eligible UMH Routes .........26 5.1.3. Selecting the Upstream PE ..........................27 5.1.4. Selecting the Upstream Multicast Hop ...............29 5.2. Details of Per-MVPN Full PIM Peering over MI-PMSI .........29 5.2.1. PIM C-Instance Control Packets .....................29
5.2.2. PIM C-Instance Reverse Path Forwarding (RPF) Determination ................................30 5.3. Use of BGP for Carrying C-Multicast Routing ...............31 5.3.1. Sending BGP Updates ................................31 5.3.2. Explicit Tracking ..................................32 5.3.3. Withdrawing BGP Updates ............................32 5.3.4. BSR ................................................33 6. PMSI Instantiation .............................................33 6.1. Use of the Intra-AS I-PMSI A-D Route ......................34 6.1.1. Sending Intra-AS I-PMSI A-D Routes .................34 6.1.2. Receiving Intra-AS I-PMSI A-D Routes ...............35 6.2. When C-flows Are Specifically Bound to P-Tunnels ..........35 6.3. Aggregating Multiple MVPNs on a Single P-Tunnel ...........35 6.3.1. Aggregate Tree Leaf Discovery ......................36 6.3.2. Aggregation Methodology ............................36 6.3.3. Demultiplexing C-Multicast Traffic .................37 6.4. Considerations for Specific Tunnel Technologies ...........38 6.4.1. RSVP-TE P2MP LSPs ..................................39 6.4.2. PIM Trees ..........................................41 6.4.3. mLDP P2MP LSPs .....................................42 6.4.4. mLDP MP2MP LSPs ....................................42 6.4.5. Ingress Replication ................................42 7. Binding Specific C-Flows to Specific P-Tunnels .................44 7.1. General Considerations ....................................45 7.1.1. At the PE Transmitting the C-Flow on the P-Tunnel ..45 7.1.2. At the PE Receiving the C-flow from the P-Tunnel ...46 7.2. Optimizing Multicast Distribution via S-PMSIs .............48 7.3. Announcing the Presence of Unsolicited Flooded Data .......49 7.4. Protocols for Binding C-Flows to P-Tunnels ................50 7.4.1. Using BGP S-PMSI A-D Routes ........................50 7.4.1.1. Advertising C-Flow Binding to P-Tunnel ....50 7.4.1.2. Explicit Tracking .........................51 7.4.2. UDP-Based Protocol .................................52 7.4.2.1. Advertising C-Flow Binding to P-Tunnel ....52 7.4.2.2. Packet Formats and Constants ..............53 7.4.3. Aggregation ........................................55 8. Inter-AS Procedures ............................................55 8.1. Non-Segmented Inter-AS P-Tunnels ..........................56 8.1.1. Inter-AS MVPN Auto-Discovery .......................56 8.1.2. Inter-AS MVPN Routing Information Exchange .........56 8.1.3. Inter-AS P-Tunnels .................................57 8.1.3.1. PIM-Based Inter-AS P-Multicast Trees ......57 8.1.3.2. The PIM MVPN Join Attribute ...............58 8.1.3.2.1. Definition .....................58 8.1.3.2.2. Usage ..........................59 8.2. Segmented Inter-AS P-Tunnels ..............................60 9. Preventing Duplication of Multicast Data Packets ...............60 9.1. Methods for Ensuring Non-Duplication ......................61
9.1.1. Discarding Packets from Wrong PE ...................62 9.1.2. Single Forwarder Selection .........................63 9.1.3. Native PIM Methods .................................63 9.2. Multihomed C-S or C-RP ....................................63 9.3. Switching from the C-RP Tree to the C-S Tree ..............63 9.3.1. How Duplicates Can Occur ...........................63 9.3.2. Solution Using Source Active A-D Routes ............65 10. Eliminating PE-PE Distribution of (C-*,C-G) State .............67 10.1. Co-Locating C-RPs on a PE ................................68 10.1.1. Initial Configuration .............................68 10.1.2. Anycast RP Based on Propagating Active Sources ....68 10.1.2.1. Receiver(s) within a Site ................69 10.1.2.2. Source within a Site .....................69 10.1.2.3. Receiver Switching from Shared to Source Tree ..............................69 10.2. Using MSDP between a PE and a Local C-RP .................69 11. Support for PIM-BIDIR C-Groups ................................71 11.1. The VPN Backbone Becomes the RPL .........................72 11.1.1. Control Plane .....................................72 11.1.2. Data Plane ........................................73 11.2. Partitioned Sets of PEs ..................................73 11.2.1. Partitions ........................................73 11.2.2. Using PE Distinguisher Labels .....................74 11.2.3. Partial Mesh of MP2MP P-Tunnels ...................75 12. Encapsulations ................................................75 12.1. Encapsulations for Single PMSI per P-Tunnel ..............75 12.1.1. Encapsulation in GRE ..............................75 12.1.2. Encapsulation in IP ...............................76 12.1.3. Encapsulation in MPLS .............................77 12.2. Encapsulations for Multiple PMSIs per P-Tunnel ...........78 12.2.1. Encapsulation in GRE ..............................78 12.2.2. Encapsulation in IP ...............................78 12.3. Encapsulations Identifying a Distinguished PE ............78 12.3.1. For MP2MP LSP P-Tunnels ...........................78 12.3.2. For Support of PIM-BIDIR C-Groups .................79 12.4. General Considerations for IP and GRE Encapsulations .....79 12.4.1. MTU (Maximum Transmission Unit) ...................79 12.4.2. TTL (Time to Live) ................................80 12.4.3. Avoiding Conflict with Internet Multicast .........80 12.5. Differentiated Services ..................................81 13. Security Considerations .......................................81 14. IANA Considerations ...........................................83 15. Acknowledgments ...............................................83 16. References ....................................................84 16.1. Normative References .....................................84 16.2. Informative References ...................................85
1. Introduction
[RFC4364] specifies the set of procedures that a Service Provider (SP) must implement in order to provide a particular kind of VPN service ("BGP/MPLS IP VPN") for its customers. The service described therein allows IP unicast packets to travel from one customer site to another, but it does not provide a way for IP multicast traffic to travel from one customer site to another. This document extends the service defined in [RFC4364] so that it also includes the capability of handling IP multicast traffic. This requires a number of different protocols to work together. The document provides a framework describing how the various protocols fit together, and it also provides a detailed specification of some of the protocols. The detailed specification of some of the other protocols is found in preexisting documents or in companion documents. A BGP/MPLS IP VPN service that supports multicast is known as a "Multicast VPN" or "MVPN". Both this document and its companion document [MVPN-BGP] discuss the use of various BGP messages and procedures to provide MVPN support. While every effort has been made to ensure that the two documents are consistent with each other, it is possible that discrepancies have crept in. In the event of any conflict or other discrepancy with respect to the use of BGP in support of MVPN service, [MVPN-BGP] is to be considered to be the authoritative document. Throughout this document, we will use the term "VPN-IP route" to mean a route that is either in the VPN-IPv4 address family [RFC4364] or in the VPN-IPv6 address family [RFC4659]. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].2. Overview
2.1. Optimality vs. Scalability
In a "BGP/MPLS IP VPN" [RFC4364], unicast routing of VPN packets is achieved without the need to keep any per-VPN state in the core of the SP's network (the "P routers"). Routing information from a particular VPN is maintained only by the Provider Edge routers (the "PE routers", or "PEs") that attach directly to sites of that VPN. Customer data travels through the P routers in tunnels from one PE to another (usually MPLS Label Switched Paths, LSPs), so to support the
VPN service the P routers only need to have routes to the PE routers. The PE-to-PE routing is optimal, but the amount of associated state in the P routers depends only on the number of PEs, not on the number of VPNs. However, in order to provide optimal multicast routing for a particular multicast flow, the P routers through which that flow travels have to hold state that is specific to that flow. A multicast flow is identified by the (source, group) tuple where the source is the IP address of the sender and the group is the IP multicast group address of the destination. Scalability would be poor if the amount of state in the P routers were proportional to the number of multicast flows in the VPNs. Therefore, when supporting multicast service for a BGP/MPLS IP VPN, the optimality of the multicast routing must be traded off against the scalability of the P routers. We explain this below in more detail. If a particular VPN is transmitting "native" multicast traffic over the backbone, we refer to it as an "MVPN". By "native" multicast traffic, we mean packets that a Customer Edge router (a "CE router" or "CE") sends to a PE, such that the IP destination address of the packets is a multicast group address, the packets are multicast control packets addressed to the PE router itself, or the packets are IP multicast data packets encapsulated in MPLS. We say that the backbone multicast routing for a particular multicast group in a particular VPN is "optimal" if and only if all of the following conditions hold: - When a PE router receives a multicast data packet of that group from a CE router, it transmits the packet in such a way that the packet is received by every other PE router that is on the path to a receiver of that group; - The packet is not received by any other PEs; - While in the backbone, no more than one copy of the packet ever traverses any link. - While in the backbone, if bandwidth usage is to be optimized, the packet traverses minimum cost trees rather than shortest path trees. Optimal routing for a particular multicast group requires that the backbone maintain one or more source trees that are specific to that flow. Each such tree requires that state be maintained in all the P routers that are in the tree.
Potentially, this would require an unbounded amount of state in the P routers, since the SP has no control of the number of multicast groups in the VPNs that it supports. The SP also doesn't have any control over the number of transmitters in each group, nor over the distribution of the receivers. The procedures defined in this document allow an SP to provide multicast VPN service, without requiring the amount of state maintained by the P routers to be proportional to the number of multicast data flows in the VPNs. The amount of state is traded off against the optimality of the multicast routing. Enough flexibility is provided so that a given SP can make his own trade-offs between scalability and optimality. An SP can even allow some multicast groups in some VPNs to receive optimal routing, while others do not. Of course, the cost of this flexibility is an increase in the number of options provided by the protocols. The basic technique for providing scalability is to aggregate a number of customer multicast flows onto a single multicast distribution tree through the P routers. A number of aggregation methods are supported. The procedures defined in this document also accommodate the SP that does not want to build multicast distribution trees in his backbone at all; the ingress PE can replicate each multicast data packet and then unicast each replica through a tunnel to each egress PE that needs to receive the data.2.1.1. Multicast Distribution Trees
This document supports the use of a single multicast distribution tree in the backbone to carry all the multicast traffic from a specified set of one or more MVPNs. Such a tree is referred to as an "Inclusive Tree". An Inclusive Tree that carries the traffic of more than one MVPN is an "Aggregate Inclusive Tree". An Inclusive Tree contains, as its members, all the PEs that attach to any of the MVPNs using the tree. With this option, even if each tree supports only one MVPN, the upper bound on the amount of state maintained by the P routers is proportional to the number of VPNs supported rather than to the number of multicast flows in those VPNs. If the trees are unidirectional, it would be more accurate to say that the state is proportional to the product of the number of VPNs and the average number of PEs per VPN. The amount of state maintained by the P routers can be further reduced by aggregating more MVPNs onto a single tree. If each such tree supports a set of MVPNs, (call it an "MVPN aggregation set"), the state maintained by the P routers is
proportional to the product of the number of MVPN aggregation sets and the average number of PEs per MVPN. Thus, the state does not grow linearly with the number of MVPNs. However, as data from many multicast groups is aggregated together onto a single Inclusive Tree, it is likely that some PEs will receive multicast data for which they have no need, i.e., some degree of optimality has been sacrificed. This document also provides procedures that enable a single multicast distribution tree in the backbone to be used to carry traffic belonging only to a specified set of one or more multicast groups, from one or more MVPNs. Such a tree is referred to as a "Selective Tree" and more specifically as an "Aggregate Selective Tree" when the multicast groups belong to different MVPNs. By default, traffic from most multicast groups could be carried by an Inclusive Tree, while traffic from, e.g., high bandwidth groups could be carried in one of the Selective Trees. When setting up the Selective Trees, one should include only those PEs that need to receive multicast data from one or more of the groups assigned to the tree. This provides more optimal routing than can be obtained by using only Inclusive Trees, though it requires additional state in the P routers.2.1.2. Ingress Replication through Unicast Tunnels
This document also provides procedures for carrying MVPN data traffic through unicast tunnels from the ingress PE to each of the egress PEs. The ingress PE replicates the multicast data packet received from a CE and sends it to each of the egress PEs using the unicast tunnels. This requires no multicast routing state in the P routers at all, but it puts the entire replication load on the ingress PE router and makes no attempt to optimize the multicast routing.2.2. Multicast Routing Adjacencies
In BGP/MPLS IP VPNs [RFC4364], each CE (Customer Edge) router is a unicast routing adjacency of a PE router, but CE routers at different sites do not become unicast routing adjacencies of each other. This important characteristic is retained for multicast routing -- a CE router becomes a multicast routing adjacency of a PE router, but CE routers at different sites do not become multicast routing adjacencies of each other. We will use the term "C-tree" to refer to a multicast distribution tree whose nodes include CE routers. (See Section 3.1 for further explication of this terminology.)
The multicast routing protocol on the PE-CE link is presumed to be PIM (Protocol Independent Multicast) [PIM-SM]. Both the ASM (Any- Source Multicast) and the SSM (Source-Specific Multicast) service models are supported. Thus, both shared C-trees and source-specific C-trees are supported. Shared C-trees may be unidirectional or bidirectional; in the latter case, the multicast routing protocol is presumed to be the BIDIR-PIM [BIDIR-PIM] "variant" of PIM-SM. A CE router exchanges "ordinary" PIM control messages with the PE router to which it is attached. Support for PIM-DM (Dense Mode) is outside the scope of this document. The PEs attaching to a particular MVPN then have to exchange the multicast routing information with each other. Two basic methods for doing this are defined: (1) PE-PE PIM and (2) BGP. In the former case, the PEs need to be multicast routing adjacencies of each other. In the latter case, they do not. For example, each PE may be a BGP adjacency of a route reflector (RR) and not of any other PEs. In order to support the "Carrier's Carrier" model of [RFC4364], mLDP (Label Distribution Protocol Extensions for Multipoint Label Switched Paths) [MLDP] may also be supported on the PE-CE interface. The use of mLDP on the PE-CE interface is described in [MVPN-BGP]. The use of BGP on the PE-CE interface is not within the scope of this document.2.3. MVPN Definition
An MVPN is defined by two sets of sites: the Sender Sites set and the Receiver Sites set, with the following properties: - Hosts within the Sender Sites set could originate multicast traffic for receivers in the Receiver Sites set. - Receivers not in the Receiver Sites set should not be able to receive this traffic. - Hosts within the Receiver Sites set could receive multicast traffic originated by any host in the Sender Sites set. - Hosts within the Receiver Sites set should not be able to receive multicast traffic originated by any host that is not in the Sender Sites set.
A site could be both in the Sender Sites set and Receiver Sites set, which implies that hosts within such a site could both originate and receive multicast traffic. An extreme case is when the Sender Sites set is the same as the Receiver Sites set, in which case all sites could originate and receive multicast traffic from each other. Sites within a given MVPN may be either within the same organization or in different organizations, which implies that an MVPN can be either an Intranet or an Extranet. A given site may be in more than one MVPN, which implies that MVPNs may overlap. Not all sites of a given MVPN have to be connected to the same service provider, which implies that an MVPN can span multiple service providers. Another way to look at MVPN is to say that an MVPN is defined by a set of administrative policies. Such policies determine both the Sender Sites set and Receiver Sites set. Such policies are established by MVPN customers, but implemented/realized by MVPN Service Providers using the existing BGP/MPLS VPN mechanisms, such as Route Targets (RTs), with extensions, as necessary.2.4. Auto-Discovery
In order for the PE routers attaching to a given MVPN to exchange MVPN control information with each other, each one needs to discover all the other PEs that attach to the same MVPN. (Strictly speaking, a PE in the Receiver Sites set need only discover the other PEs in the Sender Sites set, and a PE in the Sender Sites set need only discover the other PEs in the Receiver Sites set.) This is referred to as "MVPN Auto-Discovery". This document discusses two ways of providing MVPN auto-discovery: - BGP can be used for discovering and maintaining MVPN membership. The PE routers advertise their MVPN membership to other PE routers using BGP. A PE is considered to be a "member" of a particular MVPN if it contains a VRF (Virtual Routing and Forwarding table, see [RFC4364]) that is configured to contain the multicast routing information of that MVPN. This auto- discovery option does not make any assumptions about the methods used for transmitting MVPN multicast data packets through the backbone.
- If it is known that the PE-PE multicast control packets (i.e., PIM packets) of a particular MVPN are to be transmitted through a non-aggregated Inclusive Tree supporting the ASM service model (e.g., through a tree that is created by non-SSM PIM-SM or by BIDIR-PIM), and if the PEs attaching to that MVPN are configured with the group address corresponding to that tree, then the PEs can auto-discover each other simply by joining the tree and then multicasting PIM Hellos over the tree.2.5. PE-PE Multicast Routing Information
The BGP/MPLS IP VPN [RFC4364] specification requires a PE to maintain, at most, one BGP peering with every other PE in the network. This peering is used to exchange VPN routing information. The use of route reflectors further reduces the number of BGP adjacencies maintained by a PE to exchange VPN routing information with other PEs. This document describes various options for exchanging MVPN control information between PE routers based on the use of PIM or BGP. These options have different overheads with respect to the number of routing adjacencies that a PE router needs to maintain to exchange MVPN control information with other PE routers. Some of these options allow the retention of the unicast BGP/MPLS VPN model letting a PE maintain, at most, one BGP routing adjacency with other PE routers to exchange MVPN control information. BGP also provides reliable transport and uses incremental updates. Another option is the use of the currently existing "soft state" PIM standard [PIM-SM] that uses periodic complete updates.2.6. PE-PE Multicast Data Transmission
Like [RFC4364], this document decouples the procedures for exchanging routing information from the procedures for transmitting data traffic. Hence, a variety of transport technologies may be used in the backbone. For Inclusive Trees, these transport technologies include unicast PE-PE tunnels, using encapsulation in MPLS, IP, or GRE (Generic Routing Encapsulation), multicast distribution trees created by PIM (either unidirectional in the SSM or ASM service models or bidirectional) using IP/GRE encapsulation, point-to- multipoint LSPs created by RSVP - Traffic Engineering (RSVP-TE) or mLDP, and multipoint-to-multipoint LSPs created by mLDP. In order to aggregate traffic from multiple MVPNs onto a single multicast distribution tree, it is necessary to have a mechanism to enable the egresses of the tree to demultiplex the multicast traffic received over the tree and to associate each received packet with a particular MVPN. This document specifies a mechanism whereby upstream label assignment [MPLS-UPSTREAM-LABEL] is used by the root of the tree to assign a label to each flow. This label is used by
the receivers to perform the demultiplexing. This document also describes procedures based on BGP that are used by the root of an Aggregate Tree to advertise the Inclusive and/or Selective binding and the demultiplexing information to the leaves of the tree. This document also describes the data plane encapsulations for supporting the various SP multicast transport options. The specification for aggregating traffic of multiple MVPNs onto a single multipoint-to-multipoint LSP or onto a single bidirectional multicast distribution tree is outside the scope of this document. The specifications for using, as Selective Trees, multicast distribution trees that support the ASM service model are outside the scope of this document. The specification for using multipoint-to- multipoint LSPs as Selective Trees is outside the scope of this document. This document assumes that when SP multicast trees are used, traffic for a particular multicast group is transmitted by a particular PE on only one SP multicast tree. The use of multiple SP multicast trees for transmitting traffic belonging to a particular multicast group is outside the scope of this document.2.7. Inter-AS MVPNs
[RFC4364] describes different options for supporting BGP/MPLS IP unicast VPNs whose provider backbones contain more than one Autonomous System (AS). These are known as "inter-AS VPNs". In an inter-AS VPN, the ASes may belong to the same provider or to different providers. This document describes how inter-AS MVPNs can be supported for each of the unicast BGP/MPLS VPN inter-AS options. This document also specifies a model where inter-AS MVPN service can be offered without requiring a single SP multicast tree to span multiple ASes. In this model, an inter-AS multicast tree consists of a number of "segments", one per AS, that are stitched together at AS boundary points. These are known as "segmented inter-AS trees". Each segment of a segmented inter-AS tree may use a different multicast transport technology. It is also possible to support inter-AS MVPNs with non-segmented source trees that extend across AS boundaries.
2.8. Optionally Eliminating Shared Tree State
This document also discusses some options and protocol extensions that can be used to eliminate the need for the PE routers to distribute to each other the (*,G) and (*,G,rpt) states that occur when the VPNs are creating unidirectional C-trees to support the ASM service model.