Tech-invite3GPPspaceIETFspace
9796959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 8365

A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)

Pages: 33
Proposed Standard
Errata
Part 1 of 2 – Pages 1 to 17
None   None   Next

Top   ToC   RFC8365 - Page 1
Internet Engineering Task Force (IETF)                   A. Sajassi, Ed.
Request for Comments: 8365                                         Cisco
Category: Standards Track                                  J. Drake, Ed.
ISSN: 2070-1721                                                  Juniper
                                                                N. Bitar
                                                                   Nokia
                                                              R. Shekhar
                                                                 Juniper
                                                               J. Uttaro
                                                                    AT&T
                                                           W. Henderickx
                                                                   Nokia
                                                              March 2018


  A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)

Abstract

This document specifies how Ethernet VPN (EVPN) can be used as a Network Virtualization Overlay (NVO) solution and explores the various tunnel encapsulation options over IP and their impact on the EVPN control plane and procedures. In particular, the following encapsulation options are analyzed: Virtual Extensible LAN (VXLAN), Network Virtualization using Generic Routing Encapsulation (NVGRE), and MPLS over GRE. This specification is also applicable to Generic Network Virtualization Encapsulation (GENEVE); however, some incremental work is required, which will be covered in a separate document. This document also specifies new multihoming procedures for split-horizon filtering and mass withdrawal. It also specifies EVPN route constructions for VXLAN/NVGRE encapsulations and Autonomous System Border Router (ASBR) procedures for multihoming of Network Virtualization Edge (NVE) devices. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc8365.
Top   ToC   RFC8365 - Page 2
Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.
Top   ToC   RFC8365 - Page 3

Table of Contents

1. Introduction ....................................................4 2. Requirements Notation and Conventions ...........................5 3. Terminology .....................................................5 4. EVPN Features ...................................................7 5. Encapsulation Options for EVPN Overlays .........................8 5.1. VXLAN/NVGRE Encapsulation ..................................8 5.1.1. Virtual Identifiers Scope ...........................9 5.1.2. Virtual Identifiers to EVI Mapping .................11 5.1.3. Constructing EVPN BGP Routes .......................13 5.2. MPLS over GRE .............................................15 6. EVPN with Multiple Data-Plane Encapsulations ...................15 7. Single-Homing NVEs - NVE Residing in Hypervisor ................16 7.1. Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE ....16 7.2. Impact on EVPN Procedures for VXLAN/NVGRE Encapsulations ..17 8. Multihoming NVEs - NVE Residing in ToR Switch ..................18 8.1. EVPN Multihoming Features .................................18 8.1.1. Multihomed ES Auto-Discovery .......................18 8.1.2. Fast Convergence and Mass Withdrawal ...............18 8.1.3. Split-Horizon ......................................19 8.1.4. Aliasing and Backup Path ...........................19 8.1.5. DF Election ........................................20 8.2. Impact on EVPN BGP Routes and Attributes ..................20 8.3. Impact on EVPN Procedures .................................20 8.3.1. Split Horizon ......................................21 8.3.2. Aliasing and Backup Path ...........................22 8.3.3. Unknown Unicast Traffic Designation ................22 9. Support for Multicast ..........................................23 10. Data-Center Interconnections (DCIs) ...........................24 10.1. DCI Using GWs ............................................24 10.2. DCI Using ASBRs ..........................................24 10.2.1. ASBR Functionality with Single-Homing NVEs ........25 10.2.2. ASBR Functionality with Multihoming NVEs ..........26 11. Security Considerations .......................................28 12. IANA Considerations ...........................................29 13. References ....................................................29 13.1. Normative References .....................................29 13.2. Informative References ...................................30 Acknowledgements ..................................................32 Contributors ......................................................32 Authors' Addresses ................................................33
Top   ToC   RFC8365 - Page 4

1. Introduction

This document specifies how Ethernet VPN (EVPN) [RFC7432] can be used as a Network Virtualization Overlay (NVO) solution and explores the various tunnel encapsulation options over IP and their impact on the EVPN control plane and procedures. In particular, the following encapsulation options are analyzed: Virtual Extensible LAN (VXLAN) [RFC7348], Network Virtualization using Generic Routing Encapsulation (NVGRE) [RFC7637], and MPLS over Generic Routing Encapsulation (GRE) [RFC4023]. This specification is also applicable to Generic Network Virtualization Encapsulation (GENEVE) [GENEVE]; however, some incremental work is required, which will be covered in a separate document [EVPN-GENEVE]. This document also specifies new multihoming procedures for split-horizon filtering and mass withdrawal. It also specifies EVPN route constructions for VXLAN/NVGRE encapsulations and Autonomous System Border Router (ASBR) procedures for multihoming of Network Virtualization Edge (NVE) devices. In the context of this document, an NVO is a solution to address the requirements of a multi-tenant data center, especially one with virtualized hosts, e.g., Virtual Machines (VMs) or virtual workloads. The key requirements of such a solution, as described in [RFC7364], are the following: - Isolation of network traffic per tenant - Support for a large number of tenants (tens or hundreds of thousands) - Extension of Layer 2 (L2) connectivity among different VMs belonging to a given tenant segment (subnet) across different Points of Delivery (PoDs) within a data center or between different data centers - Allowing a given VM to move between different physical points of attachment within a given L2 segment The underlay network for NVO solutions is assumed to provide IP connectivity between NVO endpoints.
Top   ToC   RFC8365 - Page 5
   This document describes how EVPN can be used as an NVO solution and
   explores applicability of EVPN functions and procedures.  In
   particular, it describes the various tunnel encapsulation options for
   EVPN over IP and their impact on the EVPN control plane as well as
   procedures for two main scenarios:

   (a)  single-homing NVEs - when an NVE resides in the hypervisor, and

   (b)  multihoming NVEs - when an NVE resides in a Top-of-Rack (ToR)
        device.

   The possible encapsulation options for EVPN overlays that are
   analyzed in this document are:

   -  VXLAN and NVGRE

   -  MPLS over GRE

   Before getting into the description of the different encapsulation
   options for EVPN over IP, it is important to highlight the EVPN
   solution's main features, how those features are currently supported,
   and any impact that the encapsulation has on those features.

2. Requirements Notation and Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Terminology

Most of the terminology used in this documents comes from [RFC7432] and [RFC7365]. VXLAN: Virtual Extensible LAN GRE: Generic Routing Encapsulation NVGRE: Network Virtualization using Generic Routing Encapsulation GENEVE: Generic Network Virtualization Encapsulation PoD: Point of Delivery NV: Network Virtualization
Top   ToC   RFC8365 - Page 6
   NVO:  Network Virtualization Overlay

   NVE:  Network Virtualization Edge

   VNI:  VXLAN Network Identifier

   VSID:  Virtual Subnet Identifier (for NVGRE)

   I-SID:  Service Instance Identifier

   EVPN:  Ethernet VPN

   EVI:  EVPN Instance.  An EVPN instance spanning the Provider Edge
      (PE) devices participating in that EVPN

   MAC-VRF:  A Virtual Routing and Forwarding table for Media Access
      Control (MAC) addresses on a PE

   IP-VRF:  A Virtual Routing and Forwarding table for Internet Protocol
      (IP) addresses on a PE

   ES:  Ethernet Segment.  When a customer site (device or network) is
      connected to one or more PEs via a set of Ethernet links, then
      that set of links is referred to as an 'Ethernet segment'.

   Ethernet Segment Identifier (ESI):  A unique non-zero identifier that
      identifies an Ethernet segment is called an 'Ethernet Segment
      Identifier'.

   Ethernet Tag:  An Ethernet tag identifies a particular broadcast
      domain, e.g., a VLAN.  An EVPN instance consists of one or more
      broadcast domains.

   PE:  Provider Edge

   Single-Active Redundancy Mode:  When only a single PE, among all the
      PEs attached to an ES, is allowed to forward traffic to/from that
      ES for a given VLAN, then the Ethernet segment is defined to be
      operating in Single-Active redundancy mode.

   All-Active Redundancy Mode:  When all PEs attached to an Ethernet
      segment are allowed to forward known unicast traffic to/from that
      ES for a given VLAN, then the ES is defined to be operating in
      All-Active redundancy mode.

   PIM-SM:  Protocol Independent Multicast - Sparse-Mode
Top   ToC   RFC8365 - Page 7
   PIM-SSM:  Protocol Independent Multicast - Source-Specific Multicast

   BIDIR-PIM:  Bidirectional PIM

4. EVPN Features

EVPN [RFC7432] was originally designed to support the requirements detailed in [RFC7209] and therefore has the following attributes which directly address control-plane scaling and ease of deployment issues. 1. Control-plane information is distributed with BGP and broadcast and multicast traffic is sent using a shared multicast tree or with ingress replication. 2. Control-plane learning is used for MAC (and IP) addresses instead of data-plane learning. The latter requires the flooding of unknown unicast and Address Resolution Protocol (ARP) frames; whereas, the former does not require any flooding. 3. Route Reflector (RR) is used to reduce a full mesh of BGP sessions among PE devices to a single BGP session between a PE and the RR. Furthermore, RR hierarchy can be leveraged to scale the number of BGP routes on the RR. 4. Auto-discovery via BGP is used to discover PE devices participating in a given VPN, PE devices participating in a given redundancy group, tunnel encapsulation types, multicast tunnel types, multicast members, etc. 5. All-Active multihoming is used. This allows a given Customer Edge (CE) device to have multiple links to multiple PEs, and traffic to/from that CE fully utilizes all of these links. 6. When a link between a CE and a PE fails, the PEs for that EVI are notified of the failure via the withdrawal of a single EVPN route. This allows those PEs to remove the withdrawing PE as a next hop for every MAC address associated with the failed link. This is termed "mass withdrawal". 7. BGP route filtering and constrained route distribution are leveraged to ensure that the control-plane traffic for a given EVI is only distributed to the PEs in that EVI.
Top   ToC   RFC8365 - Page 8
   8.   When an IEEE 802.1Q [IEEE.802.1Q] interface is used between a CE
        and a PE, each of the VLAN IDs (VIDs) on that interface can be
        mapped onto a bridge table (for up to 4094 such bridge tables).
        All these bridge tables may be mapped onto a single MAC-VRF (in
        case of VLAN-aware bundle service).

   9.   VM Mobility mechanisms ensure that all PEs in a given EVI know
        the ES with which a given VM, as identified by its MAC and IP
        addresses, is currently associated.

   10.  RTs are used to allow the operator (or customer) to define a
        spectrum of logical network topologies including mesh, hub and
        spoke, and extranets (e.g., a VPN whose sites are owned by
        different enterprises), without the need for proprietary
        software or the aid of other virtual or physical devices.

   Because the design goal for NVO is millions of instances per common
   physical infrastructure, the scaling properties of the control plane
   for NVO are extremely important.  EVPN and the extensions described
   herein, are designed with this level of scalability in mind.

5. Encapsulation Options for EVPN Overlays

5.1. VXLAN/NVGRE Encapsulation

Both VXLAN and NVGRE are examples of technologies that provide a data plane encapsulation which is used to transport a packet over the common physical IP infrastructure between Network Virtualization Edges (NVEs) - e.g., VXLAN Tunnel End Points (VTEPs) in VXLAN network. Both of these technologies include the identifier of the specific NVO instance, VNI in VXLAN and VSID in NVGRE, in each packet. In the remainder of this document we use VNI as the representation for NVO instance with the understanding that VSID can equally be used if the encapsulation is NVGRE unless it is stated otherwise. Note that a PE is equivalent to an NVE/VTEP. VXLAN encapsulation is based on UDP, with an 8-byte header following the UDP header. VXLAN provides a 24-bit VNI, which typically provides a one-to-one mapping to the tenant VID, as described in [RFC7348]. In this scenario, the ingress VTEP does not include an inner VLAN tag on the encapsulated frame, and the egress VTEP discards the frames with an inner VLAN tag. This mode of operation in [RFC7348] maps to VLAN-Based Service in [RFC7432], where a tenant VID gets mapped to an EVI.
Top   ToC   RFC8365 - Page 9
   VXLAN also provides an option of including an inner VLAN tag in the
   encapsulated frame, if explicitly configured at the VTEP.  This mode
   of operation can map to VLAN Bundle Service in [RFC7432] because all
   the tenant's tagged frames map to a single bridge table / MAC-VRF,
   and the inner VLAN tag is not used for lookup by the disposition PE
   when performing VXLAN decapsulation as described in Section 6 of
   [RFC7348].

   [RFC7637] encapsulation is based on GRE encapsulation, and it
   mandates the inclusion of the optional GRE Key field, which carries
   the VSID.  There is a one-to-one mapping between the VSID and the
   tenant VID, as described in [RFC7637].  The inclusion of an inner
   VLAN tag is prohibited.  This mode of operation in [RFC7637] maps to
   VLAN Based Service in [RFC7432].

   As described in the next section, there is no change to the encoding
   of EVPN routes to support VXLAN or NVGRE encapsulation, except for
   the use of the BGP Encapsulation Extended Community to indicate the
   encapsulation type (e.g., VXLAN or NVGRE).  However, there is
   potential impact to the EVPN procedures depending on where the NVE is
   located (i.e., in hypervisor or ToR) and whether multihoming
   capabilities are required.

5.1.1. Virtual Identifiers Scope

Although VNIs are defined as 24-bit globally unique values, there are scenarios in which it is desirable to use a locally significant value for the VNI, especially in the context of a data-center interconnect.
5.1.1.1. Data-Center Interconnect with Gateway
In the case where NVEs in different data centers need to be interconnected, and the NVEs need to use VNIs as globally unique identifiers within a data center, then a Gateway (GW) needs to be employed at the edge of the data-center network (DCN). This is because the Gateway will provide the functionality of translating the VNI when crossing network boundaries, which may align with operator span-of-control boundaries. As an example, consider the network of Figure 1. Assume there are three network operators: one for each of the DC1, DC2, and WAN networks. The Gateways at the edge of the data centers are responsible for translating the VNIs between the values used in each of the DCNs and the values used in the WAN.
Top   ToC   RFC8365 - Page 10
                             +--------------+
                             |              |
           +---------+       |     WAN      |       +---------+
   +----+  |        +---+  +----+        +----+  +---+        |  +----+
   |NVE1|--|        |   |  |WAN |        |WAN |  |   |        |--|NVE3|
   +----+  |IP      |GW |--|Edge|        |Edge|--|GW | IP     |  +----+
   +----+  |Fabric  +---+  +----+        +----+  +---+ Fabric |  +----+
   |NVE2|--|         |       |              |       |         |--|NVE4|
   +----+  +---------+       +--------------+       +---------+  +----+

   |<------ DC 1 ------>                          <------ DC2  ------>|

              Figure 1: Data-Center Interconnect with Gateway

5.1.1.2. Data-Center Interconnect without Gateway
In the case where NVEs in different data centers need to be interconnected, and the NVEs need to use locally assigned VNIs (e.g., similar to MPLS labels), there may be no need to employ Gateways at the edge of the DCN. More specifically, the VNI value that is used by the transmitting NVE is allocated by the NVE that is receiving the traffic (in other words, this is similar to a "downstream-assigned" MPLS label). This allows the VNI space to be decoupled between different DCNs without the need for a dedicated Gateway at the edge of the data centers. This topic is covered in Section 10.2. +--------------+ | | +---------+ | WAN | +---------+ +----+ | | +----+ +----+ | | +----+ |NVE1|--| | |ASBR| |ASBR| | |--|NVE3| +----+ |IP Fabric|---| | | |--|IP Fabric| +----+ +----+ | | +----+ +----+ | | +----+ |NVE2|--| | | | | |--|NVE4| +----+ +---------+ +--------------+ +---------+ +----+ |<------ DC 1 -----> <---- DC2 ------>| Figure 2: Data-Center Interconnect with ASBR
Top   ToC   RFC8365 - Page 11

5.1.2. Virtual Identifiers to EVI Mapping

Just like in [RFC7432], where two options existed for mapping broadcast domains (represented by VLAN IDs) to an EVI, when the EVPN control plane is used in conjunction with VXLAN (or NVGRE encapsulation), there are also two options for mapping broadcast domains represented by VXLAN VNIs (or NVGRE VSIDs) to an EVI: Option 1: A Single Broadcast Domain per EVI In this option, a single Ethernet broadcast domain (e.g., subnet) represented by a VNI is mapped to a unique EVI. This corresponds to the VLAN-Based Service in [RFC7432], where a tenant-facing interface, logical interface (e.g., represented by a VID), or physical interface gets mapped to an EVI. As such, a BGP Route Distinguisher (RD) and Route Target (RT) are needed per VNI on every NVE. The advantage of this model is that it allows the BGP RT constraint mechanisms to be used in order to limit the propagation and import of routes to only the NVEs that are interested in a given VNI. The disadvantage of this model may be the provisioning overhead if the RD and RT are not derived automatically from the VNI. In this option, the MAC-VRF table is identified by the RT in the control plane and by the VNI in the data plane. In this option, the specific MAC-VRF table corresponds to only a single bridge table. Option 2: Multiple Broadcast Domains per EVI In this option, multiple subnets, each represented by a unique VNI, are mapped to a single EVI. For example, if a tenant has multiple segments/subnets each represented by a VNI, then all the VNIs for that tenant are mapped to a single EVI; for example, the EVI in this case represents the tenant and not a subnet. This corresponds to the VLAN-aware bundle service in [RFC7432]. The advantage of this model is that it doesn't require the provisioning of an RD/RT per VNI. However, this is a moot point when compared to Option 1 where auto- derivation is used. The disadvantage of this model is that routes would be imported by NVEs that may not be interested in a given VNI. In this option, the MAC-VRF table is identified by the RT in the control plane; a specific bridge table for that MAC-VRF is identified by the <RT, Ethernet Tag ID> in the control plane. In this option, the VNI in the data plane is sufficient to identify a specific bridge table.
Top   ToC   RFC8365 - Page 12
5.1.2.1. Auto-Derivation of RT
In order to simplify configuration, when the option of a single VNI per EVI is used, the RT used for EVPN can be auto-derived. RD can be auto-generated as described in [RFC7432], and RT can be auto-derived as described next. Since a Gateway PE as depicted in Figure 1 participates in both the DCN and WAN BGP sessions, it is important that, when RT values are auto-derived from VNIs, there be no conflict in RT spaces between DCNs and WANs, assuming that both are operating within the same Autonomous System (AS). Also, there can be scenarios where both VXLAN and NVGRE encapsulations may be needed within the same DCN, and their corresponding VNIs are administered independently, which means VNI spaces can overlap. In order to avoid conflict in RT spaces, the 6-byte RT values with 2-octet AS number for DCNs can be auto-derived as follow: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator | Local Administrator | +-----------------------------------------------+---------------+ | Local Administrator (Cont.) | +-------------------------------+ 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator |A| TYPE| D-ID | Service ID | +-----------------------------------------------+---------------+ | Service ID (Cont.) | +-------------------------------+ The 6-octet RT field consists of two sub-fields: - Global Administrator sub-field: 2 octets. This sub-field contains an AS number assigned by IANA <https://www.iana.org/assignments/ as-numbers/>. - Local Administrator sub-field: 4 octets * A: A single-bit field indicating if this RT is auto-derived 0: auto-derived 1: manually derived
Top   ToC   RFC8365 - Page 13
      *  Type: A 3-bit field that identifies the space in which the
         other 3 bytes are defined.  The following spaces are defined:

            0 : VID (802.1Q VLAN ID)
            1 : VXLAN
            2 : NVGRE
            3 : I-SID
            4 : EVI
            5 : dual-VID (QinQ VLAN ID)

      *  D-ID: A 4-bit field that identifies domain-id.  The default
         value of domain-id is zero, indicating that only a single
         numbering space exist for a given technology.  However, if more
         than one number space exists for a given technology (e.g.,
         overlapping VXLAN spaces), then each of the number spaces need
         to be identified by its corresponding domain-id starting from
         1.

      *  Service ID: This 3-octet field is set to VNI, VSID, I-SID, or
         VID.

   It should be noted that RT auto-derivation is applicable for 2-octet
   AS numbers.  For 4-octet AS numbers, the RT needs to be manually
   configured because 3-octet VNI fields cannot be fit within the
   2-octet local administrator field.

5.1.3. Constructing EVPN BGP Routes

In EVPN, an MPLS label, for instance, identifying the forwarding table is distributed by the egress PE via the EVPN control plane and is placed in the MPLS header of a given packet by the ingress PE. This label is used upon receipt of that packet by the egress PE for disposition of that packet. This is very similar to the use of the VNI by the egress NVE, with the difference being that an MPLS label has local significance while a VNI typically has global significance. Accordingly, and specifically to support the option of locally assigned VNIs, the MPLS Label1 field in the MAC/IP Advertisement route, the MPLS label field in the Ethernet A-D per EVI route, and the MPLS label field in the P-Multicast Service Interface (PMSI) Tunnel attribute of the Inclusive Multicast Ethernet Tag (IMET) route are used to carry the VNI. For the balance of this memo, the above MPLS label fields will be referred to as the VNI field. The VNI field is used for both local and global VNIs; for either case, the entire 24-bit field is used to encode the VNI value.
Top   ToC   RFC8365 - Page 14
   For the VLAN-Based Service (a single VNI per MAC-VRF), the Ethernet
   Tag field in the MAC/IP Advertisement, Ethernet A-D per EVI, and IMET
   route MUST be set to zero just as in the VLAN-Based Service in
   [RFC7432].

   For the VLAN-Aware Bundle Service (multiple VNIs per MAC-VRF with
   each VNI associated with its own bridge table), the Ethernet Tag
   field in the MAC Advertisement, Ethernet A-D per EVI, and IMET route
   MUST identify a bridge table within a MAC-VRF; the set of Ethernet
   Tags for that EVI needs to be configured consistently on all PEs
   within that EVI.  For locally assigned VNIs, the value advertised in
   the Ethernet Tag field MUST be set to a VID just as in the VLAN-aware
   bundle service in [RFC7432].  Such setting must be done consistently
   on all PE devices participating in that EVI within a given domain.
   For global VNIs, the value advertised in the Ethernet Tag field
   SHOULD be set to a VNI as long as it matches the existing semantics
   of the Ethernet Tag, i.e., it identifies a bridge table within a
   MAC-VRF and the set of VNIs are configured consistently on each PE in
   that EVI.

   In order to indicate which type of data-plane encapsulation (i.e.,
   VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP
   Encapsulation Extended Community defined in [RFC5512] is included
   with all EVPN routes (i.e., MAC Advertisement, Ethernet A-D per EVI,
   Ethernet A-D per ESI, IMET, and Ethernet Segment) advertised by an
   egress PE.  Five new values have been assigned by IANA to extend the
   list of encapsulation types defined in [RFC5512]; they are listed in
   Section 11.

   The MPLS encapsulation tunnel type, listed in Section 11, is needed
   in order to distinguish between an advertising node that only
   supports non-MPLS encapsulations and one that supports MPLS and
   non-MPLS encapsulations.  An advertising node that only supports MPLS
   encapsulation does not need to advertise any encapsulation tunnel
   types; i.e., if the BGP Encapsulation Extended Community is not
   present, then either MPLS encapsulation or a statically configured
   encapsulation is assumed.

   The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
   be set to the IPv4 or IPv6 address of the NVE.  The remaining fields
   in each route are set as per [RFC7432].

   Note that the procedure defined here -- to use the MPLS Label field
   to carry the VNI in the presence of a Tunnel Encapsulation Extended
   Community specifying the use of a VNI -- is aligned with the
   procedures described in Section 8.2.2.2 of [TUNNEL-ENCAP] ("When a
   Valid VNI has not been Signaled").
Top   ToC   RFC8365 - Page 15

5.2. MPLS over GRE

The EVPN data plane is modeled as an EVPN MPLS client layer sitting over an MPLS PSN tunnel server layer. Some of the EVPN functions (split-horizon, Aliasing, and Backup Path) are tied to the MPLS client layer. If MPLS over GRE encapsulation is used, then the EVPN MPLS client layer can be carried over an IP PSN tunnel transparently. Therefore, there is no impact to the EVPN procedures and associated data-plane operation. [RFC4023] defines the standard for using MPLS over GRE encapsulation, which can be used for this purpose. However, when MPLS over GRE is used in conjunction with EVPN, it is recommended that the GRE key field be present and be used to provide a 32-bit entropy value only if the P nodes can perform Equal-Cost Multipath (ECMP) hashing based on the GRE key; otherwise, the GRE header SHOULD NOT include the GRE key field. The Checksum and Sequence Number fields MUST NOT be included, and the corresponding C and S bits in the GRE header MUST be set to zero. A PE capable of supporting this encapsulation SHOULD advertise its EVPN routes along with the Tunnel Encapsulation Extended Community indicating MPLS over GRE encapsulation as described in the previous section.

6. EVPN with Multiple Data-Plane Encapsulations

The use of the BGP Encapsulation Extended Community per [RFC5512] allows each NVE in a given EVI to know each of the encapsulations supported by each of the other NVEs in that EVI. That is, each of the NVEs in a given EVI may support multiple data-plane encapsulations. An ingress NVE can send a frame to an egress NVE only if the set of encapsulations advertised by the egress NVE forms a non-empty intersection with the set of encapsulations supported by the ingress NVE; it is at the discretion of the ingress NVE which encapsulation to choose from this intersection. (As noted in Section 5.1.3, if the BGP Encapsulation extended community is not present, then the default MPLS encapsulation or a locally configured encapsulation is assumed.) When a PE advertises multiple supported encapsulations, it MUST advertise encapsulations that use the same EVPN procedures including procedures associated with split-horizon filtering described in Section 8.3.1. For example, VXLAN and NVGRE (or MPLS and MPLS over GRE) encapsulations use the same EVPN procedures; thus, a PE can advertise both of them and can support either of them or both of them simultaneously. However, a PE MUST NOT advertise VXLAN and MPLS encapsulations together because (a) the MPLS field of EVPN routes is
Top   ToC   RFC8365 - Page 16
   set to either an MPLS label or a VNI, but not both and (b) some EVPN
   procedures (such as split-horizon filtering) are different for VXLAN/
   NVGRE and MPLS encapsulations.

   An ingress node that uses shared multicast trees for sending
   broadcast or multicast frames MAY maintain distinct trees for each
   different encapsulation type.

   It is the responsibility of the operator of a given EVI to ensure
   that all of the NVEs in that EVI support at least one common
   encapsulation.  If this condition is violated, it could result in
   service disruption or failure.  The use of the BGP Encapsulation
   Extended Community provides a method to detect when this condition is
   violated, but the actions to be taken are at the discretion of the
   operator and are outside the scope of this document.

7. Single-Homing NVEs - NVE Residing in Hypervisor

When an NVE and its hosts/VMs are co-located in the same physical device, e.g., when they reside in a server, the links between them are virtual and they typically share fate. That is, the subject hosts/VMs are typically not multihomed or, if they are multihomed, the multihoming is a purely local matter to the server hosting the VM and the NVEs, and it need not be "visible" to any other NVEs residing on other servers. Thus, it does not require any specific protocol mechanisms. The most common case of this is when the NVE resides on the hypervisor. In the subsections that follow, we will discuss the impact on EVPN procedures for the case when the NVE resides on the hypervisor and the VXLAN (or NVGRE) encapsulation is used.

7.1. Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE Encapsulations

In scenarios where different groups of data centers are under different administrative domains, and these data centers are connected via one or more backbone core providers as described in [RFC7365], the RD must be a unique value per EVI or per NVE as described in [RFC7432]. In other words, whenever there is more than one administrative domain for global VNI, a unique RD must be used; or, whenever the VNI value has local significance, a unique RD must be used. Therefore, it is recommended to use a unique RD as described in [RFC7432] at all times.
Top   ToC   RFC8365 - Page 17
   When the NVEs reside on the hypervisor, the EVPN BGP routes and
   attributes associated with multihoming are no longer required.  This
   reduces the required routes and attributes to the following subset of
   four out of the total of eight listed in Section 7 of [RFC7432]:

   -  MAC/IP Advertisement Route

   -  Inclusive Multicast Ethernet Tag Route

   -  MAC Mobility Extended Community

   -  Default Gateway Extended Community

   However, as noted in Section 8.6 of [RFC7432], in order to enable a
   single-homing ingress NVE to take advantage of fast convergence,
   Aliasing, and Backup Path when interacting with multihomed egress
   NVEs attached to a given ES, the single-homing ingress NVE should be
   able to receive and process routes that are Ethernet A-D per ES and
   Ethernet A-D per EVI.

7.2. Impact on EVPN Procedures for VXLAN/NVGRE Encapsulations

When the NVEs reside on the hypervisors, the EVPN procedures associated with multihoming are no longer required. This limits the procedures on the NVE to the following subset. 1. Local learning of MAC addresses received from the VMs per Section 10.1 of [RFC7432]. 2. Advertising locally learned MAC addresses in BGP using the MAC/IP Advertisement routes. 3. Performing remote learning using BGP per Section 9.2 of [RFC7432]. 4. Discovering other NVEs and constructing the multicast tunnels using the IMET routes. 5. Handling MAC address mobility events per the procedures of Section 15 in [RFC7432]. However, as noted in Section 8.6 of [RFC7432], in order to enable a single-homing ingress NVE to take advantage of fast convergence, Aliasing, and Backup Path when interacting with multihomed egress NVEs attached to a given ES, a single-homing ingress NVE should implement the ingress node processing of routes that are Ethernet A-D per ES and Ethernet A-D per EVI as defined in Sections 8.2 ("Fast Convergence") and 8.4 ("Aliasing and Backup Path") of [RFC7432].


(next page on part 2)

Next Section