Data centers (DCs) are critical components of the infrastructure used by network operators to provide services to their customers. DCs (sites) are interconnected by a backbone network, which consists of any number of private networks and/or the Internet. DCs are attached to the backbone network by routers that are gateways (GWs). One DC typically has more than one GW for various reasons including commercial preferences, load balancing, or resiliency against connection or device failure.
Segment Routing (SR) ([
RFC 8402]) is a protocol mechanism that can be used within a DC as well as for steering traffic that flows between two DC sites. In order for a source site (also known as an ingress site) that uses SR to load-balance the flows it sends to a destination site (also known as an egress site), it needs to know the complete set of entry nodes (i.e., GWs) for that egress DC from the backbone network connecting the two DCs. Note that it is assumed that the connected set of DC sites and the border nodes in the backbone network on the paths that connect the DC sites are part of the same SR BGP - Link State (LS) instance (see [
RFC 7752] and [
RFC 9086]) so that traffic engineering using SR may be used for these flows.
Other sites, such as access networks, also need to be connected across backbone networks through gateways. For illustrative purposes, consider the ingress and egress sites shown in
Figure 1 as separate Autonomous Systems (ASes) (noting that the sites could be implemented as part of the ASes to which they are attached, or as separate ASes). The various ASes that provide connectivity between the ingress and egress sites could each be constructed differently and use different technologies such as IP; MPLS using global table routing information from BGP; MPLS IP VPN; SR-MPLS IP VPN; or SRv6 IP VPN. That is, the ingress and egress sites can be connected by tunnels across a variety of technologies. This document describes how SR Segment Identifiers (SIDs) are used to identify the paths between the ingress and egress sites.
The solution described in this document is agnostic as to whether the transit ASes do or do not have SR capabilities. The solution uses SR to stitch together path segments between GWs and through the Autonomous System Border Routers (ASBRs). Thus, there is a requirement that the GWs and ASBRs are SR capable. The solution supports the SR path being extended into the ingress and egress sites if they are SR capable.
The solution defined in this document can be seen in the broader context of site interconnection in [
SR-INTERCONNECT]. That document shows how other existing protocol elements may be combined with the solution defined in this document to provide a full system, but it is not a necessary reference for understanding this document.
Suppose that there are two gateways, GW1 and GW2 as shown in
Figure 1, for a given egress site and that they each advertise a route to prefix X, which is located within the egress site with each setting itself as next hop. One might think that the GWs for X could be inferred from the routes' next-hop fields, but typically it is not the case that both routes get distributed across the backbone: rather only the best route, as selected by BGP, is distributed. This precludes load-balancing flows across both GWs.
----------------- ---------------------
| Ingress | | Egress ------ |
| Site | | Site |Prefix| |
| | | | X | |
| | | ------ |
| -- | | --- --- |
| |GW| | | |GW1| |GW2| |
-------++-------- ----+-----------+-+--
| \ | / |
| \ | / |
| -+------------- --------+--------+-- |
| ||ASBR| ----| |---- |ASBR| |ASBR| | |
| | ---- |ASBR+------+ASBR| ---- ---- | |
| | ----| |---- | |
| | | | | |
| | ----| |---- | |
| | AS1 |ASBR+------+ASBR| AS2 | |
| | ----| |---- | |
| --------------- -------------------- |
--+-----------------------------------------------+--
| |ASBR| |ASBR| |
| ---- AS3 ---- |
| |
-----------------------------------------------------
The obvious solution to this problem is to use the BGP feature that allows the advertisement of multiple paths in BGP (known as Add-Paths) ([
RFC 7911]) to ensure that all routes to X get advertised by BGP. However, even if this is done, the identity of the GWs will be lost as soon as the routes get distributed through an ASBR that will set itself to be the next hop. And if there are multiple ASes in the backbone, not only will the next hop change several times, but the Add-Paths technique will experience scaling issues. This all means that the Add-Paths approach is effectively limited to sites connected over a single AS.
This document defines a solution that overcomes this limitation and works equally well with a backbone constructed from one or more ASes using the Tunnel Encapsulation attribute ([
RFC 9012]) as follows:
-
When a GW to a given site advertises a route to a prefix X within that site, it will include a Tunnel Encapsulation attribute that contains the union of the Tunnel Encapsulation attributes advertised by each of the GWs to that site, including itself.
In other words, each route advertised by a GW identifies all of the GWs to the same site (see
Section 3 for a discussion of how GWs discover each other), i.e., the Tunnel Encapsulation attribute advertised by each GW contains multiple Tunnel TLVs, one or more from each active GW, and each Tunnel TLV will contain a Tunnel Egress Endpoint sub-TLV that identifies the GW for that Tunnel TLV. Therefore, even if only one of the routes is distributed to other ASes, it will not matter how many times the next hop changes, as the Tunnel Encapsulation attribute will remain unchanged.
To put this in the context of
Figure 1, GW1 and GW2 discover each other as gateways for the egress site. Both GW1 and GW2 advertise themselves as having routes to prefix X. Furthermore, GW1 includes a Tunnel Encapsulation attribute, which is the union of its Tunnel Encapsulation attribute and GW2's Tunnel Encapsulation attribute. Similarly, GW2 includes a Tunnel Encapsulation attribute, which is the union of its Tunnel Encapsulation attribute and GW1's Tunnel Encapsulation attribute. The gateway in the ingress site can now see all possible paths to X in the egress site regardless of which route is propagated to it, and it can choose one or balance traffic flows as it sees fit.