Network Working Group Y. Rekhter, Ed. Request for Comments: 4271 T. Li, Ed. Obsoletes: 1771 S. Hares, Ed. Category: Standards Track January 2006 A Border Gateway Protocol 4 (BGP-4) Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2006).Abstract
This document discusses the Border Gateway Protocol (BGP), which is an inter-Autonomous System routing protocol. The primary function of a BGP speaking system is to exchange network reachability information with other BGP systems. This network reachability information includes information on the list of Autonomous Systems (ASes) that reachability information traverses. This information is sufficient for constructing a graph of AS connectivity for this reachability from which routing loops may be pruned, and, at the AS level, some policy decisions may be enforced. BGP-4 provides a set of mechanisms for supporting Classless Inter- Domain Routing (CIDR). These mechanisms include support for advertising a set of destinations as an IP prefix, and eliminating the concept of network "class" within BGP. BGP-4 also introduces mechanisms that allow aggregation of routes, including aggregation of AS paths. This document obsoletes RFC 1771.
Table of Contents
1. Introduction ....................................................4 1.1. Definition of Commonly Used Terms ..........................4 1.2. Specification of Requirements ..............................6 2. Acknowledgements ................................................6 3. Summary of Operation ............................................7 3.1. Routes: Advertisement and Storage ..........................9 3.2. Routing Information Base ..................................10 4. Message Formats ................................................11 4.1. Message Header Format .....................................12 4.2. OPEN Message Format .......................................13 4.3. UPDATE Message Format .....................................14 4.4. KEEPALIVE Message Format ..................................21 4.5. NOTIFICATION Message Format ...............................21 5. Path Attributes ................................................23 5.1. Path Attribute Usage ......................................25 5.1.1. ORIGIN .............................................25 5.1.2. AS_PATH ............................................25 5.1.3. NEXT_HOP ...........................................26 5.1.4. MULTI_EXIT_DISC ....................................28 5.1.5. LOCAL_PREF .........................................29 5.1.6. ATOMIC_AGGREGATE ...................................29 5.1.7. AGGREGATOR .........................................30 6. BGP Error Handling. ............................................30 6.1. Message Header Error Handling .............................31 6.2. OPEN Message Error Handling ...............................31 6.3. UPDATE Message Error Handling .............................32 6.4. NOTIFICATION Message Error Handling .......................34 6.5. Hold Timer Expired Error Handling .........................34 6.6. Finite State Machine Error Handling .......................35 6.7. Cease .....................................................35 6.8. BGP Connection Collision Detection ........................35 7. BGP Version Negotiation ........................................36 8. BGP Finite State Machine (FSM) .................................37 8.1. Events for the BGP FSM ....................................38 8.1.1. Optional Events Linked to Optional Session Attributes .........................................38 8.1.2. Administrative Events ..............................42 8.1.3. Timer Events .......................................46 8.1.4. TCP Connection-Based Events ........................47 8.1.5. BGP Message-Based Events ...........................49 8.2. Description of FSM ........................................51 8.2.1. FSM Definition .....................................51 8.2.1.1. Terms "active" and "passive" ..............52 8.2.1.2. FSM and Collision Detection ...............52 8.2.1.3. FSM and Optional Session Attributes .......52 8.2.1.4. FSM Event Numbers .........................53
8.2.1.5. FSM Actions that are Implementation Dependent .................................53 8.2.2. Finite State Machine ...............................53 9. UPDATE Message Handling ........................................75 9.1. Decision Process ..........................................76 9.1.1. Phase 1: Calculation of Degree of Preference .......77 9.1.2. Phase 2: Route Selection ...........................77 9.1.2.1. Route Resolvability Condition .............79 9.1.2.2. Breaking Ties (Phase 2) ...................80 9.1.3. Phase 3: Route Dissemination .......................82 9.1.4. Overlapping Routes .................................83 9.2. Update-Send Process .......................................84 9.2.1. Controlling Routing Traffic Overhead ...............85 9.2.1.1. Frequency of Route Advertisement ..........85 9.2.1.2. Frequency of Route Origination ............85 9.2.2. Efficient Organization of Routing Information ......86 9.2.2.1. Information Reduction .....................86 9.2.2.2. Aggregating Routing Information ...........87 9.3. Route Selection Criteria ..................................89 9.4. Originating BGP routes ....................................89 10. BGP Timers ....................................................90 Appendix A. Comparison with RFC 1771 .............................92 Appendix B. Comparison with RFC 1267 .............................93 Appendix C. Comparison with RFC 1163 .............................93 Appendix D. Comparison with RFC 1105 .............................94 Appendix E. TCP Options that May Be Used with BGP ................94 Appendix F. Implementation Recommendations .......................95 Appendix F.1. Multiple Networks Per Message .........95 Appendix F.2. Reducing Route Flapping ...............96 Appendix F.3. Path Attribute Ordering ...............96 Appendix F.4. AS_SET Sorting ........................96 Appendix F.5. Control Over Version Negotiation ......96 Appendix F.6. Complex AS_PATH Aggregation ...........96 Security Considerations ...........................................97 IANA Considerations ...............................................99 Normative References .............................................101 Informative References ...........................................101
1. Introduction
The Border Gateway Protocol (BGP) is an inter-Autonomous System routing protocol. The primary function of a BGP speaking system is to exchange network reachability information with other BGP systems. This network reachability information includes information on the list of Autonomous Systems (ASes) that reachability information traverses. This information is sufficient for constructing a graph of AS connectivity for this reachability, from which routing loops may be pruned and, at the AS level, some policy decisions may be enforced. BGP-4 provides a set of mechanisms for supporting Classless Inter- Domain Routing (CIDR) [RFC1518, RFC1519]. These mechanisms include support for advertising a set of destinations as an IP prefix and eliminating the concept of network "class" within BGP. BGP-4 also introduces mechanisms that allow aggregation of routes, including aggregation of AS paths. Routing information exchanged via BGP supports only the destination- based forwarding paradigm, which assumes that a router forwards a packet based solely on the destination address carried in the IP header of the packet. This, in turn, reflects the set of policy decisions that can (and cannot) be enforced using BGP. BGP can support only those policies conforming to the destination-based forwarding paradigm.1.1. Definition of Commonly Used Terms
This section provides definitions for terms that have a specific meaning to the BGP protocol and that are used throughout the text. Adj-RIB-In The Adj-RIBs-In contains unprocessed routing information that has been advertised to the local BGP speaker by its peers. Adj-RIB-Out The Adj-RIBs-Out contains the routes for advertisement to specific peers by means of the local speaker's UPDATE messages. Autonomous System (AS) The classic definition of an Autonomous System is a set of routers under a single technical administration, using an interior gateway protocol (IGP) and common metrics to determine how to route packets within the AS, and using an inter-AS routing protocol to determine how to route packets to other ASes. Since this classic definition was developed, it has become common for a single AS to
use several IGPs and, sometimes, several sets of metrics within an AS. The use of the term Autonomous System stresses the fact that, even when multiple IGPs and metrics are used, the administration of an AS appears to other ASes to have a single coherent interior routing plan, and presents a consistent picture of the destinations that are reachable through it. BGP Identifier A 4-octet unsigned integer that indicates the BGP Identifier of the sender of BGP messages. A given BGP speaker sets the value of its BGP Identifier to an IP address assigned to that BGP speaker. The value of the BGP Identifier is determined upon startup and is the same for every local interface and BGP peer. BGP speaker A router that implements BGP. EBGP External BGP (BGP connection between external peers). External peer Peer that is in a different Autonomous System than the local system. Feasible route An advertised route that is available for use by the recipient. IBGP Internal BGP (BGP connection between internal peers). Internal peer Peer that is in the same Autonomous System as the local system. IGP Interior Gateway Protocol - a routing protocol used to exchange routing information among routers within a single Autonomous System. Loc-RIB The Loc-RIB contains the routes that have been selected by the local BGP speaker's Decision Process. NLRI Network Layer Reachability Information. Route A unit of information that pairs a set of destinations with the attributes of a path to those destinations. The set of
destinations are systems whose IP addresses are contained in one IP address prefix carried in the Network Layer Reachability Information (NLRI) field of an UPDATE message. The path is the information reported in the path attributes field of the same UPDATE message. RIB Routing Information Base. Unfeasible route A previously advertised feasible route that is no longer available for use.1.2. Specification of Requirements
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].2. Acknowledgements
This document was originally published as [RFC1267] in October 1991, jointly authored by Kirk Lougheed and Yakov Rekhter. We would like to express our thanks to Guy Almes, Len Bosack, and Jeffrey C. Honig for their contributions to the earlier version (BGP-1) of this document. We would like to specially acknowledge numerous contributions by Dennis Ferguson to the earlier version of this document. We would like to explicitly thank Bob Braden for the review of the earlier version (BGP-2) of this document, and for his constructive and valuable comments. We would also like to thank Bob Hinden, Director for Routing of the Internet Engineering Steering Group, and the team of reviewers he assembled to review the earlier version (BGP-2) of this document. This team, consisting of Deborah Estrin, Milo Medin, John Moy, Radia Perlman, Martha Steenstrup, Mike St. Johns, and Paul Tsuchiya, acted with a strong combination of toughness, professionalism, and courtesy. Certain sections of the document borrowed heavily from IDRP [IS10747], which is the OSI counterpart of BGP. For this, credit should be given to the ANSI X3S3.3 group chaired by Lyman Chapin and to Charles Kunzinger, who was the IDRP editor within that group.
We would also like to thank Benjamin Abarbanel, Enke Chen, Edward Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey Haas, Dimitry Haskin, Stephen Kent, John Krawczyk, David LeRoy, Dan Massey, Jonathan Natale, Dan Pei, Mathew Richardson, John Scudder, John Stewart III, Dave Thaler, Paul Traina, Russ White, Curtis Villamizar, and Alex Zinin for their comments. We would like to specially acknowledge Andrew Lange for his help in preparing the final version of this document. Finally, we would like to thank all the members of the IDR Working Group for their ideas and the support they have given to this document.3. Summary of Operation
The Border Gateway Protocol (BGP) is an inter-Autonomous System routing protocol. It is built on experience gained with EGP (as defined in [RFC904]) and EGP usage in the NSFNET Backbone (as described in [RFC1092] and [RFC1093]). For more BGP-related information, see [RFC1772], [RFC1930], [RFC1997], and [RFC2858]. The primary function of a BGP speaking system is to exchange network reachability information with other BGP systems. This network reachability information includes information on the list of Autonomous Systems (ASes) that reachability information traverses. This information is sufficient for constructing a graph of AS connectivity, from which routing loops may be pruned, and, at the AS level, some policy decisions may be enforced. In the context of this document, we assume that a BGP speaker advertises to its peers only those routes that it uses itself (in this context, a BGP speaker is said to "use" a BGP route if it is the most preferred BGP route and is used in forwarding). All other cases are outside the scope of this document. In the context of this document, the term "IP address" refers to an IP Version 4 address [RFC791]. Routing information exchanged via BGP supports only the destination- based forwarding paradigm, which assumes that a router forwards a packet based solely on the destination address carried in the IP header of the packet. This, in turn, reflects the set of policy decisions that can (and cannot) be enforced using BGP. Note that some policies cannot be supported by the destination-based forwarding paradigm, and thus require techniques such as source routing (aka explicit routing) to be enforced. Such policies cannot be enforced using BGP either. For example, BGP does not enable one AS to send
traffic to a neighboring AS for forwarding to some destination (reachable through but) beyond that neighboring AS, intending that the traffic take a different route to that taken by the traffic originating in the neighboring AS (for that same destination). On the other hand, BGP can support any policy conforming to the destination-based forwarding paradigm. BGP-4 provides a new set of mechanisms for supporting Classless Inter-Domain Routing (CIDR) [RFC1518, RFC1519]. These mechanisms include support for advertising a set of destinations as an IP prefix and eliminating the concept of a network "class" within BGP. BGP-4 also introduces mechanisms that allow aggregation of routes, including aggregation of AS paths. This document uses the term `Autonomous System' (AS) throughout. The classic definition of an Autonomous System is a set of routers under a single technical administration, using an interior gateway protocol (IGP) and common metrics to determine how to route packets within the AS, and using an inter-AS routing protocol to determine how to route packets to other ASes. Since this classic definition was developed, it has become common for a single AS to use several IGPs and, sometimes, several sets of metrics within an AS. The use of the term Autonomous System stresses the fact that, even when multiple IGPs and metrics are used, the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of the destinations that are reachable through it. BGP uses TCP [RFC793] as its transport protocol. This eliminates the need to implement explicit update fragmentation, retransmission, acknowledgement, and sequencing. BGP listens on TCP port 179. The error notification mechanism used in BGP assumes that TCP supports a "graceful" close (i.e., that all outstanding data will be delivered before the connection is closed). A TCP connection is formed between two systems. They exchange messages to open and confirm the connection parameters. The initial data flow is the portion of the BGP routing table that is allowed by the export policy, called the Adj-Ribs-Out (see 3.2). Incremental updates are sent as the routing tables change. BGP does not require a periodic refresh of the routing table. To allow local policy changes to have the correct effect without resetting any BGP connections, a BGP speaker SHOULD either (a) retain the current version of the routes advertised to it by all of its peers for the duration of the connection, or (b) make use of the Route Refresh extension [RFC2918].
KEEPALIVE messages may be sent periodically to ensure that the connection is live. NOTIFICATION messages are sent in response to errors or special conditions. If a connection encounters an error condition, a NOTIFICATION message is sent and the connection is closed. A peer in a different AS is referred to as an external peer, while a peer in the same AS is referred to as an internal peer. Internal BGP and external BGP are commonly abbreviated as IBGP and EBGP. If a particular AS has multiple BGP speakers and is providing transit service for other ASes, then care must be taken to ensure a consistent view of routing within the AS. A consistent view of the interior routes of the AS is provided by the IGP used within the AS. For the purpose of this document, it is assumed that a consistent view of the routes exterior to the AS is provided by having all BGP speakers within the AS maintain IBGP with each other. This document specifies the base behavior of the BGP protocol. This behavior can be, and is, modified by extension specifications. When the protocol is extended, the new behavior is fully documented in the extension specifications.3.1. Routes: Advertisement and Storage
For the purpose of this protocol, a route is defined as a unit of information that pairs a set of destinations with the attributes of a path to those destinations. The set of destinations are systems whose IP addresses are contained in one IP address prefix that is carried in the Network Layer Reachability Information (NLRI) field of an UPDATE message, and the path is the information reported in the path attributes field of the same UPDATE message. Routes are advertised between BGP speakers in UPDATE messages. Multiple routes that have the same path attributes can be advertised in a single UPDATE message by including multiple prefixes in the NLRI field of the UPDATE message. Routes are stored in the Routing Information Bases (RIBs): namely, the Adj-RIBs-In, the Loc-RIB, and the Adj-RIBs-Out, as described in Section 3.2. If a BGP speaker chooses to advertise a previously received route, it MAY add to, or modify, the path attributes of the route before advertising it to a peer.
BGP provides mechanisms by which a BGP speaker can inform its peers that a previously advertised route is no longer available for use. There are three methods by which a given BGP speaker can indicate that a route has been withdrawn from service: a) the IP prefix that expresses the destination for a previously advertised route can be advertised in the WITHDRAWN ROUTES field in the UPDATE message, thus marking the associated route as being no longer available for use, b) a replacement route with the same NLRI can be advertised, or c) the BGP speaker connection can be closed, which implicitly removes all routes the pair of speakers had advertised to each other from service. Changing the attribute(s) of a route is accomplished by advertising a replacement route. The replacement route carries new (changed) attributes and has the same address prefix as the original route.3.2. Routing Information Base
The Routing Information Base (RIB) within a BGP speaker consists of three distinct parts: a) Adj-RIBs-In: The Adj-RIBs-In stores routing information learned from inbound UPDATE messages that were received from other BGP speakers. Their contents represent routes that are available as input to the Decision Process. b) Loc-RIB: The Loc-RIB contains the local routing information the BGP speaker selected by applying its local policies to the routing information contained in its Adj-RIBs-In. These are the routes that will be used by the local BGP speaker. The next hop for each of these routes MUST be resolvable via the local BGP speaker's Routing Table. c) Adj-RIBs-Out: The Adj-RIBs-Out stores information the local BGP speaker selected for advertisement to its peers. The routing information stored in the Adj-RIBs-Out will be carried in the local BGP speaker's UPDATE messages and advertised to its peers. In summary, the Adj-RIBs-In contains unprocessed routing information that has been advertised to the local BGP speaker by its peers; the Loc-RIB contains the routes that have been selected by the local BGP
speaker's Decision Process; and the Adj-RIBs-Out organizes the routes for advertisement to specific peers (by means of the local speaker's UPDATE messages). Although the conceptual model distinguishes between Adj-RIBs-In, Loc-RIB, and Adj-RIBs-Out, this neither implies nor requires that an implementation must maintain three separate copies of the routing information. The choice of implementation (for example, 3 copies of the information vs 1 copy with pointers) is not constrained by the protocol. Routing information that the BGP speaker uses to forward packets (or to construct the forwarding table used for packet forwarding) is maintained in the Routing Table. The Routing Table accumulates routes to directly connected networks, static routes, routes learned from the IGP protocols, and routes learned from BGP. Whether a specific BGP route should be installed in the Routing Table, and whether a BGP route should override a route to the same destination installed by another source, is a local policy decision, and is not specified in this document. In addition to actual packet forwarding, the Routing Table is used for resolution of the next-hop addresses specified in BGP updates (see Section 5.1.3).