5. INTERNET LAYER - FORWARDING 5.1 INTRODUCTION This section describes the process of forwarding packets. 5.2 FORWARDING WALK-THROUGH There is no separate specification of the forwarding function in IP. Instead, forwarding is covered by the protocol specifications for the internet layer protocols ([INTERNET:1], [INTERNET:2], [INTERNET:3], [INTERNET:8], and [ROUTE:11]). 5.2.1 Forwarding Algorithm Since none of the primary protocol documents describe the forwarding algorithm in any detail, we present it here. This is just a general outline, and omits important details, such as handling of congestion, that are dealt with in later sections. It is not required that an implementation follow exactly the algorithms given in sections [5.2.1.1], [5.2.1.2], and [5.2.1.3]. Much of the challenge of writing router software is to maximize the rate at which the router can forward packets while still achieving the same effect of the algorithm. Details of how to do that are beyond the scope of this document, in part because they are heavily dependent on the architecture of the router. Instead, we merely point out the order dependencies among the steps: (1) A router MUST verify the IP header, as described in section [5.2.2], before performing any actions based on the contents of the header. This allows the router to detect and discard bad packets before the expenditure of other resources. (2) Processing of certain IP options requires that the router insert its IP address into the option. As noted in Section [5.2.4], the address inserted MUST be the address of the logical interface on which the packet is sent or the router's router-id if the packet is sent over an unnumbered interface. Thus, processing of these options cannot be completed until after the output interface is chosen. (3) The router cannot check and decrement the TTL before checking whether the packet should be delivered to the router itself, for reasons mentioned in Section [4.2.2.9]. (4) More generally, when a packet is delivered locally to the router, its IP header MUST NOT be modified in any way (except that a
router may be required to insert a timestamp into any Timestamp options in the IP header). Thus, before the router determines whether the packet is to be delivered locally to the router, it cannot update the IP header in any way that it is not prepared to undo. 5.2.1.1 General This section covers the general forwarding algorithm. This algorithm applies to all forms of packets to be forwarded: unicast, multicast, and broadcast. (1) The router receives the IP packet (plus additional information about it, as described in Section [3.1]) from the Link Layer. (2) The router validates the IP header, as described in Section [5.2.2]. Note that IP reassembly is not done, except on IP fragments to be queued for local delivery in step (4). (3) The router performs most of the processing of any IP options. As described in Section [5.2.4], some IP options require additional processing after the routing decision has been made. (4) The router examines the destination IP address of the IP datagram, as described in Section [5.2.3], to determine how it should continue to process the IP datagram. There are three possibilities: o The IP datagram is destined for the router, and should be queued for local delivery, doing reassembly if needed. o The IP datagram is not destined for the router, and should be queued for forwarding. o The IP datagram should be queued for forwarding, but (a copy) must also be queued for local delivery. 5.2.1.2 Unicast Since the local delivery case is well covered by [INTRO:2], the following assumes that the IP datagram was queued for forwarding. If the destination is an IP unicast address: (5) The forwarder determines the next hop IP address for the packet, usually by looking up the packet's destination in the router's routing table. This procedure is described in more detail in Section [5.2.4]. This procedure also decides which network
interface should be used to send the packet. (6) The forwarder verifies that forwarding the packet is permitted. The source and destination addresses should be valid, as described in Section [5.3.7] and Section [5.3.4] If the router supports administrative constraints on forwarding, such as those described in Section [5.3.9], those constraints must be satisfied. (7) The forwarder decrements (by at least one) and checks the packet's TTL, as described in Section [5.3.1]. (8) The forwarder performs any IP option processing that could not be completed in step 3. (9) The forwarder performs any necessary IP fragmentation, as described in Section [4.2.2.7]. Since this step occurs after outbound interface selection (step 5), all fragments of the same datagram will be transmitted out the same interface. (10) The forwarder determines the Link Layer address of the packet's next hop. The mechanisms for doing this are Link Layer- dependent (see chapter 3). (11) The forwarder encapsulates the IP datagram (or each of the fragments thereof) in an appropriate Link Layer frame and queues it for output on the interface selected in step 5. (12) The forwarder sends an ICMP redirect if necessary, as described in Section [4.3.3.2]. 5.2.1.3 Multicast If the destination is an IP multicast, the following steps are taken. Note that the main differences between the forwarding of IP unicasts and the forwarding of IP multicasts are o IP multicasts are usually forwarded based on both the datagram's source and destination IP addresses, o IP multicast uses an expanding ring search, o IP multicasts are forwarded as Link Level multicasts, and o ICMP errors are never sent in response to IP multicast datagrams.
Note that the forwarding of IP multicasts is still somewhat experimental. As a result, the algorithm presented below is not mandatory, and is provided as an example only. (5a) Based on the IP source and destination addresses found in the datagram header, the router determines whether the datagram has been received on the proper interface for forwarding. If not, the datagram is dropped silently. The method for determining the proper receiving interface depends on the multicast routing algorithm(s) in use. In one of the simplest algorithms, reverse path forwarding (RPF), the proper interface is the one that would be used to forward unicasts back to the datagram source. (6a) Based on the IP source and destination addresses found in the datagram header, the router determines the datagram's outgoing interfaces. To implement IP multicast's expanding ring search (see [INTERNET:4]) a minimum TTL value is specified for each outgoing interface. A copy of the multicast datagram is forwarded out each outgoing interface whose minimum TTL value is less than or equal to the TTL value in the datagram header, by separately applying the remaining steps on each such interface. (7a) The router decrements the packet's TTL by one. (8a) The forwarder performs any IP option processing that could not be completed in step (3). (9a) The forwarder performs any necessary IP fragmentation, as described in Section [4.2.2.7]. (10a) The forwarder determines the Link Layer address to use in the Link Level encapsulation. The mechanisms for doing this are Link Layer-dependent. On LANs a Link Level multicast or broadcast is selected, as an algorithmic translation of the datagrams' IP multicast address. See the various IP-over-xxx specifications for more details. (11a) The forwarder encapsulates the packet (or each of the fragments thereof) in an appropriate Link Layer frame and queues it for output on the appropriate interface.
5.2.2 IP Header Validation Before a router can process any IP packet, it MUST perform a the following basic validity checks on the packet's IP header to ensure that the header is meaningful. If the packet fails any of the following tests, it MUST be silently discarded, and the error SHOULD be logged. (1) The packet length reported by the Link Layer must be large enough to hold the minimum length legal IP datagram (20 bytes). (2) The IP checksum must be correct. (3) The IP version number must be 4. If the version number is not 4 then the packet may be another version of IP, such as IPng or ST-II. (4) The IP header length field must be large enough to hold the minimum length legal IP datagram (20 bytes = 5 words). (5) The IP total length field must be large enough to hold the IP datagram header, whose length is specified in the IP header length field. A router MUST NOT have a configuration option that allows disabling any of these tests. If the packet passes the second and third tests, the IP header length field is at least 4, and both the IP total length field and the packet length reported by the Link Layer are at least 16 then, despite the above rule, the router MAY respond with an ICMP Parameter Problem message, whose pointer points at the IP header length field (if it failed the fourth test) or the IP total length field (if it failed the fifth test). However, it still MUST discard the packet and still SHOULD log the error. These rules (and this entire document) apply only to version 4 of the Internet Protocol. These rules should not be construed as prohibiting routers from supporting other versions of IP. Furthermore, if a router can truly classify a packet as being some other version of IP then it ought not treat that packet as an error packet within the context of this memo. IMPLEMENTATION It is desirable for purposes of error reporting, though not always entirely possible, to determine why a header was invalid. There are four possible reasons:
o The Link Layer truncated the IP header o The datagram is using a version of IP other than the standard one (version 4). o The IP header has been corrupted in transit. o The sender generated an illegal IP header. It is probably desirable to perform the checks in the order listed, since we believe that this ordering is most likely to correctly categorize the cause of the error. For purposes of error reporting, it may also be desirable to check if a packet that fails these tests has an IP version number indicating IPng or ST-II; these should be handled according to their respective specifications. Additionally, the router SHOULD verify that the packet length reported by the Link Layer is at least as large as the IP total length recorded in the packet's IP header. If it appears that the packet has been truncated, the packet MUST be discarded, the error SHOULD be logged, and the router SHOULD respond with an ICMP Parameter Problem message whose pointer points at the IP total length field. DISCUSSION Because any higher layer protocol that concerns itself with data corruption will detect truncation of the packet data when it reaches its final destination, it is not absolutely necessary for routers to perform the check suggested above to maintain protocol correctness. However, by making this check a router can simplify considerably the task of determining which hop in the path is truncating the packets. It will also reduce the expenditure of resources down-stream from the router in that down-stream systems will not need to deal with the packet. Finally, if the destination address in the IP header is not one of the addresses of the router, the router SHOULD verify that the packet does not contain a Strict Source and Record Route option. If a packet fails this test (if it contains a strict source route option), the router SHOULD log the error and SHOULD respond with an ICMP Parameter Problem error with the pointer pointing at the offending packet's IP destination address. DISCUSSION Some people might suggest that the router should respond with a Bad Source Route message instead of a Parameter Problem message. However, when a packet fails this test, it usually indicates a
protocol error by the previous hop router, whereas Bad Source Route would suggest that the source host had requested a nonexistent or broken path through the network. 5.2.3 Local Delivery Decision When a router receives an IP packet, it must decide whether the packet is addressed to the router (and should be delivered locally) or the packet is addressed to another system (and should be handled by the forwarder). There is also a hybrid case, where certain IP broadcasts and IP multicasts are both delivered locally and forwarded. A router MUST determine which of the these three cases applies using the following rules. o An unexpired source route option is one whose pointer value does not point past the last entry in the source route. If the packet contains an unexpired source route option, the pointer in the option is advanced until either the pointer does point past the last address in the option or else the next address is not one of the router's own addresses. In the latter (normal) case, the packet is forwarded (and not delivered locally) regardless of the rules below. o The packet is delivered locally and not considered for forwarding in the following cases: - The packet's destination address exactly matches one of the router's IP addresses, - The packet's destination address is a limited broadcast address ({-1, -1}), or - The packet's destination is an IP multicast address which is never forwarded (such as 224.0.0.1 or 224.0.0.2) and (at least) one of the logical interfaces associated with the physical interface on which the packet arrived is a member of the destination multicast group. o The packet is passed to the forwarder AND delivered locally in the following cases: - The packet's destination address is an IP broadcast address that addresses at least one of the router's logical interfaces but does not address any of the logical interfaces associated with the physical interface on which the packet arrived
- The packet's destination is an IP multicast address which is permitted to be forwarded (unlike 224.0.0.1 and 224.0.0.2) and (at least) one of the logical interfaces associated with the physical interface on which the packet arrived is a member of the destination multicast group. o The packet is delivered locally if the packet's destination address is an IP broadcast address (other than a limited broadcast address) that addresses at least one of the logical interfaces associated with the physical interface on which the packet arrived. The packet is ALSO passed to the forwarder unless the link on which the packet arrived uses an IP encapsulation that does not encapsulate broadcasts differently than unicasts (e.g., by using different Link Layer destination addresses). o The packet is passed to the forwarder in all other cases. DISCUSSION The purpose of the requirement in the last sentence of the fourth bullet is to deal with a directed broadcast to another network prefix on the same physical cable. Normally, this works as expected: the sender sends the broadcast to the router as a Link Layer unicast. The router notes that it arrived as a unicast, and therefore must be destined for a different network prefix than the sender sent it on. Therefore, the router can safely send it as a Link Layer broadcast out the same (physical) interface over which it arrived. However, if the router can't tell whether the packet was received as a Link Layer unicast, the sentence ensures that the router does the safe but wrong thing rather than the unsafe but right thing. IMPLEMENTATION As described in Section [5.3.4], packets received as Link Layer broadcasts are generally not forwarded. It may be advantageous to avoid passing to the forwarder packets it would later discard because of the rules in that section. Some Link Layers (either because of the hardware or because of special code in the drivers) can deliver to the router copies of all Link Layer broadcasts and multicasts it transmits. Use of this feature can simplify the implementation of cases where a packet has to both be passed to the forwarder and delivered locally, since forwarding the packet will automatically cause the router to receive a copy of the packet that it can then deliver locally. One must use care in these circumstances to prevent treating a received loop-back packet as a normal packet that was received (and then being subject to the rules of forwarding, etc.).
Even without such a Link Layer, it is of course hardly necessary to make a copy of an entire packet to queue it both for forwarding and for local delivery, though care must be taken with fragments, since reassembly is performed on locally delivered packets but not on forwarded packets. One simple scheme is to associate a flag with each packet on the router's output queue that indicates whether it should be queued for local delivery after it has been sent. 5.2.4 Determining the Next Hop Address When a router is going to forward a packet, it must determine whether it can send it directly to its destination, or whether it needs to pass it through another router. If the latter, it needs to determine which router to use. This section explains how these determinations are made. This section makes use of the following definitions: o LSRR - IP Loose Source and Record Route option o SSRR - IP Strict Source and Record Route option o Source Route Option - an LSRR or an SSRR o Ultimate Destination Address - where the packet is being sent to: the last address in the source route of a source-routed packet, or the destination address in the IP header of a non-source-routed packet o Adjacent - reachable without going through any IP routers o Next Hop Address - the IP address of the adjacent host or router to which the packet should be sent next o IP Destination Address - the ultimate destination address, except in source routed packets, where it is the next address specified in the source route o Immediate Destination - the node, System, router, end-system, or whatever that is addressed by the IP Destination Address.
5.2.4.1 IP Destination Address If: o the destination address in the IP header is one of the addresses of the router, o the packet contains a Source Route Option, and o the pointer in the Source Route Option does not point past the end of the option, then the next IP Destination Address is the address pointed at by the pointer in that option. If: o the destination address in the IP header is one of the addresses of the router, o the packet contains a Source Route Option, and o the pointer in the Source Route Option points past the end of the option, then the message is addressed to the system analyzing the message. A router MUST use the IP Destination Address, not the Ultimate Destination Address (the last address in the source route option), when determining how to handle a packet. It is an error for more than one source route option to appear in a datagram. If it receives such a datagram, it SHOULD discard the packet and reply with an ICMP Parameter Problem message whose pointer points at the beginning of the second source route option. 5.2.4.2 Local/Remote Decision After it has been determined that the IP packet needs to be forwarded according to the rules specified in Section [5.2.3], the following algorithm MUST be used to determine if the Immediate Destination is directly accessible (see [INTERNET:2]). (1) For each network interface that has not been assigned any IP address (the unnumbered lines as described in Section [2.2.7]), compare the router-id of the other end of the line to the IP Destination Address. If they are exactly equal, the packet can be transmitted through this interface.
DISCUSSION In other words, the router or host at the remote end of the line is the destination of the packet or is the next step in the source route of a source routed packet. (2) If no network interface has been selected in the first step, for each IP address assigned to the router: (a) isolate the network prefix used by the interface. IMPLEMENTATION The result of this operation will usually have been computed and saved during initialization. (b) Isolate the corresponding set of bits from the IP Destination Address of the packet. (c) Compare the resulting network prefixes. If they are equal to each other, the packet can be transmitted through the corresponding network interface. (3) If the destination was neither the router-id of a neighbor on an unnumbered interface nor a member of a directly connected network prefix, the IP Destination is accessible only through some other router. The selection of the router and the next hop IP address is described in Section [5.2.4.3]. In the case of a host that is not also a router, this may be the configured default router. Ongoing work in the IETF [ARCH:9, NRHP] considers some cases such as when multiple IP (sub)networks are overlaid on the same link layer network. Barring policy restrictions, hosts and routers using a common link layer network can directly communicate even if they are not in the same IP (sub)network, if there is adequate information present. The Next Hop Routing Protocol (NHRP) enables IP entities to determine the "optimal" link layer address to be used to traverse such a link layer network towards a remote destination. (4) If the selected "next hop" is reachable through an interface configured to use NHRP, then the following additional steps apply: (a) Compare the IP Destination Address to the destination addresses in the NHRP cache. If the address is in the cache, then send the datagram to the corresponding cached link layer address. (b) If the address is not in the cache, then construct an NHRP request packet containing the IP Destination Address. This message is sent to the NHRP server configured for that interface. This may be a logically separate process or entity in the router itself.
(c) The NHRP server will respond with the proper link layer address to use to transmit the datagram and subsequent datagrams to the same destination. The system MAY transmit the datagram(s) to the traditional "next hop" router while awaiting the NHRP reply. 5.2.4.3 Next Hop Address EDITORS+COMMENTS The router applies the algorithm in the previous section to determine if the IP Destination Address is adjacent. If so, the next hop address is the same as the IP Destination Address. Otherwise, the packet must be forwarded through another router to reach its Immediate Destination. The selection of this router is the topic of this section. If the packet contains an SSRR, the router MUST discard the packet and reply with an ICMP Bad Source Route error. Otherwise, the router looks up the IP Destination Address in its routing table to determine an appropriate next hop address. DISCUSSION Per the IP specification, a Strict Source Route must specify a sequence of nodes through which the packet must traverse; the packet must go from one node of the source route to the next, traversing intermediate networks only. Thus, if the router is not adjacent to the next step of the source route, the source route can not be fulfilled. Therefore, the router rejects such with an ICMP Bad Source Route error. The goal of the next-hop selection process is to examine the entries in the router's Forwarding Information Base (FIB) and select the best route (if there is one) for the packet from those available in the FIB. Conceptually, any route lookup algorithm starts out with a set of candidate routes that consists of the entire contents of the FIB. The algorithm consists of a series of steps that discard routes from the set. These steps are referred to as Pruning Rules. Normally, when the algorithm terminates there is exactly one route remaining in the set. If the set ever becomes empty, the packet is discarded because the destination is unreachable. It is also possible for the algorithm to terminate when more than one route remains in the set. In this case, the router may arbitrarily discard all but one of them, or may perform "load-splitting" by choosing whichever of the routes has been least recently used. With the exception of rule 3 (Weak TOS), a router MUST use the following Pruning Rules when selecting a next hop for a packet. If a
router does consider TOS when making next-hop decisions, the Rule 3 must be applied in the order indicated below. These rules MUST be (conceptually) applied to the FIB in the order that they are presented. (For some historical perspective, additional pruning rules, and other common algorithms in use, see Appendix E.) DISCUSSION Rule 3 is optional in that Section [5.3.2] says that a router only SHOULD consider TOS when making forwarding decisions. (1) Basic Match This rule discards any routes to destinations other than the IP Destination Address of the packet. For example, if a packet's IP Destination Address is 10.144.2.5, this step would discard a route to net 128.12.0.0/16 but would retain any routes to the network prefixes 10.0.0.0/8 and 10.144.0.0/16, and any default routes. More precisely, we assume that each route has a destination attribute, called route.dest and a corresponding prefix length, called route.length, to specify which bits of route.dest are significant. The IP Destination Address of the packet being forwarded is ip.dest. This rule discards all routes from the set of candidates except those for which the most significant route.length bits of route.dest and ip.dest are equal. For example, if a packet's IP Destination Address is 10.144.2.5 and there are network prefixes 10.144.1.0/24, 10.144.2.0/24, and 10.144.3.0/24, this rule would keep only 10.144.2.0/24; it is the only route whose prefix has the same value as the corresponding bits in the IP Destination Address of the packet. (2) Longest Match Longest Match is a refinement of Basic Match, described above. After performing Basic Match pruning, the algorithm examines the remaining routes to determine which among them have the largest route.length values. All except these are discarded. For example, if a packet's IP Destination Address is 10.144.2.5 and there are network prefixes 10.144.2.0/24, 10.144.0.0/16, and 10.0.0.0/8, then this rule would keep only the first (10.144.2.0/24) because its prefix length is longest.
(3) Weak TOS Each route has a type of service attribute, called route.tos, whose possible values are assumed to be identical to those used in the TOS field of the IP header. Routing protocols that distribute TOS information fill in route.tos appropriately in routes they add to the FIB; routes from other routing protocols are treated as if they have the default TOS (0000). The TOS field in the IP header of the packet being routed is called ip.tos. The set of candidate routes is examined to determine if it contains any routes for which route.tos = ip.tos. If so, all routes except those for which route.tos = ip.tos are discarded. If not, all routes except those for which route.tos = 0000 are discarded from the set of candidate routes. Additional discussion of routing based on Weak TOS may be found in [ROUTE:11]. DISCUSSION The effect of this rule is to select only those routes that have a TOS that matches the TOS requested in the packet. If no such routes exist then routes with the default TOS are considered. Routes with a non-default TOS that is not the TOS requested in the packet are never used, even if such routes are the only available routes that go to the packet's destination. (4) Best Metric Each route has a metric attribute, called route.metric, and a routing domain identifier, called route.domain. Each member of the set of candidate routes is compared with each other member of the set. If route.domain is equal for the two routes and route.metric is strictly inferior for one when compared with the other, then the one with the inferior metric is discarded from the set. The determination of inferior is usually by a simple arithmetic comparison, though some protocols may have structured metrics requiring more complex comparisons. (5) Vendor Policy Vendor Policy is sort of a catch-all to make up for the fact that the previously listed rules are often inadequate to choose from the possible routes. Vendor Policy pruning rules are extremely vendor-specific. See section [5.2.4.4]. This algorithm has two distinct disadvantages. Presumably, a router implementor might develop techniques to deal with these
disadvantages and make them a part of the Vendor Policy pruning rule. (1) IS-IS and OSPF route classes are not directly handled. (2) Path properties other than type of service (e.g., MTU) are ignored. It is also worth noting a deficiency in the way that TOS is supported: routing protocols that support TOS are implicitly preferred when forwarding packets that have non-zero TOS values. The Basic Match and Longest Match pruning rules generalize the treatment of a number of particular types of routes. These routes are selected in the following, decreasing, order of preference: (1) Host Route: This is a route to a specific end system. (2) Hierarchical Network Prefix Routes: This is a route to a particular network prefix. Note that the FIB may contain several routes to network prefixes that subsume each other (one prefix is the other prefix with additional bits). These are selected in order of decreasing prefix length. (5) Default Route: This is a route to all networks for which there are no explicit routes. It is by definition the route whose prefix length is zero. If, after application of the pruning rules, the set of routes is empty (i.e., no routes were found), the packet MUST be discarded and an appropriate ICMP error generated (ICMP Bad Source Route if the IP Destination Address came from a source route option; otherwise, whichever of ICMP Destination Host Unreachable or Destination Network Unreachable is appropriate, as described in Section [4.3.3.1]). 5.2.4.4 Administrative Preference One suggested mechanism for the Vendor Policy Pruning Rule is to use administrative preference, which is a simple prioritization algorithm. The idea is to manually prioritize the routes that one might need to select among. Each route has associated with it a preference value, based on various attributes of the route (specific mechanisms for assignment of preference values are suggested below). This preference value is an integer in the range [0..255], with zero being the most preferred and 254 being the least preferred. 255 is a special
value that means that the route should never be used. The first step in the Vendor Policy pruning rule discards all but the most preferable routes (and always discards routes whose preference value is 255). This policy is not safe in that it can easily be misused to create routing loops. Since no protocol ensures that the preferences configured for a router is consistent with the preferences configured in its neighbors, network managers must exercise care in configuring preferences. o Address Match It is useful to be able to assign a single preference value to all routes (learned from the same routing domain) to any of a specified set of destinations, where the set of destinations is all destinations that match a specified network prefix. o Route Class For routing protocols which maintain the distinction, it is useful to be able to assign a single preference value to all routes (learned from the same routing domain) which have a particular route class (intra-area, inter-area, external with internal metrics, or external with external metrics). o Interface It is useful to be able to assign a single preference value to all routes (learned from a particular routing domain) that would cause packets to be routed out a particular logical interface on the router (logical interfaces generally map one-to-one onto the router's network interfaces, except that any network interface that has multiple IP addresses will have multiple logical interfaces associated with it). o Source router It is useful to be able to assign a single preference value to all routes (learned from the same routing domain) that were learned from any of a set of routers, where the set of routers are those whose updates have a source address that match a specified network prefix. o Originating AS For routing protocols which provide the information, it is useful to be able to assign a single preference value to all routes (learned from a particular routing domain) which originated in another particular routing domain. For BGP routes, the originating AS is the first AS listed in the route's AS_PATH attribute. For OSPF external routes, the originating AS may be considered to be the low order 16 bits of the route's
external route tag if the tag's Automatic bit is set and the tag's Path Length is not equal to 3. o External route tag It is useful to be able to assign a single preference value to all OSPF external routes (learned from the same routing domain) whose external route tags match any of a list of specified values. Because the external route tag may contain a structured value, it may be useful to provide the ability to match particular subfields of the tag. o AS path It may be useful to be able to assign a single preference value to all BGP routes (learned from the same routing domain) whose AS path "matches" any of a set of specified values. It is not yet clear exactly what kinds of matches are most useful. A simple option would be to allow matching of all routes for which a particular AS number appears (or alternatively, does not appear) anywhere in the route's AS_PATH attribute. A more general but somewhat more difficult alternative would be to allow matching all routes for which the AS path matches a specified regular expression. 5.2.4.5 Load Splitting At the end of the Next-hop selection process, multiple routes may still remain. A router has several options when this occurs. It may arbitrarily discard some of the routes. It may reduce the number of candidate routes by comparing metrics of routes from routing domains that are not considered equivalent. It may retain more than one route and employ a load-splitting mechanism to divide traffic among them. Perhaps the only thing that can be said about the relative merits of the options is that load-splitting is useful in some situations but not in others, so a wise implementor who implements load-splitting will also provide a way for the network manager to disable it. 5.2.5 Unused IP Header Bits: RFC-791 Section 3.1 The IP header contains several reserved bits, in the Type of Service field and in the Flags field. Routers MUST NOT drop packets merely because one or more of these reserved bits has a non-zero value. Routers MUST ignore and MUST pass through unchanged the values of these reserved bits. If a router fragments a packet, it MUST copy these bits into each fragment.
DISCUSSION Future revisions to the IP protocol may make use of these unused bits. These rules are intended to ensure that these revisions can be deployed without having to simultaneously upgrade all routers in the Internet. 5.2.6 Fragmentation and Reassembly: RFC-791 Section 3.2 As was discussed in Section [4.2.2.7], a router MUST support IP fragmentation. A router MUST NOT reassemble any datagram before forwarding it. DISCUSSION A few people have suggested that there might be some topologies where reassembly of transit datagrams by routers might improve performance. The fact that fragments may take different paths to the destination precludes safe use of such a feature. Nothing in this section should be construed to control or limit fragmentation or reassembly performed as a link layer function by the router. Similarly, if an IP datagram is encapsulated in another IP datagram (e.g., it is tunnelled), that datagram is in turn fragmented, the fragments must be reassembled in order to forward the original datagram. This section does not preclude this. 5.2.7 Internet Control Message Protocol - ICMP General requirements for ICMP were discussed in Section [4.3]. This section discusses ICMP messages that are sent only by routers. 5.2.7.1 Destination Unreachable The ICMP Destination Unreachable message is sent by a router in response to a packet which it cannot forward because the destination (or next hop) is unreachable or a service is unavailable. Examples of such cases include a message addressed to a host which is not there and therefore does not respond to ARP requests, and messages addressed to network prefixes for which the router has no valid route. A router MUST be able to generate ICMP Destination Unreachable messages and SHOULD choose a response code that most closely matches the reason the message is being generated. The following codes are defined in [INTERNET:8] and [INTRO:2]:
0 = Network Unreachable - generated by a router if a forwarding path (route) to the destination network is not available; 1 = Host Unreachable - generated by a router if a forwarding path (route) to the destination host on a directly connected network is not available (does not respond to ARP); 2 = Protocol Unreachable - generated if the transport protocol designated in a datagram is not supported in the transport layer of the final destination; 3 = Port Unreachable - generated if the designated transport protocol (e.g., UDP) is unable to demultiplex the datagram in the transport layer of the final destination but has no protocol mechanism to inform the sender; 4 = Fragmentation Needed and DF Set - generated if a router needs to fragment a datagram but cannot since the DF flag is set; 5 = Source Route Failed - generated if a router cannot forward a packet to the next hop in a source route option; 6 = Destination Network Unknown - This code SHOULD NOT be generated since it would imply on the part of the router that the destination network does not exist (net unreachable code 0 SHOULD be used in place of code 6); 7 = Destination Host Unknown - generated only when a router can determine (from link layer advice) that the destination host does not exist; 11 = Network Unreachable For Type Of Service - generated by a router if a forwarding path (route) to the destination network with the requested or default TOS is not available; 12 = Host Unreachable For Type Of Service - generated if a router cannot forward a packet because its route(s) to the destination do not match either the TOS requested in the datagram or the default TOS (0). The following additional codes are hereby defined: 13 = Communication Administratively Prohibited - generated if a router cannot forward a packet due to administrative filtering; 14 = Host Precedence Violation. Sent by the first hop router to a host to indicate that a requested precedence is not permitted for the particular combination of source/destination host or
network, upper layer protocol, and source/destination port; 15 = Precedence cutoff in effect. The network operators have imposed a minimum level of precedence required for operation, the datagram was sent with a precedence below this level; NOTE: [INTRO:2] defined Code 8 for source host isolated. Routers SHOULD NOT generate Code 8; whichever of Codes 0 (Network Unreachable) and 1 (Host Unreachable) is appropriate SHOULD be used instead. [INTRO:2] also defined Code 9 for communication with destination network administratively prohibited and Code 10 for communication with destination host administratively prohibited. These codes were intended for use by end-to-end encryption devices used by U.S military agencies. Routers SHOULD use the newly defined Code 13 (Communication Administratively Prohibited) if they administratively filter packets. Routers MAY have a configuration option that causes Code 13 (Communication Administratively Prohibited) messages not to be generated. When this option is enabled, no ICMP error message is sent in response to a packet that is dropped because its forwarding is administratively prohibited. Similarly, routers MAY have a configuration option that causes Code 14 (Host Precedence Violation) and Code 15 (Precedence Cutoff in Effect) messages not to be generated. When this option is enabled, no ICMP error message is sent in response to a packet that is dropped because of a precedence violation. Routers MUST use Host Unreachable or Destination Host Unknown codes whenever other hosts on the same destination network might be reachable; otherwise, the source host may erroneously conclude that all hosts on the network are unreachable, and that may not be the case. [INTERNET:14] describes a slight modification the form of Destination Unreachable messages containing Code 4 (Fragmentation needed and DF set). A router MUST use this modified form when originating Code 4 Destination Unreachable messages. 5.2.7.2 Redirect The ICMP Redirect message is generated to inform a local host the it should use a different next hop router for a certain class of traffic. Routers MUST NOT generate the Redirect for Network or Redirect for Network and Type of Service messages (Codes 0 and 2) specified in
[INTERNET:8]. Routers MUST be able to generate the Redirect for Host message (Code 1) and SHOULD be able to generate the Redirect for Type of Service and Host message (Code 3) specified in [INTERNET:8]. DISCUSSION If the directly connected network is not subnetted (in the classical sense), a router can normally generate a network Redirect that applies to all hosts on a specified remote network. Using a network rather than a host Redirect may economize slightly on network traffic and on host routing table storage. However, the savings are not significant, and subnets create an ambiguity about the subnet mask to be used to interpret a network Redirect. In a CIDR environment, it is difficult to specify precisely the cases in which network Redirects can be used. Therefore, routers must send only host (or host and type of service) Redirects. A Code 3 (Redirect for Host and Type of Service) message is generated when the packet provoking the redirect has a destination for which the path chosen by the router would depend (in part) on the TOS requested. Routers that can generate Code 3 redirects (Host and Type of Service) MUST have a configuration option (which defaults to on) to enable Code 1 (Host) redirects to be substituted for Code 3 redirects. A router MUST send a Code 1 Redirect in place of a Code 3 Redirect if it has been configured to do so. If a router is not able to generate Code 3 Redirects then it MUST generate Code 1 Redirects in situations where a Code 3 Redirect is called for. Routers MUST NOT generate a Redirect Message unless all the following conditions are met: o The packet is being forwarded out the same physical interface that it was received from, o The IP source address in the packet is on the same Logical IP (sub)network as the next-hop IP address, and o The packet does not contain an IP source route option. The source address used in the ICMP Redirect MUST belong to the same logical (sub)net as the destination address. A router using a routing protocol (other than static routes) MUST NOT consider paths learned from ICMP Redirects when forwarding a packet. If a router is not using a routing protocol, a router MAY have a
configuration that, if set, allows the router to consider routes learned through ICMP Redirects when forwarding packets. DISCUSSION ICMP Redirect is a mechanism for routers to convey routing information to hosts. Routers use other mechanisms to learn routing information, and therefore have no reason to obey redirects. Believing a redirect which contradicted the router's other information would likely create routing loops. On the other hand, when a router is not acting as a router, it MUST comply with the behavior required of a host. 5.2.7.3 Time Exceeded A router MUST generate a Time Exceeded message Code 0 (In Transit) when it discards a packet due to an expired TTL field. A router MAY have a per-interface option to disable origination of these messages on that interface, but that option MUST default to allowing the messages to be originated. 5.2.8 INTERNET GROUP MANAGEMENT PROTOCOL - IGMP IGMP [INTERNET:4] is a protocol used between hosts and multicast routers on a single physical network to establish hosts' membership in particular multicast groups. Multicast routers use this information, in conjunction with a multicast routing protocol, to support IP multicast forwarding across the Internet. A router SHOULD implement the multicast router part of IGMP.