5.2.1.2 Unicast Since the local delivery case is well-covered by [INTRO:2], the following assumes that the IP datagram was queued for forwarding. If the destination is an IP unicast address: (5) The forwarder determines the next hop IP address for the packet, usually by looking up the packet's destination in the router's routing table. This procedure is described in more detail in Section [5.2.4]. This procedure also decides which network interface should be used to send the packet. (6) The forwarder verifies that forwarding the packet is permitted. The source and destination addresses should be valid, as described in Section [5.3.7] and Section [5.3.4] If the router supports administrative constraints on forwarding, such as those described in Section [5.3.9], those constraints must be satisfied. (7) The forwarder decrements (by at least one) and checks the packet's TTL, as described in Section [5.3.1]. (8) The forwarder performs any IP option processing that could not be completed in step 3. (9) The forwarder performs any necessary IP fragmentation, as described in Section [4.2.2.7]. Since this step occurs after outbound interface selection (step 5), all fragments of the same datagram will be transmitted out the same interface. (10) The forwarder determines the Link Layer address of the packet's next hop. The mechanisms for doing this are Link Layer-dependent (see chapter 3). (11) The forwarder encapsulates the IP datagram (or each of the fragments thereof) in an appropriate Link Layer frame and queues it for output on the interface selected in step 5. (12) The forwarder sends an ICMP redirect if necessary, as described in Section [4.3.3.2].
5.2.1.3 Multicast If the destination is an IP multicast, the following steps are taken. Note that the main differences between the forwarding of IP unicasts and the forwarding of IP multicasts are o IP multicasts are usually forwarded based on both the datagram's source and destination IP addresses, o IP multicast uses an expanding ring search, o IP multicasts are forwarded as Link Level multicasts, and o ICMP errors are never sent in response to IP multicast datagrams. Note that the forwarding of IP multicasts is still somewhat experimental. As a result, the algorithm presented below is not mandatory, and is provided as an example only. (5a) Based on the IP source and destination addresses found in the datagram header, the router determines whether the datagram has been received on the proper interface for forwarding. If not, the datagram is dropped silently. The method for determining the proper receiving interface depends on the multicast routing algorithm(s) in use. In one of the simplest algorithms, reverse path forwarding (RPF), the proper interface is the one that would be used to forward unicasts back to the datagram source. (6a) Based on the IP source and destination addresses found in the datagram header, the router determines the datagram's outgoing interfaces. In order to implement IP multicast's expanding ring search (see [INTERNET:4]) a minimum TTL value is specified for each outgoing interface. A copy of the multicast datagram is forwarded out each outgoing interface whose minimum TTL value is less than or equal to the TTL value in the datagram header, by separately applying the remaining steps on each such interface. (7a) The router decrements the packet's TTL by one. (8a) The forwarder performs any IP option processing that could not be completed in step (3).
(9a) The forwarder performs any necessary IP fragmentation, as described in Section [4.2.2.7]. (10a) The forwarder determines the Link Layer address to use in the Link Level encapsulation. The mechanisms for doing this are Link Layer-dependent. On LANs a Link Level multicast or broadcast is selected, as an algorithmic translation of the datagrams' class D destination address. See the various IP-over-xxx specifications for more details. (11a) The forwarder encapsulates the packet (or each of the fragments thereof) in an appropriate Link Layer frame and queues it for output on the appropriate interface. 5.2.2 IP Header Validation Before a router can process any IP packet, it MUST perform a the following basic validity checks on the packet's IP header to ensure that the header is meaningful. If the packet fails any of the following tests, it MUST be silently discarded, and the error SHOULD be logged. (1) The packet length reported by the Link Layer must be large enough to hold the minimum length legal IP datagram (20 bytes). (2) The IP checksum must be correct. (3) The IP version number must be 4. If the version number is not 4 then the packet may well be another version of IP, such as ST-II. (4) The IP header length field must be at least 5. (5) The IP total length field must be at least 4 * IP header length field. A router MUST NOT have a configuration option which allows disabling any of these tests. If the packet passes the second and third tests, the IP header length field is at least 4, and both the IP total length field and the packet length reported by the Link Layer are at least 16 then, despite the above rule, the router MAY respond with an ICMP Parameter Problem message, whose pointer points at the IP header length field (if it failed the fourth test) or the IP total length
field (if it failed the fifth test). However, it still MUST discard the packet and still SHOULD log the error. These rules (and this entire document) apply only to version 4 of the Internet Protocol. These rules should not be construed as prohibiting routers from supporting other versions of IP. Furthermore, if a router can truly classify a packet as being some other version of IP then it ought not treat that packet as an error packet within the context of this memo. IMPLEMENTATION: It is desirable for purposes of error reporting, though not always entirely possible, to determine why a header was invalid. There are four possible reasons: o The Link Layer truncated the IP header o The datagram is using a version of IP other than the standard one (version 4). o The IP header has been corrupted in transit. o The sender generated an illegal IP header. It is probably desirable to perform the checks in the order listed, since we believe that this ordering is most likely to correctly categorize the cause of the error. For purposes of error reporting, it may also be desirable to check if a packet which fails these tests has an IP version number equal to 6. If it does, the packet is probably an ST-II datagram and should be treated as such. ST-II is described in [FORWARD:1]. Additionally, the router SHOULD verify that the packet length reported by the Link Layer is at least as large as the IP total length recorded in the packet's IP header. If it appears that the packet has been truncated, the packet MUST be discarded, the error SHOULD be logged, and the router SHOULD respond with an ICMP Parameter Problem message whose pointer points at the IP total length field. DISCUSSION: Because any higher layer protocol which concerns itself with data corruption will detect truncation of the packet data when it reaches its final destination, it is not absolutely necessary for routers to perform the check suggested above in order to maintain protocol correctness. However, by making this check a router can simplify considerably the task of
determining which hop in the path is truncating the packets. It will also reduce the expenditure of resources down-stream from the router in that down-stream systems will not need to deal with the packet. Finally, if the destination address in the IP header is not one of the addresses of the router, the router SHOULD verify that the packet does not contain a Strict Source and Record Route option. If a packet fails this test, the router SHOULD log the error and SHOULD respond with an ICMP Parameter Problem error with the pointer pointing at the offending packet's IP destination address. DISCUSSION: Some people might suggest that the router should respond with a Bad Source Route message instead of a Parameter Problem message. However, when a packet fails this test, it usually indicates a protocol error by the previous hop router, whereas Bad Source Route would suggest that the source host had requested a nonexistent or broken path through the network. 5.2.3 Local Delivery Decision When a router receives an IP packet, it must decide whether the packet is addressed to the router (and should be delivered locally) or the packet is addressed to another system (and should be handled by the forwarder). There is also a hybrid case, where certain IP broadcasts and IP multicasts are both delivered locally and forwarded. A router MUST determine which of the these three cases applies using the following rules: o An unexpired source route option is one whose pointer value does not point past the last entry in the source route. If the packet contains an unexpired source route option, the pointer in the option is advanced until either the pointer does point past the last address in the option or else the next address is not one of the router's own addresses. In the latter (normal) case, the packet is forwarded (and not delivered locally) regardless of the rules below. o The packet is delivered locally and not considered for forwarding in the following cases: - The packet's destination address exactly matches one of the router's IP addresses, - The packet's destination address is a limited broadcast
address ({-1, -1}), and - The packet's destination is an IP multicast address which is limited to a single subnet (such as 224.0.0.1 or 224.0.0.2) and (at least) one of the logical interfaces associated with the physical interface on which the packet arrived is a member of the destination multicast group. o The packet is passed to the forwarder AND delivered locally in the following cases: - The packet's destination address is an IP broadcast address that addresses at least one of the router's logical interfaces but does not address any of the logical interfaces associated with the physical interface on which the packet arrived - The packet's destination is an IP multicast address which is not limited to a single subnetwork (such as 224.0.0.1 and 224.0.0.2 are) and (at least) one of the logical interfaces associated with the physical interface on which the packet arrived is a member of the destination multicast group. o The packet is delivered locally if the packet's destination address is an IP broadcast address (other than a limited broadcast address) that addresses at least one of the logical interfaces associated with the physical interface on which the packet arrived. The packet is ALSO passed to the forwarder unless the link on which the packet arrived uses an IP encapsulation that does not encapsulate broadcasts differently than unicasts (e.g. by using different Link Layer destination addresses). o The packet is passed to the forwarder in all other cases. DISCUSSION: The purpose of the requirement in the last sentence of the fourth bullet is to deal with a directed broadcast to another net or subnet on the same physical cable. Normally, this works as expected: the sender sends the broadcast to the router as a Link Layer unicast. The router notes that it arrived as a unicast, and therefore must be destined for a different logical net (or subnet) than the sender sent it on. Therefore, the router can safely send it as a Link Layer broadcast out the same (physical) interface over which it arrived. However, if the router can't tell whether the packet was received as a Link Layer unicast, the sentence ensures that the router does the
safe but wrong thing rather than the unsafe but right thing. IMPLEMENTATION: As described in Section [5.3.4], packets received as Link Layer broadcasts are generally not forwarded. It may be advantageous to avoid passing to the forwarder packets it would later discard because of the rules in that section. Some Link Layers (either because of the hardware or because of special code in the drivers) can deliver to the router copies of all Link Layer broadcasts and multicasts it transmits. Use of this feature can simplify the implementation of cases where a packet has to both be passed to the forwarder and delivered locally, since forwarding the packet will automatically cause the router to receive a copy of the packet that it can then deliver locally. One must use care in these circumstances in order to prevent treating a received loop-back packet as a normal packet that was received (and then being subject to the rules of forwarding, etc etc). Even in the absence of such a Link Layer, it is of course hardly necessary to make a copy of an entire packet in order to queue it both for forwarding and for local delivery, though care must be taken with fragments, since reassembly is performed on locally delivered packets but not on forwarded packets. One simple scheme is to associate a flag with each packet on the router's output queue which indicates whether it should be queued for local delivery after it has been sent. 5.2.4 Determining the Next Hop Address When a router is going to forward a packet, it must determine whether it can send it directly to its destination, or whether it needs to pass it through another router. If the latter, it needs to determine which router to use. This section explains how these determinations are made. This section makes use of the following definitions: o LSRR - IP Loose Source and Record Route option o SSRR - IP Strict Source and Record Route option o Source Route Option - an LSRR or an SSRR o Ultimate Destination Address - where the packet is being sent
to: the last address in the source route of a source-routed packet, or the destination address in the IP header of a non- source-routed packet o Adjacent - reachable without going through any IP routers o Next Hop Address - the IP address of the adjacent host or router to which the packet should be sent next o Immediate Destination Address - the ultimate destination address, except in source routed packets, where it is the next address specified in the source route o Immediate Destination - the node, system, router, end-system, or whatever that is addressed by the Immediate Destination Address. 5.2.4.1 Immediate Destination Address If the destination address in the IP header is one of the addresses of the router and the packet contains a Source Route Option, the Immediate Destination Address is the address pointed at by the pointer in that option if the pointer does not point past the end of the option. Otherwise, the Immediate Destination Address is the same as the IP destination address in the IP header. A router MUST use the Immediate Destination Address, not the Ultimate Destination Address, when determining how to handle a packet. It is an error for more than one source route option to appear in a datagram. If it receives one, it SHOULD discard the packet and reply with an ICMP Parameter Problem message whose pointer points at the beginning of the second source route option. 5.2.4.2 Local/Remote Decision After it has been determined that the IP packet needs to be forwarded in accordance with the rules specified in Section [5.2.3], the following algorithm MUST be used to determine if the Immediate Destination is directly accessible (see [INTERNET:2]): (1) For each network interface that has not been assigned any IP address (the unnumbered lines as described in Section
[2.2.7]), compare the router-id of the other end of the line to the Immediate Destination Address. If they are exactly equal, the packet can be transmitted through this interface. DISCUSSION: In other words, the router or host at the remote end of the line is the destination of the packet or is the next step in the source route of a source routed packet. (2) If no network interface has been selected in the first step, for each IP address assigned to the router: (a) Apply the subnet mask associated with the address to this IP address. IMPLEMENTATION: The result of this operation will usually have been computed and saved during initialization. (b) Apply the same subnet mask to the Immediate Destination Address of the packet. (c) Compare the resulting values. If they are equal to each other, the packet can be transmitted through the corresponding network interface. (3) If an interface has still not been selected, the Immediate Destination is accessible only through some other router. The selection of the router and the next hop IP address is described in Section [5.2.4.3]. 5.2.4.3 Next Hop Address EDITOR'S COMMENTS: Note that this section has been extensively rewritten. The original document indicated that Phil Almquist wished to revise this section to conform to his "Ruminations on the Next Hop" document. I am under the assumption that the working group generally agreed with this goal; there was an editor's note from Phil that remained in this document to that effect, and the RoNH document contains a "mandatory RRWG algorithm". So, I have taken said algorithm from RoNH and moved it into here.
Additional useful or interesting information from RoNH has been extracted and placed into an appendix to this note. The router applies the algorithm in the previous section to determine if the Immediate Destination Address is adjacent. If so, the next hop address is the same as the Immediate Destination Address. Otherwise, the packet must be forwarded through another router to reach its Immediate Destination. The selection of this router is the topic of this section. If the packet contains an SSRR, the router MUST discard the packet and reply with an ICMP Bad Source Route error. Otherwise, the router looks up the Immediate Destination Address in its routing table to determine an appropriate next hop address. DISCUSSION: Per the IP specification, a Strict Source Route must specify a sequence of nodes through which the packet must traverse; the packet must go from one node of the source route to the next, traversing intermediate networks only. Thus, if the router is not adjacent to the next step of the source route, the source route can not be fulfilled. Therefore, the ICMP Bad Source Route error. The goal of the next-hop selection process is to examine the entries in the router's Forwarding Information Base (FIB) and select the best route (if there is one) for the packet from those available in the FIB. Conceptually, any route lookup algorithm starts out with a set of candidate routes which consists of the entire contents of the FIB. The algorithm consists of a series of steps which discard routes from the set. These steps are referred to as Pruning Rules. Normally, when the algorithm terminates there is exactly one route remaining in the set. If the set ever becomes empty, the packet is discarded because the destination is unreachable. It is also possible for the algorithm to terminate when more than one route remains in the set. In this case, the router may arbitrarily discard all but one of them, or may perform "load-splitting" by choosing whichever of the routes has been least recently used. With the exception of rule 3 (Weak TOS), a router MUST use the following Pruning Rules when selecting a next hop for a packet. If a router does consider TOS when making next-hop decisions, the Rule 3 must be applied in the order indicated below. These
rules MUST be (conceptually) applied to the FIB in the order that they are presented. (For some historical perspective, additional pruning rules, and other common algorithms in use, see Appendix E). DISCUSSION: Rule 3 is optional in that Section [5.3.2] says that a router only SHOULD consider TOS when making forwarding decisions. (1) Basic Match This rule discards any routes to destinations other than the Immediate Destination Address of the packet. For example, if a packet's Immediate Destination Address is 36.144.2.5, this step would discard a route to net 128.12.0.0 but would retain any routes to net 36.0.0.0, any routes to subnet 36.144.0.0, and any default routes. More precisely, we assume that each route has a destination attribute, called route.dest, and a corresponding mask, called route.mask, to specify which bits of route.dest are significant. The Immediate Destination Address of the packet being forwarded is ip.dest. This rule discards all routes from the set of candidate routes except those for which (route.dest & route.mask) = (ip.dest & route.mask). (2) Longest Match Longest Match is a refinement of Basic Match, described above. After Basic Match pruning is performed, the remaining routes are examined to determine the maximum number of bits set in any of their route.mask attributes. The step then discards from the set of candidate routes any routes which have fewer than that maximum number of bits set in their route.mask attributes. For example, if a packet's Immediate Destination Address is 36.144.2.5 and there are {route.dest, route.mask} pairs of {36.144.2.0, 255.255.255.0}, {36.144.0.5, 255.255.0.255}, {36.144.0.0, 255.255.0.0}, and {36.0.0.0, 255.0.0.0}, then this rule would keep only the first two pairs; {36.144.2.0, 255.255.255.0} and {36.144.0.5, 255.255.0.255}.
(3) Weak TOS Each route has a type of service attribute, called route.tos, whose possible values are assumed to be identical to those used in the TOS field of the IP header. Routing protocols which distribute TOS information fill in route.tos appropriately in routes they add to the FIB; routes from other routing protocols are treated as if they have the default TOS (0000). The TOS field in the IP header of the packet being routed is called ip.tos. The set of candidate routes is examined to determine if it contains any routes for which route.tos = ip.tos. If so, all routes except those for which route.tos = ip.tos are discarded. If not, all routes except those for which route.tos = 0000 are discarded from the set of candidate routes. Additional discussion of routing based on Weak TOS may be found in [ROUTE:11]. DISCUSSION: The effect of this rule is to select only those routes which have a TOS that matches the TOS requested in the packet. If no such routes exist then routes with the default TOS are considered. Routes with a non-default TOS that is not the TOS requested in the packet are never used, even if such routes are the only available routes that go to the packet's destination. (4) Best Metric Each route has a metric attribute, called route.metric, and a routing domain identifier, called route.domain. Each member of the set of candidate routes is compared with each other member of the set. If route.domain is equal for the two routes and route.metric is strictly inferior for one when compared with the other, then the one with the inferior metric is discarded from the set. The determination of inferior is usually by a simple arithmetic comparison, though some protocols may have structured metrics requiring more complex comparisons. (5) Vendor Policy Vendor Policy is sort of a catch-all to make up for the fact that the previously listed rules are often inadequate to chose from among the possible routes. Vendor Policy pruning rules are extremely vendor-specific. See section [5.2.4.4].
This algorithm has two distinct disadvantages. Presumably, a router implementor might develop techniques to deal with these disadvantages and make them a part of the Vendor Policy pruning rule. (1) IS-IS and OSPF route classes are not directly handled. (2) Path properties other than type of service (e.g. MTU) are ignored. It is also worth noting a deficiency in the way that TOS is supported: routing protocols which support TOS are implicitly preferred when forwarding packets which have non-zero TOS values. The Basic Match and Longest Match pruning rules generalize the treatment of a number of particular types of routes. These routes are selected in the following, decreasing, order of preference: (1) Host Route: This is a route to a specific end system. (2) Subnetwork Route: This is a route to a particular subnet of a network. (3) Default Subnetwork Route: This is a route to all subnets of a particular net for which there are not (explicit) subnet routes. (4) Network Route: This is a route to a particular network. (5) Default Network Route (also known as the default route): This is a route to all networks for which there are no explicit routes to the net or any of its subnets. If, after application of the pruning rules, the set of routes is empty (i.e., no routes were found), the packet MUST be discarded and an appropriate ICMP error generated (ICMP Bad Source Route if the Immediate Destination Address came from a source route option; otherwise, whichever of ICMP Destination Host Unreachable or Destination Network Unreachable is appropriate, as described in Section [4.3.3.1]).
5.2.4.4 Administrative Preference One suggested mechanism for the Vendor Policy Pruning Rule is to use administrative preference. Each route has associated with it a preference value, based on various attributes of the route (specific mechanisms for assignment of preference values are suggested below). This preference value is an integer in the range [0..255], with zero being the most preferred and 254 being the least preferred. 255 is a special value that means that the route should never be used. The first step in the Vendor Policy pruning rule discards all but the most preferable routes (and always discards routes whose preference value is 255). This policy is not safe in that it can easily be misused to create routing loops. Since no protocol ensures that the preferences configured for a router are consistent with the preferences configured in its neighbors, network managers must exercise care in configuring preferences. o Address Match It is useful to be able to assign a single preference value to all routes (learned from the same routing domain) to any of a specified set of destinations, where the set of destinations is all destinations that match a specified address/mask pair. o Route Class For routing protocols which maintain the distinction, it is useful to be able to assign a single preference value to all routes (learned from the same routing domain) which have a particular route class (intra-area, inter-area, external with internal metrics, or external with external metrics). o Interface It is useful to be able to assign a single preference value to all routes (learned from a particular routing domain) that would cause packets to be routed out a particular logical interface on the router (logical interfaces generally map one-to-one onto the router's network interfaces, except that any network interface which has multiple IP addresses will have multiple logical interfaces associated with it). o Source router It is useful to be able to assign a single preference value
to all routes (learned from the same routing domain) which were learned from any of a set of routers, where the set of routers are those whose updates have a source address which match a specified address/mask pair. o Originating AS For routing protocols which provide the information, it is useful to be able to assign a single preference value to all routes (learned from a particular routing domain) which originated in another particular routing domain. For BGP routes, the originating AS is the first AS listed in the route's AS_PATH attribute. For OSPF external routes, the originating AS may be considered to be the low order 16 bits of the route's external route tag if the tag's Automatic bit is set and the tag's PathLength is not equal to 3. o External route tag It is useful to be able to assign a single preference value to all OSPF external routes (learned from the same routing domain) whose external route tags match any of a list of specified values. Because the external route tag may contain a structured value, it may be useful to provide the ability to match particular subfields of the tag. o AS path It may be useful to be able to assign a single preference value to all BGP routes (learned from the same routing domain) whose AS path "matches" any of a set of specified values. It is not yet clear exactly what kinds of matches are most useful. A simple option would be to allow matching of all routes for which a particular AS number appears (or alternatively, does not appear) anywhere in the route's AS_PATH attribute. A more general but somewhat more difficult alternative would be to allow matching all routes for which the AS path matches a specified regular expression. 5.2.4.6 Load Splitting At the end of the Next-hop selection process, multiple routes may still remain. A router has several options when this occurs. It may arbitrarily discard some of the routes. It may reduce the number of candidate routes by comparing metrics of routes from routing domains which are not considered equivalent. It may retain more than one route and employ a load-splitting mechanism to divide traffic among them. Perhaps the only thing that can be said about the relative merits of
the options is that load-splitting is useful in some situations but not in others, so a wise implementor who implements load- splitting will also provide a way for the network manager to disable it. 5.2.5 Unused IP Header Bits: RFC-791 Section 3.1 The IP header contains several reserved bits, in the Type of Service field and in the Flags field. Routers MUST NOT drop packets merely because one or more of these reserved bits has a non-zero value. Routers MUST ignore and MUST pass through unchanged the values of these reserved bits. If a router fragments a packet, it MUST copy these bits into each fragment. DISCUSSION: Future revisions to the IP protocol may make use of these unused bits. These rules are intended to ensure that these revisions can be deployed without having to simultaneously upgrade all routers in the Internet. 5.2.6 Fragmentation and Reassembly: RFC-791 Section 3.2 As was discussed in Section [4.2.2.7], a router MUST support IP fragmentation. A router MUST NOT reassemble any datagram before forwarding it. DISCUSSION: A few people have suggested that there might be some topologies where reassembly of transit datagrams by routers might improve performance. In general, however, the fact that fragments may take different paths to the destination precludes safe use of such a feature. Nothing in this section should be construed to control or limit fragmentation or reassembly performed as a link layer function by the router.
5.2.7 Internet Control Message Protocol - ICMP General requirements for ICMP were discussed in Section [4.3]. This section discusses ICMP messages which are sent only by routers. 5.2.7.1 Destination Unreachable The ICMP Destination Unreachable message is sent by a router in response to a packet which it cannot forward because the destination (or next hop) is unreachable or a service is unavailable A router MUST be able to generate ICMP Destination Unreachable messages and SHOULD choose a response code that most closely matches the reason why the message is being generated. The following codes are defined in [INTERNET:8] and [INTRO:2]: 0 = Network Unreachable - generated by a router if a forwarding path (route) to the destination network is not available; 1 = Host Unreachable - generated by a router if a forwarding path (route) to the destination host on a directly connected network is not available; 2 = Protocol Unreachable - generated if the transport protocol designated in a datagram is not supported in the transport layer of the final destination; 3 = Port Unreachable - generated if the designated transport protocol (e.g. UDP) is unable to demultiplex the datagram in the transport layer of the final destination but has no protocol mechanism to inform the sender; 4 = Fragmentation Needed and DF Set - generated if a router needs to fragment a datagram but cannot since the DF flag is set; 5 = Source Route Failed - generated if a router cannot forward a packet to the next hop in a source route option; 6 = Destination Network Unknown - This code SHOULD NOT be generated since it would imply on the part of the router that the destination network does not exist (net unreachable code 0 SHOULD be used in place of code 6);
7 = Destination Host Unknown - generated only when a router can determine (from link layer advice) that the destination host does not exist; 11 = Network Unreachable For Type Of Service - generated by a router if a forwarding path (route) to the destination network with the requested or default TOS is not available; 12 = Host Unreachable For Type Of Service - generated if a router cannot forward a packet because its route(s) to the destination do not match either the TOS requested in the datagram or the default TOS (0). The following additional codes are hereby defined: 13 = Communication Administratively Prohibited - generated if a router cannot forward a packet due to administrative filtering; 14 = Host Precedence Violation. Sent by the first hop router to a host to indicate that a requested precedence is not permitted for the particular combination of source/destination host or network, upper layer protocol, and source/destination port; 15 = Precedence cutoff in effect. The network operators have imposed a minimum level of precedence required for operation, the datagram was sent with a precedence below this level; NOTE: [INTRO:2] defined Code 8 for source host isolated. Routers SHOULD NOT generate Code 8; whichever of Codes 0 (Network Unreachable) and 1 (Host Unreachable) is appropriate SHOULD be used instead. [INTRO:2] also defined Code 9 for communication with destination network administratively prohibited and Code 10 for communication with destination host administratively prohibited. These codes were intended for use by end-to-end encryption devices used by U.S military agencies. Routers SHOULD use the newly defined Code 13 (Communication Administratively Prohibited) if they administratively filter packets. Routers MAY have a configuration option that causes Code 13 (Communication Administratively Prohibited) messages not to be generated. When this option is enabled, no ICMP error message is sent in response to a packet which is dropped because its
forwarding is administratively prohibited. Similarly, routers MAY have a configuration option that causes Code 14 (Host Precedence Violation) and Code 15 (Precedence Cutoff in Effect) messages not to be generated. When this option is enabled, no ICMP error message is sent in response to a packet which is dropped because of a precedence violation. Routers MUST use Host Unreachable or Destination Host Unknown codes whenever other hosts on the same destination network might be reachable; otherwise, the source host may erroneously conclude that all hosts on the network are unreachable, and that may not be the case. [INTERNET:14] describes a slight modification the form of Destination Unreachable messages containing Code 4 (Fragmentation needed and DF set). A router MUST use this modified form when originating Code 4 Destination Unreachable messages. 5.2.7.2 Redirect The ICMP Redirect message is generated to inform a host on the same subnet that the router used by the host to route certain packets should be changed. Routers MUST NOT generate the Redirect for Network or Redirect for Network and Type of Service messages (Codes 0 and 2) specified in [INTERNET:8]. Routers MUST be able to generate the Redirect for Host message (Code 1) and SHOULD be able to generate the Redirect for Type of Service and Host message (Code 3) specified in [INTERNET:8]. DISCUSSION: If the directly-connected network is not subnetted, a router can normally generate a network Redirect which applies to all hosts on a specified remote network. Using a network rather than a host Redirect may economize slightly on network traffic and on host routing table storage. However, the savings are not significant, and subnets create an ambiguity about the subnet mask to be used to interpret a network Redirect. In a general subnet environment, it is difficult to specify precisely the cases in which network Redirects can be used. Therefore, routers must send only host (or host and type of service) Redirects. A Code 3 (Redirect for Host and Type of Service) message is
generated when the packet provoking the redirect has a destination for which the path chosen by the router would depend (in part) on the TOS requested. Routers which can generate Code 3 redirects (Host and Type of Service) MUST have a configuration option (which defaults to on) to enable Code 1 (Host) redirects to be substituted for Code 3 redirects. A router MUST send a Code 1 Redirect in place of a Code 3 Redirect if it has been configured to do so. If a router is not able to generate Code 3 Redirects then it MUST generate Code 1 Redirects in situations where a Code 3 Redirect is called for. Routers MUST NOT generate a Redirect Message unless all of the following conditions are met: o The packet is being forwarded out the same physical interface that it was received from, o The IP source address in the packet is on the same Logical IP (sub)network as the next-hop IP address, and o The packet does not contain an IP source route option. The source address used in the ICMP Redirect MUST belong to the same logical (sub)net as the destination address. A router using a routing protocol (other than static routes) MUST NOT consider paths learned from ICMP Redirects when forwarding a packet. If a router is not using a routing protocol, a router MAY have a configuration which, if set, allows the router to consider routes learned via ICMP Redirects when forwarding packets. DISCUSSION: ICMP Redirect is a mechanism for routers to convey routing information to hosts. Routers use other mechanisms to learn routing information, and therefore have no reason to obey redirects. Believing a redirect which contradicted the router's other information would likely create routing loops. On the other hand, when a router is not acting as a router, it MUST comply with the behavior required of a host.
5.2.7.3 Time Exceeded A router MUST generate a Time Exceeded message Code 0 (In Transit) when it discards a packet due to an expired TTL field. A router MAY have a per-interface option to disable origination of these messages on that interface, but that option MUST default to allowing the messages to be originated. 5.2.8 INTERNET GROUP MANAGEMENT PROTOCOL - IGMP IGMP [INTERNET:4] is a protocol used between hosts and multicast routers on a single physical network to establish hosts' membership in particular multicast groups. Multicast routers use this information, in conjunction with a multicast routing protocol, to support IP multicast forwarding across the Internet. A router SHOULD implement the multicast router part of IGMP. 5.3 SPECIFIC ISSUES 5.3.1 Time to Live (TTL) The Time-to-Live (TTL) field of the IP header is defined to be a timer limiting the lifetime of a datagram. It is an 8-bit field and the units are seconds. Each router (or other module) that handles a packet MUST decrement the TTL by at least one, even if the elapsed time was much less than a second. Since this is very often the case, the TTL is effectively a hop count limit on how far a datagram can propagate through the Internet. When a router forwards a packet, it MUST reduce the TTL by at least one. If it holds a packet for more than one second, it MAY decrement the TTL by one for each second. If the TTL is reduced to zero (or less), the packet MUST be discarded, and if the destination is not a multicast address the router MUST send an ICMP Time Exceeded message, Code 0 (TTL Exceeded in Transit) message to the source. Note that a router MUST NOT discard an IP unicast or broadcast packet with a non-zero TTL merely because it can predict that another router on the path to the packet's final destination will decrement the TTL to zero. However, a router MAY do so for IP multicasts, in order to more efficiently implement IP multicast's expanding ring search algorithm (see [INTERNET:4]).
DISCUSSION: The IP TTL is used, somewhat schizophrenically, as both a hop count limit and a time limit. Its hop count function is critical to ensuring that routing problems can't melt down the network by causing packets to loop infinitely in the network. The time limit function is used by transport protocols such as TCP to ensure reliable data transfer. Many current implementations treat TTL as a pure hop count, and in parts of the Internet community there is a strong sentiment that the time limit function should instead be performed by the transport protocols that need it. In this specification, we have reluctantly decided to follow the strong belief among the router vendors that the time limit function should be optional. They argued that implementation of the time limit function is difficult enough that it is currently not generally done. They further pointed to the lack of documented cases where this shortcut has caused TCP to corrupt data (of course, we would expect the problems created to be rare and difficult to reproduce, so the lack of documented cases provides little reassurance that there haven't been a number of undocumented cases). IP multicast notions such as the expanding ring search may not work as expected unless the TTL is treated as a pure hop count. The same thing is somewhat true of traceroute. ICMP Time Exceeded messages are required because the traceroute diagnostic tool depends on them. Thus, the tradeoff is between severely crippling, if not eliminating, two very useful tools vs. a very rare and transient data transport problem (which may not occur at all). 5.3.2 Type of Service (TOS) The Type-of-Service byte in the IP header is divided into three sections: the Precedence field (high-order 3 bits), a field that is customarily called Type of Service or "TOS (next 4 bits), and a reserved bit (the low order bit). Rules governing the reserved bit were described in Section [4.2.2.3]. The Precedence field will be discussed in Section [5.3.3]. A more extensive discussion of the TOS field and its use can be found in [ROUTE:11]. A router SHOULD consider the TOS field in a packet's IP header when deciding how to forward it. The remainder of this section
describes the rules that apply to routers that conform to this requirement. A router MUST maintain a TOS value for each route in its routing table. Routes learned via a routing protocol which does not support TOS MUST be assigned a TOS of zero (the default TOS). To choose a route to a destination, a router MUST use an algorithm equivalent to the following: (1) The router locates in its routing table all available routes to the destination (see Section [5.2.4]). (2) If there are none, the router drops the packet because the destination is unreachable. See section [5.2.4]. (3) If one or more of those routes have a TOS that exactly matches the TOS specified in the packet, the router chooses the route with the best metric. (4) Otherwise, the router repeats the above step, except looking at routes whose TOS is zero. (5) If no route was chosen above, the router drops the packet because the destination is unreachable. The router returns an ICMP Destination Unreachable error specifying the appropriate code: either Network Unreachable with Type of Service (code 11) or Host Unreachable with Type of Service (code 12). DISCUSSION: Although TOS has been little used in the past, its use by hosts is now mandated by the Requirements for Internet Hosts RFCs ([INTRO:2] and [INTRO:3]). Support for TOS in routers may become a MUST in the future, but is a SHOULD for now until we get more experience with it and can better judge both its benefits and its costs. Various people have proposed that TOS should affect other aspects of the forwarding function. For example: (1) A router could place packets which have the Low Delay bit set ahead of other packets in its output queues. (2) a router is forced to discard packets, it could try to avoid discarding those which have the High Reliability bit set.
These ideas have been explored in more detail in [INTERNET:17] but we don't yet have enough experience with such schemes to make requirements in this area. 5.3.3 IP Precedence This section specifies requirements and guidelines for appropriate processing of the IP Precedence field in routers. Precedence is a scheme for allocating resources in the network based on the relative importance of different traffic flows. The IP specification defines specific values to be used in this field for various types of traffic. The basic mechanisms for precedence processing in a router are preferential resource allocation, including both precedence- ordered queue service and precedence-based congestion control, and selection of Link Layer priority features. The router also selects the IP precedence for routing, management and control traffic it originates. For a more extensive discussion of IP Precedence and its implementation see [FORWARD:6]. Precedence-ordered queue service, as discussed in this section, includes but is not limited to the queue for the forwarding process and queues for outgoing links. It is intended that a router supporting precedence should also use the precedence indication at whatever points in its processing are concerned with allocation of finite resources, such as packet buffers or Link Layer connections. The set of such points is implementation- dependent. DISCUSSION: Although the Precedence field was originally provided for use in DOD systems where large traffic surges or major damage to the network are viewed as inherent threats, it has useful applications for many non-military IP networks. Although the traffic handling capacity of networks has grown greatly in recent years, the traffic generating ability of the users has also grown, and network overload conditions still occur at times. Since IP-based routing and management protocols have become more critical to the successful operation of the Internet, overloads present two additional risks to the network: (1) High delays may result in routing protocol packets being lost. This may cause the routing protocol to falsely deduce a topology change and propagate this false
information to other routers. Not only can this cause routes to oscillate, but an extra processing burden may be placed on other routers. (2) High delays may interfere with the use of network management tools to analyze and perhaps correct or relieve the problem in the network that caused the overload condition to occur. Implementation and appropriate use of the Precedence mechanism alleviates both of these problems. 5.3.3.1 Precedence-Ordered Queue Service Routers SHOULD implement precedence-ordered queue service. Precedence-ordered queue service means that when a packet is selected for output on a (logical) link, the packet of highest precedence that has been queued for that link is sent. Routers that implement precedence-ordered queue service MUST also have a configuration option to suppress precedence-ordered queue service in the Internet Layer. Any router MAY implement other policy-based throughput management procedures that result in other than strict precedence ordering, but it MUST be configurable to suppress them (i.e., use strict ordering). As detailed in Section [5.3.6], routers that implement precedence-ordered queue service discard low precedence packets before discarding high precedence packets for congestion control purposes. Preemption (interruption of processing or transmission of a packet) is not envisioned as a function of the Internet Layer. Some protocols at other layers may provide preemption features. 5.3.3.2 Lower Layer Precedence Mappings Routers that implement precedence-ordered queueing MUST IMPLEMENT, and other routers SHOULD IMPLEMENT, Lower Layer Precedence Mapping. A router which implements Lower Layer Precedence Mapping: o MUST be able to map IP Precedence to Link Layer priority mechanisms for link layers that have such a feature defined.
o MUST have a configuration option to select the Link Layer's default priority treatment for all IP traffic o SHOULD be able to configure specific nonstandard mappings of IP precedence values to Link Layer priority values for each interface. DISCUSSION: Some research questions the workability of the priority features of some Link Layer protocols, and some networks may have faulty implementations of the link layer priority mechanism. It seems prudent to provide an escape mechanism in case such problems show up in a network. On the other hand, there are proposals to use novel queueing strategies to implement special services such as low-delay service. Special services and queueing strategies to support them need further research and experimentation before they are put into widespread use in the Internet. Since these requirements are intended to encourage (but not force) the use of precedence features in the hope of providing better Internet service to all users, routers supporting precedence-ordered queue service should default to maintaining strict precedence ordering regardless of the type of service requested. Implementors may wish to consider that correct link layer mapping of IP precedence is required by DOD policy for TCP/IP systems used on DOD networks. 5.3.3.3 Precedence Handling For All Routers A router (whether or not it employs precedence-ordered queue service): (1) MUST accept and process incoming traffic of all precedence levels normally, unless it has been administratively configured to do otherwise. (2) MAY implement a validation filter to administratively restrict the use of precedence levels by particular traffic sources. If provided, this filter MUST NOT filter out or cut off the following sorts of ICMP error messages: Destination Unreachable, Redirect, Time Exceeded, and Parameter Problem. If this filter is provided, the procedures required for packet filtering by addresses are
required for this filter also. DISCUSSION: Precedence filtering should be applicable to specific source/destination IP Address pairs, specific protocols, specific ports, and so on. An ICMP Destination Unreachable message with code 14 SHOULD be sent when a packet is dropped by the validation filter, unless this has been suppressed by configuration choice. (3) MAY implement a cutoff function which allows the router to be set to refuse or drop traffic with precedence below a specified level. This function may be activated by management actions or by some implementation dependent heuristics, but there MUST be a configuration option to disable any heuristic mechanism that operates without human intervention. An ICMP Destination Unreachable message with code 15 SHOULD be sent when a packet is dropped by the cutoff function, unless this has been suppressed by configuration choice. A router MUST NOT refuse to forward datagrams with IP precedence of 6 (Internetwork Control) or 7 (Network Control) solely due to precedence cutoff. However, other criteria may be used in conjunction with precedence cutoff to filter high precedence traffic. DISCUSSION: Unrestricted precedence cutoff could result in an unintentional cutoff of routing and control traffic. In general, host traffic should be restricted to a value of 5 (CRITIC/ECP) or below although this is not a requirement and may not be valid in certain systems. (4) MUST NOT change precedence settings on packets it did not originate. (5) SHOULD be able to configure distinct precedence values to be used for each routing or management protocol supported (except for those protocols, such as OSPF, which specify which precedence value must be used). (6) MAY be able to configure routing or management traffic precedence values independently for each peer address.
(7) MUST respond appropriately to Link Layer precedence- related error indications where provided. An ICMP Destination Unreachable message with code 15 SHOULD be sent when a packet is dropped because a link cannot accept it due to a precedence-related condition, unless this has been suppressed by configuration choice. DISCUSSION: The precedence cutoff mechanism described in (3) is somewhat controversial. Depending on the topological location of the area affected by the cutoff, transit traffic may be directed by routing protocols into the area of the cutoff, where it will be dropped. This is only a problem if another path which is unaffected by the cutoff exists between the communicating points. Proposed ways of avoiding this problem include providing some minimum bandwidth to all precedence levels even under overload conditions, or propagating cutoff information in routing protocols. In the absence of a widely accepted (and implemented) solution to this problem, great caution is recommended in activating cutoff mechanisms in transit networks. A transport layer relay could legitimately provide the function prohibited by (4) above. Changing precedence levels may cause subtle interactions with TCP and perhaps other protocols; a correct design is a non- trivial task. The intent of (5) and (6) (and the discussion of IP Precedence in ICMP messages in Section [4.3.2]) is that the IP precedence bits should be appropriately set, whether or not this router acts upon those bits in any other way. We expect that in the future specifications for routing protocols and network management protocols will specify how the IP Precedence should be set for messages sent by those protocols. The appropriate response for (7) depends on the link layer protocol in use. Typically, the router should stop trying to send offensive traffic to that destination for some period of time, and should return an ICMP Destination Unreachable message with code 15 (service not available for precedence requested) to the traffic source. It also should not try to reestablish a preempted Link Layer connection for some period of time.
5.3.4 Forwarding of Link Layer Broadcasts The encapsulation of IP packets in most Link Layer protocols (except PPP) allows a receiver to distinguish broadcasts and multicasts from unicasts simply by examining the Link Layer protocol headers (most commonly, the Link Layer destination address). The rules in this section which refer to Link Layer broadcasts apply only to Link Layer protocols which allow broadcasts to be distinguished; likewise, the rules which refer to Link Layer multicasts apply only to Link Layer protocols which allow multicasts to be distinguished. A router MUST NOT forward any packet which the router received as a Link Layer broadcast (even if the IP destination address is also some form of broadcast address) unless the packet is an all- subnets-directed broadcast being forwarded as specified in [INTERNET:3]. DISCUSSION: As noted in Section [5.3.5.3], forwarding of all-subnets- directed broadcasts in accordance with [INTERNET:3] is optional and is not something that routers do by default. A router MUST NOT forward any packet which the router received as a Link Layer multicast unless the packet's destination address is an IP multicast address. A router SHOULD silently discard a packet that is received via a Link Layer broadcast but does not specify an IP multicast or IP broadcast destination address. When a router sends a packet as a Link Layer broadcast, the IP destination address MUST be a legal IP broadcast or IP multicast address. 5.3.5 Forwarding of Internet Layer Broadcasts There are two major types of IP broadcast addresses; limited broadcast and directed broadcast. In addition, there are three subtypes of directed broadcast; a broadcast directed to a specified network, a broadcast directed to a specified subnetwork, and a broadcast directed to all subnets of a specified network. Classification by a router of a broadcast into one of these categories depends on the broadcast address and on the router's understanding (if any) of the subnet structure of the destination network. The same broadcast will be classified differently by different routers.
A limited IP broadcast address is defined to be all-ones: { -1, -1 } or 255.255.255.255. A net-directed broadcast is composed of the network portion of the IP address with a local part of all-ones, { <Network-number>, -1 }. For example, a Class A net broadcast address is net.255.255.255, a Class B net broadcast address is net.net.255.255 and a Class C net broadcast address is net.net.net.255 where net is a byte of the network address. An all-subnets-directed broadcast is composed of the network part of the IP address with a subnet and a host part of all-ones, { <Network-number>, -1, -1 }. For example, an all-subnets broadcast on a subnetted class B network is net.net.255.255. A network must be known to be subnetted and the subnet part must be all-ones before a broadcast can be classified as all-subnets-directed. A subnet-directed broadcast address is composed of the network and subnet part of the IP address with a host part of all-ones, { <Network-number>, <Subnet-number>, -1 }. For example, a subnet- directed broadcast to subnet 2 of a class B network might be net.net.2.255 (if the subnet mask was 255.255.255.0) or net.net.1.127 (if the subnet mask was 255.255.255.128). A network must be known to be subnetted and the net and subnet part must not be all-ones before an IP broadcast can be classified as subnet- directed. As was described in Section [4.2.3.1], a router may encounter certain non-standard IP broadcast addresses: o 0.0.0.0 is an obsolete form of the limited broadcast address o { broadcast address. o { broadcast address. o { form of a subnet-directed broadcast address. As was described in that section, packets addressed to any of these addresses SHOULD be silently discarded, but if they are not, they MUST be treated in accordance with the same rules that apply to packets addressed to the non-obsolete forms of the broadcast addresses described above. These rules are described in the next few sections.