Tech-invite3GPPspaceIETFspace
9796959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 3168

The Addition of Explicit Congestion Notification (ECN) to IP

Pages: 63
Proposed Standard
Errata
Obsoletes:  2481
Updates:  2003247424010793
Updated by:  430160408311
Part 2 of 3 – Pages 12 to 38
First   Prev   Next

Top   ToC   RFC3168 - Page 12   prevText

6. Support from the Transport Protocol

ECN requires support from the transport protocol, in addition to the functionality given by the ECN field in the IP packet header. The transport protocol might require negotiation between the endpoints during setup to determine that all of the endpoints are ECN-capable, so that the sender can set the ECT codepoint in transmitted packets. Second, the transport protocol must be capable of reacting appropriately to the receipt of CE packets. This reaction could be in the form of the data receiver informing the data sender of the received CE packet (e.g., TCP), of the data receiver unsubscribing to a layered multicast group (e.g., RLM [MJV96]), or of some other action that ultimately reduces the arrival rate of that flow on that congested link. CE packets indicate persistent rather than transient congestion (see Section 5.1), and hence reactions to the receipt of CE packets should be those appropriate for persistent congestion. This document only addresses the addition of ECN Capability to TCP, leaving issues of ECN in other transport protocols to further research. For TCP, ECN requires three new pieces of functionality:
Top   ToC   RFC3168 - Page 13
   negotiation between the endpoints during connection setup to
   determine if they are both ECN-capable; an ECN-Echo (ECE) flag in the
   TCP header so that the data receiver can inform the data sender when
   a CE packet has been received; and a Congestion Window Reduced (CWR)
   flag in the TCP header so that the data sender can inform the data
   receiver that the congestion window has been reduced. The support
   required from other transport protocols is likely to be different,
   particularly for unreliable or reliable multicast transport
   protocols, and will have to be determined as other transport
   protocols are brought to the IETF for standardization.

   In a mild abuse of terminology, in this document we refer to `TCP
   packets' instead of `TCP segments'.

6.1. TCP

The following sections describe in detail the proposed use of ECN in TCP. This proposal is described in essentially the same form in [Floyd94]. We assume that the source TCP uses the standard congestion control algorithms of Slow-start, Fast Retransmit and Fast Recovery [RFC2581]. This proposal specifies two new flags in the Reserved field of the TCP header. The TCP mechanism for negotiating ECN-Capability uses the ECN-Echo (ECE) flag in the TCP header. Bit 9 in the Reserved field of the TCP header is designated as the ECN-Echo flag. The location of the 6-bit Reserved field in the TCP header is shown in Figure 4 of RFC 793 [RFC793] (and is reproduced below for completeness). This specification of the ECN Field leaves the Reserved field as a 4-bit field using bits 4-7. To enable the TCP receiver to determine when to stop setting the ECN-Echo flag, we introduce a second new flag in the TCP header, the CWR flag. The CWR flag is assigned to Bit 8 in the Reserved field of the TCP header. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | | | U | A | P | R | S | F | | Header Length | Reserved | R | C | S | S | Y | I | | | | G | K | H | T | N | N | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Figure 3: The old definition of bytes 13 and 14 of the TCP header.
Top   ToC   RFC3168 - Page 14
        0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
      |               |               | C | E | U | A | P | R | S | F |
      | Header Length |    Reserved   | W | C | R | C | S | S | Y | I |
      |               |               | R | E | G | K | H | T | N | N |
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

      Figure 4: The new definition of bytes 13 and 14 of the TCP
                Header.

   Thus, ECN uses the ECT and CE flags in the IP header (as shown in
   Figure 1) for signaling between routers and connection endpoints, and
   uses the ECN-Echo and CWR flags in the TCP header (as shown in Figure
   4) for TCP-endpoint to TCP-endpoint signaling.  For a TCP connection,
   a typical sequence of events in an ECN-based reaction to congestion
   is as follows:

      * An ECT codepoint is set in packets transmitted by the sender to
        indicate that ECN is supported by the transport entities for
        these packets.

      * An ECN-capable router detects impending congestion and detects
        that an ECT codepoint is set in the packet it is about to drop.
        Instead of dropping the packet, the router chooses to set the CE
        codepoint in the IP header and forwards the packet.

      * The receiver receives the packet with the CE codepoint set, and
        sets the ECN-Echo flag in its next TCP ACK sent to the sender.

      * The sender receives the TCP ACK with ECN-Echo set, and reacts to
        the congestion as if a packet had been dropped.

      * The sender sets the CWR flag in the TCP header of the next
        packet sent to the receiver to acknowledge its receipt of and
        reaction to the ECN-Echo flag.

   The negotiation for using ECN by the TCP transport entities and the
   use of the ECN-Echo and CWR flags is described in more detail in the
   sections below.

6.1.1 TCP Initialization

In the TCP connection setup phase, the source and destination TCPs exchange information about their willingness to use ECN. Subsequent to the completion of this negotiation, the TCP sender sets an ECT codepoint in the IP header of data packets to indicate to the network that the transport is capable and willing to participate in ECN for this packet. This indicates to the routers that they may mark this
Top   ToC   RFC3168 - Page 15
   packet with the CE codepoint, if they would like to use that as a
   method of congestion notification. If the TCP connection does not
   wish to use ECN notification for a particular packet, the sending TCP
   sets the ECN codepoint to not-ECT, and the TCP receiver ignores the
   CE codepoint in the received packet.

   For this discussion, we designate the initiating host as Host A and
   the responding host as Host B.  We call a SYN packet with the ECE and
   CWR flags set an "ECN-setup SYN packet", and we call a SYN packet
   with at least one of the ECE and CWR flags not set a "non-ECN-setup
   SYN packet".  Similarly, we call a SYN-ACK packet with only the ECE
   flag set but the CWR flag not set an "ECN-setup SYN-ACK packet", and
   we call a SYN-ACK packet with any other configuration of the ECE and
   CWR flags a "non-ECN-setup SYN-ACK packet".

   Before a TCP connection can use ECN, Host A sends an ECN-setup SYN
   packet, and Host B sends an ECN-setup SYN-ACK packet.  For a SYN
   packet, the setting of both ECE and CWR in the ECN-setup SYN packet
   is defined as an indication that the sending TCP is ECN-Capable,
   rather than as an indication of congestion or of response to
   congestion. More precisely, an ECN-setup SYN packet indicates that
   the TCP implementation transmitting the SYN packet will participate
   in ECN as both a sender and receiver.  Specifically, as a receiver,
   it will respond to incoming data packets that have the CE codepoint
   set in the IP header by setting ECE in outgoing TCP Acknowledgement
   (ACK) packets.  As a sender, it will respond to incoming packets that
   have ECE set by reducing the congestion window and setting CWR when
   appropriate.  An ECN-setup SYN packet does not commit the TCP sender
   to setting the ECT codepoint in any or all of the packets it may
   transmit.  However, the commitment to respond appropriately to
   incoming packets with the CE codepoint set remains even if the TCP
   sender in a later transmission, within this TCP connection, sends a
   SYN packet without ECE and CWR set.

   When Host B sends an ECN-setup SYN-ACK packet, it sets the ECE flag
   but not the CWR flag.  An ECN-setup SYN-ACK packet is defined as an
   indication that the TCP transmitting the SYN-ACK packet is ECN-
   Capable.  As with the SYN packet, an ECN-setup SYN-ACK packet does
   not commit the TCP host to setting the ECT codepoint in transmitted
   packets.

   The following rules apply to the sending of ECN-setup packets within
   a TCP connection, where a TCP connection is defined by the standard
   rules for TCP connection establishment and termination.

      * If a host has received an ECN-setup SYN packet, then it MAY send
        an ECN-setup SYN-ACK packet.  Otherwise, it MUST NOT send an
        ECN-setup SYN-ACK packet.
Top   ToC   RFC3168 - Page 16
      * A host MUST NOT set ECT on data packets unless it has sent at
        least one ECN-setup SYN or ECN-setup SYN-ACK packet, and has
        received at least one ECN-setup SYN or ECN-setup SYN-ACK packet,
        and has sent no non-ECN-setup SYN or non-ECN-setup SYN-ACK
        packet.  If a host has received at least one non-ECN-setup SYN
        or non-ECN-setup SYN-ACK packet, then it SHOULD NOT set ECT on
        data packets.

      * If a host ever sets the ECT codepoint on a data packet, then
        that host MUST correctly set/clear the CWR TCP bit on all
        subsequent packets in the connection.

      * If a host has sent at least one ECN-setup SYN or ECN-setup SYN-
        ACK packet, and has received no non-ECN-setup SYN or non-ECN-
        setup SYN-ACK packet, then if that host receives TCP data
        packets with ECT and CE codepoints set in the IP header, then
        that host MUST process these packets as specified for an ECN-
        capable connection.

      * A host that is not willing to use ECN on a TCP connection SHOULD
        clear both the ECE and CWR flags in all non-ECN-setup SYN and/or
        SYN-ACK packets that it sends to indicate this unwillingness.
        Receivers MUST correctly handle all forms of the non-ECN-setup
        SYN and SYN-ACK packets.

      * A host MUST NOT set ECT on SYN or SYN-ACK packets.

   A TCP client enters TIME-WAIT state after receiving a FIN-ACK, and
   transitions to CLOSED state after a timeout.  Many TCP
   implementations create a new TCP connection if they receive an in-
   window SYN packet during TIME-WAIT state.  When a TCP host enters
   TIME-WAIT or CLOSED state, it should ignore any previous state about
   the negotiation of ECN for that connection.

6.1.1.1. Middlebox Issues
ECN introduces the use of the ECN-Echo and CWR flags in the TCP header (as shown in Figure 3) for initialization. There exist some faulty firewalls, load balancers, and intrusion detection systems in the Internet that either drop an ECN-setup SYN packet or respond with a RST, in the belief that such a packet (with these bits set) is a signature for a port-scanning tool that could be used in a denial- of-service attack. Some of the offending equipment has been identified, and a web page [FIXES] contains a list of non-compliant products and the fixes posted by the vendors, where these are available. The TBIT web page [TBIT] lists some of the web servers affected by this faulty equipment. We mention this in this document as a warning to the community of this problem.
Top   ToC   RFC3168 - Page 17
   To provide robust connectivity even in the presence of such faulty
   equipment, a host that receives a RST in response to the transmission
   of an ECN-setup SYN packet MAY resend a SYN with CWR and ECE cleared.
   This could result in a TCP connection being established without using
   ECN.

   A host that receives no reply to an ECN-setup SYN within the normal
   SYN retransmission timeout interval MAY resend the SYN and any
   subsequent SYN retransmissions with CWR and ECE cleared.  To overcome
   normal packet loss that results in the original SYN being lost, the
   originating host may retransmit one or more ECN-setup SYN packets
   before giving up and retransmitting the SYN with the CWR and ECE bits
   cleared.

   We note that in this case, the following example scenario is
   possible:

   (1) Host A: Sends an ECN-setup SYN.
   (2) Host B: Sends an ECN-setup SYN/ACK, packet is dropped or delayed.
   (3) Host A: Sends a non-ECN-setup SYN.
   (4) Host B: Sends a non-ECN-setup SYN/ACK.

   We note that in this case, following the procedures above, neither
   Host A nor Host B may set the ECT bit on data packets.  Further, an
   important consequence of the rules for ECN setup and usage in Section
   6.1.1 is that a host is forbidden from using the reception of ECT
   data packets as an implicit signal that the other host is ECN-
   capable.

6.1.1.2. Robust TCP Initialization with an Echoed Reserved Field
There is the question of why we chose to have the TCP sending the SYN set two ECN-related flags in the Reserved field of the TCP header for the SYN packet, while the responding TCP sending the SYN-ACK sets only one ECN-related flag in the SYN-ACK packet. This asymmetry is necessary for the robust negotiation of ECN-capability with some deployed TCP implementations. There exists at least one faulty TCP implementation in which TCP receivers set the Reserved field of the TCP header in ACK packets (and hence the SYN-ACK) simply to reflect the Reserved field of the TCP header in the received data packet. Because the TCP SYN packet sets the ECN-Echo and CWR flags to indicate ECN-capability, while the SYN-ACK packet sets only the ECN- Echo flag, the sending TCP correctly interprets a receiver's reflection of its own flags in the Reserved field as an indication that the receiver is not ECN-capable. The sending TCP is not mislead by a faulty TCP implementation sending a SYN-ACK packet that simply reflects the Reserved field of the incoming SYN packet.
Top   ToC   RFC3168 - Page 18

6.1.2. The TCP Sender

For a TCP connection using ECN, new data packets are transmitted with an ECT codepoint set in the IP header. When only one ECT codepoint is needed by a sender for all packets sent on a TCP connection, ECT(0) SHOULD be used. If the sender receives an ECN-Echo (ECE) ACK packet (that is, an ACK packet with the ECN-Echo flag set in the TCP header), then the sender knows that congestion was encountered in the network on the path from the sender to the receiver. The indication of congestion should be treated just as a congestion loss in non- ECN-Capable TCP. That is, the TCP source halves the congestion window "cwnd" and reduces the slow start threshold "ssthresh". The sending TCP SHOULD NOT increase the congestion window in response to the receipt of an ECN-Echo ACK packet. TCP should not react to congestion indications more than once every window of data (or more loosely, more than once every round-trip time). That is, the TCP sender's congestion window should be reduced only once in response to a series of dropped and/or CE packets from a single window of data. In addition, the TCP source should not decrease the slow-start threshold, ssthresh, if it has been decreased within the last round trip time. However, if any retransmitted packets are dropped, then this is interpreted by the source TCP as a new instance of congestion. After the source TCP reduces its congestion window in response to a CE packet, incoming acknowledgments that continue to arrive can "clock out" outgoing packets as allowed by the reduced congestion window. If the congestion window consists of only one MSS (maximum segment size), and the sending TCP receives an ECN-Echo ACK packet, then the sending TCP should in principle still reduce its congestion window in half. However, the value of the congestion window is bounded below by a value of one MSS. If the sending TCP were to continue to send, using a congestion window of 1 MSS, this results in the transmission of one packet per round-trip time. It is necessary to still reduce the sending rate of the TCP sender even further, on receipt of an ECN-Echo packet when the congestion window is one. We use the retransmit timer as a means of reducing the rate further in this circumstance. Therefore, the sending TCP MUST reset the retransmit timer on receiving the ECN-Echo packet when the congestion window is one. The sending TCP will then be able to send a new packet only when the retransmit timer expires. When an ECN-Capable TCP sender reduces its congestion window for any reason (because of a retransmit timeout, a Fast Retransmit, or in response to an ECN Notification), the TCP sender sets the CWR flag in the TCP header of the first new data packet sent after the window reduction. If that data packet is dropped in the network, then the
Top   ToC   RFC3168 - Page 19
   sending TCP will have to reduce the congestion window again and
   retransmit the dropped packet.

   We ensure that the "Congestion Window Reduced" information is
   reliably delivered to the TCP receiver.  This comes about from the
   fact that if the new data packet carrying the CWR flag is dropped,
   then the TCP sender will have to again reduce its congestion window,
   and send another new data packet with the CWR flag set.  Thus, the
   CWR bit in the TCP header SHOULD NOT be set on retransmitted packets.

   When the TCP data sender is ready to set the CWR bit after reducing
   the congestion window, it SHOULD set the CWR bit only on the first
   new data packet that it transmits.

   [Floyd94] discusses TCP's response to ECN in more detail.  [Floyd98]
   discusses the validation test in the ns simulator, which illustrates
   a wide range of ECN scenarios. These scenarios include the following:
   an ECN followed by another ECN, a Fast Retransmit, or a Retransmit
   Timeout; a Retransmit Timeout or a Fast Retransmit followed by an
   ECN; and a congestion window of one packet followed by an ECN.

   TCP follows existing algorithms for sending data packets in response
   to incoming ACKs, multiple duplicate acknowledgments, or retransmit
   timeouts [RFC2581].  TCP also follows the normal procedures for
   increasing the congestion window when it receives ACK packets without
   the ECN-Echo bit set [RFC2581].

6.1.3. The TCP Receiver

When TCP receives a CE data packet at the destination end-system, the TCP data receiver sets the ECN-Echo flag in the TCP header of the subsequent ACK packet. If there is any ACK withholding implemented, as in current "delayed-ACK" TCP implementations where the TCP receiver can send an ACK for two arriving data packets, then the ECN-Echo flag in the ACK packet will be set to '1' if the CE codepoint is set in any of the data packets being acknowledged. That is, if any of the received data packets are CE packets, then the returning ACK has the ECN-Echo flag set. To provide robustness against the possibility of a dropped ACK packet carrying an ECN-Echo flag, the TCP receiver sets the ECN-Echo flag in a series of ACK packets sent subsequently. The TCP receiver uses the CWR flag received from the TCP sender to determine when to stop setting the ECN-Echo flag. After a TCP receiver sends an ACK packet with the ECN-Echo bit set, that TCP receiver continues to set the ECN-Echo flag in all the ACK packets it sends (whether they acknowledge CE data packets or non-CE
Top   ToC   RFC3168 - Page 20
   data packets) until it receives a CWR packet (a packet with the CWR
   flag set).  After the receipt of the CWR packet, acknowledgments for
   subsequent non-CE data packets do not have the ECN-Echo flag set. If
   another CE packet is received by the data receiver, the receiver
   would once again send ACK packets with the ECN-Echo flag set.  While
   the receipt of a CWR packet does not guarantee that the data sender
   received the ECN-Echo message, this does suggest that the data sender
   reduced its congestion window at some point *after* it sent the data
   packet for which the CE codepoint was set.

   We have already specified that a TCP sender is not required to reduce
   its congestion window more than once per window of data.  Some care
   is required if the TCP sender is to avoid unnecessary reductions of
   the congestion window when a window of data includes both dropped
   packets and (marked) CE packets.  This is illustrated in [Floyd98].

6.1.4. Congestion on the ACK-path

For the current generation of TCP congestion control algorithms, pure acknowledgement packets (e.g., packets that do not contain any accompanying data) MUST be sent with the not-ECT codepoint. Current TCP receivers have no mechanisms for reducing traffic on the ACK-path in response to congestion notification. Mechanisms for responding to congestion on the ACK-path are areas for current and future research. (One simple possibility would be for the sender to reduce its congestion window when it receives a pure ACK packet with the CE codepoint set). For current TCP implementations, a single dropped ACK generally has only a very small effect on the TCP's sending rate.

6.1.5. Retransmitted TCP packets

This document specifies ECN-capable TCP implementations MUST NOT set either ECT codepoint (ECT(0) or ECT(1)) in the IP header for retransmitted data packets, and that the TCP data receiver SHOULD ignore the ECN field on arriving data packets that are outside of the receiver's current window. This is for greater security against denial-of-service attacks, as well as for robustness of the ECN congestion indication with packets that are dropped later in the network. First, we note that if the TCP sender were to set an ECT codepoint on a retransmitted packet, then if an unnecessarily-retransmitted packet was later dropped in the network, the end nodes would never receive the indication of congestion from the router setting the CE codepoint. Thus, setting an ECT codepoint on retransmitted data packets is not consistent with the robust delivery of the congestion indication even for packets that are later dropped in the network.
Top   ToC   RFC3168 - Page 21
   In addition, an attacker capable of spoofing the IP source address of
   the TCP sender could send data packets with arbitrary sequence
   numbers, with the CE codepoint set in the IP header.  On receiving
   this spoofed data packet, the TCP data receiver would determine that
   the data does not lie in the current receive window, and return a
   duplicate acknowledgement.  We define an out-of-window packet at the
   TCP data receiver as a data packet that lies outside the receiver's
   current window.  On receiving an out-of-window packet, the TCP data
   receiver has to decide whether or not to treat the CE codepoint in
   the packet header as a valid indication of congestion, and therefore
   whether to return ECN-Echo indications to the TCP data sender.  If
   the TCP data receiver ignored the CE codepoint in an out-of-window
   packet, then the TCP data sender would not receive this possibly-
   legitimate indication of congestion from the network, resulting in a
   violation of end-to-end congestion control.  On the other hand, if
   the TCP data receiver honors the CE indication in the out-of-window
   packet, and reports the indication of congestion to the TCP data
   sender, then the malicious node that created the spoofed, out-of-
   window packet has successfully "attacked" the TCP connection by
   forcing the data sender to unnecessarily reduce (halve) its
   congestion window.  To prevent such a denial-of-service attack, we
   specify that a legitimate TCP data sender MUST NOT set an ECT
   codepoint on retransmitted data packets, and that the TCP data
   receiver SHOULD ignore the CE codepoint on out-of-window packets.

   One drawback of not setting ECT(0) or ECT(1) on retransmitted packets
   is that it denies ECN protection for retransmitted packets.  However,
   for an ECN-capable TCP connection in a fully-ECN-capable environment
   with mild congestion, packets should rarely be dropped due to
   congestion in the first place, and so instances of retransmitted
   packets should rarely arise.  If packets are being retransmitted,
   then there are already packet losses (from corruption or from
   congestion) that ECN has been unable to prevent.

   We note that if the router sets the CE codepoint for an ECN-capable
   data packet within a TCP connection, then the TCP connection is
   guaranteed to receive that indication of congestion, or to receive
   some other indication of congestion within the same window of data,
   even if this packet is dropped or reordered in the network.  We
   consider two cases, when the packet is later retransmitted, and when
   the packet is not later retransmitted.

   In the first case, if the packet is either dropped or delayed, and at
   some point retransmitted by the data sender, then the retransmission
   is a result of a Fast Retransmit or a Retransmit Timeout for either
   that packet or for some prior packet in the same window of data.  In
   this case, because the data sender already has retransmitted this
   packet, we know that the data sender has already responded to an
Top   ToC   RFC3168 - Page 22
   indication of congestion for some packet within the same window of
   data as the original packet.  Thus, even if the first transmission of
   the packet is dropped in the network, or is delayed, if it had the CE
   codepoint set, and is later ignored by the data receiver as an out-
   of-window packet, this is not a problem, because the sender has
   already responded to an indication of congestion for that window of
   data.

   In the second case, if the packet is never retransmitted by the data
   sender, then this data packet is the only copy of this data received
   by the data receiver, and therefore arrives at the data receiver as
   an in-window packet, regardless of how much the packet might be
   delayed or reordered.  In this case, if the CE codepoint is set on
   the packet within the network, this will be treated by the data
   receiver as a valid indication of congestion.

6.1.6. TCP Window Probes.

When the TCP data receiver advertises a zero window, the TCP data sender sends window probes to determine if the receiver's window has increased. Window probe packets do not contain any user data except for the sequence number, which is a byte. If a window probe packet is dropped in the network, this loss is not detected by the receiver. Therefore, the TCP data sender MUST NOT set either an ECT codepoint or the CWR bit on window probe packets. However, because window probes use exact sequence numbers, they cannot be easily spoofed in denial-of-service attacks. Therefore, if a window probe arrives with the CE codepoint set, then the receiver SHOULD respond to the ECN indications.

7. Non-compliance by the End Nodes

This section discusses concerns about the vulnerability of ECN to non-compliant end-nodes (i.e., end nodes that set the ECT codepoint in transmitted packets but do not respond to received CE packets). We argue that the addition of ECN to the IP architecture will not significantly increase the current vulnerability of the architecture to unresponsive flows. Even for non-ECN environments, there are serious concerns about the damage that can be done by non-compliant or unresponsive flows (that is, flows that do not respond to congestion control indications by reducing their arrival rate at the congested link). For example, an end-node could "turn off congestion control" by not reducing its congestion window in response to packet drops. This is a concern for the current Internet. It has been argued that routers will have to deploy mechanisms to detect and differentially treat packets from
Top   ToC   RFC3168 - Page 23
   non-compliant flows [RFC2309,FF99].  It has also been suggested that
   techniques such as end-to-end per-flow scheduling and isolation of
   one flow from another, differentiated services, or end-to-end
   reservations could remove some of the more damaging effects of
   unresponsive flows.

   It might seem that dropping packets in itself is an adequate
   deterrent for non-compliance, and that the use of ECN removes this
   deterrent.  We would argue in response that (1) ECN-capable routers
   preserve packet-dropping behavior in times of high congestion; and
   (2) even in times of high congestion, dropping packets in itself is
   not an adequate deterrent for non-compliance.

   First, ECN-Capable routers will only mark packets (as opposed to
   dropping them) when the packet marking rate is reasonably low. During
   periods where the average queue size exceeds an upper threshold, and
   therefore the potential packet marking rate would be high, our
   recommendation is that routers drop packets rather then set the CE
   codepoint in packet headers.

   During the periods of low or moderate packet marking rates when ECN
   would be deployed, there would be little deterrent effect on
   unresponsive flows of dropping rather than marking those packets. For
   example, delay-insensitive flows using reliable delivery might have
   an incentive to increase rather than to decrease their sending rate
   in the presence of dropped packets.  Similarly, delay-sensitive flows
   using unreliable delivery might increase their use of FEC in response
   to an increased packet drop rate, increasing rather than decreasing
   their sending rate.  For the same reasons, we do not believe that
   packet dropping itself is an effective deterrent for non-compliance
   even in an environment of high packet drop rates, when all flows are
   sharing the same packet drop rate.

   Several methods have been proposed to identify and restrict non-
   compliant or unresponsive flows. The addition of ECN to the network
   environment would not in any way increase the difficulty of designing
   and deploying such mechanisms. If anything, the addition of ECN to
   the architecture would make the job of identifying unresponsive flows
   slightly easier.  For example, in an ECN-Capable environment routers
   are not limited to information about packets that are dropped or have
   the CE codepoint set at that router itself; in such an environment,
   routers could also take note of arriving CE packets that indicate
   congestion encountered by that packet earlier in the path.
Top   ToC   RFC3168 - Page 24

8. Non-compliance in the Network

This section considers the issues when a router is operating, possibly maliciously, to modify either of the bits in the ECN field. We note that in IPv4, the IP header is protected from bit errors by a header checksum; this is not the case in IPv6. Thus for IPv6 the ECN field can be accidentally modified by bit errors on links or in routers without being detected by an IP header checksum. By tampering with the bits in the ECN field, an adversary (or a broken router) could do one or more of the following: falsely report congestion, disable ECN-Capability for an individual packet, erase the ECN congestion indication, or falsely indicate ECN-Capability. Section 18 systematically examines the various cases by which the ECN field could be modified. The important criterion considered in determining the consequences of such modifications is whether it is likely to lead to poorer behavior in any dimension (throughput, delay, fairness or functionality) than if a router were to drop a packet. The first two possible changes, falsely reporting congestion or disabling ECN-Capability for an individual packet, are no worse than if the router were to simply drop the packet. From a congestion control point of view, setting the CE codepoint in the absence of congestion by a non-compliant router would be no worse than a router dropping a packet unnecessarily. By "erasing" an ECT codepoint of a packet that is later dropped in the network, a router's actions could result in an unnecessary packet drop for that packet later in the network. However, as discussed in Section 18, a router that erases the ECN congestion indication or falsely indicates ECN-Capability could potentially do more damage to the flow that if it has simply dropped the packet. A rogue or broken router that "erased" the CE codepoint in arriving CE packets would prevent that indication of congestion from reaching downstream receivers. This could result in the failure of congestion control for that flow and a resulting increase in congestion in the network, ultimately resulting in subsequent packets dropped for this flow as the average queue size increased at the congested gateway. Section 19 considers the potential repercussions of subverting end- to-end congestion control by either falsely indicating ECN- Capability, or by erasing the congestion indication in ECN (the CE- codepoint). We observe in Section 19 that the consequence of subverting ECN-based congestion control may lead to potential unfairness, but this is likely to be no worse than the subversion of either ECN-based or packet-based congestion control by the end nodes.
Top   ToC   RFC3168 - Page 25

8.1. Complications Introduced by Split Paths

If a router or other network element has access to all of the packets of a flow, then that router could do no more damage to a flow by altering the ECN field than it could by simply dropping all of the packets from that flow. However, in some cases, a malicious or broken router might have access to only a subset of the packets from a flow. The question is as follows: can this router, by altering the ECN field in this subset of the packets, do more damage to that flow than if it has simply dropped that set of the packets? This is also discussed in detail in Section 18, which concludes as follows: It is true that the adversary that has access only to a subset of packets in an aggregate might, by subverting ECN-based congestion control, be able to deny the benefits of ECN to the other packets in the aggregate. While this is undesirable, this is not a sufficient concern to result in disabling ECN.

9. Encapsulated Packets

9.1. IP packets encapsulated in IP

The encapsulation of IP packet headers in tunnels is used in many places, including IPsec and IP in IP [RFC2003]. This section considers issues related to interactions between ECN and IP tunnels, and specifies two alternative solutions. This discussion is complemented by RFC 2983's discussion of interactions between Differentiated Services and IP tunnels of various forms [RFC 2983], as Differentiated Services uses the remaining six bits of the IP header octet that is used by ECN (see Figure 2 in Section 5). Some IP tunnel modes are based on adding a new "outer" IP header that encapsulates the original, or "inner" IP header and its associated packet. In many cases, the new "outer" IP header may be added and removed at intermediate points along a connection, enabling the network to establish a tunnel without requiring endpoint participation. We denote tunnels that specify that the outer header be discarded at tunnel egress as "simple tunnels". ECN uses the ECN field in the IP header for signaling between routers and connection endpoints. ECN interacts with IP tunnels based on the treatment of the ECN field in the IP header. In simple IP tunnels the octet containing the ECN field is copied or mapped from the inner IP header to the outer IP header at IP tunnel ingress, and the outer header's copy of this field is discarded at IP tunnel egress. If the outer header were to be simply discarded without taking care to deal with the ECN field, and an ECN-capable router were to set the CE
Top   ToC   RFC3168 - Page 26
   (Congestion Experienced) codepoint within a packet in a simple IP
   tunnel, this indication would be discarded at tunnel egress, losing
   the indication of congestion.

   Thus, the use of ECN over simple IP tunnels would result in routers
   attempting to use the outer IP header to signal congestion to
   endpoints, but those congestion warnings never arriving because the
   outer header is discarded at the tunnel egress point.  This problem
   was encountered with ECN and IPsec in tunnel mode, and RFC 2481
   recommended that ECN not be used with the older simple IPsec tunnels
   in order to avoid this behavior and its consequences.  When ECN
   becomes widely deployed, then simple tunnels likely to carry ECN-
   capable traffic will have to be changed.  If ECN-capable traffic is
   carried by a simple tunnel through a congested, ECN-capable router,
   this could result in subsequent packets being dropped for this flow
   as the average queue size increases at the congested router, as
   discussed in Section 8 above.

   From a security point of view, the use of ECN in the outer header of
   an IP tunnel might raise security concerns because an adversary could
   tamper with the ECN information that propagates beyond the tunnel
   endpoint.  Based on an analysis in Sections 18 and 19 of these
   concerns and the resultant risks, our overall approach is to make
   support for ECN an option for IP tunnels, so that an IP tunnel can be
   specified or configured either to use ECN or not to use ECN in the
   outer header of the tunnel.  Thus, in environments or tunneling
   protocols where the risks of using ECN are judged to outweigh its
   benefits, the tunnel can simply not use ECN in the outer header.
   Then the only indication of congestion experienced at routers within
   the tunnel would be through packet loss.

   The result is that there are two viable options for the behavior of
   ECN-capable connections over an IP tunnel, including IPsec tunnels:

      * A limited-functionality option in which ECN is preserved in the
        inner header, but disabled in the outer header.  The only
        mechanism available for signaling congestion occurring within
        the tunnel in this case is dropped packets.

      * A full-functionality option that supports ECN in both the inner
        and outer headers, and propagates congestion warnings from nodes
        within the tunnel to endpoints.

   Support for these options requires varying amounts of changes to IP
   header processing at tunnel ingress and egress.  A small subset of
   these changes sufficient to support only the limited-functionality
   option would be sufficient to eliminate any incompatibility between
   ECN and IP tunnels.
Top   ToC   RFC3168 - Page 27
   One goal of this document is to give guidance about the tradeoffs
   between the limited-functionality and full-functionality options.  A
   full discussion of the potential effects of an adversary's
   modifications of the ECN field is given in Sections 18 and 19.

9.1.1. The Limited-functionality and Full-functionality Options

The limited-functionality option for ECN encapsulation in IP tunnels is for the not-ECT codepoint to be set in the outside (encapsulating) header regardless of the value of the ECN field in the inside (encapsulated) header. With this option, the ECN field in the inner header is not altered upon de-capsulation. The disadvantage of this approach is that the flow does not have ECN support for that part of the path that is using IP tunneling, even if the encapsulated packet (from the original TCP sender) is ECN-Capable. That is, if the encapsulated packet arrives at a congested router that is ECN- capable, and the router can decide to drop or mark the packet as an indication of congestion to the end nodes, the router will not be permitted to set the CE codepoint in the packet header, but instead will have to drop the packet. The full-functionality option for ECN encapsulation is to copy the ECN codepoint of the inside header to the outside header on encapsulation if the inside header is not-ECT or ECT, and to set the ECN codepoint of the outside header to ECT(0) if the ECN codepoint of the inside header is CE. On decapsulation, if the CE codepoint is set on the outside header, then the CE codepoint is also set in the inner header. Otherwise, the ECN codepoint on the inner header is left unchanged. That is, for full ECN support the encapsulation and decapsulation processing involves the following: At tunnel ingress, the full-functionality option sets the ECN codepoint in the outer header. If the ECN codepoint in the inner header is not-ECT or ECT, then it is copied to the ECN codepoint in the outer header. If the ECN codepoint in the inner header is CE, then the ECN codepoint in the outer header is set to ECT(0). Upon decapsulation at the tunnel egress, the full-functionality option sets the CE codepoint in the inner header if the CE codepoint is set in the outer header. Otherwise, no change is made to this field of the inner header. With the full-functionality option, a flow can take advantage of ECN in those parts of the path that might use IP tunneling. The disadvantage of the full-functionality option from a security perspective is that the IP tunnel cannot protect the flow from certain modifications to the ECN bits in the IP header within the tunnel. The potential dangers from modifications to the ECN bits in the IP header are described in detail in Sections 18 and 19.
Top   ToC   RFC3168 - Page 28
      (1) An IP tunnel MUST modify the handling of the DS field octet at
      IP tunnel endpoints by implementing either the limited-
      functionality or the full-functionality option.

      (2) Optionally, an IP tunnel MAY enable the endpoints of an IP
      tunnel to negotiate the choice between the limited-functionality
      and the full-functionality option for ECN in the tunnel.

   The minimum required to make ECN usable with IP tunnels is the
   limited-functionality option, which prevents ECN from being enabled
   in the outer header of the tunnel.  Full support for ECN requires the
   use of the full-functionality option.  If there are no optional
   mechanisms for the tunnel endpoints to negotiate a choice between the
   limited-functionality or full-functionality option, there can be a
   pre-existing agreement between the tunnel endpoints about whether to
   support the limited-functionality or the full-functionality ECN
   option.

   All IP tunnels MUST implement the limited-functionality option, and
   SHOULD support the full-functionality option.

   In addition, it is RECOMMENDED that packets with the CE codepoint in
   the outer header be dropped if they arrive at the tunnel egress point
   for a tunnel that uses the limited-functionality option, or for a
   tunnel that uses the full-functionality option but for which the
   not-ECT codepoint is set in the inner header.  This is motivated by
   backwards compatibility and to ensure that no unauthorized
   modifications of the ECN field take place, and is discussed further
   in the next Section (9.1.2).

9.1.2. Changes to the ECN Field within an IP Tunnel.

The presence of a copy of the ECN field in the inner header of an IP tunnel mode packet provides an opportunity for detection of unauthorized modifications to the ECN field in the outer header. Comparison of the ECT fields in the inner and outer headers falls into two categories for implementations that conform to this document: * If the IP tunnel uses the full-functionality option, then the not-ECT codepoint should be set in the outer header if and only if it is also set in the inner header. * If the tunnel uses the limited-functionality option, then the not-ECT codepoint should be set in the outer header. Receipt of a packet not satisfying the appropriate condition could be a cause of concern.
Top   ToC   RFC3168 - Page 29
   Consider the case of an IP tunnel where the tunnel ingress point has
   not been updated to this document's requirements, while the tunnel
   egress point has been updated to support ECN.  In this case, the IP
   tunnel is not explicitly configured to support the full-functionality
   ECN option. However, the tunnel ingress point is behaving identically
   to a tunnel ingress point that supports the full-functionality
   option.  If packets from an ECN-capable connection use this tunnel,
   the ECT codepoint will be set in the outer header at the tunnel
   ingress point.  Congestion within the tunnel may then result in ECN-
   capable routers setting CE in the outer header.  Because the tunnel
   has not been explicitly configured to support the full-functionality
   option, the tunnel egress point expects the not-ECT codepoint to be
   set in the outer header.  When an ECN-capable tunnel egress point
   receives a packet with the ECT or CE codepoint in the outer header,
   in a tunnel that has not been configured to support the full-
   functionality option, that packet should be processed, according to
   whether the CE codepoint was set, as follows.  It is RECOMMENDED that
   on a tunnel that has not been configured to support the full-
   functionality option, packets should be dropped at the egress point
   if the CE codepoint is set in the outer header but not in the inner
   header, and should be forwarded otherwise.

   An IP tunnel cannot provide protection against erasure of congestion
   indications based on changing the ECN codepoint from CE to ECT.  The
   erasure of congestion indications may impact the network and other
   flows in ways that would not be possible in the absence of ECN.  It
   is important to note that erasure of congestion indications can only
   be performed to congestion indications placed by nodes within the
   tunnel; the copy of the ECN field in the inner header preserves
   congestion notifications from nodes upstream of the tunnel ingress
   (unless the inner header is also erased).  If erasure of congestion
   notifications is judged to be a security risk that exceeds the
   congestion management benefits of ECN, then tunnels could be
   specified or configured to use the limited-functionality option.

9.2. IPsec Tunnels

IPsec supports secure communication over potentially insecure network components such as intermediate routers. IPsec protocols support two operating modes, transport mode and tunnel mode, that span a wide range of security requirements and operating environments. Transport mode security protocol header(s) are inserted between the IP (IPv4 or IPv6) header and higher layer protocol headers (e.g., TCP), and hence transport mode can only be used for end-to-end security on a connection. IPsec tunnel mode is based on adding a new "outer" IP header that encapsulates the original, or "inner" IP header and its associated packet. Tunnel mode security headers are inserted between these two IP headers. In contrast to transport mode, the new "outer"
Top   ToC   RFC3168 - Page 30
   IP header and tunnel mode security headers can be added and removed
   at intermediate points along a connection, enabling security gateways
   to secure vulnerable portions of a connection without requiring
   endpoint participation in the security protocols.  An important
   aspect of tunnel mode security is that in the original specification,
   the outer header is discarded at tunnel egress, ensuring that
   security threats based on modifying the IP header do not propagate
   beyond that tunnel endpoint.  Further discussion of IPsec can be
   found in [RFC2401].

   The IPsec protocol as originally defined in [ESP, AH] required that
   the inner header's ECN field not be changed by IPsec decapsulation
   processing at a tunnel egress node; this would have ruled out the
   possibility of full-functionality mode for ECN.  At the same time,
   this would ensure that an adversary's modifications to the ECN field
   cannot be used to launch theft- or denial-of-service attacks across
   an IPsec tunnel endpoint, as any such modifications will be discarded
   at the tunnel endpoint.

   In principle, permitting the use of ECN functionality in the outer
   header of an IPsec tunnel raises security concerns because an
   adversary could tamper with the information that propagates beyond
   the tunnel endpoint.  Based on an analysis (included in Sections 18
   and 19) of these concerns and the associated risks, our overall
   approach has been to provide configuration support for IPsec changes
   to remove the conflict with ECN.

   In particular, in tunnel mode the IPsec tunnel MUST support the
   limited-functionality option outlined in Section 9.1.1, and SHOULD
   support the full-functionality option outlined in Section 9.1.1.

   This makes permission to use ECN functionality in the outer header of
   an IPsec tunnel a configurable part of the corresponding IPsec
   Security Association (SA), so that it can be disabled in situations
   where the risks are judged to outweigh the benefits.  The result is
   that an IPsec security administrator is presented with two
   alternatives for the behavior of ECN-capable connections within an
   IPsec tunnel, the limited-functionality alternative and full-
   functionality alternative described earlier.

   In addition, this document specifies how the endpoints of an IPsec
   tunnel could negotiate enabling ECN functionality in the outer
   headers of that tunnel based on security policy.  The ability to
   negotiate ECN usage between tunnel endpoints would enable a security
   administrator to disable ECN in situations where she believes the
   risks (e.g., of lost congestion notifications) outweigh the benefits
   of ECN.
Top   ToC   RFC3168 - Page 31
   The IPsec protocol, as defined in [ESP, AH], does not include the IP
   header's ECN field in any of its cryptographic calculations (in the
   case of tunnel mode, the outer IP header's ECN field is not
   included).  Hence modification of the ECN field by a network node has
   no effect on IPsec's end-to-end security, because it cannot cause any
   IPsec integrity check to fail.  As a consequence, IPsec does not
   provide any defense against an adversary's modification of the ECN
   field (i.e., a man-in-the-middle attack), as the adversary's
   modification will also have no effect on IPsec's end-to-end security.
   In some environments, the ability to modify the ECN field without
   affecting IPsec integrity checks may constitute a covert channel; if
   it is necessary to eliminate such a channel or reduce its bandwidth,
   then the IPsec tunnel should be run in limited-functionality mode.

9.2.1. Negotiation between Tunnel Endpoints

This section describes the detailed changes to enable usage of ECN over IPsec tunnels, including the negotiation of ECN support between tunnel endpoints. This is supported by three changes to IPsec: * An optional Security Association Database (SAD) field indicating whether tunnel encapsulation and decapsulation processing allows or forbids ECN usage in the outer IP header. * An optional Security Association Attribute that enables negotiation of this SAD field between the two endpoints of an SA that supports tunnel mode. * Changes to tunnel mode encapsulation and decapsulation processing to allow or forbid ECN usage in the outer IP header based on the value of the SAD field. When ECN usage is allowed in the outer IP header, the ECT codepoint is set in the outer header for ECN-capable connections and congestion notifications (indicated by the CE codepoint) from such connections are propagated to the inner header at tunnel egress. If negotiation of ECN usage is implemented, then the SAD field SHOULD also be implemented. On the other hand, negotiation of ECN usage is OPTIONAL in all cases, even for implementations that support the SAD field. The encapsulation and decapsulation processing changes are REQUIRED, but MAY be implemented without the other two changes by assuming that ECN usage is always forbidden. The full-functionality alternative for ECN usage over IPsec tunnels consists of the SAD field and the full version of encapsulation and decapsulation processing changes, with or without the OPTIONAL negotiation support. The limited-functionality alternative consists of a subset of the encapsulation and decapsulation changes that always forbids ECN usage.
Top   ToC   RFC3168 - Page 32
   These changes are covered further in the following three subsections.

9.2.1.1. ECN Tunnel Security Association Database Field
Full ECN functionality adds a new field to the SAD (see [RFC2401]): ECN Tunnel: allowed or forbidden. Indicates whether ECN-capable connections using this SA in tunnel mode are permitted to receive ECN congestion notifications for congestion occurring within the tunnel. The allowed value enables ECN congestion notifications. The forbidden value disables such notifications, causing all congestion to be indicated via dropped packets. [OPTIONAL. The value of this field SHOULD be assumed to be "forbidden" in implementations that do not support it.] If this attribute is implemented, then the SA specification in a Security Policy Database (SPD) entry MUST support a corresponding attribute, and this SPD attribute MUST be covered by the SPD administrative interface (currently described in Section 4.4.1 of [RFC2401]).
9.2.1.2. ECN Tunnel Security Association Attribute
A new IPsec Security Association Attribute is defined to enable the support for ECN congestion notifications based on the outer IP header to be negotiated for IPsec tunnels (see [RFC2407]). This attribute is OPTIONAL, although implementations that support it SHOULD also support the SAD field defined in Section 9.2.1.1. Attribute Type class value type ------------------------------------------------- ECN Tunnel 10 Basic The IPsec SA Attribute value 10 has been allocated by IANA to indicate that the ECN Tunnel SA Attribute is being negotiated; the type of this attribute is Basic (see Section 4.5 of [RFC2407]). The Class Values are used to conduct the negotiation. See [RFC2407, RFC2408, RFC2409] for further information including encoding formats and requirements for negotiating this SA attribute.
Top   ToC   RFC3168 - Page 33
   Class Values

   ECN Tunnel

   Specifies whether ECN functionality is allowed to be used with Tunnel
   Encapsulation Mode.  This affects tunnel encapsulation and
   decapsulation processing - see Section 9.2.1.3.

   RESERVED          0
   Allowed           1
   Forbidden         2

   Values 3-61439 are reserved to IANA.  Values 61440-65535 are for
   private use.

   If unspecified, the default shall be assumed to be Forbidden.

   ECN Tunnel is a new SA attribute, and hence initiators that use it
   can expect to encounter responders that do not understand it, and
   therefore reject proposals containing it.  For backwards
   compatibility with such implementations initiators SHOULD always also
   include a proposal without the ECN Tunnel attribute to enable such a
   responder to select a transform or proposal that does not contain the
   ECN Tunnel attribute.  RFC 2407 currently requires responders to
   reject all proposals if any proposal contains an unknown attribute;
   this requirement is expected to be changed to require a responder not
   to select proposals or transforms containing unknown attributes.

9.2.1.3. Changes to IPsec Tunnel Header Processing
For full ECN support, the encapsulation and decapsulation processing for the IPv4 TOS field and the IPv6 Traffic Class field are changed from that specified in [RFC2401] to the following: <-- How Outer Hdr Relates to Inner Hdr --> Outer Hdr at Inner Hdr at IPv4 Encapsulator Decapsulator Header fields: -------------------- ------------ DS Field copied from inner hdr (5) no change ECN Field constructed (7) constructed (8) IPv6 Header fields: DS Field copied from inner hdr (6) no change ECN Field constructed (7) constructed (8)
Top   ToC   RFC3168 - Page 34
      (5)(6) If the packet will immediately enter a domain for which the
      DSCP value in the outer header is not appropriate, that value MUST
      be mapped to an appropriate value for the domain [RFC 2474].  Also
      see [RFC 2475] for further information.

      (7) If the value of the ECN Tunnel field in the SAD entry for this
      SA is "allowed" and the ECN field in the inner header is set to
      any value other than CE, copy this ECN field to the outer header.
      If the ECN field in the inner header is set to CE, then set the
      ECN field in the outer header to ECT(0).

      (8) If the value of the ECN tunnel field in the SAD entry for this
      SA is "allowed" and the ECN field in the inner header is set to
      ECT(0) or ECT(1) and the ECN field in the outer header is set to
      CE, then copy the ECN field from the outer header to the inner
      header.  Otherwise, make no change to the ECN field in the inner
      header.

      (5) and (6) are identical to match usage in [RFC2401], although
      they are different in [RFC2401].

   The above description applies to implementations that support the ECN
   Tunnel field in the SAD; such implementations MUST implement this
   processing instead of the processing of the IPv4 TOS octet and IPv6
   Traffic Class octet defined in [RFC2401].  This constitutes the
   full-functionality alternative for ECN usage with IPsec tunnels.

   An implementation that does not support the ECN Tunnel field in the
   SAD MUST implement this processing by assuming that the value of the
   ECN Tunnel field of the SAD is "forbidden" for every SA.  In this
   case, the processing of the ECN field reduces to:

      (7) Set the ECN field to not-ECT in the outer header.
      (8) Make no change to the ECN field in the inner header.

   This constitutes the limited functionality alternative for ECN usage
   with IPsec tunnels.

   For backwards compatibility, packets with the CE codepoint set in the
   outer header SHOULD be dropped if they arrive on an SA that is using
   the limited-functionality option, or that is using the full-
   functionality option with the not-ECN codepoint set in the inner
   header.
Top   ToC   RFC3168 - Page 35

9.2.2. Changes to the ECN Field within an IPsec Tunnel.

If the ECN Field is changed inappropriately within an IPsec tunnel, and this change is detected at the tunnel egress, then the receipt of a packet not satisfying the appropriate condition for its SA is an auditable event. An implementation MAY create audit records with per-SA counts of incorrect packets over some time period rather than creating an audit record for each erroneous packet. Any such audit record SHOULD contain the headers from at least one erroneous packet, but need not contain the headers from every packet represented by the entry.

9.2.3. Comments for IPsec Support

Substantial comments were received on two areas of this document during review by the IPsec working group. This section describes these comments and explains why the proposed changes were not incorporated. The first comment indicated that per-node configuration is easier to implement than per-SA configuration. After serious thought and despite some initial encouragement of per-node configuration, it no longer seems to be a good idea. The concern is that as ECN-awareness is progressively deployed in IPsec, many ECN-aware IPsec implementations will find themselves communicating with a mixture of ECN-aware and ECN-unaware IPsec tunnel endpoints. In such an environment with per-node configuration, the only reasonable thing to do is forbid ECN usage for all IPsec tunnels, which is not the desired outcome. In the second area, several reviewers noted that SA negotiation is complex, and adding to it is non-trivial. One reviewer suggested using ICMP after tunnel setup as a possible alternative. The addition to SA negotiation in this document is OPTIONAL and will remain so; implementers are free to ignore it. The authors believe that the assurance it provides can be useful in a number of situations. In practice, if this is not implemented, it can be deleted at a subsequent stage in the standards process. Extending ICMP to negotiate ECN after tunnel setup is more complex than extending SA attribute negotiation. Some tunnels do not permit traffic to be addressed to the tunnel egress endpoint, hence the ICMP packet would have to be addressed to somewhere else, scanned for by the egress endpoint, and discarded there or at its actual destination. In addition, ICMP delivery is unreliable, and hence there is a possibility of an ICMP packet being dropped, entailing the invention of yet another ack/retransmit mechanism. It seems better simply to specify an OPTIONAL extension to the existing SA negotiation mechanism.
Top   ToC   RFC3168 - Page 36

9.3. IP packets encapsulated in non-IP Packet Headers.

A different set of issues are raised, relative to ECN, when IP packets are encapsulated in tunnels with non-IP packet headers. This occurs with MPLS [MPLS], GRE [GRE], L2TP [L2TP], and PPTP [PPTP]. For these protocols, there is no conflict with ECN; it is just that ECN cannot be used within the tunnel unless an ECN codepoint can be specified for the header of the encapsulating protocol. Earlier work considered a preliminary proposal for incorporating ECN into MPLS, and proposals for incorporating ECN into GRE, L2TP, or PPTP will be considered as the need arises.

10. Issues Raised by Monitoring and Policing Devices

One possibility is that monitoring and policing devices (or more informally, "penalty boxes") will be installed in the network to monitor whether best-effort flows are appropriately responding to congestion, and to preferentially drop packets from flows determined not to be using adequate end-to-end congestion control procedures. We recommend that any "penalty box" that detects a flow or an aggregate of flows that is not responding to end-to-end congestion control first change from marking to dropping packets from that flow, before taking any additional action to restrict the bandwidth available to that flow. Thus, initially, the router may drop packets in which the router would otherwise would have set the CE codepoint. This could include dropping those arriving packets for that flow that are ECN-Capable and that already have the CE codepoint set. In this way, any congestion indications seen by that router for that flow will be guaranteed to also be seen by the end nodes, even in the presence of malicious or broken routers elsewhere in the path. If we assume that the first action taken at any "penalty box" for an ECN- capable flow will be to drop packets instead of marking them, then there is no way that an adversary that subverts ECN-based end-to-end congestion control can cause a flow to be characterized as being non-cooperative and placed into a more severe action within the "penalty box". The monitoring and policing devices that are actually deployed could fall short of the `ideal' monitoring device described above, in that the monitoring is applied not to a single flow, but to an aggregate of flows (e.g., those sharing a single IPsec tunnel). In this case, the switch from marking to dropping would apply to all of the flows in that aggregate, denying the benefits of ECN to the other flows in the aggregate also. At the highest level of aggregation, another form of the disabling of ECN happens even in the absence of
Top   ToC   RFC3168 - Page 37
   monitoring and policing devices, when ECN-Capable RED queues switch
   from marking to dropping packets as an indication of congestion when
   the average queue size has exceeded some threshold.

11. Evaluations of ECN

11.1. Related Work Evaluating ECN

This section discusses some of the related work evaluating the use of ECN. The ECN Web Page [ECN] has pointers to other papers, as well as to implementations of ECN. [Floyd94] considers the advantages and drawbacks of adding ECN to the TCP/IP architecture. As shown in the simulation-based comparisons, one advantage of ECN is to avoid unnecessary packet drops for short or delay-sensitive TCP connections. A second advantage of ECN is in avoiding some unnecessary retransmit timeouts in TCP. This paper discusses in detail the integration of ECN into TCP's congestion control mechanisms. The possible disadvantages of ECN discussed in the paper are that a non-compliant TCP connection could falsely advertise itself as ECN-capable, and that a TCP ACK packet carrying an ECN-Echo message could itself be dropped in the network. The first of these two issues is discussed in the appendix of this document, and the second is addressed by the addition of the CWR flag in the TCP header. Experimental evaluations of ECN include [RFC2884,K98]. The conclusions of [K98] and [RFC2884] are that ECN TCP gets moderately better throughput than non-ECN TCP; that ECN TCP flows are fair towards non-ECN TCP flows; and that ECN TCP is robust with two-way traffic (with congestion in both directions) and with multiple congested gateways. Experiments with many short web transfers show that, while most of the short connections have similar transfer times with or without ECN, a small percentage of the short connections have very long transfer times for the non-ECN experiments as compared to the ECN experiments.

11.2. A Discussion of the ECN nonce.

The use of two ECT codepoints, ECT(0) and ECT(1), can provide a one- bit ECN nonce in packet headers [SCWA99]. The primary motivation for this is the desire to allow mechanisms for the data sender to verify that network elements are not erasing the CE codepoint, and that data receivers are properly reporting to the sender the receipt of packets with the CE codepoint set, as required by the transport protocol. This section discusses issues of backwards compatibility with IP ECN implementations in routers conformant with RFC 2481, in which only one ECT codepoint was defined. We do not believe that the
Top   ToC   RFC3168 - Page 38
   incremental deployment of ECN implementations that understand the
   ECT(1) codepoint will cause significant operational problems.  This
   is particularly likely to be the case when the deployment of the
   ECT(1) codepoint begins with routers, before the ECT(1) codepoint
   starts to be used by end-nodes.

11.2.1. The Incremental Deployment of ECT(1) in Routers.

ECN has been an Experimental standard since January 1999, and there are already implementations of ECN in routers that do not understand the ECT(1) codepoint. When the use of the ECT(1) codepoint is standardized for TCP or for other transport protocols, this could mean that a data sender is using the ECT(1) codepoint, but that this codepoint is not understood by a congested router on the path. If allowed by the transport protocol, a data sender would be free not to make use of ECT(1) at all, and to send all ECN-capable packets with the codepoint ECT(0). However, if an ECN-capable sender is using ECT(1), and the congested router on the path did not understand the ECT(1) codepoint, then the router would end up marking some of the ECT(0) packets, and dropping some of the ECT(1) packets, as indications of congestion. Since TCP is required to react to both marked and dropped packets, this behavior of dropping packets that could have been marked poses no significant threat to the network, and is consistent with the overall approach to ECN that allows routers to determine when and whether to mark packets as they see fit (see Section 5).


(page 38 continued on part 3)

Next Section