Network Working Group H. Balakrishnan Request for Comments: 3449 MIT LCS BCP: 69 V. N. Padmanabhan Category: Best Current Practice Microsoft Research G. Fairhurst M. Sooriyabandara University of Aberdeen, U.K. December 2002 TCP Performance Implications of Network Path Asymmetry Status of this Memo This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved.Abstract
This document describes TCP performance problems that arise because of asymmetric effects. These problems arise in several access networks, including bandwidth-asymmetric networks and packet radio subnetworks, for different underlying reasons. However, the end result on TCP performance is the same in both cases: performance often degrades significantly because of imperfection and variability in the ACK feedback from the receiver to the sender. The document details several mitigations to these effects, which have either been proposed or evaluated in the literature, or are currently deployed in networks. These solutions use a combination of local link-layer techniques, subnetwork, and end-to-end mechanisms, consisting of: (i) techniques to manage the channel used for the upstream bottleneck link carrying the ACKs, typically using header compression or reducing the frequency of TCP ACKs, (ii) techniques to handle this reduced ACK frequency to retain the TCP sender's acknowledgment-triggered self-clocking and (iii) techniques to schedule the data and ACK packets in the reverse direction to improve performance in the presence of two-way traffic. Each technique is described, together with known issues, and recommendations for use. A summary of the recommendations is provided at the end of the document.
Table of Contents
1. Conventions used in this Document ...............................3 2. Motivation ....................................................4 2.1 Asymmetry due to Differences in Transmit and Receive Capacity .........................................4 2.2 Asymmetry due to Shared Media in the Reverse Direction .......5 2.3 The General Problem ..........................................5 3. How does Asymmetry Degrade TCP Performance? .....................5 3.1 Asymmetric Capacity ..........................................5 3.2 MAC Protocol Interactions ....................................7 3.3 Bidirectional Traffic ........................................8 3.4 Loss in Asymmetric Network Paths ............................10 4. Improving TCP Performance using Host Mitigations ...............10 4.1 Modified Delayed ACKs .......................................11 4.2 Use of Large MSS ............................................12 4.3 ACK Congestion Control ......................................13 4.4 Window Prediction Mechanism .................................14 4.5 Acknowledgement based on Cwnd Estimation. ...................14 4.6 TCP Sender Pacing ...........................................14 4.7 TCP Byte Counting ...........................................15 4.8 Backpressure ................................................16 5. Improving TCP performance using Transparent Modifications ......17 5.1 TYPE 0: Header Compression ..................................18 5.1.1 TCP Header Compression ..................................18 5.1.2 Alternate Robust Header Compression Algorithms ..........19 5.2 TYPE 1: Reverse Link Bandwidth Management ...................19 5.2.1 ACK Filtering ...........................................20 5.2.2 ACK Decimation ..........................................21 5.3 TYPE 2: Handling Infrequent ACKs ............................22 5.3.1 ACK Reconstruction ......................................23 5.3.2 ACK Compaction and Companding ...........................25 5.3.3 Mitigating TCP packet bursts generated by Infrequent ACKs .........................................26 5.4 TYPE 3: Upstream Link Scheduling ............................27 5.4.1 Per-Flow queuing at the Upstream Bottleneck Link ........27 5.4.2 ACKs-first Scheduling ...................................28 6. Security Considerations ........................................29 7. Summary ........................................................30 8. Acknowledgments ................................................32 9. References .....................................................32 10. IANA Considerations ...........................................37 Appendix A. Examples of Subnetworks Exhibiting Network Path Asymmetry .............................................38 Authors' Addresses ................................................40 Full Copyright Statement ..........................................41
1. Conventions used in this Document
FORWARD DIRECTION: The dominant direction of data transfer over an asymmetric network path. It corresponds to the direction with better characteristics in terms of capacity, latency, error rate, etc. Data transfer in the forward direction is called "forward transfer". Packets travelling in the forward direction follow the forward path through the IP network. REVERSE DIRECTION: The direction in which acknowledgments of a forward TCP transfer flow. Data transfer could also happen in this direction (and is termed "reverse transfer"), but it is typically less voluminous than that in the forward direction. The reverse direction typically exhibits worse characteristics than the forward direction. Packets travelling in the reverse direction follow the reverse path through the IP network. UPSTREAM LINK: The specific bottleneck link that normally has much less capability than the corresponding downstream link. Congestion is not confined to this link alone, and may also occur at any point along the forward and reverse directions (e.g., due to sharing with other traffic flows). DOWNSTREAM LINK: A link on the forward path, corresponding to the upstream link. ACK: A cumulative TCP acknowledgment [RFC791]. In this document, this term refers to a TCP segment that carries a cumulative acknowledgement (ACK), but no data. DELAYED ACK FACTOR, d: The number of TCP data segments acknowledged by a TCP ACK. The minimum value of d is 1, since at most one ACK should be sent for each data packet [RFC1122, RFC2581]. STRETCH ACK: Stretch ACKs are acknowledgements that cover more than 2 segments of previously unacknowledged data (d>2) [RFC2581]. Stretch ACKs can occur by design (although this is not standard), due to implementation bugs [All97b, RFC2525], or due to ACK loss [RFC2760]. NORMALIZED BANDWIDTH RATIO, k: The ratio of the raw bandwidth (capacity) of the forward direction to the return direction, divided by the ratio of the packet sizes used in the two directions [LMS97]. SOFTSTATE: Per-flow state established in a network device that is used by the protocol [Cla88]. The state expires after a period of time (i.e., is not required to be explicitly deleted when a session
expires), and is continuously refreshed while a flow continues (i.e., lost state may be reconstructed without needing to exchange additional control messages).2. Motivation
Asymmetric characteristics are exhibited by several network technologies, including cable data networks, (e.g., DOCSIS cable TV networks [DS00, DS01]), direct broadcast satellite (e.g., an IP service using Digital Video Broadcast, DVB, [EN97] with an interactive return channel), Very Small Aperture satellite Terminals (VSAT), Asymmetric Digital Subscriber Line (ADSL) [ITU02, ANS01], and several packet radio networks. These networks are increasingly being deployed as high-speed Internet access networks, and it is therefore highly desirable to achieve good TCP performance. However, the asymmetry of the network paths often makes this challenging. Examples of some networks that exhibit asymmetry are provided in the Appendix. Asymmetry may manifest itself as a difference in transmit and receive capacity, an imbalance in the packet loss rate, or differences between the transmit and receive paths [RFC3077]. For example, when capacity is asymmetric, such that there is reduced capacity on reverse path used by TCP ACKs, slow or infrequent ACK feedback degrades TCP performance in the forward direction. Similarly, asymmetry in the underlying Medium Access Control (MAC) and Physical (PHY) protocols could make it expensive to transmit TCP ACKs (disproportionately to their size), even when capacity is symmetric.2.1 Asymmetry due to Differences in Transmit and Receive Capacity
Network paths may be asymmetric because the upstream and downstream links operate at different rates and/or are implemented using different technologies. The asymmetry in capacity may be substantially increased when best effort IP flows carrying TCP ACKs share the available upstream capacity with other traffic flows, e.g., telephony, especially flows that have reserved upstream capacity. This includes service guarantees at the IP layer (e.g., the Guaranteed Service [RFC2212]) or at the subnet layer (e.g., support of Voice over IP [ITU01] using the Unsolicited Grant service in DOCSIS [DS01], or CBR virtual connections in ATM over ADSL [ITU02, ANS01]). When multiple upstream links exist the asymmetry may be reduced by dividing upstream traffic between a number of available upstream links.
2.2 Asymmetry due to Shared Media in the Reverse Direction
In networks employing centralized multiple access control, asymmetry may be a fundamental consequence of the hub-and-spokes architecture of the network (i.e., a single base node communicating with multiple downstream nodes). The central node often incurs less transmission overhead and does not incur latency in scheduling its own downstream transmissions. In contrast, upstream transmission is subject to additional overhead and latency (e.g., due to guard times between transmission bursts, and contention intervals). This can produce significant network path asymmetry. Upstream capacity may be further limited by the requirement that each node must first request per-packet bandwidth using a contention MAC protocol (e.g., DOCSIS 1.0 MAC restricts each node to sending at most a single packet in each upstream time-division interval [DS00]). A satellite network employing dynamic Bandwidth on Demand (BoD), also consumes MAC resources for each packet sent (e.g., [EN00]). In these schemes, the available uplink capacity is a function of the MAC algorithm. The MAC and PHY schemes also introduce overhead per upstream transmission which could be so significant that transmitting short packets (including TCP ACKs) becomes as costly as transmitting MTU-sized data packets.2.3 The General Problem
Despite the technological differences between capacity-dependent and MAC-dependent asymmetries, both kinds of network path suffer reduced TCP performance for the same fundamental reason: the imperfection and variability of ACK feedback. This document discusses the problem in detail and describes several techniques that may reduce or eliminate the constraints.3. How does Asymmetry Degrade TCP Performance?
This section describes the implications of network path asymmetry on TCP performance. The reader is referred to [BPK99, Bal98, Pad98, FSS01, Sam99] for more details and experimental results.3.1 Asymmetric Capacity
The problems that degrade unidirectional transfer performance when the forward and return paths have very different capacities depend on the characteristics of the upstream link. Two types of situations arise for unidirectional traffic over such network paths: when the upstream bottleneck link has sufficient queuing to prevent packet (ACK) losses, and when the upstream bottleneck link has a small buffer. Each is considered in turn.
If the upstream bottleneck link has deep queues, so that this does not drop ACKs in the reverse direction, then performance is a strong function of the normalized bandwidth ratio, k. For example, for a 10 Mbps downstream link and a 50 Kbps upstream link, the raw capacity ratio is 200. With 1000-byte data packets and 40-byte ACKs, the ratio of the packet sizes is 25. This implies that k is 200/25 = 8. Thus, if the receiver acknowledges more frequently than one ACK every 8 (k) data packets, the upstream link will become saturated before the downstream link, limiting the throughput in the forward direction. Note that, the achieved TCP throughput is determined by the minimum of the receiver advertised window or TCP congestion window, cwnd [RFC2581]. If ACKs are not dropped (at the upstream bottleneck link) and k > 1 or k > 0.5 when delayed ACKs are used [RFC1122], TCP ACK-clocking breaks down. Consider two data packets transmitted by the sender in quick succession. En route to the receiver, these packets get spaced apart according to the capacity of the smallest bottleneck link in the forward direction. The principle of ACK clocking is that the ACKs generated in response to receiving these data packets reflects this temporal spacing all the way back to the sender, enabling it to transmit new data packets that maintain the same spacing [Jac88]. ACK clocking with delayed ACKs, reflects the spacing between data packets that actually trigger ACKs. However, the limited upstream capacity and queuing at the upstream bottleneck router alters the inter-ACK spacing of the reverse path, and hence that observed at the sender. When ACKs arrive at the upstream bottleneck link at a faster rate than the link can support, they get queued behind one another. The spacing between them when they emerge from the link is dilated with respect to their original spacing, and is a function of the upstream bottleneck capacity. Thus the TCP sender clocks out new data packets at a slower rate than if there had been no queuing of ACKs. The performance of the connection is no longer dependent on the downstream bottleneck link alone; instead, it is throttled by the rate of arriving ACKs. As a side effect, the sender's rate of cwnd growth also slows down. A second side effect arises when the upstream bottleneck link on the reverse path is saturated. The saturated link causes persistent queuing of packets, leading to an increasing path Round Trip Time (RTT) [RFC2998] observed by all end hosts using the bottleneck link. This can impact the protocol control loops, and may also trigger false time out (underestimation of the path RTT by the sending host). A different situation arises when the upstream bottleneck link has a relatively small amount of buffer space to accommodate ACKs. As the transmission window grows, this queue fills, and ACKs are dropped. If the receiver were to acknowledge every packet, only one of every k
ACKs would get through to the sender, and the remaining (k-1) are dropped due to buffer overflow at the upstream link buffer (here k is the normalized bandwidth ratio as before). In this case, the reverse bottleneck link capacity and slow ACK arrival rate are not directly responsible for any degraded performance. However, the infrequency of ACKs leads to three reasons for degraded performance: 1. The sender transmits data in large bursts of packets, limited only by the available cwnd. If the sender receives only one ACK in k, it transmits data in bursts of k (or more) packets because each ACK shifts the sliding window by at least k (acknowledged) data packets (TCP data segments). This increases the likelihood of data packet loss along the forward path especially when k is large, because routers do not handle large bursts of packets well. 2. Current TCP sender implementations increase their cwnd by counting the number of ACKs they receive and not by how much data is actually acknowledged by each ACK. The later approach, also known as byte counting (section 4.7), is a standard implementation option for cwnd increase during the congestion avoidance period [RFC2581]. Thus fewer ACKs imply a slower rate of growth of the cwnd, which degrades performance over long-delay connections. 3. The sender TCP's Fast Retransmission and Fast Recovery algorithms [RFC2581] are less effective when ACKs are lost. The sender may possibly not receive the threshold number of duplicate ACKs even if the receiver transmits more than the DupACK threshold (> 3 DupACKs) [RFC2581]. Furthermore, the sender may possibly not receive enough duplicate ACKs to adequately inflate its cwnd during Fast Recovery.3.2 MAC Protocol Interactions
The interaction of TCP with MAC protocols may degrade end-to-end performance. Variable round-trip delays and ACK queuing are the main symptoms of this problem. One example is the impact on terrestrial wireless networks [Bal98]. A high per-packet overhead may arise from the need for communicating link nodes to first synchronise (e.g., via a Ready To Send / Clear to Send (RTS/CTS) protocol) before communication and the significant turn-around time for the wireless channel. This overhead is variable, since the RTS/CTS exchange may need to back-off exponentially when the remote node is busy (e.g., engaged in a conversation with a different node). This leads to large and variable communication latencies in packet-radio networks.
An asymmetric workload (more downstream than upstream traffic) may cause ACKs to be queued in some wireless nodes (especially in the end host modems), exacerbating the variable latency. Queuing may also occur in other shared media, e.g., cable modem uplinks, BoD access systems often employed on shared satellite channels. Variable latency and ACK queuing reduces the smoothness of the TCP data flow. In particular, ACK traffic can interfere with the flow of data packets, increasing the traffic load of the system. TCP measures the path RTT, and from this calculates a smoothed RTT estimate (srtt) and a linear deviation, rttvar. These are used to estimate a path retransmission timeout (RTO) [RFC2988], set to srtt + 4*rttvar. For most wired TCP connections, the srtt remains constant or has a low linear deviation. The RTO therefore tracks the path RTT, and the TCP sender will respond promptly when multiple losses occur in a window. In contrast, some wireless networks exhibit a high variability in RTT, causing the RTO to significantly increase (e.g., on the order of 10 seconds). Paths traversing multiple wireless hops are especially vulnerable to this effect, because this increases the probability that the intermediate nodes may already be engaged in conversation with other nodes. The overhead in most MAC schemes is a function of both the number and size of packets. However, the MAC contention problem is a significant function of the number of packets (e.g., ACKs) transmitted rather than their size. In other words, there is a significant cost to transmitting a packet regardless of packet size. Experiments conducted on the Ricochet packet radio network in 1996 and 1997 demonstrated the impact of radio turnarounds and the corresponding increased RTT variability, resulting in degraded TCP performance. It was not uncommon for TCP connections to experience timeouts of 9 - 12 seconds, with the result that many connections were idle for a significant fraction of their lifetime (e.g., sometimes 35% of the total transfer time). This leads to under- utilization of the available capacity. These effects may also occur in other wireless subnetworks.3.3 Bidirectional Traffic
Bidirectional traffic arises when there are simultaneous TCP transfers in the forward and reverse directions over an asymmetric network path, e.g., a user who sends an e-mail message in the reverse direction while simultaneously receiving a web page in the forward direction. To simplify the discussion, only one TCP connection in each direction is considered. In many practical cases, several simultaneous connections need to share the available capacity, increasing the level of congestion.
Bidirectional traffic makes the effects discussed in section 3.1 more pronounced, because part of the upstream link bandwidth is consumed by the reverse transfer. This effectively increases the degree of bandwidth asymmetry. Other effects also arise due to the interaction between data packets of the reverse transfer and ACKs of the forward transfer. Suppose at the time the forward TCP connection is initiated, the reverse TCP connection has already saturated the bottleneck upstream link with data packets. There is then a high probability that many ACKs of the new forward TCP connection will encounter a full upstream link buffer and hence get dropped. Even after these initial problems, ACKs of the forward connection could get queued behind large data packets of the reverse connection. The larger data packets may have correspondingly long transmission times (e.g., it takes about 280 ms to transmit a 1 Kbyte data packet over a 28.8 kbps line). This causes the forward transfer to stall for long periods of time. It is only at times when the reverse connection loses packets (due to a buffer overflow at an intermediate router) and slows down, that the forward connection gets the opportunity to make rapid progress and build up its cwnd. When ACKs are queued behind other traffic for appreciable periods of time, the burst nature of TCP traffic and self-synchronizing effects can result in an effect known as ACK Compression [ZSC91], which reduces the throughput of TCP. It occurs when a series of ACKs, in one direction are queued behind a burst of other packets (e.g., data packets traveling in the same direction) and become compressed in time. This results in an intense burst of data packets in the other direction, in response to the burst of compressed ACKs arriving at the server. This phenomenon has been investigated in detail for bidirectional traffic, and recent analytical work [LMS97] has predicted ACK Compression may also result from bi-directional transmission with asymmetry, and was observed in practical asymmetric satellite subnetworks [FSS01]. In the case of extreme asymmetry (k>>1), the inter-ACK spacing can increase due to queuing (section 3.1), resulting in ACK dilation. In summary, sharing of the upstream bottleneck link by multiple flows (e.g., IP flows to the same end host, or flows to a number of end hosts sharing a common upstream link) increases the level of ACK Congestion. The presence of bidirectional traffic exacerbates the constraints introduced by bandwidth asymmetry because of the adverse interaction between (large) data packets of a reverse direction connection and the ACKs of a forward direction connection.
3.4 Loss in Asymmetric Network Paths
Loss may occur in either the forward or reverse direction. For data transfer in the forward direction this results respectively in loss of data packets and ACK packets. Loss of ACKs is less significant than loss of data packets, because it generally results in stretch ACKs [CR98, FSS01]. In the case of long delay paths, a slow upstream link [RFC3150] can lead to another complication when the end host uses TCP large windows [RFC1323] to maximize throughput in the forward direction. Loss of data packets on the forward path, due to congestion, or link loss, common for some wireless links, will generate a large number of back-to-back duplicate ACKs (or TCP SACK packets [RFC2018]), for each correctly received data packet following a loss. The TCP sender employs Fast Retransmission and Recovery [RFC2581] to recover from the loss, but even if this is successful, the ACK to the retransmitted data segment may be significantly delayed by other duplicate ACKs still queued at the upstream link buffer. This can ultimately lead to a timeout [RFC2988] and a premature end to the TCP Slow Start [RFC2581]. This results in poor forward path throughput. Section 5.3 describes some mitigations to counter this.