Appendix A. Literature Review
This appendix summarizes the literature with respect to link indications on wireless local area networks.A.1. Link Layer
The characteristics of wireless links have been found to vary considerably depending on the environment. In "Performance of Multihop Wireless Networks: Shortest Path is Not Enough" [Shortest], the authors studied the performance of both an indoor and outdoor mesh network. By measuring inter-node throughput, the best path between nodes was computed. The throughput of the best path was compared with the throughput of the shortest path computed based on a hop-count metric. In almost all cases, the shortest path route offered considerably lower throughput than the best path. In examining link behavior, the authors found that rather than exhibiting a bi-modal distribution between "up" (low loss rate) and "down" (high loss rate), many links exhibited intermediate loss rates. Asymmetry was also common, with 30 percent of links demonstrating substantial differences in the loss rates in each direction. As a result, on wireless networks the measured throughput can differ substantially from the negotiated rate due to retransmissions, and successful delivery of routing packets is not necessarily an indication that the link is useful for delivery of data. In "Measurement and Analysis of the Error Characteristics of an In-Building Wireless Network" [Eckhardt], the authors characterize the performance of an AT&T Wavelan 2 Mbps in-building WLAN operating in Infrastructure mode on the Carnegie Mellon campus. In this study, very low frame loss was experienced. As a result, links could be assumed to operate either very well or not at all. In "Link-level Measurements from an 802.11b Mesh Network" [Aguayo], the authors analyze the causes of frame loss in a 38-node urban multi-hop 802.11 ad-hoc network. In most cases, links that are very bad in one direction tend to be bad in both directions, and links that are very good in one direction tend to be good in both directions. However, 30 percent of links exhibited loss rates differing substantially in each direction. Signal to noise ratio (SNR) and distance showed little value in predicting loss rates, and rather than exhibiting a step-function transition between "up" (low loss) or "down" (high loss) states, inter-node loss rates varied widely, demonstrating a nearly uniform
distribution over the range at the lower rates. The authors attribute the observed effects to multi-path fading, rather than attenuation or interference. The findings of [Eckhardt] and [Aguayo] demonstrate the diversity of link conditions observed in practice. While for indoor infrastructure networks site surveys and careful measurement can assist in promoting ideal behavior, in ad-hoc/mesh networks node mobility and external factors such as weather may not be easily controlled. Considerable diversity in behavior is also observed due to implementation effects. "Techniques to reduce IEEE 802.11b MAC layer handover time" [Velayos] measured handover times for a stationary STA after the AP was turned off. This study divided handover times into detection (determination of disconnection from the existing point of attachment), search (discovery of alternative attachment points), and execution (connection to an alternative point of attachment) phases. These measurements indicated that the duration of the detection phase (the largest component of handoff delay) is determined by the number of non-acknowledged frames triggering the search phase and delays due to precursors such as RTS/CTS and rate adaptation. Detection behavior varied widely between implementations. For example, network interface cards (NICs) designed for desktops attempted more retransmissions prior to triggering search as compared with laptop designs, since they assumed that the AP was always in range, regardless of whether the Beacon was received. The study recommends that the duration of the detection phase be reduced by initiating the search phase as soon as collisions can be excluded as the cause of non-acknowledged transmissions; the authors recommend three consecutive transmission failures as the cutoff. This approach is both quicker and more immune to multi-path interference than monitoring of the SNR. Where the STA is not sending or receiving frames, it is recommended that Beacon reception be tracked in order to detect disconnection, and that Beacon spacing be reduced to 60 ms in order to reduce detection times. In order to compensate for more frequent triggering of the search phase, the authors recommend algorithms for wait time reduction, as well as interleaving of search and data frame transmission. "An Empirical Analysis of the IEEE 802.11 MAC Layer Handoff Process" [Mishra] investigates handoff latencies obtained with three mobile STA implementations communicating with two APs. The study found that there is a large variation in handoff latency among STA and AP implementations and that implementations utilize different message sequences. For example, one STA sends a Reassociation Request prior
to authentication, which results in receipt of a Deauthenticate message. The study divided handoff latency into discovery, authentication, and reassociation exchanges, concluding that the discovery phase was the dominant component of handoff delay. Latency in the detection phase was not investigated. "SyncScan: Practical Fast Handoff for 802.11 Infrastructure Networks" [Ramani] weighs the pros and cons of active versus passive scanning. The authors point out the advantages of timed Beacon reception, which had previously been incorporated into [IEEE-802.11k]. Timed Beacon reception allows the station to continually keep up to date on the signal to noise ratio of neighboring APs, allowing handoff to occur earlier. Since the station does not need to wait for initial and subsequent responses to a broadcast Probe Response (MinChannelTime and MaxChannelTime, respectively), performance is comparable to what is achievable with 802.11k Neighbor Reports and unicast Probe Requests. The authors measured the channel switching delay, the time it takes to switch to a new frequency and begin receiving frames. Measurements ranged from 5 ms to 19 ms per channel; where timed Beacon reception or interleaved active scanning is used, switching time contributes significantly to overall handoff latency. The authors propose deployment of APs with Beacons synchronized via Network Time Protocol (NTP) [RFC1305], enabling a driver implementing SyncScan to work with legacy APs without requiring implementation of new protocols. The authors measured the distribution of inter- arrival times for stations implementing SyncScan, with excellent results. "Roaming Interval Measurements" [Alimian] presents data on the behavior of stationary STAs after the AP signal has been shut off. This study highlighted implementation differences in rate adaptation as well as detection, scanning, and handoff. As in [Velayos], performance varied widely between implementations, from half an order of magnitude variation in rate adaptation to an order of magnitude difference in detection times, two orders of magnitude in scanning, and one and a half orders of magnitude in handoff times. "An experimental study of IEEE 802.11b handoff performance and its effect on voice traffic" [Vatn] describes handover behavior observed when the signal from the AP is gradually attenuated, which is more representative of field experience than the shutoff techniques used in [Velayos]. Stations were configured to initiate handover when signal strength dipped below a threshold, rather than purely based on frame loss, so that they could begin handover while still connected to the current AP. It was noted that stations continued to receive data frames during the search phase. Station-initiated
Disassociation and pre-authentication were not observed in this study.A.1.1. Link Indications
Within a link layer, the definition of "Link Up" and "Link Down" may vary according to the deployment scenario. For example, within PPP [RFC1661], either peer may send an LCP-Terminate frame in order to terminate the PPP link layer, and a link may only be assumed to be usable for sending network protocol packets once Network Control Protocol (NCP) negotiation has completed for that protocol. Unlike PPP, IEEE 802 does not include facilities for network layer configuration, and the definition of "Link Up" and "Link Down" varies by implementation. Empirical evidence suggests that the definition of "Link Up" and "Link Down" may depend on whether the station is mobile or stationary, whether infrastructure or ad-hoc mode is in use, and whether security and Inter-Access Point Protocol (IAPP) is implemented. Where a STA encounters a series of consecutive non-acknowledged frames while having missed one or more Beacons, the most likely cause is that the station has moved out of range of the AP. As a result, [Velayos] recommends that the station begin the search phase after collisions can be ruled out; since this approach does not take rate adaptation into account, it may be somewhat aggressive. Only when no alternative workable rate or point of attachment is found is a "Link Down" indication returned. In a stationary point-to-point installation, the most likely cause of an outage is that the link has become impaired, and alternative points of attachment may not be available. As a result, implementations configured to operate in this mode tend to be more persistent. For example, within 802.11 the short interframe space (SIFS) interval may be increased and MIB variables relating to timeouts (such as dot11AuthenticationResponseTimeout, dot11AssociationResponseTimeout, dot11ShortRetryLimit, and dot11LongRetryLimit) may be set to larger values. In addition, a "Link Down" indication may be returned later. In IEEE 802.11 ad-hoc mode with no security, reception of data frames is enabled in State 1 ("Unauthenticated" and "Unassociated"). As a result, reception of data frames is enabled at any time, and no explicit "Link Up" indication exists. In Infrastructure mode, IEEE 802.11-2003 enables reception of data frames only in State 3 ("Authenticated" and "Associated"). As a result, a transition to State 3 (e.g., completion of a successful
Association or Reassociation exchange) enables sending and receiving of network protocol packets and a transition from State 3 to State 2 (reception of a "Disassociate" frame) or State 1 (reception of a "Deauthenticate" frame) disables sending and receiving of network protocol packets. As a result, IEEE 802.11 stations typically signal "Link Up" on receipt of a successful Association/Reassociation Response. As described within [IEEE-802.11F], after sending a Reassociation Response, an Access Point will send a frame with the station's source address to a multicast destination. This causes switches within the Distribution System (DS) to update their learning tables, readying the DS to forward frames to the station at its new point of attachment. Were the AP to not send this "spoofed" frame, the station's location would not be updated within the distribution system until it sends its first frame at the new location. Thus, the purpose of spoofing is to equalize uplink and downlink handover times. This enables an attacker to deny service to authenticated and associated stations by spoofing a Reassociation Request using the victim's MAC address, from anywhere within the ESS. Without spoofing, such an attack would only be able to disassociate stations on the AP to which the Reassociation Request was sent. The signaling of "Link Down" is considerably more complex. Even though a transition to State 2 or State 1 results in the station being unable to send or receive IP packets, this does not necessarily imply that such a transition should be considered a "Link Down" indication. In an infrastructure network, a station may have a choice of multiple Access Points offering connection to the same network. In such an environment, a station that is unable to reach State 3 with one Access Point may instead choose to attach to another Access Point. Rather than registering a "Link Down" indication with each move, the station may instead register a series of "Link Up" indications. In [IEEE-802.11i], forwarding of frames from the station to the distribution system is only feasible after the completion of the 4-way handshake and group-key handshake, so that entering State 3 is no longer sufficient. This has resulted in several observed problems. For example, where a "Link Up" indication is triggered on the station by receipt of an Association/Reassociation Response, DHCP [RFC2131] or Router Solicitation/Router Advertisement (RS/RA) may be triggered prior to when the link is usable by the Internet layer, resulting in configuration delays or failures. Similarly, transport layer connections will encounter packet loss, resulting in back-off of retransmission timers.
A.1.2. Smart Link Layer Proposals
In order to improve link layer performance, several studies have investigated "smart link layer" proposals. "Advice to link designers on link Automatic Repeat reQuest (ARQ)" [RFC3366] provides advice to the designers of digital communication equipment and link-layer protocols employing link-layer Automatic Repeat reQuest (ARQ) techniques for IP. It discusses the use of ARQ, timers, persistency in retransmission, and the challenges that arise from sharing links between multiple flows and from different transport requirements. In "Link-layer Enhancements for TCP/IP over GSM" [Ludwig], the authors describe how the Global System for Mobile Communications (GSM)-reliable and unreliable link layer modes can be simultaneously utilized without higher layer control. Where a reliable link layer protocol is required (where reliable transports such TCP and Stream Control Transmission Protocol (SCTP) [RFC2960] are used), the Radio Link Protocol (RLP) can be engaged; with delay-sensitive applications such as those based on UDP, the transparent mode (no RLP) can be used. The authors also describe how PPP negotiation can be optimized over high-latency GSM links using "Quickstart-PPP". In "Link Layer Based TCP Optimisation for Disconnecting Networks" [Scott], the authors describe performance problems that occur with reliable transport protocols facing periodic network disconnections, such as those due to signal fading or handoff. The authors define a disconnection as a period of connectivity loss that exceeds a retransmission timeout, but is shorter than the connection lifetime. One issue is that link-unaware senders continue to back off during periods of disconnection. The authors suggest that a link-aware reliable transport implementation halt retransmission after receiving a "Link Down" indication. Another issue is that on reconnection the lengthened retransmission times cause delays in utilizing the link. To improve performance, a "smart link layer" is proposed, which stores the first packet that was not successfully transmitted on a connection, then retransmits it upon receipt of a "Link Up" indication. Since a disconnection can result in hosts experiencing different network conditions upon reconnection, the authors do not advocate bypassing slow start or attempting to raise the congestion window. Where IPsec is used and connections cannot be differentiated because transport headers are not visible, the first untransmitted packet for a given sender and destination IP address can be retransmitted. In addition to looking at retransmission of a single packet per connection, the authors also examined other schemes such
as retransmission of multiple packets and simulated duplicate reception of single or multiple packets (known as rereception). In general, retransmission schemes were superior to rereception schemes, since rereception cannot stimulate fast retransmit after a timeout. Retransmission of multiple packets did not appreciably improve performance over retransmission of a single packet. Since the focus of the research was on disconnection rather than just lossy channels, a two-state Markov model was used, with the "up" state representing no loss, and the "down" state representing 100 percent loss. In "Multi Service Link Layers: An Approach to Enhancing Internet Performance over Wireless Links" [Xylomenos], the authors use ns-2 to simulate the performance of various link layer recovery schemes (raw link without retransmission, go back N, XOR-based FEC, selective repeat, Karn's RLP, out-of-sequence RLP, and Berkeley Snoop) in stand-alone file transfer, Web browsing, and continuous media distribution. While selective repeat and Karn's RLP provide the highest throughput for file transfer and Web browsing scenarios, continuous media distribution requires a combination of low delay and low loss and the out-of-sequence RLP performed best in this scenario. Since the results indicate that no single link layer recovery scheme is optimal for all applications, the authors propose that the link layer implement multiple recovery schemes. Simulations of the multi-service architecture showed that the combination of a low-error rate recovery scheme for TCP (such as Karn's RLP) and a low-delay scheme for UDP traffic (such as out-of-sequence RLP) provides for good performance in all scenarios. The authors then describe how a multi-service link layer can be integrated with Differentiated Services. In "WaveLAN-II: A High-Performance Wireless LAN for the Unlicensed Band" [Kamerman], the authors propose an open-loop rate adaptation algorithm known as Automatic Rate Fallback (ARF). In ARF, the sender adjusts the rate upwards after a fixed number of successful transmissions, and adjusts the rate downwards after one or two consecutive failures. If after an upwards rate adjustment the transmission fails, the rate is immediately readjusted downwards. In "A Rate-Adaptive MAC Protocol for Multi-Hop Wireless Networks" [RBAR], the authors propose a closed-loop rate adaptation approach that requires incompatible changes to the IEEE 802.11 MAC. In order to enable the sender to better determine the transmission rate, the receiver determines the packet length and signal to noise ratio (SNR) of a received RTS frame and calculates the corresponding rate based on a theoretical channel model, rather than channel usage statistics. The recommended rate is sent back in the CTS frame. This allows the
rate (and potentially the transmit power) to be optimized on each transmission, albeit at the cost of requiring RTS/CTS for every frame transmission. In "MiSer: An Optimal Low-Energy Transmission Strategy for IEEE 802.11 a/h" [Qiao], the authors propose a scheme for optimizing transmit power. The proposal mandates the use of RTS/CTS in order to deal with hidden nodes, requiring that CTS and ACK frames be sent at full power. The authors utilize a theoretical channel model rather than one based on channel usage statistics. In "IEEE 802.11 Rate Adaptation: A Practical Approach" [Lacage], the authors distinguish between low-latency implementations, which enable per-packet rate decisions, and high-latency implementations, which do not. The former implementations typically include dedicated CPUs in their design, enabling them to meet real-time requirements. The latter implementations are typically based on highly integrated designs in which the upper MAC is implemented on the host. As a result, due to operating system latencies the information required to make per-packet rate decisions may not be available in time. The authors propose an Adaptive ARF (AARF) algorithm for use with low-latency implementations. This enables rapid downward rate negotiation on failure to receive an ACK, while increasing the number of successful transmissions required for upward rate negotiation. The AARF algorithm is therefore highly stable in situations where channel properties are changing slowly, but slow to adapt upwards when channel conditions improve. In order to test the algorithm, the authors utilized ns-2 simulations as well as implementing a version of AARF adapted to a high-latency implementation, the AR 5212 chipset. The Multiband Atheros Driver for WiFi (MadWiFi) driver enables a fixed schedule of rates and retries to be provided when a frame is queued for transmission. The adapted algorithm, known as the Adaptive Multi Rate Retry (AMRR), requests only one transmission at each of three rates, the last of which is the minimum available rate. This enables adaptation to short-term fluctuations in the channel with minimal latency. The AMRR algorithm provides performance considerably better than the existing MadWifi driver. In "Link Adaptation Strategy for IEEE 802.11 WLAN via Received Signal Strength Measurement" [Pavon], the authors propose an algorithm by which a STA adjusts the transmission rate based on a comparison of the received signal strength (RSS) from the AP with dynamically estimated threshold values for each transmission rate. Upon reception of a frame, the STA updates the average RSS, and on transmission the STA selects a rate and adjusts the RSS threshold values based on whether or not the transmission is successful. In order to validate the algorithm, the authors utilized an OPNET
simulation without interference, and an ideal curve of bit error rate (BER) vs. signal to noise ratio (SNR) was assumed. Not surprisingly, the simulation results closely matched the maximum throughput achievable for a given signal to noise ratio, based on the ideal BER vs. SNR curve. In "Hybrid Rate Control for IEEE 802.11" [Haratcherev], the authors describe a hybrid technique utilizing Signal Strength Indication (SSI) data to constrain the potential rates selected by statistics- based automatic rate control. Statistics-based rate control techniques include: Maximum Throughput This technique, which was chosen as the statistics-based technique in the hybrid scheme, sends a fraction of data at adjacent rates in order to estimate which rate provides the maximum throughput. Since accurate estimation of throughput requires a minimum number of frames to be sent at each rate, and only a fraction of frames are utilized for this purpose, this technique adapts more slowly at lower rates; with 802.11b rates, the adaptation time scale is typically on the order of a second. Depending on how many rates are tested, this technique can enable adaptation beyond adjacent rates. However, where maximum rate and low frame loss are already being encountered, this technique results in lower throughput. Frame Error Rate (FER) Control This technique estimates the FER, attempting to keep it between a lower limit (if FER moves below, increase rate) and upper limit (if FER moves above, decrease rate). Since this technique can utilize all the transmitted data, it can respond faster than maximum throughput techniques. However, there is a tradeoff of reaction time versus FER estimation accuracy; at lower rates either reaction times slow or FER estimation accuracy will suffer. Since this technique only measures the FER at the current rate, it can only enable adaptation to adjacent rates. Retry-based This technique modifies FER control techniques by enabling rapid downward rate adaptation after a number (5-10) of unsuccessful retransmissions. Since fewer packets are required, the sensitivity of reaction time to rate is reduced. However, upward rate adaptation proceeds more slowly since it is based on a collection of FER data. This technique is limited to adaptation to adjacent rates, and it has the disadvantage of potentially worsening frame loss due to contention.
While statistics-based techniques are robust against short-lived link quality changes, they do not respond quickly to long-lived changes. By constraining the rate selected by statistics-based techniques based on ACK SSI versus rate data (not theoretical curves), more rapid link adaptation was enabled. In order to ensure rapid adaptation during rapidly varying conditions, the rate constraints are tightened when the SSI values are changing rapidly, encouraging rate transitions. The authors validated their algorithms by implementing a driver for the Atheros AR5000 chipset, and then testing its response to insertion and removal from a microwave oven acting as a Faraday cage. The hybrid algorithm dropped many fewer packets than the maximum throughput technique by itself. In order to estimate the SSI of data at the receiver, the ACK SSI was used. This approach does not require the receiver to provide the sender with the received power, so that it can be implemented without changing the IEEE 802.11 MAC. Calibration of the rate versus ACK SSI curves does not require a symmetric channel, but it does require that channel properties in both directions vary in a proportional way and that the ACK transmit power remains constant. The authors checked the proportionality assumption and found that the SSI of received data correlated highly (74%) with the SSI of received ACKs. Low pass filtering and monotonicity constraints were applied to remove noise in the rate versus SSI curves. The resulting hybrid rate adaptation algorithm demonstrated the ability to respond to rapid deterioration (and improvement) in channel properties, since it is not restricted to moving to adjacent rates. In "CARA: Collision-Aware Rate Adaptation for IEEE 802.11 WLANs" [CARA], the authors propose Collision-Aware Rate Adaptation (CARA). This involves utilization of Clear Channel Assessment (CCA) along with adaptation of the Request-to-Send/Clear-to-Send (RTS/CTS) mechanism to differentiate losses caused by frame collisions from losses caused by channel conditions. Rather than decreasing rate as the result of frame loss due to collisions, which leads to increased contention, CARA selectively enables RTS/CTS (e.g., after a frame loss), reducing the likelihood of frame loss due to hidden stations. CARA can also utilize CCA to determine whether a collision has occurred after a transmission; however, since CCA may not detect a significant fraction of all collisions (particularly when transmitting at low rate), its use is optional. As compared with ARF, in simulations the authors show large improvements in aggregate throughput due to addition of adaptive RTS/CTS, and additional modest improvements with the additional help of CCA. In "Robust Rate Adaptation for 802.11 Wireless Networks" [Robust], the authors implemented the ARF, AARF, and SampleRate [SampleRate] algorithms on a programmable Access Point platform, and
experimentally examined the performance of these algorithms as well as the ONOE [ONOE] algorithm implemented in MadWiFi. Based on their experiments, the authors critically examine the assumptions underlying existing rate negotiation algorithms: Decrease transmission rate upon severe frame loss Where severe frame loss is due to channel conditions, rate reduction can improve throughput. However, where frame loss is due to contention (such as from hidden stations), reducing transmission rate increases congestion, lowering throughput and potentially leading to congestive collapse. Instead, the authors propose adaptive enabling of RTS/CTS so as to reduce contention due to hidden stations. Once RTS/CTS is enabled, remaining losses are more likely to be due to channel conditions, providing more reliable guidance on increasing or decreasing transmission rate. Use probe frames to assess possible new rates Probe frames reliably estimate frame loss at a given rate unless the sample size is sufficient and the probe frames are of comparable length to data frames. The authors argue that rate adaptation schemes such as SampleRate are too sensitive to loss of probe packets. In order to satisfy sample size constraints, a significant number of probe frames are required. This can increase frame loss if the probed rate is too high, and can lower throughput if the probed rate is too low. Instead, the authors propose assessment of the channel condition by tracking the frame loss ratio within a window of 5 to 40 frames. Use consecutive transmission successes/losses to increase/decrease rate The authors argue that consecutive successes or losses are not a reliable basis for rate increases or decreases; greater sample size is needed. Use PHY metrics like SNR to infer new transmission rate The authors argue that received signal to noise ratio (SNR) routinely varies 5 dB per packet and that variations of 10-14 dB are common. As a result, rate decisions based on SNR or signal strength can cause transmission rate to vary rapidly. The authors question the value of such rapid variation, since studies such as [Aguayo] show little correlation between SNR and frame loss probability. As a result, the authors argue that neither received signal strength indication (RSSI) nor background energy level can be used to distinguish losses due to contention from those due to channel conditions. While multi- path interference can simultaneously result in high signal strength and frame loss, the relationship between low signal
strength and high frame loss is stronger. Therefore, transmission rate decreases due to low received signal strength probably do reflect sudden worsening in channel conditions, although sudden increases may not necessarily indicate that channel conditions have improved. Long-term smoothened operation produces best average performance The authors present evidence that frame losses more than 150 ms apart are uncorrelated. Therefore, collection of statistical data over intervals of 1 second or greater reduces responsiveness, but does not improve the quality of transmission rate decisions. Rather, the authors argue that a sampling period of 100 ms provides the best average performance. Such small sampling periods also argue against use of probes, since probe packets can only represent a fraction of all data frames and probes collected more than 150 ms apart may not provide reliable information on channel conditions. Based on these flaws, the authors propose the Robust Rate Adaptation Algorithm (RRAA). RRAA utilizes only the frame loss ratio at the current transmission rate to determine whether to increase or decrease the transmission rate; PHY layer information or probe packets are not used. Each transmission rate is associated with an estimation window, a maximum tolerable loss threshold (MTL) and an opportunistic rate increase threshold (ORI). If the loss ratio is larger than the MTL, the transmission rate is decreased, and if it is smaller than the ORI, transmission rate is increased; otherwise transmission rate remains the same. The thresholds are selected in order to maximize throughput. Although RRAA only allows movement between adjacent transmission rates, the algorithm does not require collection of an entire estimation window prior to increasing or decreasing transmission rates; if additional data collection would not change the decision, the change is made immediately. The authors validate the RRAA algorithm using experiments and field trials; the results indicate that RRAA without adaptive RTS/CTS outperforms the ARF, AARF, and Sample Rate algorithms. This occurs because RRAA is not as sensitive to transient frame loss and does not use probing, enabling it to more frequently utilize higher transmission rates. Where there are no hidden stations, turning on adaptive RTS/CTS reduces performance by at most a few percent. However, where there is substantial contention from hidden stations, adaptive RTS/CTS provides large performance gains, due to reduction in frame loss that enables selection of a higher transmission rate. In "Efficient Mobility Management for Vertical Handoff between WWAN and WLAN" [Vertical], the authors propose use of signal strength and link utilization in order to optimize vertical handoff. WLAN to WWAN
handoff is driven by SSI decay. When IEEE 802.11 SSI falls below a threshold (S1), Fast Fourier Transform (FFT)-based decay detection is undertaken to determine if the signal is likely to continue to decay. If so, then handoff to the WWAN is initiated when the signal falls below the minimum acceptable level (S2). WWAN to WLAN handoff is driven by both PHY and MAC characteristics of the IEEE 802.11 target network. At the PHY layer, characteristics such as SSI are examined to determine if the signal strength is greater than a minimum value (S3). At the MAC layer, the IEEE 802.11 Network Allocation Vector (NAV) occupation is examined in order to estimate the maximum available bandwidth and mean access delay. Note that depending on the value of S3, it is possible for the negotiated rate to be less than the available bandwidth. In order to prevent premature handoff between WLAN and WWAN, S1 and S2 are separated by 6 dB; in order to prevent oscillation between WLAN and WWAN media, S3 needs to be greater than S1 by an appropriate margin.A.2. Internet Layer
Within the Internet layer, proposals have been made for utilizing link indications to optimize IP configuration, to improve the usefulness of routing metrics, and to optimize aspects of Mobile IP handoff. In "Analysis of link failures in an IP backbone" [Iannaccone], the authors investigate link failures in Sprint's IP backbone. They identify the causes of convergence delay, including delays in detection of whether an interface is down or up. While it is fastest for a router to utilize link indications if available, there are situations in which it is necessary to depend on loss of routing packets to determine the state of the link. Once the link state has been determined, a delay may occur within the routing protocol in order to dampen link flaps. Finally, another delay may be introduced in propagating the link state change, in order to rate limit link state advertisements, and guard against instability. "Bidirectional Forwarding Detection" [BFD] notes that link layers may provide only limited failure indications, and that relatively slow "Hello" mechanisms are used in routing protocols to detect failures when no link layer indications are available. This results in failure detection times of the order of a second, which is too long for some applications. The authors describe a mechanism that can be used for liveness detection over any media, enabling rapid detection of failures in the path between adjacent forwarding engines. A path is declared operational when bidirectional reachability has been confirmed.
In "Detecting Network Attachment (DNA) in IPv4" [RFC4436], a host that has moved to a new point of attachment utilizes a bidirectional reachability test in parallel with DHCP [RFC2131] to rapidly reconfirm an operable configuration. In "L2 Triggers Optimized Mobile IPv6 Vertical Handover: The 802.11/GPRS Example" [Park], the authors propose that the mobile node send a router solicitation on receipt of a "Link Up" indication in order to provide lower handoff latency than would be possible using generic movement detection [RFC3775]. The authors also suggest immediate invalidation of the Care-of Address (CoA) on receipt of a "Link Down" indication. However, this is problematic where a "Link Down" indication can be followed by a "Link Up" indication without a resulting change in IP configuration, as described in [RFC4436]. In "Layer 2 Handoff for Mobile-IPv4 with 802.11" [Mun], the authors suggest that MIPv4 Registration messages be carried within Information Elements of IEEE 802.11 Association/Reassociation frames, in order to minimize handoff delays. This requires modification to the mobile node as well as 802.11 APs. However, prior to detecting network attachment, it is difficult for the mobile node to determine whether or not the new point of attachment represents a change of network. For example, even where a station remains within the same ESS, it is possible that the network will change. Where no change of network results, sending a MIPv4 Registration message with each Association/Reassociation is unnecessary. Where a change of network results, it is typically not possible for the mobile node to anticipate its new CoA at Association/Reassociation; for example, a DHCP server may assign a CoA not previously given to the mobile node. When dynamic VLAN assignment is used, the VLAN assignment is not even determined until IEEE 802.1X authentication has completed, which is after Association/Reassociation in [IEEE-802.11i]. In "Link Characteristics Information for Mobile IP" [Lee], link characteristics are included in registration/Binding Update messages sent by the mobile node to the home agent and correspondent node. Where the mobile node is acting as a receiver, this allows the correspondent node to adjust its transport parameters window more rapidly than might otherwise be possible. Link characteristics that may be communicated include the link type (e.g., 802.11b, CDMA (Code Division Multiple Access), GPRS (General Packet Radio Service), etc.) and link bandwidth. While the document suggests that the correspondent node should adjust its sending rate based on the advertised link bandwidth, this may not be wise in some circumstances. For example, where the mobile node link is not the bottleneck, adjusting the sending rate based on the link bandwidth could cause congestion. Also, where the transmission rate changes frequently, sending registration messages on each transmission rate
change could by itself consume significant bandwidth. Even where the advertised link characteristics indicate the need for a smaller congestion window, it may be non-trivial to adjust the sending rates of individual connections where there are multiple connections open between a mobile node and correspondent node. A more conservative approach would be to trigger parameter re-estimation and slow start based on the receipt of a registration message or Binding Update. In "Hotspot Mitigation Protocol (HMP)" [HMP], it is noted that Mobile Ad-hoc NETwork (MANET) routing protocols have a tendency to concentrate traffic since they utilize shortest-path metrics and allow nodes to respond to route queries with cached routes. The authors propose that nodes participating in an ad-hoc wireless mesh monitor local conditions such as MAC delay, buffer consumption, and packet loss. Where congestion is detected, this is communicated to neighboring nodes via an IP option. In response to moderate congestion, nodes suppress route requests; where major congestion is detected, nodes rate control transport connections flowing through them. The authors argue that for ad-hoc networks, throttling by intermediate nodes is more effective than end-to-end congestion control mechanisms.A.3. Transport Layer
Within the transport layer, proposals have focused on countering the effects of handoff-induced packet loss and non-congestive loss caused by lossy wireless links. Where a mobile host moves to a new network, the transport parameters (including the RTT, RTO, and congestion window) may no longer be valid. Where the path change occurs on the sender (e.g., change in outgoing or incoming interface), the sender can reset its congestion window and parameter estimates. However, where it occurs on the receiver, the sender may not be aware of the path change. In "The BU-trigger method for improving TCP performance over Mobile IPv6" [Kim], the authors note that handoff-related packet loss is interpreted as congestion by the transport layer. In the case where the correspondent node is sending to the mobile node, it is proposed that receipt of a Binding Update by the correspondent node be used as a signal to the transport layer to adjust cwnd and ssthresh values, which may have been reduced due to handoff-induced packet loss. The authors recommend that cwnd and ssthresh be recovered to pre-timeout values, regardless of whether the link parameters have changed. The paper does not discuss the behavior of a mobile node sending a Binding Update, in the case where the mobile node is sending to the correspondent node.
In "Effect of Vertical Handovers on Performance of TCP-Friendly Rate Control" [Gurtov], the authors examine the effect of explicit handover notifications on TCP-friendly rate control (TFRC). Where explicit handover notification includes information on the loss rate and throughput of the new link, this can be used to instantaneously change the transmission rate of the sender. The authors also found that resetting the TFRC receiver state after handover enabled parameter estimates to adjust more quickly. In "Adapting End Host Congestion Control for Mobility" [Eddy], the authors note that while MIPv6 with route optimization allows a receiver to communicate a subnet change to the sender via a Binding Update, this is not available within MIPv4. To provide a communication vehicle that can be universally employed, the authors propose a TCP option that allows a connection endpoint to inform a peer of a subnet change. The document does not advocate utilization of "Link Up" or "Link Down" events since these events are not necessarily indicative of subnet change. On detection of subnet change, it is advocated that the congestion window be reset to INIT_WINDOW and that transport parameters be re-estimated. The authors argue that recovery from slow start results in higher throughput both when the subnet change results in lower bottleneck bandwidth as well as when bottleneck bandwidth increases. In "Efficient Mobility Management for Vertical Handoff between WWAN and WLAN" [Vertical], the authors propose a "Virtual Connectivity Manager", which utilizes local connection translation (LCT) and a subscription/notification service supporting simultaneous movement in order to enable end-to-end mobility and maintain TCP throughput during vertical handovers. In an early version of "Datagram Congestion Control Protocol (DCCP)" [RFC4340], a "Reset Congestion State" option was proposed in Section 11. This option was removed in part because the use conditions were not fully understood: An HC-Receiver sends the Reset Congestion State option to its sender to force the sender to reset its congestion state -- that is, to "slow start", as if the connection were beginning again. ... The Reset Congestion State option is reserved for the very few cases when an endpoint knows that the congestion properties of a path have changed. Currently, this reduces to mobility: a DCCP endpoint on a mobile host MUST send Reset Congestion State to its peer after the mobile host changes address or path.
"Framework and Requirements for TRIGTRAN" [TRIGTRAN] discusses optimizations to recover earlier from a retransmission timeout incurred during a period in which an interface or intervening link was down. "End-to-end, Implicit 'Link-Up' Notification" [E2ELinkup] describes methods by which a TCP implementation that has backed off its retransmission timer due to frame loss on a remote link can learn that the link has once again become operational. This enables retransmission to be attempted prior to expiration of the backed-off retransmission timer. "Link-layer Triggers Protocol" [Yegin] describes transport issues arising from lack of host awareness of link conditions on downstream Access Points and routers. Transport of link layer triggers is proposed to address the issue. "TCP Extensions for Immediate Retransmissions" [Eggert] describes how a transport layer implementation may utilize existing "end-to-end connectivity restored" indications. It is proposed that in addition to regularly scheduled retransmissions that retransmission be attempted by the transport layer on receipt of an indication that connectivity to a peer node may have been restored. End-to-end connectivity restoration indications include "Link Up", confirmation of first-hop router reachability, confirmation of Internet layer configuration, and receipt of other traffic from the peer. In "Discriminating Congestion Losses from Wireless Losses Using Interarrival Times at the Receiver" [Biaz], the authors propose a scheme for differentiating congestive losses from wireless transmission losses based on inter-arrival times. Where the loss is due to wireless transmission rather than congestion, congestive backoff and cwnd adjustment is omitted. However, the scheme appears to assume equal spacing between packets, which is not realistic in an environment exhibiting link layer frame loss. The scheme is shown to function well only when the wireless link is the bottleneck, which is often the case with cellular networks, but not with IEEE 802.11 deployment scenarios such as home or hotspot use. In "Improving Performance of TCP over Wireless Networks" [Bakshi], the authors focus on the performance of TCP over wireless networks with burst losses. The authors simulate performance of TCP Tahoe within ns-2, utilizing a two-state Markov model, representing "good" and "bad" states. Where the receiver is connected over a wireless link, the authors simulate the effect of an Explicit Bad State Notification (EBSN) sent by an Access Point unable to reach the receiver. In response to an EBSN, it is advocated that the existing retransmission timer be canceled and replaced by a new dynamically
estimated timeout, rather than being backed off. In the simulations, EBSN prevents unnecessary timeouts, decreasing RTT variance and improving throughput. In "A Feedback-Based Scheme for Improving TCP Performance in Ad-Hoc Wireless Networks" [Chandran], the authors proposed an explicit Route Failure Notification (RFN), allowing the sender to stop its retransmission timers when the receiver becomes unreachable. On route reestablishment, a Route Reestablishment Notification (RRN) is sent, unfreezing the timer. Simulations indicate that the scheme significantly improves throughput and reduces unnecessary retransmissions. In "Analysis of TCP Performance over Mobile Ad Hoc Networks" [Holland], the authors explore how explicit link failure notification (ELFN) can improve the performance of TCP in mobile ad hoc networks. ELFN informs the TCP sender about link and route failures so that it need not treat the ensuing packet loss as due to congestion. Using an ns-2 simulation of TCP Reno over 802.11 with routing provided by the Dynamic Source Routing (DSR) protocol, it is demonstrated that TCP performance falls considerably short of expected throughput based on the percentage of the time that the network is partitioned. A portion of the problem was attributed to the inability of the routing protocol to quickly recognize and purge stale routes, leading to excessive link failures; performance improved dramatically when route caching was turned off. Interactions between the route request and transport retransmission timers were also noted. Where the route request timer is too large, new routes cannot be supplied in time to prevent the transport timer from expiring, and where the route request timer is too small, network congestion may result. For their implementation of ELFN, the authors piggybacked additional information (sender and receiver addresses and ports, the TCP sequence number) on an existing "route failure" notice to enable the sender to identify the affected connection. Where a TCP receives an ELFN, it disables the retransmission timer and enters "stand-by" mode, where packets are sent at periodic intervals to determine if the route has been reestablished. If an acknowledgment is received, then the retransmission timers are restored. Simulations show that performance is sensitive to the probe interval, with intervals of 30 seconds or greater giving worse performance than TCP Reno. The effect of resetting the congestion window and RTO values was also investigated. In the study, resetting the congestion window to one did not have much of an effect on throughput, since the bandwidth/delay of the network was only a few packets. However, resetting the RTO to a high initial value (6 seconds) did have a substantial detrimental effect, particularly at high speed. In terms of the probe packet sent, the simulations showed little difference
between sending the first packet in the congestion window, or retransmitting the packet with the lowest sequence number among those signaled as lost via the ELFNs. In "Improving TCP Performance over Wireless Links" [Goel], the authors propose use of an ICMP-DEFER message, sent by a wireless Access Point on failure of a transmission attempt. After exhaustion of retransmission attempts, an ICMP-RETRANSMIT message is sent. On receipt of an ICMP-DEFER message, the expiry of the retransmission timer is postponed by the current RTO estimate. On receipt of an ICMP-RETRANSMIT message, the segment is retransmitted. On retransmission, the congestion window is not reduced; when coming out of fast recovery, the congestion window is reset to its value prior to fast retransmission and fast recovery. Using a two-state Markov model, simulated using ns-2, the authors show that the scheme improves throughput. In "Explicit Transport Error Notification (ETEN) for Error-Prone Wireless and Satellite Networks" [Krishnan], the authors examine the use of explicit transport error notification (ETEN) to aid TCP in distinguishing congestive losses from those due to corruption. Both per-packet and cumulative ETEN mechanisms were simulated in ns-2, using both TCP Reno and TCP SACK over a wide range of bit error rates and traffic conditions. While per-packet ETEN mechanisms provided substantial gains in TCP goodput without congestion, where congestion was also present, the gains were not significant. Cumulative ETEN mechanisms did not perform as well in the study. The authors point out that ETEN faces significant deployment barriers since it can create new security vulnerabilities and requires implementations to obtain reliable information from the headers of corrupt packets. In "Towards More Expressive Transport-Layer Interfaces" [Eggert2], the authors propose extensions to existing network/transport and transport/application interfaces to improve the performance of the transport layer in the face of changes in path characteristics varying more quickly than the round-trip time. In "Protocol Enhancements for Intermittently Connected Hosts" [Schuetz], the authors note that intermittent connectivity can lead to poor performance and connectivity failures. To address these problems, the authors combine the use of the Host Identity Protocol (HIP) [RFC4423] with a TCP User Timeout Option and TCP Retransmission trigger, demonstrating significant improvement.
A.4. Application Layer
In "Application-oriented Link Adaptation for IEEE 802.11" [Haratcherev2], rate information generated by a link layer utilizing improved rate adaptation algorithms is provided to a video application, and used for codec adaptation. Coupling the link and application layers results in major improvements in the Peak Signal to Noise Ratio (PSNR). Since this approach assumes that the link represents the path bottleneck bandwidth, it is not universally applicable to use over the Internet. At the application layer, the usage of "Link Down" indications has been proposed to augment presence systems. In such systems, client devices periodically refresh their presence state using application layer protocols such as SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE) [RFC3428] or Extensible Messaging and Presence Protocol (XMPP) [RFC3921]. If the client should become disconnected, their unavailability will not be detected until the presence status times out, which can take many minutes. However, if a link goes down, and a disconnect indication can be sent to the presence server (presumably by the Access Point, which remains connected), the status of the user's communication application can be updated nearly instantaneously.Appendix B. IAB Members at the Time of This Writing
Bernard Aboba Loa Andersson Brian Carpenter Leslie Daigle Elwyn Davies Kevin Fall Olaf Kolkman Kurtis Lindqvist David Meyer David Oran Eric Rescorla Dave Thaler Lixia Zhang
Author's Address
Bernard Aboba, Ed. Microsoft Corporation One Microsoft Way Redmond, WA 98052 EMail: bernarda@microsoft.com Phone: +1 425 706 6605 Fax: +1 425 936 7329 IAB EMail: iab@iab.org URI: http://www.iab.org/
Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.