Internet Engineering Task Force (IETF) C. Perkins Request for Comments: 8083 University of Glasgow Updates: 3550 V. Singh Category: Standards Track callstats.io ISSN: 2070-1721 March 2017 Multimedia Congestion Control: Circuit Breakers for Unicast RTP SessionsAbstract
The Real-time Transport Protocol (RTP) is widely used in telephony, video conferencing, and telepresence applications. Such applications are often run on best-effort UDP/IP networks. If congestion control is not implemented in these applications, then network congestion can lead to uncontrolled packet loss and a resulting deterioration of the user's multimedia experience. The congestion control algorithm acts as a safety measure by stopping RTP flows from using excessive resources and protecting the network from overload. At the time of this writing, however, while there are several proprietary solutions, there is no standard algorithm for congestion control of interactive RTP flows. This document does not propose a congestion control algorithm. It instead defines a minimal set of RTP circuit breakers: conditions under which an RTP sender needs to stop transmitting media data to protect the network from excessive congestion. It is expected that, in the absence of long-lived excessive congestion, RTP applications running on best-effort IP networks will be able to operate without triggering these circuit breakers. To avoid triggering the RTP circuit breaker, any Standards Track congestion control algorithms defined for RTP will need to operate within the envelope set by these RTP circuit breaker algorithms. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc8083.
Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile . 8 4.1. RTP/AVP Circuit Breaker #1: RTCP Timeout . . . . . . . . 10 4.2. RTP/AVP Circuit Breaker #2: Media Timeout . . . . . . . . 11 4.3. RTP/AVP Circuit Breaker #3: Congestion . . . . . . . . . 12 4.4. RTP/AVP Circuit Breaker #4: Media Usability . . . . . . . 16 4.5. Ceasing Transmission . . . . . . . . . . . . . . . . . . 17 5. RTP Circuit Breakers and the RTP/AVPF and RTP/SAVPF Profiles 18 6. Impact of RTCP Extended Reports (XR) . . . . . . . . . . . . 19 7. Impact of Explicit Congestion Notification (ECN) . . . . . . 19 8. Impact of Bundled Media and Layered Coding . . . . . . . . . 20 9. Security Considerations . . . . . . . . . . . . . . . . . . . 20 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 10.1. Normative References . . . . . . . . . . . . . . . . . . 21 10.2. Informative References . . . . . . . . . . . . . . . . . 22 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 25 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25
1. Introduction
The Real-time Transport Protocol (RTP) [RFC3550] is widely used in voice-over-IP, video teleconferencing, and telepresence systems. Many of these systems run over best-effort UDP/IP networks and can suffer from packet loss and increased latency if network congestion occurs. Designing effective RTP congestion control algorithms to adapt the transmission of RTP-based media to match the available network capacity while also maintaining the user experience is a difficult but important problem. Many such congestion control and media adaptation algorithms have been proposed, but to date there is no consensus on the correct approach or even that a single standard algorithm is desirable. This memo does not attempt to propose a new RTP congestion control algorithm. Instead, we propose a small set of RTP circuit breakers: mechanisms that terminate RTP flows in conditions under which there is general agreement that serious network congestion is occurring. The RTP circuit breakers proposed in this memo are a specific instance of the general class of network transport circuit breakers [RFC8084] designed to act as a protection mechanism of last resort to avoid persistent excessive congestion. To avoid triggering the RTP circuit breaker, any Standards Track congestion control algorithms defined for RTP will need to operate within the envelope set by the RTP circuit breaker algorithms defined by this memo.2. Background
We consider congestion control for unicast RTP traffic flows. This is the problem of adapting the transmission of an audio/visual data flow, encapsulated within an RTP transport session, from one sender to one receiver so that it does not use more capacity than is available along the network path. Such adaptation needs to be done in a way that limits the disruption to the user experience caused by both packet loss and excessive rate changes. Congestion control for multicast flows is outside the scope of this memo. Multicast traffic needs different solutions since the available capacity estimator for a group of receivers will differ from that for a single receiver, and because multicast congestion control has to consider issues of fairness across groups of receivers that do not apply to unicast flows. Congestion control for unicast RTP traffic can be implemented in one of two places in the protocol stack. One approach is to run the RTP traffic over a congestion-controlled transport protocol (for example, over TCP), and to adapt the media encoding to match the dictates of the transport-layer congestion control algorithm. This is safe for the network but can be suboptimal for the media quality unless the
transport protocol is designed to support real-time media flows. We do not consider this class of applications further in this memo, as their network safety is guaranteed by the underlying transport. Alternatively, RTP flows can be run over a non-congestion-controlled transport protocol (for example, UDP) performing rate adaptation at the application layer based on RTP Control Protocol (RTCP) feedback. With a well-designed, network-aware application, this allows highly effective media quality adaptation, but there is potential to cause persistent congestion in the network if the application does not adapt its sending rate in a timely and effective manner. We consider this class of applications in this memo. Congestion control relies on monitoring the delivery of a media flow and responding to adapt the transmission of that flow when there are signs that the network path is congested. Network congestion can be detected in one of three ways: 1) a receiver can infer the onset of congestion by observing an increase in one-way delay caused by queue build-up within the network; 2) if Explicit Congestion Notification (ECN) [RFC3168] is supported, the network can signal the presence of congestion by marking packets using ECN Congestion Experienced (CE) marks (this could potentially be augmented by mechanisms such as Congestion Exposure (ConEx) [RFC7713] or other future protocol extensions for network signaling of congestion); or 3) in the extreme case, congestion will cause packet loss that can be detected by observing a gap in the received RTP sequence numbers. Once the onset of congestion is observed, the receiver has to send feedback to the sender to indicate that the transmission rate needs to be reduced. How the sender reduces the transmission rate is highly dependent on the media codec being used and is outside the scope of this memo. There are several ways in which a receiver can send feedback to a media sender within the RTP framework: o The base RTP specification [RFC3550] defines RTCP Receiver Report (RR) packets to convey reception quality feedback information and Sender Report (SR) packets to convey information about the media transmission. RTCP SR packets contain data that can be used to reconstruct media timing at a receiver along with a count of the total number of octets and packets sent. RTCP RR packets report
on the fraction of packets lost in the last reporting interval, the cumulative number of packets lost, the highest sequence number received, and the inter-arrival jitter. The RTCP RR packets also contain timing information that allows the sender to estimate the network Round-Trip Time (RTT) to the receivers. RTCP reports are sent periodically, with the reporting interval being determined by the number of Synchronization Sources (SSRCs) used in the session and a configured session bandwidth estimate (the number of SSRCs) used is usually two in a unicast session, one for each participant, but can be greater if the participants send multiple media streams). The interval between reports sent from each receiver is on the order of a few seconds on average; although it varies with the session bandwidth, it is randomized to avoid synchronization of reports from multiple receivers. The interval can be less than a second in a high-bandwidth session. RTCP RR packets allow a receiver to report ongoing network congestion to the sender. However, if a receiver detects the onset of congestion part way through a reporting interval, the base RTP specification contains no provision for sending the RTCP RR packet early, and the receiver has to wait until the next scheduled reporting interval. o The RTCP Extended Reports (XR) [RFC3611] allow reporting of more complex and sophisticated reception quality metrics but do not change the RTCP timing rules. RTCP extended reports of potential interest for congestion control purposes are the extended packet loss, discard, and burst metrics [RFC3611] [RFC7002] [RFC7097] [RFC7003] [RFC6958] as well as the extended delay metrics [RFC6843] [RFC6798]. Other RTCP Extended Reports that could be helpful for congestion control purposes might be developed in future. o Rapid feedback about the occurrence of congestion events can be achieved using the Extended RTP Profile for RTCP-Based Feedback (RTP/AVPF) [RFC4585] (or its secure variant, RTP/SAVPF [RFC5124]) in place of the RTP/AVP profile [RFC3551]. This modifies the RTCP timing rules to allow RTCP reports to be sent early, in some cases immediately, provided the RTCP transmission rate keeps within its bandwidth allocation. It also defines transport-layer feedback messages, including Negative Acknowledgements (NACKs), that can be used to report on specific congestion events. RTP Codec Control Messages [RFC5104] extend the RTP/AVPF profile with additional feedback messages that can be used to influence the way in which rate adaptation occurs but do not further change the dynamics of how rapidly feedback can be sent. Use of the RTP/AVPF profile is dependent on signaling.
o Finally, ECN for RTP over UDP [RFC6679] can be used to provide feedback on the number of packets that received an ECN-CE mark. This RTCP extension builds on the RTP/AVPF profile to allow rapid congestion feedback when ECN is supported. In addition to these mechanisms for providing feedback, the sender can include an RTP header extension in each packet to record packet transmission times [RFC5450]. Accurate transmission timestamps can be helpful for estimating queuing delays to get an early indication of the onset of congestion. Taken together, these various mechanisms allow receivers to provide feedback on the senders when congestion events occur, with varying degrees of timeliness and accuracy. The key distinction is between systems that use only the basic RTCP mechanisms, without RTP/AVPF rapid feedback, and those that use the RTP/AVPF extensions to respond to congestion more rapidly.3. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. This interpretation of these key words applies only when written in ALL CAPS. Mixed- or lower-case uses of these key words are not to be interpreted as carrying special significance in this memo. The definition of the RTP circuit breaker is specified in terms of the following variables: o Td is the deterministic RTCP reporting interval, as defined in Section 6.3.1 of [RFC3550]. o Tdr is the sender's estimate of the deterministic RTCP reporting interval, Td, calculated by a receiver of the data it is sending. Tdr is not known at the sender but can be estimated by executing the algorithm in Section 6.2 of [RFC3550] using the average RTCP packet size seen at the sender, the number of members reported in the receiver's SR/RR report blocks, and whether the receiver is sending SR or RR packets. Tdr is recalculated when each new RTCP SR/RR report is received, but the media timeout circuit breaker (see Section 4.2) is only reconsidered when Tdr increases.
o Tr is the network round-trip time, which is calculated by the sender using the algorithm in Section 6.4.1 of [RFC3550] and is smoothed using an exponentially weighted moving average as Tr = (0.8 * Tr) + (0.2 * Tr_new) where Tr_new is the latest RTT estimate obtained from an RTCP report. The weight is chosen so old estimates decay over k intervals. o k is the non-reporting threshold (see Section 4.2). o Tf is the media framing interval at the sender. For applications sending at a constant frame rate, Tf is the inter-frame interval. For applications that switch between a small set of possible frame rates (for example, when sending speech with comfort noise, such that comfort noise frames are sent less often than speech frames), Tf is set to the longest of the inter-frame intervals of the different frame rates. For applications that send periodic frames but dynamically vary their frame rate, Tf is set to the largest inter-frame interval used in the last 10 seconds. For applications that send less than one frame every 10 seconds, or that have no concept of periodic frames (e.g., text conversation [RFC4103], or pointer events [RFC2862]), when each frame is sent, Tf is set to the time interval since the previous frame. o G is the frame group size. That is, the number of frames that are coded together based on a particular sending rate setting. If the codec used by the sender can change its rate on each frame, then G = 1; otherwise, G is set to the number of frames before the codec can adjust to the new rate. For codecs that have the concept of a Group of Pictures (GOP), G is likely the GOP length. o T_rr_interval is the minimal interval between RTCP reports, as defined in Section 3.4 of [RFC4585]; it is only meaningful for implementations of RTP/AVPF profile [RFC4585] or the RTP/SAVPF profile [RFC5124]. o X is the estimated throughput a TCP connection would achieve over a path, in bytes per second. o s is the size of RTP packets being sent, in bytes. If the RTP packets being sent vary in size, then the average size over the packet comprising the last 4 * G frames MUST be used (this is intended to be comparable to the four loss intervals used in [RFC5348]). o p is the loss event rate, between 0.0 and 1.0, that would be seen by a TCP connection over a particular path. When used in the RTP congestion circuit breaker, this is approximated as described in Section 4.3.
o t_RTO is the retransmission timeout value that would be used by a TCP connection over a particular path, in seconds. This MUST be approximated using t_RTO = 4 * Tr when used as part of the RTP congestion circuit breaker. o b is the number of packets that are acknowledged by a single TCP acknowledgement. Following [RFC5348], it is RECOMMENDED that the value b = 1 is used as part of the RTP congestion circuit breaker.4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile
The feedback mechanisms defined in [RFC3550] and available under the RTP/AVP profile [RFC3551] are the minimum that can be assumed for a baseline circuit breaker mechanism that is suitable for all unicast applications of RTP. Accordingly, for an RTP circuit breaker to be useful, it needs to be able to detect that an RTP flow is causing excessive congestion using only basic RTCP features without needing RTCP XR feedback or the RTP/AVPF profile for rapid RTCP reports. RTCP is a fundamental part of the RTP protocol, and the mechanisms described here rely on the implementation of RTCP. Implementations that claim to support RTP, but that do not implement RTCP, will be unable to use the circuit breaker mechanisms described in this memo. Such implementations SHOULD NOT be used on networks that might be subject to congestion unless equivalent mechanisms are defined using some non-RTCP feedback channel to report congestion and signal circuit breaker conditions. The RTCP timeout circuit breaker (Section 4.1) will trigger if an implementation of this memo attempts to interwork with an endpoint that does not support RTCP. Implementations that sometimes need to interwork with endpoints that do not support RTCP need to disable the RTP circuit breakers if they don't receive some confirmation via signaling that the remote endpoint implements RTCP (the presence of a Session Description Protocol (SDP) "a=rtcp:" attribute in an answer might be such an indication). The RTP circuit breaker SHOULD NOT be disabled on networks that might be subject to congestion unless equivalent mechanisms are defined using some non-RTCP feedback channel to report congestion and signal circuit breaker conditions [RFC8084]. Three potential congestion signals are available from the basic RTCP SR/RR packets and are reported for each SSRC in the RTP session: 1. The sender can estimate the network round-trip time once per RTCP reporting interval based on the contents and timing of RTCP SR and RR packets.
2. Receivers report a jitter estimate (the statistical variance of the RTP data packet inter-arrival time) calculated over the RTCP reporting interval. Due to the nature of the jitter calculation (Section 6.4.4. of [RFC3550]), the jitter is only meaningful for RTP flows that send a single data packet for each RTP timestamp value (i.e., audio flows, or video flows where each packet comprises one video frame). 3. Receivers report the fraction of RTP data packets lost during the RTCP reporting interval and the cumulative number of RTP packets lost over the entire RTP session. These congestion signals limit the possible circuit breakers since they give only limited visibility into the behavior of the network. RTT estimates are widely used in congestion control algorithms as a proxy for queuing delay measures in delay-based congestion control or to determine connection timeouts. RTT estimates derived from RTCP SR and RR packets sent according to the RTP/AVP timing rules are too infrequent to be useful for congestion control and don't give enough information to distinguish a delay change due to routing updates from queuing delay caused by congestion. Accordingly, we cannot use the RTT estimate alone as an RTP circuit breaker. Increased jitter can be a signal of transient network congestion, but in the highly aggregated form reported in RTCP RR packets, it offers insufficient information to estimate the extent or persistence of congestion. Jitter reports are a useful early warning of potential network congestion but provide an insufficiently strong signal to be used as a circuit breaker. The remaining congestion signals are the packet loss fraction and the cumulative number of packets lost. If considered carefully, and over an appropriate time frame to distinguish transient problems from long term issues [RFC8084], these can be effective indicators that persistent excessive congestion is occurring in networks where packet loss is primarily due to queue overflows, although loss caused by non-congestive packet corruption can distort the result in some networks. TCP congestion control [RFC5681] intentionally tries to fill the router queues and uses the resulting packet loss as congestion feedback. An RTP flow competing with TCP traffic will therefore expect to see a non-zero packet loss fraction, and some variation in queuing latency, in normal operation when sharing a path with other flows, which needs to be accounted for when determining the circuit breaker threshold [RFC8084]. This behavior of TCP is reflected in the congestion circuit breaker below and will affect the design of any RTP congestion control protocol.
Two packet loss regimes can be observed: 1) RTCP RR packets show a non-zero packet loss fraction while the extended highest sequence number received continues to increment; and 2) RR packets show a loss fraction of zero, but the extended highest sequence number received does not increment even though the sender has been transmitting RTP data packets. The former corresponds to the TCP congestion avoidance state and indicates a congested path that is still delivering data; the latter corresponds to a TCP timeout and is most likely due to a path failure. A third condition is that data is being sent but no RTCP feedback is received at all, corresponding to a failure of the reverse path. We derive circuit breaker conditions for these loss regimes in the following.4.1. RTP/AVP Circuit Breaker #1: RTCP Timeout
An RTCP timeout can occur when RTP data packets are being sent, but there are no RTCP reports returned from the receiver. This is either due to a failure of the receiver to send RTCP reports or a failure of the return path that is preventing those RTCP reporting from being delivered. In either case, it is not safe to continue transmission since the sender has no way of knowing if it is causing congestion. An RTP sender that has not received any RTCP SR or RTCP RR packets reporting on the SSRC it is using, for a time period of at least three times its deterministic RTCP reporting interval, Td (where Td is calculated without the randomization factor and using the fixed minimum interval of Tmin=5 seconds), SHOULD cease transmission (see Section 4.5). The rationale for this choice of timeout is as described in Section 6.2 of [RFC3550] ("so that implementations which do not use the reduced value for transmitting RTCP packets are not timed out by other participants prematurely") and has been updated by Section 6.1.4 of [RFC8108] to account for the use of the RTP/AVPF profile [RFC4585] or the RTP/SAVPF profile [RFC5124]. To reduce the risk of premature timeout, implementations SHOULD NOT configure the RTCP bandwidth such that Td is larger than 5 seconds. Similarly, implementations that use the RTP/AVPF profile [RFC4585] or the RTP/SAVPF profile [RFC5124] SHOULD NOT configure T_rr_interval to values larger than 4 seconds (the reduced limit for T_rr_interval follows Section 6.1.3 of [RFC8108]). The choice of three RTCP reporting intervals as the timeout is made following Section 6.3.5 of RFC 3550 [RFC3550]. This specifies that participants in an RTP session will timeout and remove an RTP sender from the list of active RTP senders if no RTP data packets have been received from that RTP sender within the last two RTCP reporting intervals. Using a timeout of three RTCP reporting intervals is therefore large enough that the other participants will have timed
out the sender if a network problem stops the data packets it is sending from reaching the receivers, even allowing for loss of some RTCP packets. If a sender is transmitting a large number of RTP media streams, such that the corresponding RTCP SR or RR packets are too large to fit into the network MTU, the receiver will generate RTCP SR or RR packets in a round-robin manner. In this case, the sender SHOULD treat receipt of an RTCP SR or RR packet corresponding to any SSRC it sent on the same 5-tuple of source and destination IP address, port, and protocol as an indication that the receiver and return path are working and thus preventing the RTCP timeout circuit breaker from triggering.4.2. RTP/AVP Circuit Breaker #2: Media Timeout
If RTP data packets are being sent but the RTCP SR or RR packets reporting on that SSRC indicate a non-increasing extended highest sequence number received, this is an indication that those RTP data packets are not reaching the receiver. This could be a short-term issue affecting only a few RTP packets, perhaps caused by a slow-to- open firewall or a transient connectivity problem, but if the issue persists, it is a sign of a more ongoing and significant problem (a "media timeout"). The time needed to declare a media timeout depends on the parameters Tdr, Tr, Tf, and on the non-reporting threshold k. The value of k is chosen so that when Tdr is large compared to Tr and Tf, receipt of at least k RTCP reports with non-increasing extended highest sequence number received gives reasonable assurance that the forward path has failed and that the RTP data packets have not been lost by chance. The RECOMMENDED value for k is 5 reports. When Tdr < Tf, then RTP data packets are being sent at a rate less than one per RTCP reporting interval of the receiver, so the extended highest sequence number received can be expected to be non-increasing for some receiver RTCP reporting intervals. Similarly, when Tdr < Tr, some receiver RTCP reporting intervals might pass before the RTP data packets arrive at the receiver, also leading to reports where the extended highest sequence number received is non- increasing. Both issues require the media timeout interval to be scaled relative to the threshold, k. The media timeout RTP circuit breaker is therefore as follows. When starting sending, calculate MEDIA_TIMEOUT using: MEDIA_TIMEOUT = ceil(k * max(Tf, Tr, Tdr) / Tdr)
When a sender receives an RTCP packet that indicates reception of the media it has been sending, then it cancels the media timeout circuit breaker. If it is still sending, then it MUST calculate a new value for MEDIA_TIMEOUT and set a new media timeout circuit breaker. If a sender receives an RTCP packet indicating that its media was not received, it MUST calculate a new value for MEDIA_TIMEOUT. If the new value is larger than the previous, it replaces MEDIA_TIMEOUT with the new value, extending the media timeout circuit breaker; otherwise, it keeps the original value of MEDIA_TIMEOUT. This process is known as reconsidering the media timeout circuit breaker. If MEDIA_TIMEOUT consecutive RTCP packets are received indicating that the media being sent was not received, and the media timeout circuit breaker has not been canceled, then the media timeout circuit breaker triggers. When the media timeout circuit breaker triggers, the sender SHOULD cease transmission (see Section 4.5). When stopping sending an RTP stream, a sender MUST cancel the corresponding media timeout circuit breaker.