RFC 7928

Characterization Guidelines for Active Queue Management (AQM)

Pages: 37
Informational

Part 1 of 2 – Pages 1 to 18

RFC7928 - Page 1

Internet Engineering Task Force (IETF)                      N. Kuhn, Ed.
Request for Comments: 7928                        CNES, Telecom Bretagne
Category: Informational                                P. Natarajan, Ed.
ISSN: 2070-1721                                            Cisco Systems
                                                         N. Khademi, Ed.
                                                      University of Oslo
                                                                  D. Ros
                                           Simula Research Laboratory AS
                                                               July 2016


     Characterization Guidelines for Active Queue Management (AQM)

Abstract

   Unmanaged large buffers in today's networks have given rise to a slew
   of performance issues.  These performance issues can be addressed by
   some form of Active Queue Management (AQM) mechanism, optionally in
   combination with a packet-scheduling scheme such as fair queuing.
   This document describes various criteria for performing
   characterizations of AQM schemes that can be used in lab testing
   during development, prior to deployment.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are a candidate for any level of Internet
   Standard; see Section 2 of RFC 7841.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc7928.

RFC7928 - Page 2

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
     1.1.  Reducing the Latency and Maximizing the Goodput . . . . .   5
     1.2.  Goals of This Document  . . . . . . . . . . . . . . . . .   5
     1.3.  Requirements Language . . . . . . . . . . . . . . . . . .   6
     1.4.  Glossary  . . . . . . . . . . . . . . . . . . . . . . . .   7
   2.  End-to-End Metrics  . . . . . . . . . . . . . . . . . . . . .   7
     2.1.  Flow Completion Time  . . . . . . . . . . . . . . . . . .   8
     2.2.  Flow Startup Time . . . . . . . . . . . . . . . . . . . .   8
     2.3.  Packet Loss . . . . . . . . . . . . . . . . . . . . . . .   9
     2.4.  Packet Loss Synchronization . . . . . . . . . . . . . . .   9
     2.5.  Goodput . . . . . . . . . . . . . . . . . . . . . . . . .  10
     2.6.  Latency and Jitter  . . . . . . . . . . . . . . . . . . .  11
     2.7.  Discussion on the Trade-Off between Latency and Goodput .  11
   3.  Generic Setup for Evaluations . . . . . . . . . . . . . . . .  12
     3.1.  Topology and Notations  . . . . . . . . . . . . . . . . .  12
     3.2.  Buffer Size . . . . . . . . . . . . . . . . . . . . . . .  14
     3.3.  Congestion Controls . . . . . . . . . . . . . . . . . . .  14
   4.  Methodology, Metrics, AQM Comparisons, Packet Sizes,
       Scheduling, and ECN . . . . . . . . . . . . . . . . . . . . .  14
     4.1.  Methodology . . . . . . . . . . . . . . . . . . . . . . .  14
     4.2.  Comments on Metrics Measurement . . . . . . . . . . . . .  15
     4.3.  Comparing AQM Schemes . . . . . . . . . . . . . . . . . .  15
       4.3.1.  Performance Comparison  . . . . . . . . . . . . . . .  15
       4.3.2.  Deployment Comparison . . . . . . . . . . . . . . . .  16
     4.4.  Packet Sizes and Congestion Notification  . . . . . . . .  16
     4.5.  Interaction with ECN  . . . . . . . . . . . . . . . . . .  17
     4.6.  Interaction with Scheduling . . . . . . . . . . . . . . .  17
   5.  Transport Protocols . . . . . . . . . . . . . . . . . . . . .  18
     5.1.  TCP-Friendly Sender . . . . . . . . . . . . . . . . . . .  19
       5.1.1.  TCP-Friendly Sender with the Same Initial Congestion
               Window  . . . . . . . . . . . . . . . . . . . . . . .  19

RFC7928 - Page 3

       5.1.2.  TCP-Friendly Sender with Different Initial Congestion
               Windows . . . . . . . . . . . . . . . . . . . . . . .  19
     5.2.  Aggressive Transport Sender . . . . . . . . . . . . . . .  19
     5.3.  Unresponsive Transport Sender . . . . . . . . . . . . . .  20
     5.4.  Less-than-Best-Effort Transport Sender  . . . . . . . . .  20
   6.  Round-Trip Time Fairness  . . . . . . . . . . . . . . . . . .  21
     6.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . .  21
     6.2.  Recommended Tests . . . . . . . . . . . . . . . . . . . .  21
     6.3.  Metrics to Evaluate the RTT Fairness  . . . . . . . . . .  22
   7.  Burst Absorption  . . . . . . . . . . . . . . . . . . . . . .  22
     7.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . .  22
     7.2.  Recommended Tests . . . . . . . . . . . . . . . . . . . .  23
   8.  Stability . . . . . . . . . . . . . . . . . . . . . . . . . .  24
     8.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . .  24
     8.2.  Recommended Tests . . . . . . . . . . . . . . . . . . . .  24
       8.2.1.  Definition of the Congestion Level  . . . . . . . . .  25
       8.2.2.  Mild Congestion . . . . . . . . . . . . . . . . . . .  25
       8.2.3.  Medium Congestion . . . . . . . . . . . . . . . . . .  25
       8.2.4.  Heavy Congestion  . . . . . . . . . . . . . . . . . .  25
       8.2.5.  Varying the Congestion Level  . . . . . . . . . . . .  26
       8.2.6.  Varying Available Capacity  . . . . . . . . . . . . .  26
     8.3.  Parameter Sensitivity and Stability Analysis  . . . . . .  27
   9.  Various Traffic Profiles  . . . . . . . . . . . . . . . . . .  27
     9.1.  Traffic Mix . . . . . . . . . . . . . . . . . . . . . . .  28
     9.2.  Bidirectional Traffic . . . . . . . . . . . . . . . . . .  28
   10. Example of a Multi-AQM Scenario . . . . . . . . . . . . . . .  29
     10.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . .  29
     10.2.  Details on the Evaluation Scenario . . . . . . . . . . .  29
   11. Implementation Cost . . . . . . . . . . . . . . . . . . . . .  30
     11.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . .  30
     11.2.  Recommended Discussion . . . . . . . . . . . . . . . . .  30
   12. Operator Control and Auto-Tuning  . . . . . . . . . . . . . .  30
     12.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . .  30
     12.2.  Recommended Discussion . . . . . . . . . . . . . . . . .  31
   13. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .  31
   14. Security Considerations . . . . . . . . . . . . . . . . . . .  32
   15. References  . . . . . . . . . . . . . . . . . . . . . . . . .  32
     15.1.  Normative References . . . . . . . . . . . . . . . . . .  32
     15.2.  Informative References . . . . . . . . . . . . . . . . .  33
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  36
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  37

RFC7928 - Page 4

1.  Introduction

   Active Queue Management (AQM) addresses the concerns arising from
   using unnecessarily large and unmanaged buffers to improve network
   and application performance, such as those presented in Section 1.2
   of the AQM recommendations document [RFC7567].  Several AQM
   algorithms have been proposed in the past years, most notably Random
   Early Detection (RED) [FLOY1993], BLUE [FENG2002], Proportional
   Integral controller (PI) [HOLLO2001], and more recently, Controlled
   Delay (CoDel) [CODEL] and Proportional Integral controller Enhanced
   (PIE) [AQMPIE].  In general, these algorithms actively interact with
   the Transmission Control Protocol (TCP) and any other transport
   protocol that deploys a congestion control scheme to manage the
   amount of data they keep in the network.  The available buffer space
   in the routers and switches should be large enough to accommodate the
   short-term buffering requirements.  AQM schemes aim at reducing
   buffer occupancy, and therefore the end-to-end delay.  Some of these
   algorithms, notably RED, have also been widely implemented in some
   network devices.  However, the potential benefits of the RED scheme
   have not been realized since RED is reported to be usually turned
   off.

   A buffer is a physical volume of memory in which a queue or set of
   queues are stored.  When speaking of a specific queue in this
   document, "buffer occupancy" refers to the amount of data (measured
   in bytes or packets) that are in the queue, and the "maximum buffer
   size" refers to the maximum buffer occupancy.  In switches and
   routers, a global memory space is often shared between the available
   interfaces, and thus, the maximum buffer size for any given interface
   may vary over time.

   Bufferbloat [BB2011] is the consequence of deploying large, unmanaged
   buffers on the Internet -- the buffering has often been measured to
   be ten times or a hundred times larger than needed.  Large buffer
   sizes in combination with TCP and/or unresponsive flows increases
   end-to-end delay.  This results in poor performance for latency-
   sensitive applications such as real-time multimedia (e.g., voice,
   video, gaming, etc.).  The degree to which this affects modern
   networking equipment, especially consumer-grade equipment, produces
   problems even with commonly used web services.  Active queue
   management is thus essential to control queuing delay and decrease
   network latency.

   The Active Queue Management and Packet Scheduling Working Group (AQM
   WG) was chartered to address the problems with large unmanaged
   buffers in the Internet.  Specifically, the AQM WG is tasked with
   standardizing AQM schemes that not only address concerns with such
   buffers, but are also robust under a wide variety of operating

RFC7928 - Page 5

   conditions.  This document provides characterization guidelines that
   can be used to assess the applicability, performance, and
   deployability of an AQM, whether it is a candidate for
   standardization at IETF or not.

   The AQM algorithm implemented in a router can be separated from the
   scheduling of packets sent out by the router as discussed in the AQM
   recommendations document [RFC7567].  The rest of this memo refers to
   the AQM as a dropping/marking policy as a separate feature to any
   interface-scheduling scheme.  This document may be complemented with
   another one on guidelines for assessing the combination of packet
   scheduling and AQM.  We note that such a document will inherit all
   the guidelines from this document, plus any additional scenarios
   relevant for packet scheduling such as flow-starvation evaluation or
   the impact of the number of hash buckets.

1.1.  Reducing the Latency and Maximizing the Goodput

   The trade-off between reducing the latency and maximizing the goodput
   is intrinsically linked to each AQM scheme and is key to evaluating
   its performance.  To ensure the safety deployment of an AQM, its
   behavior should be assessed in a variety of scenarios.  Whenever
   possible, solutions ought to aim at both maximizing goodput and
   minimizing latency.

1.2.  Goals of This Document

   This document recommends a generic list of scenarios against which an
   AQM proposal should be evaluated, considering both potential
   performance gain and safety of deployment.  The guidelines help to
   quantify performance of AQM schemes in terms of latency reduction,
   goodput maximization, and the trade-off between these two.  The
   document presents central aspects of an AQM algorithm that should be
   considered, whatever the context, such as burst absorption capacity,
   RTT fairness, or resilience to fluctuating network conditions.  The
   guidelines also discuss methods to understand the various aspects
   associated with safely deploying and operating the AQM scheme.  Thus,
   one of the key objectives behind formulating the guidelines is to
   help ascertain whether a specific AQM is not only better than drop-
   tail (i.e., without AQM and with a BDP-sized buffer), but also safe
   to deploy: the guidelines can be used to compare several AQM
   proposals with each other, but should be used to compare a proposal
   with drop-tail.

   This memo details generic characterization scenarios against which
   any AQM proposal should be evaluated, irrespective of whether or not
   an AQM is standardized by the IETF.  This document recommends the
   relevant scenarios and metrics to be considered.  This document

RFC7928 - Page 6

   presents central aspects of an AQM algorithm that should be
   considered whatever the context, such as burst absorption capacity,
   RTT fairness, or resilience to fluctuating network conditions.

   These guidelines do not define and are not bound to a particular
   deployment scenario or evaluation toolset.  Instead, the guidelines
   can be used to assert the potential gain of introducing an AQM for
   the particular environment, which is of interest to the testers.
   These guidelines do not cover every possible aspect of a particular
   algorithm.  These guidelines do not present context-dependent
   scenarios (such as IEEE 802.11 WLANs, data centers, or rural
   broadband networks).  To keep the guidelines generic, a number of
   potential router components and algorithms (such as Diffserv) are
   omitted.

   The goals of this document can thus be summarized as follows:

   o  The present characterization guidelines provide a non-exhaustive
      list of scenarios to help ascertain whether an AQM is not only
      better than drop-tail (with a BDP-sized buffer), but also safe to
      deploy; the guidelines can also be used to compare several AQM
      proposals with each other.

   o  The present characterization guidelines (1) are not bound to a
      particular evaluation toolset and (2) can be used for various
      deployment contexts; testers are free to select a toolset that is
      best suited for the environment in which their proposal will be
      deployed.

   o  The present characterization guidelines are intended to provide
      guidance for better selecting an AQM for a specific environment;
      it is not required that an AQM proposal is evaluated following
      these guidelines for its standardization.

1.3.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

RFC7928 - Page 7

1.4.  Glossary

   o  Application-limited traffic: A type of traffic that does not have
      an unlimited amount of data to transmit.

   o  AQM: The Active Queue Management (AQM) algorithm implemented in a
      router can be separated from the scheduling of packets sent by the
      router.  The rest of this memo refers to the AQM as a dropping/
      marking policy as a separate feature to any interface scheduling
      scheme [RFC7567].

   o  BDP: Bandwidth Delay Product.

   o  Buffer: A physical volume of memory in which a queue or set of
      queues are stored.

   o  Buffer Occupancy: The amount of data stored in a buffer, measured
      in bytes or packets.

   o  Buffer Size: The maximum buffer occupancy, that is the maximum
      amount of data that may be stored in a buffer, measured in bytes
      or packets.

   o  Initial Window 10 (IW10): TCP initial congestion window set to 10
      packets.

   o  Latency: One-way delay of packets across Internet paths.  This
      definition suits transport layer definition of the latency, which
      should not be confused with an application-layer view of the
      latency.

   o  Goodput: Goodput is defined as the number of bits per unit of time
      forwarded to the correct destination, minus any bits lost or
      retransmitted [RFC2647].  The goodput should be determined for
      each flow and not for aggregates of flows.

   o  SQRT: The square root function.

   o  ROUND: The round function.

2.  End-to-End Metrics

   End-to-end delay is the result of propagation delay, serialization
   delay, service delay in a switch, medium-access delay, and queuing
   delay, summed over the network elements along the path.  AQM schemes
   may reduce the queuing delay by providing signals to the sender on
   the emergence of congestion, but any impact on the goodput must be
   carefully considered.  This section presents the metrics that could

RFC7928 - Page 8

   be used to better quantify (1) the reduction of latency, (2)
   maximization of goodput, and (3) the trade-off between these two.
   This section provides normative requirements for metrics that can be
   used to assess the performance of an AQM scheme.

   Some metrics listed in this section are not suited to every type of
   traffic detailed in the rest of this document.  It is therefore not
   necessary to measure all of the following metrics: the chosen metric
   may not be relevant to the context of the evaluation scenario (e.g.,
   latency vs. goodput trade-off in application-limited traffic
   scenarios).  Guidance is provided for each metric.

2.1.  Flow Completion Time

   The flow completion time is an important performance metric for the
   end-user when the flow size is finite.  The definition of the flow
   size may be a source of contradictions, thus, this metric can
   consider a flow as a single file.  Considering the fact that an AQM
   scheme may drop/mark packets, the flow completion time is directly
   linked to the dropping/marking policy of the AQM scheme.  This metric
   helps to better assess the performance of an AQM depending on the
   flow size.  The Flow Completion Time (FCT) is related to the flow
   size (Fs) and the goodput for the flow (G) as follows:

   FCT [s] = Fs [byte] / ( G [bit/s] / 8 [bit/byte] )

   Where flow size is the size of the transport-layer payload in bits
   and goodput is the transport-layer payload transfer time (described
   in Section 2.5).

   If this metric is used to evaluate the performance of web transfers,
   it is suggested to rather consider the time needed to download all
   the objects that compose the web page, as this makes more sense in
   terms of user experience, rather than assessing the time needed to
   download each object.

2.2.  Flow Startup Time

   The flow startup time is the time between when the request was sent
   from the client and when the server starts to transmit data.  The
   amount of packets dropped by an AQM may seriously affect the waiting
   period during which the data transfer has not started.  This metric
   would specifically focus on the operations such as DNS lookups, TCP
   opens, and Secure Socket Layer (SSL) handshakes.

RFC7928 - Page 9

2.3.  Packet Loss

   Packet loss can occur en route, this can impact the end-to-end
   performance measured at the receiver end.

   The tester should evaluate the loss experienced at the receiver end
   using one of two metrics:

   o  The packet loss ratio: This metric is to be frequently measured
      during the experiment.  The long-term loss ratio is of interest
      for steady-state scenarios only;

   o  The interval between consecutive losses: The time between two
      losses is to be measured.

   The packet loss ratio can be assessed by simply evaluating the loss
   ratio as a function of the number of lost packets and the total
   number of packets sent.  This might not be easily done in laboratory
   testing, for which these guidelines advise the tester:

   o  To check that for every packet, a corresponding packet was
      received within a reasonable time, as presented in the document
      that proposes a metric for one-way packet loss across Internet
      paths [RFC7680].

   o  To keep a count of all packets sent, and a count of the non-
      duplicate packets received, as discussed in [RFC2544], which
      presents a benchmarking methodology.

   The interval between consecutive losses, which is also called a
   "gap", is a metric of interest for Voice over IP (VoIP) traffic
   [RFC3611].

2.4.  Packet Loss Synchronization

   One goal of an AQM algorithm is to help to avoid global
   synchronization of flows sharing a bottleneck buffer on which the AQM
   operates ([RFC2309] and [RFC7567]).  The "degree" of packet-loss
   synchronization between flows should be assessed, with and without
   the AQM under consideration.

   Loss synchronization among flows may be quantified by several
   slightly different metrics that capture different aspects of the same
   issue [HASS2008].  However, in real-world measurements the choice of
   metric could be imposed by practical considerations -- e.g., whether
   fine-grained information on packet losses at the bottleneck is
   available or not.  For the purpose of AQM characterization, a good
   candidate metric is the global synchronization ratio, measuring the

RFC7928 - Page 10

   proportion of flows losing packets during a loss event.  This metric
   can be used in real-world experiments to characterize synchronization
   along arbitrary Internet paths [JAY2006].

   If an AQM scheme is evaluated using real-life network environments,
   it is worth pointing out that some network events, such as failed
   link restoration may cause synchronized losses between active flows,
   and thus confuse the meaning of this metric.

2.5.  Goodput

   The goodput has been defined as the number of bits per the unit of
   time forwarded to the correct destination interface, minus any bits
   lost or retransmitted, such as proposed in Section 3.17 of [RFC2647],
   which describes the benchmarking terminology for firewall
   performances.  This definition requires that the test setup needs to
   be qualified to assure that it is not generating losses on its own.

   Measuring the end-to-end goodput provides an appreciation of how well
   an AQM scheme improves transport and application performance.  The
   measured end-to-end goodput is linked to the dropping/marking policy
   of the AQM scheme -- e.g., the fewer the number of packet drops, the
   fewer packets need retransmission, minimizing the impact of AQM on
   transport and application performance.  Additionally, an AQM scheme
   may resort to Explicit Congestion Notification (ECN) marking as an
   initial means to control delay.  Again, marking packets instead of
   dropping them reduces the number of packet retransmissions and
   increases goodput.  End-to-end goodput values help to evaluate the
   AQM scheme's effectiveness in minimizing packet drops that impact
   application performance and to estimate how well the AQM scheme works
   with ECN.

   The measurement of the goodput allows the tester to evaluate to what
   extent an AQM is able to maintain a high bottleneck utilization.
   This metric should also be obtained frequently during an experiment,
   as the long-term goodput is relevant for steady-state scenarios only
   and may not necessarily reflect how the introduction of an AQM
   actually impacts the link utilization during a certain period of
   time.  Fluctuations in the values obtained from these measurements
   may depend on other factors than the introduction of an AQM, such as
   link-layer losses due to external noise or corruption, fluctuating
   bandwidths (IEEE 802.11 WLANs), heavy congestion levels, or the
   transport layer's rate reduction by the congestion control mechanism.

RFC7928 - Page 11

2.6.  Latency and Jitter

   The latency, or the one-way delay metric, is discussed in [RFC7679].
   There is a consensus on an adequate metric for the jitter that
   represents the one-way delay variations for packets from the same
   flow: the Packet Delay Variation (PDV) serves well in all use cases
   [RFC5481].

   The end-to-end latency includes components other than just the
   queuing delay, such as the signal-processing delay, transmission
   delay, and processing delay.  Moreover, the jitter is caused by
   variations in queuing and processing delay (e.g., scheduling
   effects).  The introduction of an AQM scheme would impact end-to-end
   latency and jitter, and therefore these metrics should be considered
   in the end-to-end evaluation of performance.

2.7.  Discussion on the Trade-Off between Latency and Goodput

   The metrics presented in this section may be considered in order to
   discuss and quantify the trade-off between latency and goodput.

   With regards to the goodput, and in addition to the long-term
   stationary goodput value, it is recommended to take measurements at
   every multiple of the minimum RTT (minRTT) between A and B.  It is
   suggested to take measurements at least every K * minRTT (to smooth
   out the fluctuations), with K=10.  Higher values for K can be
   considered whenever it is more appropriate for the presentation of
   the results, since the value for K may depend on the network's path
   characteristics.  The measurement period must be disclosed for each
   experiment, and when results/values are compared across different AQM
   schemes, the comparisons should use exactly the same measurement
   periods.  With regards to latency, it is recommended to take the
   samples on a per-packet basis whenever possible, depending on the
   features provided by the hardware and software and the impact of
   sampling itself on the hardware performance.

   From each of these sets of measurements, the cumulative density
   function (CDF) of the considered metrics should be computed.  If the
   considered scenario introduces dynamically varying parameters,
   temporal evolution of the metrics could also be generated.  For each
   scenario, the following graph may be generated: the x-axis shows a
   queuing delay (that is, the average per-packet delay in excess of
   minimum RTT), the y-axis the goodput.  Ellipses are computed as
   detailed in [WINS2014]: "We take each individual [...] run [...] as
   one point, and then compute the 1-epsilon elliptic contour of the
   maximum-likelihood 2D Gaussian distribution that explains the points.
   [...] we plot the median per-sender throughput and queueing delay as
   a circle. [...] The orientation of an ellipse represents the

RFC7928 - Page 12

   covariance between the throughput and delay measured for the
   protocol."  This graph provides part of a better understanding of (1)
   the delay/goodput trade-off for a given congestion control mechanism
   (Section 5), and (2) how the goodput and average queue delay vary as
   a function of the traffic load (Section 8.2).

3.  Generic Setup for Evaluations

   This section presents the topology that can be used for each of the
   following scenarios, the corresponding notations, and discusses
   various assumptions that have been made in the document.

3.1.  Topology and Notations

   +--------------+                                +--------------+
   |sender A_i    |                                |receive B_i   |
   |--------------|                                |--------------|
   | SEN.Flow1.1 +---------+            +-----------+ REC.Flow1.1 |
   |        +     |        |            |          |        +     |
   |        |     |        |            |          |        |     |
   |        +     |        |            |          |        +     |
   | SEN.Flow1.X +-----+   |            |  +--------+ REC.Flow1.X |
   +--------------+    |   |            |  |       +--------------+
        +            +-+---+---+     +--+--+---+            +
        |            |Router L |     |Router R |            |
        |            |---------|     |---------|            |
        |            | AQM     |     |         |            |
        |            | BuffSize|     | BuffSize|            |
        |            | (Bsize) +-----+ (Bsize) |            |
        |            +-----+--++     ++-+------+            |
        +                  |  |       | |                   +
   +--------------+        |  |       | |          +--------------+
   |sender A_n    |        |  |       | |          |receive B_n   |
   |--------------|        |  |       | |          |--------------|
   | SEN.FlowN.1 +---------+  |       | +-----------+ REC.FlowN.1 |
   |        +     |           |       |            |        +     |
   |        |     |           |       |            |        |     |
   |        +     |           |       |            |        +     |
   | SEN.FlowN.Y +------------+       +-------------+ REC.FlowN.Y |
   +--------------+                                +--------------+

                     Figure 1: Topology and Notations

RFC7928 - Page 13

   Figure 1 is a generic topology where:

   o  The traffic profile is a set of flows with similar characteristics
      -- RTT, congestion control scheme, transport protocol, etc.;

   o  Senders with different traffic characteristics (i.e., traffic
      profiles) can be introduced;

   o  The timing of each flow could be different (i.e., when does each
      flow start and stop?);

   o  Each traffic profile can comprise various number of flows;

   o  Each link is characterized by a couple (one-way delay, capacity);

   o  Sender A_i is instantiated for each traffic profile.  A
      corresponding receiver B_i is instantiated for receiving the flows
      in the profile;

   o  Flows share a bottleneck (the link between routers L and R);

   o  The tester should consider both scenarios of asymmetric and
      symmetric bottleneck links in terms of bandwidth.  In the case of
      an asymmetric link, the capacity from senders to receivers is
      higher than the one from receivers to senders; the symmetric link
      scenario provides a basic understanding of the operation of the
      AQM mechanism, whereas the asymmetric link scenario evaluates an
      AQM mechanism in a more realistic setup;

   o  In asymmetric link scenarios, the tester should study the
      bidirectional traffic between A and B (downlink and uplink) with
      the AQM mechanism deployed in one direction only.  The tester may
      additionally consider a scenario with the AQM mechanism being
      deployed in both directions.  In each scenario, the tester should
      investigate the impact of the drop policy of the AQM on TCP ACK
      packets and its impact on the performance (Section 9.2).

   Although this topology may not perfectly reflect actual topologies,
   the simple topology is commonly used in the world of simulations and
   small testbeds.  It can be considered as adequate to evaluate AQM
   proposals [TCPEVAL].  Testers ought to pay attention to the topology
   used to evaluate an AQM scheme when comparing it with a newly
   proposed AQM scheme.

RFC7928 - Page 14

3.2.  Buffer Size

   The size of the buffers should be carefully chosen, and may be set to
   the bandwidth-delay product; the bandwidth being the bottleneck
   capacity and the delay being the largest RTT in the considered
   network.  The size of the buffer can impact the AQM performance and
   is a dimensioning parameter that will be considered when comparing
   AQM proposals.

   If a specific buffer size is required, the tester must justify and
   detail the way the maximum queue size is set.  Indeed, the maximum
   size of the buffer may affect the AQM's performance and its choice
   should be elaborated for a fair comparison between AQM proposals.
   While comparing AQM schemes, the buffer size should remain the same
   across the tests.

3.3.  Congestion Controls

   This document considers running three different congestion control
   algorithms between A and B:

   o  Standard TCP congestion control: The base-line congestion control
      is TCP NewReno with selective acknowledgment (SACK) [RFC5681].

   o  Aggressive congestion controls: A base-line congestion control for
      this category is CUBIC [CUBIC].

   o  Less-than-Best-Effort (LBE) congestion controls: Per [RFC6297], an
      LBE service "results in smaller bandwidth and/or delay impact on
      standard TCP than standard TCP itself, when sharing a bottleneck
      with it."  A base-line congestion control for this category is Low
      Extra Delay Background Transport (LEDBAT) [RFC6817].

   Other transport congestion controls can OPTIONALLY be evaluated in
   addition.  Recent transport layer protocols are not mentioned in the
   following sections, for the sake of simplicity.

4.  Methodology, Metrics, AQM Comparisons, Packet Sizes, Scheduling, and
    ECN

4.1.  Methodology

   A description of each test setup should be detailed to allow this
   test to be compared with other tests.  This also allows others to
   replicate the tests if needed.  This test setup should detail
   software and hardware versions.  The tester could make its data
   available.

RFC7928 - Page 15

   The proposals should be evaluated on real-life systems, or they may
   be evaluated with event-driven simulations (such as ns-2, ns-3,
   OMNET, etc.).  The proposed scenarios are not bound to a particular
   evaluation toolset.

   The tester is encouraged to make the detailed test setup and the
   results publicly available.

4.2.  Comments on Metrics Measurement

   This document presents the end-to-end metrics that ought to be used
   to evaluate the trade-off between latency and goodput as described in
   Section 2.  In addition to the end-to-end metrics, the queue-level
   metrics (normally collected at the device operating the AQM) provide
   a better understanding of the AQM behavior under study and the impact
   of its internal parameters.  Whenever it is possible (e.g., depending
   on the features provided by the hardware/software), these guidelines
   advise considering queue-level metrics, such as link utilization,
   queuing delay, queue size, or packet drop/mark statistics in addition
   to the AQM-specific parameters.  However, the evaluation must be
   primarily based on externally observed end-to-end metrics.

   These guidelines do not aim to detail the way these metrics can be
   measured, since that is expected to depend on the evaluation toolset.

4.3.  Comparing AQM Schemes

   This document recognizes that these guidelines may be used for
   comparing AQM schemes.

   AQM schemes need to be compared against both performance and
   deployment categories.  In addition, this section details how best to
   achieve a fair comparison of AQM schemes by avoiding certain
   pitfalls.

4.3.1.  Performance Comparison

   AQM schemes should be compared against the generic scenarios that are
   summarized in Section 13.  AQM schemes may be compared for specific
   network environments such as data centers, home networks, etc.  If an
   AQM scheme has parameter(s) that were externally tuned for
   optimization or other purposes, these values must be disclosed.

   AQM schemes belong to different varieties such as queue-length based
   schemes (for example, RED) or queuing-delay based scheme (for
   example, CoDel, PIE).  AQM schemes expose different control knobs
   associated with different semantics.  For example, while both PIE and
   CoDel are queuing-delay based schemes and each expose a knob to

RFC7928 - Page 16

   control the queuing delay -- PIE's "queuing delay reference" vs.
   CoDel's "queuing delay target", the two tuning parameters of the two
   schemes have different semantics, resulting in different control
   points.  Such differences in AQM schemes can be easily overlooked
   while making comparisons.

   This document recommends the following procedures for a fair
   performance comparison between the AQM schemes:

   1.  Similar control parameters and implications: Testers should be
       aware of the control parameters of the different schemes that
       control similar behavior.  Testers should also be aware of the
       input value ranges and corresponding implications.  For example,
       consider two different schemes -- (A) queue-length based AQM
       scheme, and (B) queuing-delay based scheme.  A and B are likely
       to have different kinds of control inputs to control the target
       delay -- the target queue length in A vs. target queuing delay in
       B, for example.  Setting parameter values such as 100 MB for A
       vs. 10 ms for B will have different implications depending on
       evaluation context.  Such context-dependent implications must be
       considered before drawing conclusions on performance comparisons.
       Also, it would be preferable if an AQM proposal listed such
       parameters and discussed how each relates to network
       characteristics such as capacity, average RTT, etc.

   2.  Compare over a range of input configurations: There could be
       situations when the set of control parameters that affect a
       specific behavior have different semantics between the two AQM
       schemes.  As mentioned above, PIE has tuning parameters to
       control queue delay that have different semantics from those used
       in CoDel.  In such situations, these schemes need to be compared
       over a range of input configurations.  For example, compare PIE
       vs. CoDel over the range of target delay input configurations.

4.3.2.  Deployment Comparison

   AQM schemes must be compared against deployment criteria such as the
   parameter sensitivity (Section 8.3), auto-tuning (Section 12), or
   implementation cost (Section 11).

4.4.  Packet Sizes and Congestion Notification

   An AQM scheme may be considering packet sizes while generating
   congestion signals [RFC7141].  For example, control packets such as
   DNS requests/responses, TCP SYNs/ACKs are small, but their loss can
   severely impact application performance.  An AQM scheme may therefore
   be biased towards small packets by dropping them with lower
   probability compared to larger packets.  However, such an AQM scheme

RFC7928 - Page 17

   is unfair to data senders generating larger packets.  Data senders,
   malicious or otherwise, are motivated to take advantage of such an
   AQM scheme by transmitting smaller packets, and this could result in
   unsafe deployments and unhealthy transport and/or application
   designs.

   An AQM scheme should adhere to the recommendations outlined in the
   Best Current Practice for dropping and marking packets [BCP41], and
   should not provide undue advantage to flows with smaller packets,
   such as discussed in Section 4.4 of the AQM recommendation document
   [RFC7567].  In order to evaluate if an AQM scheme is biased towards
   flows with smaller size packets, traffic can be generated, as defined
   in Section 8.2.2, where half of the flows have smaller packets (e.g.,
   500-byte packets) than the other half of the flow (e.g., 1500-byte
   packets).  In this case, the metrics reported could be the same as in
   Section 6.3, where Category I is the set of flows with smaller
   packets and Category II the one with larger packets.  The
   bidirectional scenario could also be considered (Section 9.2).

4.5.  Interaction with ECN

   ECN [RFC3168] is an alternative that allows AQM schemes to signal to
   receivers about network congestion that does not use packet drops.
   There are benefits to providing ECN support for an AQM scheme
   [WELZ2015].

   If the tested AQM scheme can support ECN, the testers must discuss
   and describe the support of ECN, such as discussed in the AQM
   recommendation document [RFC7567].  Also, the AQM's ECN support can
   be studied and verified by replicating tests in Section 6.2 with ECN
   turned ON at the TCP senders.  The results can be used not only to
   evaluate the performance of the tested AQM with and without ECN
   markings, but also to quantify the interest of enabling ECN.

4.6.  Interaction with Scheduling

   A network device may use per-flow or per-class queuing with a
   scheduling algorithm to either prioritize certain applications or
   classes of traffic, limit the rate of transmission, or to provide
   isolation between different traffic flows within a common class, such
   as discussed in Section 2.1 of the AQM recommendation document
   [RFC7567].

   The scheduling and the AQM conjointly impact the end-to-end
   performance.  Therefore, the AQM proposal must discuss the
   feasibility of adding scheduling combined with the AQM algorithm.  It
   can be explained whether the dropping policy is applied when packets
   are being enqueued or dequeued.

RFC7928 - Page 18

   These guidelines do not propose guidelines to assess the performance
   of scheduling algorithms.  Indeed, as opposed to characterizing AQM
   schemes that is related to their capacity to control the queuing
   delay in a queue, characterizing scheduling schemes is related to the
   scheduling itself and its interaction with the AQM scheme.  As one
   example, the scheduler may create sub-queues and the AQM scheme may
   be applied on each of the sub-queues, and/or the AQM could be applied
   on the whole queue.  Also, schedulers might, such as FQ-CoDel
   [HOEI2015] or FavorQueue [ANEL2014], introduce flow prioritization.
   In these cases, specific scenarios should be proposed to ascertain
   that these scheduler schemes not only help in tackling the
   bufferbloat, but also are robust under a wide variety of operating
   conditions.  This is out of the scope of this document, which focuses
   on dropping and/or marking AQM schemes.

(page 18 continued on part 2)