It is
REQUIRED per the architecture of the method that two cooperating hosts operate in the roles of Src (test packet Sender) and Dst (Receiver) with a measured path and return path between them.
The duration of a test, Parameter I,
MUST be constrained in a production network, since this is an active test method and it will likely cause congestion on the path from the Src host to the Dst host during a test.
The algorithm described in this section
MUST NOT be used as a general Congestion Control Algorithm (CCA). As stated in
Section 2 ("Scope, Goals, and Applicability"), the load rate adjustment algorithm's goal is to help determine the Maximum IP-Layer Capacity in the context of an infrequent, diagnostic, short-term measurement. There is a trade-off between test duration (also the test data volume) and algorithm aggressiveness (speed of ramp-up and ramp-down to the Maximum IP-Layer Capacity). The Parameter values chosen below strike a well-tested balance among these factors.
A table
SHALL be pre-built (by the test administrator), defining all the offered load rates that will be supported (R1 through Rn, in ascending order, corresponding to indexed rows in the table). It is
RECOMMENDED that rates begin with 0.5 Mbps at index zero, use 1 Mbps at index one, and then continue in 1 Mbps increments to 1 Gbps. Above 1 Gbps, and up to 10 Gbps, it is
RECOMMENDED that 100 Mbps increments be used. Above 10 Gbps, increments of 1 Gbps are
RECOMMENDED. A higher initial IP-Layer Sender Bit Rate might be configured when the test operator is certain that the Maximum IP-Layer Capacity is well above the initial IP-Layer Sender Bit Rate and factors such as test duration and total test traffic play an important role. The sending rate table
SHOULD bracket the Maximum Capacity where it will make measurements, including constrained rates less than 500 kbps if applicable.
Each rate is defined as datagrams of size ss, sent as a burst of count cc, each time interval tt (the default for tt is 100 microsec, a likely system tick interval). While it is advantageous to use datagrams of as large a size as possible, it may be prudent to use a slightly smaller maximum that allows for secondary protocol headers and/or tunneling without resulting in IP-Layer fragmentation. Selection of a new rate is indicated by a calculation on the current row, Rx. For example:
-
"Rx+1":
-
The Sender uses the next-higher rate in the table.
-
"Rx-10":
-
The Sender uses the rate 10 rows lower in the table.
At the beginning of a test, the Sender begins sending at rate R1 and the Receiver starts a feedback timer of duration FT (while awaiting inbound datagrams). As datagrams are received, they are checked for sequence number anomalies (loss, out-of-order, duplication, etc.) and the delay range is measured (one-way or round-trip). This information is accumulated until the feedback timer FT expires and a status feedback message is sent from the Receiver back to the Sender, to communicate this information. The accumulated statistics are then reset by the Receiver for the next feedback interval. As feedback messages are received back at the Sender, they are evaluated to determine how to adjust the current offered load rate (Rx).
If the feedback indicates that no sequence number anomalies were detected AND the delay range was below the lower threshold, the offered load rate is increased. If congestion has not been confirmed up to this point (see below for the method for declaring congestion), the offered load rate is increased by more than one rate setting (e.g., Rx+10). This allows the offered load to quickly reach a near-maximum rate. Conversely, if congestion has been previously confirmed, the offered load rate is only increased by one (Rx+1). However, if a rate threshold above a high sending rate (such as 1 Gbps) is exceeded, the offered load rate is only increased by one (Rx+1) in any congestion state.
If the feedback indicates that sequence number anomalies were detected OR the delay range was above the upper threshold, the offered load rate is decreased. The
RECOMMENDED threshold values are 10 for sequence number gaps and 30 msec for lower and 90 msec for upper delay thresholds, respectively. Also, if congestion is now confirmed for the first time by the current feedback message being processed, then the offered load rate is decreased by more than one rate setting (e.g., Rx-30). This one-time reduction is intended to compensate for the fast initial ramp-up. In all other cases, the offered load rate is only decreased by one (Rx-1).
If the feedback indicates that there were no sequence number anomalies AND the delay range was above the lower threshold but below the upper threshold, the offered load rate is not changed. This allows time for recent changes in the offered load rate to stabilize and for the feedback to represent current conditions more accurately.
Lastly, the method for inferring congestion is that there were sequence number anomalies AND/OR the delay range was above the upper threshold for three consecutive feedback intervals. The algorithm described above is also illustrated in Annex B of ITU-T Recommendation Y.1540, 2020 version [
Y.1540] and is implemented in
Appendix A ("Load Rate Adjustment Pseudocode") in this memo.
The load rate adjustment algorithm
MUST include timers that stop the test when received packet streams cease unexpectedly. The timeout thresholds are provided in
Table 1, along with values for all other Parameters and variables described in this section. Operations of non-obvious Parameters appear below:
-
load packet timeout:
-
The load packet timeout SHALL be reset to the configured value each time a load packet is received. If the timeout expires, the Receiver SHALL be closed and no further feedback sent.
-
feedback message timeout:
-
The feedback message timeout SHALL be reset to the configured value each time a feedback message is received. If the timeout expires, the Sender SHALL be closed and no further load packets sent.
Parameter |
Default |
Tested Range or Values |
Expected Safe Range (not entirely tested, other values NOT RECOMMENDED)
|
FT, feedback time interval |
50 msec |
20 msec, 50 msec, 100 msec |
20 msec <= FT <= 250 msec; larger values may slow the rate increase and fail to find the max
|
Feedback message timeout (stop test) |
L*FT, L=20 (1 sec with FT=50 msec) |
L=100 with FT=50 msec (5 sec) |
0.5 sec <= L*FT <= 30 sec; upper limit for very unreliable test paths only
|
Load packet timeout (stop test) |
1 sec |
5 sec |
0.250-30 sec; upper limit for very unreliable test paths only
|
Table index 0 |
0.5 Mbps |
0.5 Mbps |
When testing <= 10 Gbps |
Table index 1 |
1 Mbps |
1 Mbps |
When testing <= 10 Gbps |
Table index (step) size |
1 Mbps |
1 Mbps <= rate <= 1 Gbps |
Same as tested |
Table index (step) size, rate > 1 Gbps |
100 Mbps |
1 Gbps <= rate <= 10 Gbps |
Same as tested |
Table index (step) size, rate > 10 Gbps |
1 Gbps |
Untested |
>10 Gbps |
ss, UDP payload size, bytes |
None |
<=1222 |
Recommend max at largest value that avoids fragmentation; using a payload size that is too small might result in unexpected Sender limitations
|
cc, burst count |
None |
1 <= cc <= 100 |
Same as tested. Vary cc as needed to create the desired maximum sending rate. Sender buffer size may limit cc in the implementation
|
tt, burst interval |
100 microsec |
100 microsec, 1 msec |
Available range of "tick" values (HZ param) |
Low delay range threshold |
30 msec |
5 msec, 30 msec |
Same as tested |
High delay range threshold |
90 msec |
10 msec, 90 msec |
Same as tested |
Sequence error threshold |
10 |
0, 1, 5, 10, 100 |
Same as tested |
Consecutive errored status report threshold |
3 |
2, 3, 4, 5 |
Use values >1 to avoid misinterpreting transient loss |
Fast mode increase, in table index steps |
10 |
10 |
2 <= steps <= 30 |
Fast mode decrease, in table index steps |
3 * Fast mode increase |
3 * Fast mode increase |
Same as tested |
Table 1: Parameters for Load Rate Adjustment Algorithm
As a consequence of default parameterization, the Number of table steps in total for rates less than 10 Gbps is 1090 (excluding index 0).
A related Sender backoff response to network conditions occurs when one or more status feedback messages fail to arrive at the Sender.
If no status feedback messages arrive at the Sender for the interval greater than the Lost Status Backoff timeout:
UDRT + (2+w)*FT = Lost Status Backoff timeout
where:
UDRT = upper delay range threshold (default 90 msec)
FT = feedback time interval (default 50 msec)
w = number of repeated timeouts (w=0 initially, w++ on each
timeout, and reset to 0 when a message is received)
Beginning when the last message (of any type) was successfully received at the Sender:
The offered load
SHALL then be decreased, following the same process as when the feedback indicates the presence of one or more sequence number anomalies OR the delay range was above the upper threshold (as described above), with the same load rate adjustment algorithm variables in their current state. This means that lost status feedback messages OR sequence errors OR delay variation can result in rate reduction and congestion confirmation.
The
RECOMMENDED initial value for w is 0, taking a Round-Trip Time (RTT) of less than FT into account. A test with an RTT longer than FT is a valid reason to increase the initial value of w appropriately. Variable w
SHALL be incremented by one whenever the Lost Status Backoff timeout is exceeded. So, with FT = 50 msec and UDRT = 90 msec, a status feedback message loss would be declared at 190 msec following a successful message, again at 50 msec after that (240 msec total), and so on.
Also, if congestion is now confirmed for the first time by a Lost Status Backoff timeout, then the offered load rate is decreased by more than one rate setting (e.g., Rx-30). This one-time reduction is intended to compensate for the fast initial ramp-up. In all other cases, the offered load rate is only decreased by one (Rx-1).
Appendix B discusses compliance with the applicable mandatory requirements of [
RFC 8085], consistent with the goals of the IP-Layer Capacity Metric and Method, including the load rate adjustment algorithm described in this section.
It is of course necessary to calibrate the equipment performing the IP-Layer Capacity measurement, to ensure that the expected capacity can be measured accurately and that equipment choices (processing speed, interface bandwidth, etc.) are suitably matched to the measurement range.
When assessing a maximum rate as the metric specifies, artificially high (optimistic) values might be measured until some buffer on the path is filled. Other causes include bursts of back-to-back packets with idle intervals delivered by a path, while the measurement interval (dt) is small and aligned with the bursts. The artificial values might result in an unsustainable Maximum Capacity observed when the Method of Measurement is searching for the maximum, and that would not do. This situation is different from the bimodal service rates (discussed in "[
Reporting the Metric]",
Section 6.6), which are characterized by a multi-second duration (much longer than the measured RTT) and repeatable behavior.
There are many ways that the Method of Measurement could handle this false-max issue. The default value for measurement of Singletons (dt = 1 second) has proven to be of practical value during tests of this method, allows the bimodal service rates to be characterized, and has an obvious alignment with the reporting units (Mbps).
Another approach comes from
Section 24 of
RFC 2544 and its discussion of trial duration, where relatively short trials conducted as part of the search are followed by longer trials to make the final determination. In the production network, measurements of Singletons and Samples (the terms for trials and tests of Lab Benchmarking) must be limited in duration because they may affect service. But there is sufficient value in repeating a Sample with a fixed sending rate determined by the previous search for the Maximum IP-Layer Capacity, to qualify the result in terms of the other performance metrics measured at the same time.
A Qualification measurement for the search result is a subsequent measurement, sending at a fixed 99.x percent of the Maximum IP-Layer Capacity for I, or an indefinite period. The same Maximum Capacity Metric is applied, and the Qualification for the result is a Sample without supra-threshold packet losses or a growing minimum delay trend in subsequent Singletons (or each dt of the measurement interval, I). Samples exhibiting supra-threshold packet losses or increasing queue occupation require a repeated search and/or test at a reduced fixed Sender rate for Qualification.
Here, as with any Active Capacity test, the test duration must be kept short. Ten-second tests for each direction of transmission are common today. The default measurement interval specified here is I = 10 seconds. The combination of a fast and congestion-aware search method and user-network coordination makes a unique contribution to production testing. The Maximum IP Capacity Metric and Method for assessing performance is very different from the classic Throughput Metric and Methods provided in [
RFC 2544]: it uses near-real-time load adjustments that are sensitive to loss and delay, similar to other congestion control algorithms used on the Internet every day, along with limited duration. On the other hand, Throughput measurements [
RFC 2544] can produce sustained overload conditions for extended periods of time. Individual trials in a test governed by a binary search can last 60 seconds for each step, and the final confirmation trial may be even longer. This is very different from "normal" traffic levels, but overload conditions are not a concern in the isolated test environment. The concerns raised in [
RFC 6815] were that the methods discussed in [
RFC 2544] would be let loose on production networks, and instead the authors challenged the standards community to develop Metrics and Methods like those described in this memo.
In general, the widespread measurements that this memo encourages will encounter widespread behaviors. The bimodal IP Capacity behaviors already discussed in
Section 6.6 are good examples.
In general, it is
RECOMMENDED to locate test endpoints as close to the intended measured link(s) as practical (for reasons of scale, this is not always possible; there is a limit on the number of test endpoints coming from many perspectives -- for example, management and measurement traffic). The testing operator
MUST set a value for the MaxHops Parameter, based on the expected path length. This Parameter can keep measurement traffic from straying too far beyond the intended path.
The measured path may be stateful based on many factors, and the Parameter "Time of day" when a test starts may not be enough information. Repeatable testing may require knowledge of the time from the beginning of a measured flow -- and how the flow is constructed, including how much traffic has already been sent on that flow when a state change is observed -- because the state change may be based on time, bytes sent, or both. Both load packets and status feedback messages
MUST contain sequence numbers; this helps with measurements based on those packets.
Many different types of traffic shapers and on-demand communications access technologies may be encountered, as anticipated in [
RFC 7312], and play a key role in measurement results. Methods
MUST be prepared to provide a short preamble transmission to activate on-demand communications access and to discard the preamble from subsequent test results.
The following conditions might be encountered during measurement, where packet losses may occur independently of the measurement sending rate:
-
Congestion of an interconnection or backbone interface may appear as packet losses distributed over time in the test stream, due to much-higher-rate interfaces in the backbone.
-
Packet loss due to the use of Random Early Detection (RED) or other active queue management may or may not affect the measurement flow if competing background traffic (other flows) is simultaneously present.
-
There may be only a small delay variation independent of the sending rate under these conditions as well.
-
Persistent competing traffic on measurement paths that include shared transmission media may cause random packet losses in the test stream.
It is possible to mitigate these conditions using the flexibility of the load rate adjustment algorithm described in
Section 8.1 above (tuning specific Parameters).
If the measurement flow burst duration happens to be on the order of or smaller than the burst size of a shaper or a policer in the path, then the line rate might be measured rather than the bandwidth limit imposed by the shaper or policer. If this condition is suspected, alternate configurations
SHOULD be used.
In general, results depend on the sending stream's characteristics; the measurement community has known this for a long time and needs to keep it foremost in mind. Although the default is a single flow (F=1) for testing, the use of multiple flows may be advantageous for the following reasons:
-
The test hosts may be able to create a higher load than with a single flow, or parallel test hosts may be used to generate one flow each.
-
Link aggregation may be present (flow-based load balancing), and multiple flows are needed to occupy each member of the aggregate.
-
Internet access policies may limit the IP-Layer Capacity depending on the Type-P of the packets, possibly reserving capacity for various stream types.
Each flow would be controlled using its own implementation of the load rate adjustment (search) algorithm.
It is obviously counterproductive to run more than one independent and concurrent test (regardless of the number of flows in the test stream) attempting to measure the
maximum capacity on a single path. The number of concurrent, independent tests of a path
SHALL be limited to one.
Tests of a v4-v6 transition mechanism might well be the intended subject of a capacity test. As long as both IPv4 packets and IPv6 packets sent/received are standard-formed, this should be allowed (and the change in header size easily accounted for on a per-packet basis).
As testing continues, implementers should expect the methods to evolve. The ITU-T has published a supplement (Supplement 60) to the Y-series of ITU-T Recommendations, "Interpreting ITU-T Y.1540 maximum IP-layer capacity measurements" [
Y.Sup60], which is the result of continued testing with the metric. Those results have improved the methods described here.