In this section, we review specific network management and measurement techniques and how QUIC's design impacts them.
Limited RTT measurement is possible by passive observation of QUIC traffic; see
Section 3.8. No passive measurement of loss is possible with the present wire image. Limited observation of upstream congestion may be possible via the observation of Congestion Experienced (CE) markings in the IP header [
RFC 3168] on ECN-enabled QUIC traffic.
On-path devices can also make measurements of RTT, loss, and other performance metrics when information is carried in an additional network-layer packet header (
Section 6 of
RFC 9065 describes the use of Operations, Administration, and Management (OAM) information). Using network-layer approaches also has the advantage that common observation and analysis tools can be consistently used for multiple transport protocols; however, these techniques are often limited to measurements within one or multiple cooperating domains.
Stateful treatment of QUIC traffic (e.g., at a firewall or NAT middlebox) is possible through QUIC traffic and version identification (
Section 3.1) and observation of the handshake for connection confirmation (
Section 3.2). The lack of any visible end-of-flow signal (
Section 3.6) means that this state must be purged either through timers or least-recently-used eviction depending on application requirements.
While QUIC has no clear network-visible end-of-flow signal and therefore does require timer-based state removal, the QUIC handshake indicates confirmation by both ends of a valid bidirectional transmission. As soon as the handshake completed, timers should be set long enough to also allow for short idle time during a valid transmission.
[
RFC 4787] requires a network state timeout that is not less than 2 minutes for most UDP traffic. However, in practice, a QUIC endpoint can experience lower timeouts in the range of 30 to 60 seconds [
QUIC-TIMEOUT].
In contrast, [
RFC 5382] recommends a state timeout of more than 2 hours for TCP given that TCP is a connection-oriented protocol with well-defined closure semantics. Even though QUIC has explicitly been designed to tolerate NAT rebindings, decreasing the NAT timeout is not recommended as it may negatively impact application performance or incentivize endpoints to send very frequent keep-alive packets.
Therefore, a state timeout of at least two minutes is recommended for QUIC traffic, even when lower state timeouts are used for other UDP traffic.
If state is removed too early, this could lead to black-holing of incoming packets after a short idle period. To detect this situation, a timer at the client needs to expire before a re-establishment can happen (if at all), which would lead to unnecessarily long delays in an otherwise working connection.
Furthermore, not all endpoints use routing architectures where connections will survive a port or address change. Even when the client revives the connection, a NAT rebinding can cause a routing mismatch where a packet is not even delivered to the server that might support address migration. For these reasons, the limits in [
RFC 4787] are important to avoid black-holing of packets (and hence avoid interrupting the flow of data to the client), especially where devices are able to distinguish QUIC traffic from other UDP payloads.
The QUIC header optionally contains a connection ID, which could provide additional entropy beyond the 5-tuple. The QUIC handshake needs to be observed in order to understand whether the connection ID is present and what length it has. However, connection IDs may be renegotiated after the handshake, and this renegotiation is not visible to the path. Therefore, using the connection ID as a flow key field for stateful treatment of flows is not recommended as connection ID changes will cause undetectable and unrecoverable loss of state in the middle of a connection. In particular, the use of the connection ID for functions that require state to make a forwarding decision is not viable as it will break connectivity, or at minimum, cause long timeout-based delays before this problem is detected by the endpoints and the connection can potentially be re-established.
Use of connection IDs is specifically discouraged for NAT applications. If a NAT hits an operational limit, it is recommended to rather drop the initial packets of a flow (see also
Section 4.5), which potentially triggers TCP fallback. Use of the connection ID to multiplex multiple connections on the same IP address/port pair is not a viable solution as it risks connectivity breakage in case the connection ID changes.
While QUIC's migration capability makes it possible for a connection to survive client address changes, this does not work if the routers or switches in the server infrastructure route using the address-port 4-tuple. If infrastructure routes on addresses only, NAT rebinding or address migration will cause packets to be delivered to the wrong server. [
QUIC-LB] describes a way to addresses this problem by coordinating the selection and use of connection IDs between load balancers and servers.
Applying address translation at a middlebox to maintain a stable address-port mapping for flows based on connection ID might seem like a solution to this problem. However, hiding information about the change of the IP address or port conceals important and security-relevant information from QUIC endpoints, and as such, would facilitate amplification attacks (see
Section 8 of [
QUIC-TRANSPORT]). A NAT function that hides peer address changes prevents the other end from detecting and mitigating attacks as the endpoint cannot verify connectivity to the new address using QUIC PATH_CHALLENGE and PATH_RESPONSE frames.
In addition, a change of IP address or port is also an input signal to other internal mechanisms in QUIC. When a path change is detected, path-dependent variables like congestion control parameters will be reset, which protects the new path from overload.
In the case of networking architectures that include load balancers, the connection ID can be used as a way for the server to signal information about the desired treatment of a flow to the load balancers. Guidance on assigning connection IDs is given in [
QUIC-APPLICABILITY]. [
QUIC-LB] describes a system for coordinating selection and use of connection IDs between load balancers and servers.
[
RFC 4787] describes possible packet-filtering behaviors that relate to NATs but are often also used in other scenarios where packet filtering is desired. Though the guidance there holds, a particularly unwise behavior admits a handful of UDP packets and then makes a decision to whether or not filter later packets in the same connection. QUIC applications are encouraged to fall back to TCP if early packets do not arrive at their destination [
QUIC-APPLICABILITY], as QUIC is based on UDP and there are known blocks of UDP traffic (see
Section 4.6). Admitting a few packets allows the QUIC endpoint to determine that the path accepts QUIC. Sudden drops afterwards will result in slow and costly timeouts before abandoning the connection.
Today, UDP is the most prevalent DDoS vector, since it is easy for compromised non-admin applications to send a flood of large UDP packets (while with TCP the attacker gets throttled by the congestion controller) or to craft reflection and amplification attacks; therefore, some networks block UDP traffic. With increased deployment of QUIC, there is also an increased need to allow UDP traffic on ports used for QUIC. However, if UDP is generally enabled on these ports, UDP flood attacks may also use the same ports. One possible response to this threat is to throttle UDP traffic on the network, allocating a fixed portion of the network capacity to UDP and blocking UDP datagrams over that cap. As the portion of QUIC traffic compared to TCP is also expected to increase over time, using such a limit is not recommended; if this is done, limits might need to be adapted dynamically.
Further, if UDP traffic is desired to be throttled, it is recommended to block individual QUIC flows entirely rather than dropping packets indiscriminately. When the handshake is blocked, QUIC-capable applications may fall back to TCP. However, blocking a random fraction of QUIC packets across 4-tuples will allow many QUIC handshakes to complete, preventing TCP fallback, but these connections will suffer from severe packet loss (see also
Section 4.5). Therefore, UDP throttling should be realized by per-flow policing as opposed to per-packet policing. Note that this per-flow policing should be stateless to avoid problems with stateful treatment of QUIC flows (see
Section 4.2), for example, blocking a portion of the space of values of a hash function over the addresses and ports in the UDP datagram. While QUIC endpoints are often able to survive address changes, e.g., by NAT rebindings, blocking a portion of the traffic based on 5-tuple hashing increases the risk of black-holing an active connection when the address changes.
Note that some source ports are assumed to be reflection attack vectors by some servers; see
Section 8.1 of [
QUIC-APPLICABILITY]. As a result, NAT binding to these source ports can result in that traffic being blocked.
On-path observation of the transport headers of packets can be used for various security functions. For example, Denial of Service (DoS) and Distributed DoS (DDoS) attacks against the infrastructure or against an endpoint can be detected and mitigated by characterizing anomalous traffic. Other uses include support for security audits (e.g., verifying the compliance with cipher suites), client and application fingerprinting for inventory, and providing alerts for network intrusion detection and other next-generation firewall functions.
Current practices in detection and mitigation of DDoS attacks generally involve classification of incoming traffic (as packets, flows, or some other aggregate) into "good" (productive) and "bad" (DDoS) traffic, and then differential treatment of this traffic to forward only good traffic. This operation is often done in a separate specialized mitigation environment through which all traffic is filtered; a generalized architecture for separation of concerns in mitigation is given in [
DOTS-ARCH].
Efficient classification of this DDoS traffic in the mitigation environment is key to the success of this approach. Limited first packet garbage detection as in
Section 3.1.2 and stateful tracking of QUIC traffic as mentioned in
Section 4.2 above may be useful during classification.
Note that using a connection ID to support connection migration renders 5-tuple-based filtering insufficient to detect active flows and requires more state to be maintained by DDoS defense systems if support of migration of QUIC flows is desired. For the common case of NAT rebinding, where the client's address changes without the client's intent or knowledge, DDoS defense systems can detect a change in the client's endpoint address by linking flows based on the server's connection IDs. However, QUIC's linkability resistance ensures that a deliberate connection migration is accompanied by a change in the connection ID. In this case, the connection ID cannot be used to distinguish valid, active traffic from new attack traffic.
It is also possible for endpoints to directly support security functions such as DoS classification and mitigation. Endpoints can cooperate with an in-network device directly by e.g., sharing information about connection IDs.
Another potential method could use an on-path network device that relies on pattern inferences in the traffic and heuristics or machine learning instead of processing observed header information.
However, it is questionable whether connection migrations must be supported during a DDoS attack. While unintended migration without a connection ID change can be supported much easier, it might be acceptable to not support migrations of active QUIC connections that are not visible to the network functions performing the DDoS detection. As soon as the connection blocking is detected by the client, the client may be able to rely on the 0-RTT data mechanism provided by QUIC. When clients migrate to a new path, they should be prepared for the migration to fail and attempt to reconnect quickly.
Beyond in-network DDoS protection mechanisms, TCP SYN cookies [
RFC 4987] are a well-established method of mitigating some kinds of TCP DDoS attacks. QUIC Retry packets are the functional analogue to SYN cookies, forcing clients to prove possession of their IP address before committing server state. However, there are safeguards in QUIC against unsolicited injection of these packets by intermediaries who do not have consent of the end server. See [
QUIC-RETRY] for standard ways for intermediaries to send Retry packets on behalf of consenting servers.
It is expected that any QoS handling in the network, e.g., based on use of Diffserv Code Points (DSCPs) [
RFC 2475] as well as Equal-Cost Multi-Path (ECMP) routing, is applied on a per-flow basis (and not per-packet) and as such that all packets belonging to the same active QUIC connection get uniform treatment.
Using ECMP to distribute packets from a single flow across multiple network paths or any other nonuniform treatment of packets belong to the same connection could result in variations in order, delivery rate, and drop rate. As feedback about loss or delay of each packet is used as input to the congestion controller, these variations could adversely affect performance. Depending on the loss recovery mechanism that is implemented, QUIC may be more tolerant of packet reordering than typical TCP traffic (see
Section 2.7). However, the recovery mechanism used by a flow cannot be known by the network and therefore reordering tolerance should be considered as unknown.
Note that the 5-tuple of a QUIC connection can change due to migration. In this case different flows are observed by the path and may be treated differently, as congestion control is usually reset on migration (see also
Section 3.5).
Datagram Packetization Layer PMTU Discovery (DPLPMTUD) can be used by QUIC to probe for the supported PMTU. DPLPMTUD optionally uses ICMP messages (e.g., IPv6 Packet Too Big (PTB) messages). Given known attacks with the use of ICMP messages, the use of DPLPMTUD in QUIC has been designed to safely use but not rely on receiving ICMP feedback (see
Section 14.2.1 of [
QUIC-TRANSPORT]).
Networks are recommended to forward these ICMP messages and retain as much of the original packet as possible without exceeding the minimum MTU for the IP version when generating ICMP messages as recommended in [
RFC 1812] and [
RFC 4443].
Some network segments support 1500-byte packets, but can only do so by fragmenting at a lower layer before traversing a network segment with a smaller MTU, and then reassembling within the network segment. This is permissible even when the IP layer is IPv6 or IPv4 with the Don't Fragment (DF) bit set, because fragmentation occurs below the IP layer. However, this process can add to compute and memory costs, leading to a bottleneck that limits network capacity. In such networks, this generates a desire to influence a majority of senders to use smaller packets to avoid exceeding limited reassembly capacity.
For TCP, Maximum Segment Size (MSS) clamping (
Section 3.2 of
RFC 4459) is often used to change the sender's TCP maximum segment size, but QUIC requires a different approach.
Section 14 of [
QUIC-TRANSPORT] advises senders to probe larger sizes using DPLPMTUD [
DPLPMTUD] or Path Maximum Transmission Unit Discovery (PMTUD) [
RFC 1191] [
RFC 8201]. This mechanism encourages senders to approach the maximum packet size, which could then cause fragmentation within a network segment of which they may not be aware.
If path performance is limited when forwarding larger packets, an on-path device should support a maximum packet size for a specific transport flow and then consistently drop all packets that exceed the configured size when the inner IPv4 packet has DF set or IPv6 is used.
Networks with configurations that would lead to fragmentation of large packets within a network segment should drop such packets rather than fragmenting them. Network operators who plan to implement a more selective policy may start by focusing on QUIC.
QUIC flows cannot always be easily distinguished from other UDP traffic, but we assume at least some portion of QUIC traffic can be identified (see
Section 3.1). For networks supporting QUIC, it is recommended that a path drops any packet larger than the fragmentation size. When a QUIC endpoint uses DPLPMTUD, it will use a QUIC probe packet to discover the PMTU. If this probe is lost, it will not impact the flow of QUIC data.
IPv4 routers generate an ICMP message when a packet is dropped because the link MTU was exceeded. [
RFC 8504] specifies how an IPv6 node generates an ICMPv6 PTB in this case. PMTUD relies upon an endpoint receiving such PTB messages [
RFC 8201], whereas DPLPMTUD does not reply upon these messages, but can still optionally use these to improve performance
Section 4.6 of [
DPLPMTUD].
A network cannot know in advance which discovery method is used by a QUIC endpoint, so it should send a PTB message in addition to dropping an oversized packet. A generated PTB message should be compliant with the validation requirements of
Section 14.2.1 of [
QUIC-TRANSPORT], otherwise it will be ignored for PMTU discovery. This provides a signal to the endpoint to prevent the packet size from growing too large, which can entirely avoid network segment fragmentation for that flow.
Endpoints can cache PMTU information in the IP-layer cache. This short-term consistency between the PMTU for flows can help avoid an endpoint using a PMTU that is inefficient. The IP cache can also influence the PMTU value of other IP flows that use the same path [
RFC 8201] [
DPLPMTUD], including IP packets carrying protocols other than QUIC. The representation of an IP path is implementation specific [
RFC 8201].