7. Sequence Numbers
DCCP uses sequence numbers to arrange packets into sequence, to detect losses and network duplicates, and to protect against attackers, half-open connections, and the delivery of very old packets. Every packet carries a Sequence Number; most packet types carry an Acknowledgement Number as well. DCCP sequence numbers are packet based. That is, Sequence Numbers generated by each endpoint increase by one, modulo 2^48, per packet. Even DCCP-Ack and DCCP-Sync packets, and other packets that don't carry user data, increment the Sequence Number. Since DCCP is an unreliable protocol, there are no true retransmissions, but effective retransmissions, such as retransmissions of DCCP-Request packets, also increment the Sequence Number. This lets DCCP implementations
detect network duplication, retransmissions, and acknowledgement loss; it is a significant departure from TCP practice.7.1. Variables
DCCP endpoints maintain a set of sequence number variables for each connection. ISS The Initial Sequence Number Sent by this endpoint. This equals the Sequence Number of the first DCCP-Request or DCCP-Response sent. ISR The Initial Sequence Number Received from the other endpoint. This equals the Sequence Number of the first DCCP-Request or DCCP-Response received. GSS The Greatest Sequence Number Sent by this endpoint. Here, and elsewhere, "greatest" is measured in circular sequence space. GSR The Greatest Sequence Number Received from the other endpoint on an acknowledgeable packet. (Section 7.4 defines this term.) GAR The Greatest Acknowledgement Number Received from the other endpoint on an acknowledgeable packet that was not a DCCP- Sync. Some other variables are derived from these primitives. SWL and SWH (Sequence Number Window Low and High) The extremes of the validity window for received packets' Sequence Numbers. AWL and AWH (Acknowledgement Number Window Low and High) The extremes of the validity window for received packets' Acknowledgement Numbers.7.2. Initial Sequence Numbers
The endpoints' initial sequence numbers are set by the first DCCP- Request and DCCP-Response packets sent. Initial sequence numbers MUST be chosen to avoid two problems: o delivery of old packets, where packets lingering in the network from an old connection are delivered to a new connection with the same addresses and port numbers; and
o sequence number attacks, where an attacker can guess the sequence numbers that a future connection would use [M85]. These problems are the same as those faced by TCP, and DCCP implementations SHOULD use TCP's strategies to avoid them [RFC793, RFC1948]. The rest of this section explains these strategies in more detail. To address the first problem, an implementation MUST ensure that the initial sequence number for a given <source address, source port, destination address, destination port> 4-tuple doesn't overlap with recent sequence numbers on previous connections with the same 4-tuple. ("Recent" means sent within 2 maximum segment lifetimes, or 4 minutes.) The implementation MUST additionally ensure that the lower 24 bits of the initial sequence number don't overlap with the lower 24 bits of recent sequence numbers (unless the implementation plans to avoid short sequence numbers; see Section 7.6). An implementation that has state for a recent connection with the same 4-tuple can pick a good initial sequence number explicitly. Otherwise, it could tie initial sequence number selection to some clock, such as the 4-microsecond clock used by TCP [RFC793]. Two separate clocks may be required, one for the upper 24 bits and one for the lower 24 bits. To address the second problem, an implementation MUST provide each 4-tuple with an independent initial sequence number space. Then, opening a connection doesn't provide any information about initial sequence numbers on other connections to the same host. [RFC1948] achieves this by adding a cryptographic hash of the 4-tuple and a secret to each initial sequence number. For the secret, [RFC1948] recommends a combination of some truly random data [RFC4086], an administratively installed passphrase, the endpoint's IP address, and the endpoint's boot time, but truly random data is sufficient. Care should be taken when the secret is changed; such a change alters all initial sequence number spaces, which might make an initial sequence number for some 4-tuple equal a recently sent sequence number for the same 4-tuple. To avoid this problem, the endpoint might remember dead connection state for each 4-tuple or stay quiet for 2 maximum segment lifetimes around such a change.7.3. Quiet Time
DCCP endpoints, like TCP endpoints, must take care before initiating connections when they boot. In particular, they MUST NOT send packets whose sequence numbers are close to the sequence numbers of packets lingering in the network from before the boot. The simplest way to enforce this rule is for DCCP endpoints to avoid sending any packets until one maximum segment lifetime (2 minutes) after boot.
Other enforcement mechanisms include remembering recent sequence numbers across boots and reserving the upper 8 or so bits of initial sequence numbers for a persistent counter that decrements by two each boot. (The latter mechanism would require disallowing packets with short sequence numbers; see Section 7.6.1.)7.4. Acknowledgement Numbers
Cumulative acknowledgements are meaningless in an unreliable protocol. Therefore, DCCP's Acknowledgement Number field has a different meaning from TCP's. A received packet is classified as acknowledgeable if and only if its header was successfully processed by the receiving DCCP. In terms of the pseudocode in Section 8.5, a received packet becomes acknowledgeable when the receiving endpoint reaches Step 8. This means, for example, that all acknowledgeable packets have valid header checksums and sequence numbers. A sent packet's Acknowledgement Number MUST equal the sending endpoint's GSR, the Greatest Sequence Number Received on an acknowledgeable packet, for all packet types except DCCP-Sync and DCCP-SyncAck. "Acknowledgeable" does not refer to data processing. Even acknowledgeable packets may have their application data dropped, due to receive buffer overflow or corruption, for instance. Data Dropped options report these data losses when necessary, letting congestion control mechanisms distinguish between network losses and endpoint losses. This issue is discussed further in Sections 11.4 and 11.7. DCCP-Sync and DCCP-SyncAck packets' Acknowledgement Numbers differ as follows: The Acknowledgement Number on a DCCP-Sync packet corresponds to a received packet, but not necessarily to an acknowledgeable packet; in particular, it might correspond to an out-of-sync packet whose options were not processed. The Acknowledgement Number on a DCCP-SyncAck packet always corresponds to an acknowledgeable DCCP- Sync packet; it might be less than GSR in the presence of reordering.7.5. Validity and Synchronization
Any DCCP endpoint might receive packets that are not actually part of the current connection. For instance, the network might deliver an old packet, an attacker might attempt to hijack a connection, or the other endpoint might crash, causing a half-open connection. DCCP, like TCP, uses sequence number checks to detect these cases. Packets whose Sequence and/or Acknowledgement Numbers are out of range are called sequence-invalid and are not processed normally.
Unlike TCP, DCCP requires a synchronization mechanism to recover from large bursts of loss. One endpoint might send so many packets during a burst of loss that when one of its packets finally got through, the other endpoint would label its Sequence Number as invalid. A handshake of DCCP-Sync and DCCP-SyncAck packets recovers from this case.7.5.1. Sequence and Acknowledgement Number Windows
Each DCCP endpoint defines sequence validity windows that are subsets of the Sequence and Acknowledgement Number spaces. These windows correspond to packets the endpoint expects to receive in the next few round-trip times. The Sequence and Acknowledgement Number windows always contain GSR and GSS, respectively. The window widths are controlled by Sequence Window features for the two half-connections. The Sequence Number validity window for packets from DCCP B is [SWL, SWH]. This window always contains GSR, the Greatest Sequence Number Received on a sequence-valid packet from DCCP B. It is W packets wide, where W is the value of the Sequence Window/B feature. One- fourth of the sequence window, rounded down, is less than or equal to GSR, and three-fourths is greater than GSR. (This asymmetric placement assumes that bursts of loss are more common in the network than significant reorderings.) invalid | valid Sequence Numbers | invalid <---------*|*===========*=======================*|*---------> GSR -|GSR + 1 - GSR GSR +|GSR + 1 + floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) = SWL = SWH The Acknowledgement Number validity window for packets from DCCP B is [AWL, AWH]. The high end of the window, AWH, equals GSS, the Greatest Sequence Number Sent by DCCP A; the window is W' packets wide, where W' is the value of the Sequence Window/A feature. invalid | valid Acknowledgement Numbers | invalid <---------*|*===================================*|*---------> GSS - W'|GSS + 1 - W' GSS|GSS + 1 = AWL = AWH SWL and AWL are initially adjusted so that they are not less than the initial Sequence Numbers received and sent, respectively: SWL := max(GSR + 1 - floor(W/4), ISR), AWL := max(GSS + 1 - W', ISS).
These adjustments MUST be applied only at the beginning of the connection. (Long-lived connections may wrap sequence numbers so that they appear to be less than ISR or ISS; the adjustments MUST NOT be applied in that case.)7.5.2. Sequence Window Feature
The Sequence Window/A feature determines the width of the Sequence Number validity window used by DCCP B and the width of the Acknowledgement Number validity window used by DCCP A. DCCP A sends a "Change L(Sequence Window, W)" option to notify DCCP B that the Sequence Window/A value is W. Sequence Window has feature number 3 and is non-negotiable. It takes 48-bit (6-byte) integer values, like DCCP sequence numbers. Change and Confirm options for Sequence Window are therefore 9 bytes long. New connections start with Sequence Window 100 for both endpoints. The minimum valid Sequence Window value is Wmin = 32. The maximum valid Sequence Window value is Wmax = 2^46 - 1 = 70368744177663. Change options suggesting Sequence Window values out of this range are invalid and MUST be handled accordingly. A proper Sequence Window/A value must reflect the number of packets DCCP A expects to be in flight. Only DCCP A can anticipate this number. Values that are too small increase the risk of the endpoints getting out sync after bursts of loss, and values that are much too small can prevent productive communication whether or not there is loss. On the other hand, too-large values increase the risk of connection hijacking; Section 7.5.5 quantifies this risk. One good guideline is for each endpoint to set Sequence Window to about five times the maximum number of packets it expects to send in a round- trip time. Endpoints SHOULD send Change L(Sequence Window) options, as necessary, as the connection progresses. Also, an endpoint MUST NOT persistently send more than its Sequence Window number of packets per round-trip time; that is, DCCP A MUST NOT persistently send more than Sequence Window/A packets per RTT.7.5.3. Sequence-Validity Rules
Sequence-validity depends on the received packet's type. This table shows the sequence and acknowledgement number checks applied to each packet; a packet is sequence-valid if it passes both tests, and sequence-invalid if it does not. Many of the checks refer to the sequence and acknowledgement number validity windows [SWL, SWH] and [AWL, AWH] defined in Section 7.5.1.
Acknowledgement Number Packet Type Sequence Number Check Check ----------- --------------------- ---------------------- DCCP-Request SWL <= seqno <= SWH (*) N/A DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH DCCP-Data SWL <= seqno <= SWH N/A DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-DataAck SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-CloseReq GSR < seqno <= SWH GAR <= ackno <= AWH DCCP-Close GSR < seqno <= SWH GAR <= ackno <= AWH DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH DCCP-Sync SWL <= seqno AWL <= ackno <= AWH DCCP-SyncAck SWL <= seqno AWL <= ackno <= AWH (*) Check not applied if connection is in LISTEN or REQUEST state. In general, packets are sequence-valid if their Sequence and Acknowledgement Numbers lie within the corresponding valid windows, [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as follows: o Since DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets end a connection, they cannot have Sequence Numbers less than or equal to GSR, or Acknowledgement Numbers less than GAR. o DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly checked. These packet types exist specifically to get the endpoints back into sync; checking their Sequence Numbers would eliminate their usefulness. The lenient checks on DCCP-Sync and DCCP-SyncAck packets allow continued operation after unusual events, such as endpoint crashes and large bursts of loss, but there's no need for leniency in the absence of unusual events -- that is, during ongoing successful communication. Therefore, DCCP implementations SHOULD use the following, more stringent checks for active connections, where a connection is considered active if it has received valid packets from the other endpoint within the last three round-trip times. Acknowledgement Number Packet Type Sequence Number Check Check ----------- --------------------- ---------------------- DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH Finally, an endpoint MAY apply the following more stringent checks to DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets, further lowering the probability of successful blind attacks using those packet types.
Since these checks can cause extra synchronization overhead and delay connection closing when packets are lost, they should be considered experimental. Acknowledgement Number Packet Type Sequence Number Check Check ----------- --------------------- ---------------------- DCCP-CloseReq seqno == GSR + 1 GAR <= ackno <= AWH DCCP-Close seqno == GSR + 1 GAR <= ackno <= AWH DCCP-Reset seqno == GSR + 1 GAR <= ackno <= AWH Note that sequence-validity is only one of the validity checks applied to received packets.7.5.4. Handling Sequence-Invalid Packets
Endpoints respond to received sequence-invalid packets as follows. o Any sequence-invalid DCCP-Sync or DCCP-SyncAck packet MUST be ignored. o A sequence-invalid DCCP-Reset packet MUST elicit a DCCP-Sync packet in response (subject to a possible rate limit). This response packet MUST use a new Sequence Number, and thus will increase GSS; GSR will not change, however, since the received packet was sequence-invalid. The response packet's Acknowledgement Number MUST equal GSR. o Any other sequence-invalid packet MUST elicit a similar DCCP-Sync packet, except that the response packet's Acknowledgement Number MUST equal the sequence-invalid packet's Sequence Number. On receiving a sequence-valid DCCP-Sync packet, the peer endpoint (say, DCCP B) MUST update its GSR variable and reply with a DCCP- SyncAck packet. The DCCP-SyncAck packet's Acknowledgement Number will equal the DCCP-Sync's Sequence Number, which is not necessarily GSR. Upon receiving this DCCP-SyncAck, which will be sequence-valid since it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, and the endpoints will be back in sync. As an exception, if the peer endpoint is in the REQUEST state, it MUST respond with a DCCP-Reset instead of a DCCP-SyncAck. This serves to clean up DCCP A's half-open connection. To protect against denial-of-service attacks, DCCP implementations SHOULD impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets, such as not more than eight DCCP-Syncs per second.
DCCP endpoints MUST NOT process sequence-invalid packets except, perhaps, by generating a DCCP-Sync. For instance, options MUST NOT be processed. An endpoint MAY temporarily preserve sequence-invalid packets in case they become valid later, however; this can reduce the impact of bursts of loss by delivering more packets to the application. In particular, an endpoint MAY preserve sequence- invalid packets for up to 2 round-trip times. If, within that time, the relevant sequence windows change so that the packets become sequence-valid, the endpoint MAY process them again. Note that sequence-invalid DCCP-Reset packets cause DCCP-Syncs to be generated. This is because endpoints in an unsynchronized state (CLOSED, REQUEST, and LISTEN) might not have enough information to generate a proper DCCP-Reset on the first try. For example, if a peer endpoint is in CLOSED state and receives a DCCP-Data packet, it cannot guess the right Sequence Number to use on the DCCP-Reset it generates (since the DCCP-Data packet has no Acknowledgement Number). The DCCP-Sync generated in response to this bad reset serves as a challenge, and contains enough information for the peer to generate a proper DCCP-Reset. However, the new DCCP-Reset may carry a different Reset Code than the original DCCP-Reset; probably the new Reset Code will be 3, "No Connection". The endpoint SHOULD use information from the original DCCP-Reset when possible.7.5.5. Sequence Number Attacks
Sequence and Acknowledgement Numbers form DCCP's main line of defense against attackers. An attacker that cannot guess sequence numbers cannot easily manipulate or hijack a DCCP connection, and requirements like careful initial sequence number choice eliminate the most serious attacks. An attacker might still send many packets with randomly chosen Sequence and Acknowledgement Numbers, however. If one of those probes ends up sequence-valid, it may shut down the connection or otherwise cause problems. The easiest such attacks to execute are as follows: o Send DCCP-Data packets with random Sequence Numbers. If one of these packets hits the valid sequence number window, the attack packet's application data may be inserted into the data stream. o Send DCCP-Sync packets with random Sequence and Acknowledgement Numbers. If one of these packets hits the valid acknowledgement number window, the receiver will shift its sequence number window accordingly, getting out of sync with the correct endpoint -- perhaps permanently.
The attacker has to guess both Source and Destination Ports for any of these attacks to succeed. Additionally, the connection would have to be inactive for the DCCP-Sync attack to succeed, assuming the victim implemented the more stringent checks for active connections recommended in Section 7.5.3. To quantify the probability of success, let N be the number of attack packets the attacker is willing to send, W be the relevant sequence window width, and L be the length of sequence numbers (24 or 48). The attacker's best strategy is to space the attack packets evenly over sequence space. Then the probability of hitting one sequence number window is P = WN/2^L. The success probability for a DCCP-Data attack using short sequence numbers thus equals P = WN/2^24. For W = 100, then, the attacker must send more than 83,000 packets to achieve a 50% chance of success. For reference, the easiest TCP attack -- sending a SYN with a random sequence number, which will cause a connection reset if it falls within the window -- with W = 8760 (a common default) and L = 32 requires more than 245,000 packets to achieve a 50% chance of success. A fast connection's W will generally be high, increasing the attack success probability for fixed N. If this probability gets uncomfortably high with L = 24, the endpoint SHOULD prevent the use of short sequence numbers by manipulating the Allow Short Sequence Numbers feature (see Section 7.6.1). The probability limit depends on the application, however. Some applications, such as those already designed to handle corruption, are quite resilient to data injection attacks. The DCCP-Sync attack has L = 48, since DCCP-Sync packets use long sequence numbers exclusively; in addition, the success probability is halved, since only half the Sequence Number space is valid. Attacks have a correspondingly smaller probability of success. For a large W of 2000 packets, then, the attacker must send more than 10^11 packets to achieve a 50% chance of success. Attacks involving DCCP-Ack, DCCP-DataAck, DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets are more difficult, since Sequence and Acknowledgement Numbers must both be guessed. The probability of attack success for these packet types equals P = WXN/2^(2L), where W is the Sequence Number window, X is the Acknowledgement Number window, and N and L are as before. Since DCCP-Data attacks with short sequence numbers are relatively easy for attackers to execute, DCCP has been engineered to prevent these attacks from escalating to connection resets or other serious
consequences. In particular, any options whose processing might cause the connection to be reset are ignored when they appear on DCCP-Data packets.7.5.6. Sequence Number Handling Examples
In the following example, DCCP A and DCCP B recover from a large burst of loss that runs DCCP A's sequence numbers out of DCCP B's appropriate sequence number window. DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) --> DCCP-Data(seq 2) XXX ... --> DCCP-Data(seq 100) XXX --> DCCP-Data(seq 101) --> ??? seqno out of range; send Sync OK <-- DCCP-Sync(seq 11, ack 101) <-- (GSS=11,GSR=1) --> DCCP-SyncAck(seq 102, ack 11) --> OK (GSS=102,GSR=11) (GSS=11,GSR=102) In the next example, a DCCP connection recovers from a simple blind attack. DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? seqno out of range; send Sync ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- ackno out of range; ignore (GSS=1,GSR=10) (GSS=11,GSR=1) The final example demonstrates recovery from a half-open connection. DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) (Crash) CLOSED OPEN REQUEST --> DCCP-Request(seq 400) --> ??? !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) REQUEST CLOSED REQUEST --> DCCP-Request(seq 402) --> ...
7.6. Short Sequence Numbers
DCCP sequence numbers are 48 bits long. This large sequence space protects DCCP connections against some blind attacks, such as the injection of DCCP-Resets into the connection. However, DCCP-Data, DCCP-Ack, and DCCP-DataAck packets, which make up the body of any DCCP connection, may reduce header space by transmitting only the lower 24 bits of the relevant Sequence and Acknowledgement Numbers. The receiving endpoint will extend these numbers to 48 bits using the following pseudocode: procedure Extend_Sequence_Number(S, REF) /* S is a 24-bit sequence number from the packet header. REF is the relevant 48-bit reference sequence number: GSS if S is an Acknowledgement Number, and GSR if S is a Sequence Number. */ Set REF_low := low 24 bits of REF Set REF_hi := high 24 bits of REF If REF_low (<) S /* circular comparison mod 2^24 */ and S |<| REF_low, /* conventional, non-circular comparison */ Return (((REF_hi + 1) mod 2^24) << 24) | S Otherwise, if S (<) REF_low and REF_low |<| S, Return (((REF_hi - 1) mod 2^24) << 24) | S Otherwise, Return (REF_hi << 24) | S The two different kinds of comparison in the if statements detect when the low-order bits of the sequence space have wrapped. (The circular comparison "REF_low (<) S" returns true if and only if (S - REF_low), calculated using two's-complement arithmetic and then represented as an unsigned number, is less than or equal to 2^23 (mod 2^24).) When this happens, the high-order bits are incremented or decremented, as appropriate.7.6.1. Allow Short Sequence Numbers Feature
Endpoints can require that all packets use long sequence numbers by leaving the Allow Short Sequence Numbers feature value at its default of zero. This can reduce the risk that data will be inappropriately injected into the connection. DCCP A sends a "Change L(Allow Short Seqnos, 1)" option to indicate its desire to send packets with short sequence numbers. Allow Short Sequence Numbers has feature number 2 and is server- priority. It takes one-byte Boolean values. When Allow Short Seqnos/B is zero, DCCP B MUST NOT send packets with short sequence numbers and DCCP A MUST ignore any packets with short sequence
numbers that are received. Values of two or more are reserved. New connections start with Allow Short Sequence Numbers 0 for both endpoints.7.6.2. When to Avoid Short Sequence Numbers
Short sequence numbers reduce the rate DCCP connections can safely achieve and increase the risks of certain kinds of attacks, including blind data injection. Very-high-rate DCCP connections, and connections with large sequence windows (Section 7.5.2), SHOULD NOT use short sequence numbers on their data packets. The attack risk issues have been discussed in Section 7.5.5; we discuss the rate limitation issue here. The sequence-validity mechanism assumes that the network does not deliver extremely old data. In particular, it assumes that the network must have dropped any packet by the time the connection wraps around and uses its sequence number again. This constraint limits the maximum connection rate that can be safely achieved. Let MSL equal the maximum segment lifetime, P equal the average DCCP packet size in bits, and L equal the length of sequence numbers (24 or 48 bits). Then the maximum safe rate, in bits per second, is R = P*(2^L)/2MSL. For the default MSL of 2 minutes, 1500-byte DCCP packets, and short sequence numbers, the safe rate is therefore approximately 800 Mb/s. Although 2 minutes is a very large MSL for any networks that could sustain that rate with such small packets, long sequence numbers allow much higher rates under the same constraints: up to 14 petabits a second for 1500-byte packets and the default MSL.7.7. NDP Count and Detecting Application Loss
DCCP's sequence numbers increment by one on every packet, including non-data packets (packets that don't carry application data). This makes DCCP sequence numbers suitable for detecting any network loss, but not for detecting the loss of application data. The NDP Count option reports the length of each burst of non-data packets. This lets the receiving DCCP reliably determine when a burst of loss included application data. +--------+--------+-------- ... --------+ |00100101| Length | NDP Count | +--------+--------+-------- ... --------+ Type=37 Len=3-8 (1-6 bytes) If a DCCP endpoint's Send NDP Count feature is one (see below), then that endpoint MUST send an NDP Count option on every packet whose
immediate predecessor was a non-data packet. Non-data packets consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq, DCCP-Reset, DCCP-Sync, and DCCP-SyncAck. The other packet types, namely DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, are considered data packets, although not all DCCP-Request and DCCP- Response packets will actually carry application data. The value stored in NDP Count equals the number of consecutive non- data packets in the run immediately previous to the current packet. Packets with no NDP Count option are considered to have NDP Count zero. The NDP Count option can carry one to six bytes of data. The smallest option format that can hold the NDP Count SHOULD be used. With NDP Count, the receiver can reliably tell only whether a burst of loss contained at least one data packet. For example, the receiver cannot always tell whether a burst of loss contained a non- data packet.7.7.1. NDP Count Usage Notes
Say that K consecutive sequence numbers are missing in some burst of loss, and that the Send NDP Count feature is on. Then some application data was lost within those sequence numbers unless the packet following the hole contains an NDP Count option whose value is greater than or equal to K. For example, say that an endpoint sent the following sequence of non-data packets (Nx) and data packets (Dx). N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 Those packets would have NDP Counts as follows. N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 - 1 2 - 1 - - 1 - - - - 1 2 NDP Count is not useful for applications that include their own sequence numbers with their packet headers.7.7.2. Send NDP Count Feature
The Send NDP Count feature lets DCCP endpoints negotiate whether they should send NDP Count options on their packets. DCCP A sends a "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count options.
Send NDP Count has feature number 7 and is server-priority. It takes one-byte Boolean values. DCCP B MUST send NDP Count options as described above when Send NDP Count/B is one, although it MAY send NDP Count options even when Send NDP Count/B is zero. Values of two or more are reserved. New connections start with Send NDP Count 0 for both endpoints.8. Event Processing
This section describes how DCCP connections move between states and which packets are sent when. Note that feature negotiation takes place in parallel with the connection-wide state transitions described here.8.1. Connection Establishment
DCCP connections' initiation phase consists of a three-way handshake: an initial DCCP-Request packet sent by the client, a DCCP-Response sent by the server in reply, and finally an acknowledgement from the client, usually via a DCCP-Ack or DCCP-DataAck packet. The client moves from the REQUEST state to PARTOPEN, and finally to OPEN; the server moves from LISTEN to RESPOND, and finally to OPEN. Client State Server State CLOSED LISTEN 1. REQUEST --> Request --> 2. <-- Response <-- RESPOND 3. PARTOPEN --> Ack, DataAck --> 4. <-- Data, Ack, DataAck <-- OPEN 5. OPEN <-> Data, Ack, DataAck <-> OPEN8.1.1. Client Request
When a client decides to initiate a connection, it enters the REQUEST state, chooses an initial sequence number (Section 7.2), and sends a DCCP-Request packet using that sequence number to the intended server. DCCP-Request packets will commonly carry feature negotiation options that open negotiations for various connection parameters, such as preferred congestion control IDs for each half-connection. They may also carry application data, but the client should be aware that the server may not accept such data. A client in the REQUEST state SHOULD use an exponential-backoff timer to send new DCCP-Request packets if no response is received. The first retransmission should occur after approximately one second, backing off to not less than one packet every 64 seconds; or the
endpoint can use whatever retransmission strategy is followed for retransmitting TCP SYNs. Each new DCCP-Request MUST increment the Sequence Number by one and MUST contain the same Service Code and application data as the original DCCP-Request. A client MAY give up on its DCCP-Requests after some time (3 minutes, for example). When it does, it SHOULD send a DCCP-Reset packet to the server with Reset Code 2, "Aborted", to clean up state in case one or more of the Requests actually arrived. A client in REQUEST state has never received an initial sequence number from its peer, so the DCCP-Reset's Acknowledgement Number MUST be set to zero. The client leaves the REQUEST state for PARTOPEN when it receives a DCCP-Response from the server.8.1.2. Service Codes
Each DCCP-Request contains a 32-bit Service Code, which identifies the application-level service to which the client application is trying to connect. Service Codes should correspond to application services and protocols. For example, there might be a Service Code for SIP control connections and one for RTP audio connections. Middleboxes, such as firewalls, can use the Service Code to identify the application running on a nonstandard port (assuming the DCCP header has not been encrypted). Endpoints MUST associate a Service Code with every DCCP socket, both actively and passively opened. The application will generally supply this Service Code. Each active socket MUST have exactly one Service Code. Passive sockets MAY, at the implementation's discretion, be associated with more than one Service Code; this might let multiple applications, or multiple versions of the same application, listen on the same port, differentiated by Service Code. If the DCCP-Request's Service Code doesn't equal any of the server's Service Codes for the given port, the server MUST reject the request by sending a DCCP- Reset packet with Reset Code 8, "Bad Service Code". A middlebox MAY also send such a DCCP-Reset in response to packets whose Service Code is considered unsuitable. Service Codes are not intended to be DCCP-specific and are allocated by IANA. Following the policies outlined in [RFC2434], most Service Codes are allocated First Come First Served, subject to the following guidelines. o Service Codes are allocated one at a time, or in small blocks. A short English description of the intended service is REQUIRED to obtain a Service Code assignment, but no specification, standards
track or otherwise, is necessary. IANA maintains an association of Service Codes to the corresponding phrases. o Users request specific Service Code values. We suggest that users request Service Codes that can be represented using the "SC:" formatting convention described below. Thus, the "Frobodyne Plotz Protocol" might correspond to Service Code 17178548426 or, equivalently, "SC:fdpz". The canonical interpretation of a Service Code field is numeric. o Service Codes whose bytes each have values in the set {32, 45-57, 65-90} use a Specification Required allocation policy. That is, these Service Codes are used for international standard or standards-track specifications, IETF or otherwise. (This set consists of the ASCII digits, uppercase letters, and characters space, '-', '.', and '/'.) o Service Codes whose high-order byte equals 63 (ASCII '?') are reserved for Private Use. o Service Code 0 represents the absence of a meaningful Service Code and MUST NOT be allocated. o The value 4294967295 is an invalid Service Code. Servers MUST reject any DCCP-Request with this Service Code value by sending a DCCP-Reset packet with Reset Code 8, "Bad Service Code". This design for Service Code allocation is based on the allocation of 4-byte identifiers for Macintosh resources, PNG chunks, and TrueType and OpenType tables. In text settings, we recommend that Service Codes be written in one of three forms, prefixed by the ASCII letters SC and either a colon ":" or equals sign "=". These forms are interpreted as follows. SC: Indicates a Service Code representable using a subset of the ASCII characters. The colon is followed by one to four characters taken from the following set: letters, digits, and the characters in "-_+.*/?@" (not including quotes). Numerically, these characters have values in {42-43, 45-57, 63-90, 95, 97-122}. The Service Code is calculated by padding the string on the right with spaces (value 32) and intepreting the four-character result as a 32-bit big-endian number. SC= Indicates a decimal Service Code. The equals sign is followed by any number of decimal digits, which specify the Service Code. Values above 4294967294 are illegal.
SC=x or SC=X Indicates a hexadecimal Service Code. The "x" or "X" is followed by any number of hexadecimal digits (upper or lower case), which specify the Service Code. Values above 4294967294 are illegal. Thus, the Service Code 1717858426 might be represented in text as either SC:fdpz, SC=1717858426, or SC=x6664707A.8.1.3. Server Response
In the second phase of the three-way handshake, the server moves from the LISTEN state to RESPOND and sends a DCCP-Response message to the client. In this phase, a server will often specify the features it would like to use, either from among those the client requested or in addition to those. Among these options is the congestion control mechanism the server expects to use. The server MAY respond to a DCCP-Request packet with a DCCP-Reset packet to refuse the connection. Relevant Reset Codes for refusing a connection include 7, "Connection Refused", when the DCCP-Request's Destination Port did not correspond to a DCCP port open for listening; 8, "Bad Service Code", when the DCCP-Request's Service Code did not correspond to the service code registered with the Destination Port; and 9, "Too Busy", when the server is currently too busy to respond to requests. The server SHOULD limit the rate at which it generates these resets; for example, to not more than 1024 per second. The server SHOULD NOT retransmit DCCP-Response packets; the client will retransmit the DCCP-Request if necessary. (Note that the "retransmitted" DCCP-Request will have, at least, a different sequence number from the "original" DCCP-Request. The server can thus distinguish true retransmissions from network duplicates.) The server will detect that the retransmitted DCCP-Request applies to an existing connection because of its Source and Destination Ports. Every valid DCCP-Request received while the server is in the RESPOND state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST increment the server's Sequence Number by one and MUST include the same application data, if any, as the original DCCP-Response. The server MUST NOT accept more than one piece of DCCP-Request application data per connection. In particular, the DCCP-Response sent in reply to a retransmitted DCCP-Request with application data SHOULD contain a Data Dropped option, in which the retransmitted DCCP-Request data is reported with Drop Code 0, Protocol Constraints. The original DCCP-Request SHOULD also be reported in the Data Dropped option, either in a Normal Block (if the server accepted the data or
there was no data) or in a Drop Code 0 Drop Block (if the server refused the data the first time as well). The Data Dropped and Init Cookie options are particularly useful for DCCP-Response packets (Sections 11.7 and 8.1.4). The server leaves the RESPOND state for OPEN when it receives a valid DCCP-Ack from the client, completing the three-way handshake. It MAY also leave the RESPOND state for CLOSED after a timeout of not less than 4MSL (8 minutes); when doing so, it SHOULD send a DCCP-Reset with Reset Code 2, "Aborted", to clean up state at the client.8.1.4. Init Cookie Option
+--------+--------+--------+--------+--------+-------- |00100100| Length | Init Cookie Value ... +--------+--------+--------+--------+--------+-------- Type=36 The Init Cookie option lets a DCCP server avoid having to hold any state until the three-way connection setup handshake has completed, in a similar fashion as for TCP SYN cookies [SYNCOOKIES]. The server wraps up the Service Code, server port, and any options it cares about from both the DCCP-Request and DCCP-Response in an opaque cookie. Typically the cookie will be encrypted using a secret known only to the server and will include a cryptographic checksum or magic value so that correct decryption can be verified. When the server receives the cookie back in the response, it can decrypt the cookie and instantiate all the state it avoided keeping. In the meantime, it need not move from the LISTEN state. The Init Cookie option MUST NOT be sent on DCCP-Request or DCCP-Data packets. Any Init Cookie options received on DCCP-Request or DCCP- Data packets, or after the connection has been established (when the connection's state is >= OPEN), MUST be ignored. The server MAY include Init Cookie options in its DCCP-Response. If so, then the client MUST echo the same Init Cookie options, in the same order, in each succeeding DCCP packet until one of those packets is acknowledged (showing that the three-way handshake has completed) or the connection is reset. As a result, the client MUST NOT use DCCP- Data packets until the three-way handshake completes or the connection is reset. The Init Cookie options on a client packet MUST equal those received on the DCCP-Request indicated by the client packet's Acknowledgement Number. The server SHOULD design its Init Cookie format so that Init Cookies can be checked for tampering; it SHOULD respond to a tampered Init Cookie option by resetting the connection with Reset Code 10, "Bad Init Cookie".
Init Cookie's precise implementation need not be specified here; since Init Cookies are opaque to the client, there are no interoperability concerns. An example cookie format might encrypt (using a secret key) the connection's initial sequence and acknowledgement numbers, ports, Service Code, any options included on the DCCP-Request packet and the corresponding DCCP-Response, a random salt, and a magic number. On receiving a reflected Init Cookie, the server would decrypt the cookie, validate it by checking its magic number, sequence numbers, and ports, and, if valid, create a corresponding socket using the options. Each individual Init Cookie option can hold at most 253 bytes of data, but a server can send multiple Init Cookie options to gain more space.8.1.5. Handshake Completion
When the client receives a DCCP-Response from the server, it moves from the REQUEST state to PARTOPEN and completes the three-way handshake by sending a DCCP-Ack packet to the server. The client remains in PARTOPEN until it can be sure that the server has received some packet the client sent from PARTOPEN (either the initial DCCP- Ack or a later packet). Clients in the PARTOPEN state that want to send data MUST do so using DCCP-DataAck packets, not DCCP-Data packets. This is because DCCP-Data packets lack Acknowledgement Numbers, so the server can't tell from a DCCP-Data packet whether the client saw its DCCP-Response. Furthermore, if the DCCP-Response included an Init Cookie, that Init Cookie MUST be included on every packet sent in PARTOPEN. The single DCCP-Ack sent when entering the PARTOPEN state might, of course, be dropped by the network. The client SHOULD ensure that some packet gets through eventually. The preferred mechanism would be a roughly 200-millisecond timer, set every time a packet is transmitted in PARTOPEN. If this timer goes off and the client is still in PARTOPEN, the client generates another DCCP-Ack and backs off the timer. If the client remains in PARTOPEN for more than 4MSL (8 minutes), it SHOULD reset the connection with Reset Code 2, "Aborted". The client leaves the PARTOPEN state for OPEN when it receives a valid packet other than DCCP-Response, DCCP-Reset, or DCCP-Sync from the server.8.2. Data Transfer
In the central data transfer phase of the connection, both server and client are in the OPEN state.
DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to application events on host A. These packets are congestion- controlled by the CCID for the A-to-B half-connection. In contrast, DCCP-Ack packets sent by DCCP A are controlled by the CCID for the B-to-A half-connection. Generally, DCCP A will piggyback acknowledgement information on DCCP-Data packets when acceptable, creating DCCP-DataAck packets. DCCP-Ack packets are used when there is no data to send from DCCP A to DCCP B, or when the congestion state of the A-to-B CCID will not allow data to be sent. DCCP-Sync and DCCP-SyncAck packets may also occur in the data transfer phase. Some cases causing DCCP-Sync generation are discussed in Section 7.5. One important distinction between DCCP- Sync packets and other packet types is that DCCP-Sync elicits an immediate acknowledgement. On receiving a valid DCCP-Sync packet, a DCCP endpoint MUST immediately generate and send a DCCP-SyncAck response (subject to any implementation rate limits); the Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence Number of the DCCP-Sync. A particular DCCP implementation might decide to initiate feature negotiation only once the OPEN state was reached, in which case it might not allow data transfer until some time later. Data received during that time SHOULD be rejected and reported using a Data Dropped Drop Block with Drop Code 0, Protocol Constraints (see Section 11.7).8.3. Termination
DCCP connection termination uses a handshake consisting of an optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset packet. The server moves from the OPEN state, possibly through the CLOSEREQ state, to CLOSED; the client moves from OPEN through CLOSING to TIMEWAIT, and after 2MSL wait time (4 minutes) to CLOSED. The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the server decides to close the connection but doesn't want to hold TIMEWAIT state: Client State Server State OPEN OPEN 1. <-- CloseReq <-- CLOSEREQ 2. CLOSING --> Close --> 3. <-- Reset <-- CLOSED (LISTEN) 4. TIMEWAIT 5. CLOSED
A shorter sequence occurs when the client decides to close the connection. Client State Server State OPEN OPEN 1. CLOSING --> Close --> 2. <-- Reset <-- CLOSED (LISTEN) 3. TIMEWAIT 4. CLOSED Finally, the server can decide to hold TIMEWAIT state: Client State Server State OPEN OPEN 1. <-- Close <-- CLOSING 2. CLOSED --> Reset --> 3. TIMEWAIT 4. CLOSED (LISTEN) In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT state for the connection. As in TCP, TIMEWAIT state, where an endpoint quietly preserves a socket for 2MSL (4 minutes) after its connection has closed, ensures that no connection duplicating the current connection's source and destination addresses and ports can start up while old packets might remain in the network. The termination handshake proceeds as follows. The receiver of a valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet. The receiver of a valid DCCP-Close packet MUST respond with a DCCP- Reset packet with Reset Code 1, "Closed". The receiver of a valid DCCP-Reset packet -- which is also the sender of the DCCP-Close packet (and possibly the receiver of the DCCP-CloseReq packet) -- will hold TIMEWAIT state for the connection. A DCCP-Reset packet completes every DCCP connection, whether the termination is clean (due to application close; Reset Code 1, "Closed") or unclean. Unlike TCP, which has two distinct termination mechanisms (FIN and RST), DCCP ends all connections in a uniform manner. This is justified because some aspects of connection termination are the same independent of whether termination was clean. For instance, the endpoint that receives a valid DCCP-Reset SHOULD hold TIMEWAIT state for the connection. Processors that must distinguish between clean and unclean termination can examine the Reset Code. DCCP implementations generally transition to the CLOSED state after sending a DCCP-Reset packet. Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP- CloseReq and DCCP-Close packets, respectively, until leaving those
states. The retransmission timer should initially be set to go off in two round-trip times and should back off to not less than once every 64 seconds if no relevant response is received. Only the server can send a DCCP-CloseReq packet or enter the CLOSEREQ state. A server receiving a sequence-valid DCCP-CloseReq packet MUST respond with a DCCP-Sync packet and otherwise ignore the DCCP- CloseReq. DCCP-Data, DCCP-DataAck, and DCCP-Ack packets received in CLOSEREQ or CLOSING states MAY be either processed or ignored.8.3.1. Abnormal Termination
DCCP endpoints generate DCCP-Reset packets to terminate connections abnormally; a DCCP-Reset packet may be generated from any state. Resets sent in the CLOSED, LISTEN, and TIMEWAIT states use Reset Code 3, "No Connection", unless otherwise specified. Resets sent in the REQUEST or RESPOND states use Reset Code 4, "Packet Error", unless otherwise specified. DCCP endpoints in CLOSED, LISTEN, or TIMEWAIT state may need to generate a DCCP-Reset packet in response to a packet received from a peer. Since these states have no associated sequence number variables, the Sequence and Acknowledgement Numbers on the DCCP-Reset packet R are taken from the received packet P, as follows. 1. If P.ackno exists, then set R.seqno := P.ackno + 1. Otherwise, set R.seqno := 0. 2. Set R.ackno := P.seqno. 3. If the packet used short sequence numbers (P.X == 0), then set the upper 24 bits of R.seqno and R.ackno to 0.8.4. DCCP State Diagram
The most common state transitions discussed above can be summarized in the following state diagram. The diagram is illustrative; the text in Section 8.5 and elsewhere should be considered definitive. For example, there are arcs (not shown) from every state except CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset.
+---------------------------+ +---------------------------+ | v v | | +----------+ | | +-------------+ CLOSED +------------+ | | | passive +----------+ active | | | | open open | | | | snd Request | | | v v | | +----------+ +----------+ | | | LISTEN | | REQUEST | | | +----+-----+ +----+-----+ | | | rcv Request rcv Response | | | | snd Response snd Ack | | | v v | | +----------+ +----------+ | | | RESPOND | | PARTOPEN | | | +----+-----+ +----+-----+ | | | rcv Ack/DataAck rcv packet | | | | | | | | +----------+ | | | +------------>| OPEN |<-----------+ | | +--+-+--+--+ | | server active close | | | active close | | snd CloseReq | | | or rcv CloseReq | | | | | snd Close | | | | | | | +----------+ | | | +----------+ | | | CLOSEREQ |<---------+ | +--------->| CLOSING | | | +----+-----+ | +----+-----+ | | | rcv Close | rcv Reset | | | | snd Reset | | | |<---------+ | v | | | +----+-----+ | | rcv Close | | TIMEWAIT | | | snd Reset | +----+-----+ | +-----------------------------+ | | +-----------+ 2MSL timer expires8.5. Pseudocode
This section presents an algorithm describing the processing steps a DCCP endpoint must go through when it receives a packet. A DCCP implementation need not implement the algorithm as it is described here, but any implementation MUST generate observable effects exactly as indicated by this pseudocode, except where allowed otherwise by another part of this document.
The received packet is written as P, the socket as S. Socket variables are: S.SWL - sequence number window low S.SWH - sequence number window high S.AWL - acknowledgement number window low S.AWH - acknowledgement number window high S.ISS - initial sequence number sent S.ISR - initial sequence number received S.OSR - first OPEN sequence number received S.GSS - greatest sequence number sent S.GSR - greatest valid sequence number received S.GAR - greatest valid acknowledgement number received on a non-Sync; initialized to S.ISS "Send packet" actions always use, and increment, S.GSS. Step 1: Check header basics /* This step checks for malformed packets. Packets that fail these checks are ignored -- they do not receive Resets in response */ If the packet is shorter than 12 bytes, drop packet and return If P.type is not understood, drop packet and return If P.Data Offset is smaller than the given packet type's fixed header length or larger than the packet's length, drop packet and return If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet has short sequence numbers), drop packet and return If the header checksum is incorrect, drop packet and return If P.CsCov is too large for the packet size, drop packet and return Step 2: Check ports and process TIMEWAIT state /* Flow ID is <src addr, src port, dst addr, dst port> 4-tuple */ Look up flow ID in table and get corresponding socket If no socket, or S.state == TIMEWAIT, /* The following Reset's Sequence and Acknowledgement Numbers are taken from the input packet; see Section 8.3.1. */ Generate Reset(No Connection) unless P.type == Reset Drop packet and return
Step 3: Process LISTEN state If S.state == LISTEN, If P.type == Request or P contains a valid Init Cookie option, /* Must scan the packet's options to check for Init Cookies. Only Init Cookies are processed here, however; other options are processed in Step 8. This scan need only be performed if the endpoint uses Init Cookies */ /* Generate a new socket and switch to that socket */ Set S := new socket for this port pair S.state = RESPOND Choose S.ISS (initial seqno) or set from Init Cookies Initialize S.GAR := S.ISS Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookies Continue with S.state == RESPOND /* A Response packet will be generated in Step 11 */ Otherwise, Generate Reset(No Connection) unless P.type == Reset Drop packet and return Step 4: Prepare sequence numbers in REQUEST If S.state == REQUEST, If (P.type == Response or P.type == Reset) and S.AWL <= P.ackno <= S.AWH, /* Set sequence number variables corresponding to the other endpoint, so P will pass the tests in Step 6 */ Set S.GSR, S.ISR, S.SWL, S.SWH /* Response processing continues in Step 10; Reset processing continues in Step 9 */ Otherwise, /* Only Response and Reset are valid in REQUEST state */ Generate Reset(Packet Error) Drop packet and return Step 5: Prepare sequence numbers for Sync If P.type == Sync or P.type == SyncAck, If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, /* P is valid, so update sequence number variables accordingly. After this update, P will pass the tests in Step 6. A SyncAck is generated if necessary in Step 15 */ Update S.GSR, S.SWL, S.SWH Otherwise, Drop packet and return
Step 6: Check sequence numbers If P.X == 0 and the relevant Allow Short Seqnos feature is 0, /* Packet has short seqnos, but short seqnos not allowed */ Drop packet and return Otherwise, if P.X == 0, Extend P.seqno and P.ackno to 48 bits using the procedure in Section 7.6 Let LSWL = S.SWL and LAWL = S.AWL If P.type == CloseReq or P.type == Close or P.type == Reset, LSWL := S.GSR + 1, LAWL := S.GAR If LSWL <= P.seqno <= S.SWH and (P.ackno does not exist or LAWL <= P.ackno <= S.AWH), Update S.GSR, S.SWL, S.SWH If P.type != Sync, Update S.GAR Otherwise, If P.type == Reset, Send Sync packet acknowledging S.GSR Otherwise, Send Sync packet acknowledging P.seqno Drop packet and return Step 7: Check for unexpected packet types If (S.is_server and P.type == CloseReq) or (S.is_server and P.type == Response) or (S.is_client and P.type == Request) or (S.state >= OPEN and P.type == Request and P.seqno >= S.OSR) or (S.state >= OPEN and P.type == Response and P.seqno >= S.OSR) or (S.state == RESPOND and P.type == Data), Send Sync packet acknowledging P.seqno Drop packet and return Step 8: Process options and mark acknowledgeable /* Option processing is not specifically described here. Certain options, such as Mandatory, may cause the connection to be reset, in which case Steps 9 and on are not executed */ Mark packet as acknowledgeable (in Ack Vector terms, Received or Received ECN Marked) Step 9: Process Reset If P.type == Reset, Tear down connection S.state := TIMEWAIT Set TIMEWAIT timer Drop packet and return
Step 10: Process REQUEST state (second part) If S.state == REQUEST, /* If we get here, P is a valid Response from the server (see Step 4), and we should move to PARTOPEN state. PARTOPEN means send an Ack, don't send Data packets, retransmit Acks periodically, and always include any Init Cookie from the Response */ S.state := PARTOPEN Set PARTOPEN timer Continue with S.state == PARTOPEN /* Step 12 will send the Ack completing the three-way handshake */ Step 11: Process RESPOND state If S.state == RESPOND, If P.type == Request, Send Response, possibly containing Init Cookie If Init Cookie was sent, Destroy S and return /* Step 3 will create another socket when the client completes the three-way handshake */ Otherwise, S.OSR := P.seqno S.state := OPEN Step 12: Process PARTOPEN state If S.state == PARTOPEN, If P.type == Response, Send Ack Otherwise, if P.type != Sync, S.OSR := P.seqno S.state := OPEN Step 13: Process CloseReq If P.type == CloseReq and S.state < CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer Step 14: Process Close If P.type == Close, Generate Reset(Closed) Tear down connection Drop packet and return
Step 15: Process Sync If P.type == Sync, Generate SyncAck Step 16: Process data /* At this point any application data on P can be passed to the application, except that the application MUST NOT receive data from more than one Request or Response */9. Checksums
DCCP uses a header checksum to protect its header against corruption. Generally, this checksum also covers any application data. DCCP applications can, however, request that the header checksum cover only part of the application data, or perhaps no application data at all. Link layers may then reduce their protection on unprotected parts of DCCP packets. For some noisy links, and for applications that can tolerate corruption, this can greatly improve delivery rates and perceived performance. Checksum coverage may eventually impact congestion control mechanisms as well. A packet with corrupt application data and complete checksum coverage is treated as lost. This incurs a heavy-duty loss response from the sender's congestion control mechanism, which can unfairly penalize connections on links with high background corruption. The combination of reduced checksum coverage and Data Checksum options may let endpoints report packets as corrupt rather than dropped, using Data Dropped options and Drop Code 3 (see Section 11.7). This may eventually benefit applications. However, further research is required to determine an appropriate response to corruption, which can sometimes correlate with congestion. Corrupt packets currently incur a loss response. The Data Checksum option, which contains a strong CRC, lets endpoints detect application data corruption. An API can then be used to avoid delivering corrupt data to the application, even if links deliver corrupt data to the endpoint due to reduced checksum coverage. However, the use of reduced checksum coverage for applications that demand correct data is currently considered experimental. This is because the combined loss-plus-corruption rate for packets with reduced checksum coverage may be significantly higher than that for packets with full checksum coverage, although the loss rate will generally be lower. Actual behavior will depend on link design; further research and experience is required. Reduced checksum coverage introduces some security considerations; see Section 18.1. See Appendix B for further motivation and
discussion. DCCP's implementation of reduced checksum coverage was inspired by UDP-Lite [RFC3828].9.1. Header Checksum Field
DCCP uses the TCP/IP checksum algorithm. The Checksum field in the DCCP generic header (see Section 5.1) equals the 16-bit one's complement of the one's complement sum of all 16-bit words in the DCCP header, DCCP options, a pseudoheader taken from the network- layer header, and, depending on the value of the Checksum Coverage field, some or all of the application data. When calculating the checksum, the Checksum field itself is treated as 0. If a packet contains an odd number of header and payload bytes to be checksummed, 8 zero bits are added on the right to form a 16-bit word for checksum purposes. The pad byte is not transmitted as part of the packet. The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits long and consists of the IPv4 source and destination addresses, the IP protocol number for DCCP (padded on the left with 8 zero bits), and the DCCP length as a 16-bit quantity (the length of the DCCP header with options, plus the length of any data); see [RFC793], Section 3.1. For IPv6, it is 320 bits long, and consists of the IPv6 source and destination addresses, the DCCP length as a 32-bit quantity, and the IP protocol number for DCCP (padded on the left with 24 zero bits); see [RFC2460], Section 8.1. Packets with invalid header checksums MUST be ignored. In particular, their options MUST NOT be processed.9.2. Header Checksum Coverage Field
The Checksum Coverage field in the DCCP generic header (see Section 5.1) specifies what parts of the packet are covered by the Checksum field, as follows: CsCov = 0 The Checksum field covers the DCCP header, DCCP options, network-layer pseudoheader, and all application data in the packet, possibly padded on the right with zeros to an even number of bytes. CsCov = 1-15 The Checksum field covers the DCCP header, DCCP options, network-layer pseudoheader, and the initial (CsCov-1)*4 bytes of the packet's application data. Thus, if CsCov is 1, none of the application data is protected by the header checksum. The value (CsCov-1)*4 MUST be less than or equal to the length of the application data. Packets with invalid CsCov values MUST be ignored; in particular, their options MUST NOT be
processed. The meanings of values other than 0 and 1 should be considered experimental. Values other than 0 specify that corruption is acceptable in some or all of the DCCP packet's application data. In fact, DCCP cannot even detect corruption in areas not covered by the header checksum, unless the Data Checksum option is used. Applications should not make any assumptions about the correctness of received data not covered by the checksum and should, if necessary, introduce their own validity checks. A DCCP application interface should let sending applications suggest a value for CsCov for sent packets, defaulting to 0 (full coverage). The Minimum Checksum Coverage feature, described below, lets an endpoint refuse delivery of application data on packets with partial checksum coverage; by default, only fully covered application data is accepted. Lower layers that support partial error detection MAY use the Checksum Coverage field as a hint of where errors do not need to be detected. Lower layers MUST use a strong error detection mechanism to detect at least errors that occur in the sensitive part of the packet, and to discard damaged packets. The sensitive part consists of the bytes between the first byte of the IP header and the last byte identified by Checksum Coverage. For more details on application and lower-layer interface issues relating to partial checksumming, see [RFC3828].9.2.1. Minimum Checksum Coverage Feature
The Minimum Checksum Coverage feature lets a DCCP endpoint determine whether its peer is willing to accept packets with reduced Checksum Coverage. For example, DCCP A sends a "Change R(Minimum Checksum Coverage, 1)" option to DCCP B to check whether B is willing to accept packets with Checksum Coverage set to 1. Minimum Checksum Coverage has feature number 8 and is server- priority. It takes one-byte integer values between 0 and 15; values of 16 or more are reserved. Minimum Checksum Coverage/B reflects values of Checksum Coverage that DCCP B finds unacceptable. Say that the value of Minimum Checksum Coverage/B is MinCsCov. Then: o If MinCsCov = 0, then DCCP B only finds packets with CsCov = 0 acceptable. o If MinCsCov > 0, then DCCP B additionally finds packets with CsCov >= MinCsCov acceptable.
DCCP B MAY refuse to process application data from packets with unacceptable Checksum Coverage. Such packets SHOULD be reported using Data Dropped options (Section 11.7) with Drop Code 0, Protocol Constraints. New connections start with Minimum Checksum Coverage 0 for both endpoints.9.3. Data Checksum Option
The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- check code of a DCCP packet's application data. +--------+--------+--------+--------+--------+--------+ |00101100|00000110| CRC-32c | +--------+--------+--------+--------+--------+--------+ Type=44 Length=6 The sending DCCP computes the CRC of the bytes comprising the application data area and stores it in the option data. The CRC-32c algorithm used for Data Checksum is the same as that used for SCTP [RFC3309]; note that the CRC-32c of zero bytes of data equals zero. The DCCP header checksum will cover the Data Checksum option, so the data checksum must be computed before the header checksum. A DCCP endpoint receiving a packet with a Data Checksum option either MUST or MAY check the Data Checksum; the choice depends on the value of the Check Data Checksum feature described below. If it checks the checksum, it computes the received application data's CRC-32c using the same algorithm as the sender and compares the result with the Data Checksum value. If the CRCs differ, the endpoint reacts in one of two ways: o The receiving application may have requested delivery of known- corrupt data via some optional API. In this case, the packet's data MUST be delivered to the application, with a note that it is known to be corrupt. Furthermore, the receiving endpoint MUST report the packet as delivered corrupt using a Data Dropped option (Drop Code 7, Delivered Corrupt). o Otherwise, the receiving endpoint MUST drop the application data and report that data as dropped due to corruption using a Data Dropped option (Drop Code 3, Corrupt). In either case, the packet is considered acknowledgeable (since its header was processed) and will therefore be acknowledged using the equivalent of Ack Vector's Received or Received ECN Marked states. Although Data Checksum is intended for packets containing application data, it may be included on other packets, such as DCCP-Ack, DCCP-
Sync, and DCCP-SyncAck. The receiver SHOULD calculate the application data area's CRC-32c on such packets, just as it does for DCCP-Data and similar packets. If the CRCs differ, the packets similarly MUST be reported using Data Dropped options (Drop Code 3), although their application data areas would not be delivered to the application in any case.9.3.1. Check Data Checksum Feature
The Check Data Checksum feature lets a DCCP endpoint determine whether its peer will definitely check Data Checksum options. DCCP A sends a Mandatory "Change R(Check Data Checksum, 1)" option to DCCP B to require it to check Data Checksum options (the connection will be reset if it cannot). Check Data Checksum has feature number 9 and is server-priority. It takes one-byte Boolean values. DCCP B MUST check any received Data Checksum options when Check Data Checksum/B is one, although it MAY check them even when Check Data Checksum/B is zero. Values of two or more are reserved. New connections start with Check Data Checksum 0 for both endpoints.9.3.2. Checksum Usage Notes
Internet links must normally apply strong integrity checks to the packets they transmit [RFC3828, RFC3819]. This is the default case when the DCCP header's Checksum Coverage value equals zero (full coverage). However, the DCCP Checksum Coverage value might not be zero. By setting partial Checksum Coverage, the application indicates that it can tolerate corruption in the unprotected part of the application data. Recognizing this, link layers may reduce error detection and/or correction strength when transmitting this unprotected part. This, in turn, can significantly increase the likelihood of the endpoint's receiving corrupt data; Data Checksum lets the receiver detect that corruption with very high probability.