Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 4340

Datagram Congestion Control Protocol (DCCP)

Pages: 129
Proposed Standard
Errata
Updated by:  5595559663356773
Part 3 of 5 – Pages 44 to 76
First   Prev   Next

Top   ToC   RFC4340 - Page 44   prevText

7. Sequence Numbers

DCCP uses sequence numbers to arrange packets into sequence, to detect losses and network duplicates, and to protect against attackers, half-open connections, and the delivery of very old packets. Every packet carries a Sequence Number; most packet types carry an Acknowledgement Number as well. DCCP sequence numbers are packet based. That is, Sequence Numbers generated by each endpoint increase by one, modulo 2^48, per packet. Even DCCP-Ack and DCCP-Sync packets, and other packets that don't carry user data, increment the Sequence Number. Since DCCP is an unreliable protocol, there are no true retransmissions, but effective retransmissions, such as retransmissions of DCCP-Request packets, also increment the Sequence Number. This lets DCCP implementations
Top   ToC   RFC4340 - Page 45
   detect network duplication, retransmissions, and acknowledgement
   loss; it is a significant departure from TCP practice.

7.1. Variables

DCCP endpoints maintain a set of sequence number variables for each connection. ISS The Initial Sequence Number Sent by this endpoint. This equals the Sequence Number of the first DCCP-Request or DCCP-Response sent. ISR The Initial Sequence Number Received from the other endpoint. This equals the Sequence Number of the first DCCP-Request or DCCP-Response received. GSS The Greatest Sequence Number Sent by this endpoint. Here, and elsewhere, "greatest" is measured in circular sequence space. GSR The Greatest Sequence Number Received from the other endpoint on an acknowledgeable packet. (Section 7.4 defines this term.) GAR The Greatest Acknowledgement Number Received from the other endpoint on an acknowledgeable packet that was not a DCCP- Sync. Some other variables are derived from these primitives. SWL and SWH (Sequence Number Window Low and High) The extremes of the validity window for received packets' Sequence Numbers. AWL and AWH (Acknowledgement Number Window Low and High) The extremes of the validity window for received packets' Acknowledgement Numbers.

7.2. Initial Sequence Numbers

The endpoints' initial sequence numbers are set by the first DCCP- Request and DCCP-Response packets sent. Initial sequence numbers MUST be chosen to avoid two problems: o delivery of old packets, where packets lingering in the network from an old connection are delivered to a new connection with the same addresses and port numbers; and
Top   ToC   RFC4340 - Page 46
   o  sequence number attacks, where an attacker can guess the sequence
      numbers that a future connection would use [M85].

   These problems are the same as those faced by TCP, and DCCP
   implementations SHOULD use TCP's strategies to avoid them [RFC793,
   RFC1948].  The rest of this section explains these strategies in more
   detail.

   To address the first problem, an implementation MUST ensure that the
   initial sequence number for a given <source address, source port,
   destination address, destination port> 4-tuple doesn't overlap with
   recent sequence numbers on previous connections with the same
   4-tuple.  ("Recent" means sent within 2 maximum segment lifetimes, or
   4 minutes.)  The implementation MUST additionally ensure that the
   lower 24 bits of the initial sequence number don't overlap with the
   lower 24 bits of recent sequence numbers (unless the implementation
   plans to avoid short sequence numbers; see Section 7.6).  An
   implementation that has state for a recent connection with the same
   4-tuple can pick a good initial sequence number explicitly.
   Otherwise, it could tie initial sequence number selection to some
   clock, such as the 4-microsecond clock used by TCP [RFC793].  Two
   separate clocks may be required, one for the upper 24 bits and one
   for the lower 24 bits.

   To address the second problem, an implementation MUST provide each
   4-tuple with an independent initial sequence number space.  Then,
   opening a connection doesn't provide any information about initial
   sequence numbers on other connections to the same host.  [RFC1948]
   achieves this by adding a cryptographic hash of the 4-tuple and a
   secret to each initial sequence number.  For the secret, [RFC1948]
   recommends a combination of some truly random data [RFC4086], an
   administratively installed passphrase, the endpoint's IP address, and
   the endpoint's boot time, but truly random data is sufficient.  Care
   should be taken when the secret is changed; such a change alters all
   initial sequence number spaces, which might make an initial sequence
   number for some 4-tuple equal a recently sent sequence number for the
   same 4-tuple.  To avoid this problem, the endpoint might remember
   dead connection state for each 4-tuple or stay quiet for 2 maximum
   segment lifetimes around such a change.

7.3. Quiet Time

DCCP endpoints, like TCP endpoints, must take care before initiating connections when they boot. In particular, they MUST NOT send packets whose sequence numbers are close to the sequence numbers of packets lingering in the network from before the boot. The simplest way to enforce this rule is for DCCP endpoints to avoid sending any packets until one maximum segment lifetime (2 minutes) after boot.
Top   ToC   RFC4340 - Page 47
   Other enforcement mechanisms include remembering recent sequence
   numbers across boots and reserving the upper 8 or so bits of initial
   sequence numbers for a persistent counter that decrements by two each
   boot.  (The latter mechanism would require disallowing packets with
   short sequence numbers; see Section 7.6.1.)

7.4. Acknowledgement Numbers

Cumulative acknowledgements are meaningless in an unreliable protocol. Therefore, DCCP's Acknowledgement Number field has a different meaning from TCP's. A received packet is classified as acknowledgeable if and only if its header was successfully processed by the receiving DCCP. In terms of the pseudocode in Section 8.5, a received packet becomes acknowledgeable when the receiving endpoint reaches Step 8. This means, for example, that all acknowledgeable packets have valid header checksums and sequence numbers. A sent packet's Acknowledgement Number MUST equal the sending endpoint's GSR, the Greatest Sequence Number Received on an acknowledgeable packet, for all packet types except DCCP-Sync and DCCP-SyncAck. "Acknowledgeable" does not refer to data processing. Even acknowledgeable packets may have their application data dropped, due to receive buffer overflow or corruption, for instance. Data Dropped options report these data losses when necessary, letting congestion control mechanisms distinguish between network losses and endpoint losses. This issue is discussed further in Sections 11.4 and 11.7. DCCP-Sync and DCCP-SyncAck packets' Acknowledgement Numbers differ as follows: The Acknowledgement Number on a DCCP-Sync packet corresponds to a received packet, but not necessarily to an acknowledgeable packet; in particular, it might correspond to an out-of-sync packet whose options were not processed. The Acknowledgement Number on a DCCP-SyncAck packet always corresponds to an acknowledgeable DCCP- Sync packet; it might be less than GSR in the presence of reordering.

7.5. Validity and Synchronization

Any DCCP endpoint might receive packets that are not actually part of the current connection. For instance, the network might deliver an old packet, an attacker might attempt to hijack a connection, or the other endpoint might crash, causing a half-open connection. DCCP, like TCP, uses sequence number checks to detect these cases. Packets whose Sequence and/or Acknowledgement Numbers are out of range are called sequence-invalid and are not processed normally.
Top   ToC   RFC4340 - Page 48
   Unlike TCP, DCCP requires a synchronization mechanism to recover from
   large bursts of loss.  One endpoint might send so many packets during
   a burst of loss that when one of its packets finally got through, the
   other endpoint would label its Sequence Number as invalid.  A
   handshake of DCCP-Sync and DCCP-SyncAck packets recovers from this
   case.

7.5.1. Sequence and Acknowledgement Number Windows

Each DCCP endpoint defines sequence validity windows that are subsets of the Sequence and Acknowledgement Number spaces. These windows correspond to packets the endpoint expects to receive in the next few round-trip times. The Sequence and Acknowledgement Number windows always contain GSR and GSS, respectively. The window widths are controlled by Sequence Window features for the two half-connections. The Sequence Number validity window for packets from DCCP B is [SWL, SWH]. This window always contains GSR, the Greatest Sequence Number Received on a sequence-valid packet from DCCP B. It is W packets wide, where W is the value of the Sequence Window/B feature. One- fourth of the sequence window, rounded down, is less than or equal to GSR, and three-fourths is greater than GSR. (This asymmetric placement assumes that bursts of loss are more common in the network than significant reorderings.) invalid | valid Sequence Numbers | invalid <---------*|*===========*=======================*|*---------> GSR -|GSR + 1 - GSR GSR +|GSR + 1 + floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) = SWL = SWH The Acknowledgement Number validity window for packets from DCCP B is [AWL, AWH]. The high end of the window, AWH, equals GSS, the Greatest Sequence Number Sent by DCCP A; the window is W' packets wide, where W' is the value of the Sequence Window/A feature. invalid | valid Acknowledgement Numbers | invalid <---------*|*===================================*|*---------> GSS - W'|GSS + 1 - W' GSS|GSS + 1 = AWL = AWH SWL and AWL are initially adjusted so that they are not less than the initial Sequence Numbers received and sent, respectively: SWL := max(GSR + 1 - floor(W/4), ISR), AWL := max(GSS + 1 - W', ISS).
Top   ToC   RFC4340 - Page 49
   These adjustments MUST be applied only at the beginning of the
   connection.  (Long-lived connections may wrap sequence numbers so
   that they appear to be less than ISR or ISS; the adjustments MUST NOT
   be applied in that case.)

7.5.2. Sequence Window Feature

The Sequence Window/A feature determines the width of the Sequence Number validity window used by DCCP B and the width of the Acknowledgement Number validity window used by DCCP A. DCCP A sends a "Change L(Sequence Window, W)" option to notify DCCP B that the Sequence Window/A value is W. Sequence Window has feature number 3 and is non-negotiable. It takes 48-bit (6-byte) integer values, like DCCP sequence numbers. Change and Confirm options for Sequence Window are therefore 9 bytes long. New connections start with Sequence Window 100 for both endpoints. The minimum valid Sequence Window value is Wmin = 32. The maximum valid Sequence Window value is Wmax = 2^46 - 1 = 70368744177663. Change options suggesting Sequence Window values out of this range are invalid and MUST be handled accordingly. A proper Sequence Window/A value must reflect the number of packets DCCP A expects to be in flight. Only DCCP A can anticipate this number. Values that are too small increase the risk of the endpoints getting out sync after bursts of loss, and values that are much too small can prevent productive communication whether or not there is loss. On the other hand, too-large values increase the risk of connection hijacking; Section 7.5.5 quantifies this risk. One good guideline is for each endpoint to set Sequence Window to about five times the maximum number of packets it expects to send in a round- trip time. Endpoints SHOULD send Change L(Sequence Window) options, as necessary, as the connection progresses. Also, an endpoint MUST NOT persistently send more than its Sequence Window number of packets per round-trip time; that is, DCCP A MUST NOT persistently send more than Sequence Window/A packets per RTT.

7.5.3. Sequence-Validity Rules

Sequence-validity depends on the received packet's type. This table shows the sequence and acknowledgement number checks applied to each packet; a packet is sequence-valid if it passes both tests, and sequence-invalid if it does not. Many of the checks refer to the sequence and acknowledgement number validity windows [SWL, SWH] and [AWL, AWH] defined in Section 7.5.1.
Top   ToC   RFC4340 - Page 50
                                             Acknowledgement Number
   Packet Type      Sequence Number Check    Check
   -----------      ---------------------    ----------------------
   DCCP-Request     SWL <= seqno <= SWH (*)  N/A
   DCCP-Response    SWL <= seqno <= SWH (*)  AWL <= ackno <= AWH
   DCCP-Data        SWL <= seqno <= SWH      N/A
   DCCP-Ack         SWL <= seqno <= SWH      AWL <= ackno <= AWH
   DCCP-DataAck     SWL <= seqno <= SWH      AWL <= ackno <= AWH
   DCCP-CloseReq    GSR <  seqno <= SWH      GAR <= ackno <= AWH
   DCCP-Close       GSR <  seqno <= SWH      GAR <= ackno <= AWH
   DCCP-Reset       GSR <  seqno <= SWH      GAR <= ackno <= AWH
   DCCP-Sync        SWL <= seqno             AWL <= ackno <= AWH
   DCCP-SyncAck     SWL <= seqno             AWL <= ackno <= AWH

   (*) Check not applied if connection is in LISTEN or REQUEST state.

   In general, packets are sequence-valid if their Sequence and
   Acknowledgement Numbers lie within the corresponding valid windows,
   [SWL, SWH] and [AWL, AWH].  The exceptions to this rule are as
   follows:

   o  Since DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets end a
      connection, they cannot have Sequence Numbers less than or equal
      to GSR, or Acknowledgement Numbers less than GAR.

   o  DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly
      checked.  These packet types exist specifically to get the
      endpoints back into sync; checking their Sequence Numbers would
      eliminate their usefulness.

   The lenient checks on DCCP-Sync and DCCP-SyncAck packets allow
   continued operation after unusual events, such as endpoint crashes
   and large bursts of loss, but there's no need for leniency in the
   absence of unusual events -- that is, during ongoing successful
   communication.  Therefore, DCCP implementations SHOULD use the
   following, more stringent checks for active connections, where a
   connection is considered active if it has received valid packets from
   the other endpoint within the last three round-trip times.

                                             Acknowledgement Number
   Packet Type      Sequence Number Check    Check
   -----------      ---------------------    ----------------------
   DCCP-Sync        SWL <= seqno <= SWH      AWL <= ackno <= AWH
   DCCP-SyncAck     SWL <= seqno <= SWH      AWL <= ackno <= AWH

   Finally, an endpoint MAY apply the following more stringent checks to
   DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets, further lowering
   the probability of successful blind attacks using those packet types.
Top   ToC   RFC4340 - Page 51
   Since these checks can cause extra synchronization overhead and delay
   connection closing when packets are lost, they should be considered
   experimental.

                                             Acknowledgement Number
   Packet Type      Sequence Number Check    Check
   -----------      ---------------------    ----------------------
   DCCP-CloseReq    seqno == GSR + 1         GAR <= ackno <= AWH
   DCCP-Close       seqno == GSR + 1         GAR <= ackno <= AWH
   DCCP-Reset       seqno == GSR + 1         GAR <= ackno <= AWH

   Note that sequence-validity is only one of the validity checks
   applied to received packets.

7.5.4. Handling Sequence-Invalid Packets

Endpoints respond to received sequence-invalid packets as follows. o Any sequence-invalid DCCP-Sync or DCCP-SyncAck packet MUST be ignored. o A sequence-invalid DCCP-Reset packet MUST elicit a DCCP-Sync packet in response (subject to a possible rate limit). This response packet MUST use a new Sequence Number, and thus will increase GSS; GSR will not change, however, since the received packet was sequence-invalid. The response packet's Acknowledgement Number MUST equal GSR. o Any other sequence-invalid packet MUST elicit a similar DCCP-Sync packet, except that the response packet's Acknowledgement Number MUST equal the sequence-invalid packet's Sequence Number. On receiving a sequence-valid DCCP-Sync packet, the peer endpoint (say, DCCP B) MUST update its GSR variable and reply with a DCCP- SyncAck packet. The DCCP-SyncAck packet's Acknowledgement Number will equal the DCCP-Sync's Sequence Number, which is not necessarily GSR. Upon receiving this DCCP-SyncAck, which will be sequence-valid since it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, and the endpoints will be back in sync. As an exception, if the peer endpoint is in the REQUEST state, it MUST respond with a DCCP-Reset instead of a DCCP-SyncAck. This serves to clean up DCCP A's half-open connection. To protect against denial-of-service attacks, DCCP implementations SHOULD impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets, such as not more than eight DCCP-Syncs per second.
Top   ToC   RFC4340 - Page 52
   DCCP endpoints MUST NOT process sequence-invalid packets except,
   perhaps, by generating a DCCP-Sync.  For instance, options MUST NOT
   be processed.  An endpoint MAY temporarily preserve sequence-invalid
   packets in case they become valid later, however; this can reduce the
   impact of bursts of loss by delivering more packets to the
   application.  In particular, an endpoint MAY preserve sequence-
   invalid packets for up to 2 round-trip times.  If, within that time,
   the relevant sequence windows change so that the packets become
   sequence-valid, the endpoint MAY process them again.

   Note that sequence-invalid DCCP-Reset packets cause DCCP-Syncs to be
   generated.  This is because endpoints in an unsynchronized state
   (CLOSED, REQUEST, and LISTEN) might not have enough information to
   generate a proper DCCP-Reset on the first try.  For example, if a
   peer endpoint is in CLOSED state and receives a DCCP-Data packet, it
   cannot guess the right Sequence Number to use on the DCCP-Reset it
   generates (since the DCCP-Data packet has no Acknowledgement Number).
   The DCCP-Sync generated in response to this bad reset serves as a
   challenge, and contains enough information for the peer to generate a
   proper DCCP-Reset.  However, the new DCCP-Reset may carry a different
   Reset Code than the original DCCP-Reset; probably the new Reset Code
   will be 3, "No Connection".  The endpoint SHOULD use information from
   the original DCCP-Reset when possible.

7.5.5. Sequence Number Attacks

Sequence and Acknowledgement Numbers form DCCP's main line of defense against attackers. An attacker that cannot guess sequence numbers cannot easily manipulate or hijack a DCCP connection, and requirements like careful initial sequence number choice eliminate the most serious attacks. An attacker might still send many packets with randomly chosen Sequence and Acknowledgement Numbers, however. If one of those probes ends up sequence-valid, it may shut down the connection or otherwise cause problems. The easiest such attacks to execute are as follows: o Send DCCP-Data packets with random Sequence Numbers. If one of these packets hits the valid sequence number window, the attack packet's application data may be inserted into the data stream. o Send DCCP-Sync packets with random Sequence and Acknowledgement Numbers. If one of these packets hits the valid acknowledgement number window, the receiver will shift its sequence number window accordingly, getting out of sync with the correct endpoint -- perhaps permanently.
Top   ToC   RFC4340 - Page 53
   The attacker has to guess both Source and Destination Ports for any
   of these attacks to succeed.  Additionally, the connection would have
   to be inactive for the DCCP-Sync attack to succeed, assuming the
   victim implemented the more stringent checks for active connections
   recommended in Section 7.5.3.

   To quantify the probability of success, let N be the number of attack
   packets the attacker is willing to send, W be the relevant sequence
   window width, and L be the length of sequence numbers (24 or 48).
   The attacker's best strategy is to space the attack packets evenly
   over sequence space.  Then the probability of hitting one sequence
   number window is P = WN/2^L.

   The success probability for a DCCP-Data attack using short sequence
   numbers thus equals P = WN/2^24.  For W = 100, then, the attacker
   must send more than 83,000 packets to achieve a 50% chance of
   success.  For reference, the easiest TCP attack -- sending a SYN with
   a random sequence number, which will cause a connection reset if it
   falls within the window -- with W = 8760 (a common default) and
   L = 32 requires more than 245,000 packets to achieve a 50% chance of
   success.

   A fast connection's W will generally be high, increasing the attack
   success probability for fixed N.  If this probability gets
   uncomfortably high with L = 24, the endpoint SHOULD prevent the use
   of short sequence numbers by manipulating the Allow Short Sequence
   Numbers feature (see Section 7.6.1).  The probability limit depends
   on the application, however.  Some applications, such as those
   already designed to handle corruption, are quite resilient to data
   injection attacks.

   The DCCP-Sync attack has L = 48, since DCCP-Sync packets use long
   sequence numbers exclusively; in addition, the success probability is
   halved, since only half the Sequence Number space is valid.  Attacks
   have a correspondingly smaller probability of success.  For a large W
   of 2000 packets, then, the attacker must send more than 10^11 packets
   to achieve a 50% chance of success.

   Attacks involving DCCP-Ack, DCCP-DataAck, DCCP-CloseReq, DCCP-Close,
   and DCCP-Reset packets are more difficult, since Sequence and
   Acknowledgement Numbers must both be guessed.  The probability of
   attack success for these packet types equals P = WXN/2^(2L), where W
   is the Sequence Number window, X is the Acknowledgement Number
   window, and N and L are as before.

   Since DCCP-Data attacks with short sequence numbers are relatively
   easy for attackers to execute, DCCP has been engineered to prevent
   these attacks from escalating to connection resets or other serious
Top   ToC   RFC4340 - Page 54
   consequences.  In particular, any options whose processing might
   cause the connection to be reset are ignored when they appear on
   DCCP-Data packets.

7.5.6. Sequence Number Handling Examples

In the following example, DCCP A and DCCP B recover from a large burst of loss that runs DCCP A's sequence numbers out of DCCP B's appropriate sequence number window. DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) --> DCCP-Data(seq 2) XXX ... --> DCCP-Data(seq 100) XXX --> DCCP-Data(seq 101) --> ??? seqno out of range; send Sync OK <-- DCCP-Sync(seq 11, ack 101) <-- (GSS=11,GSR=1) --> DCCP-SyncAck(seq 102, ack 11) --> OK (GSS=102,GSR=11) (GSS=11,GSR=102) In the next example, a DCCP connection recovers from a simple blind attack. DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? seqno out of range; send Sync ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- ackno out of range; ignore (GSS=1,GSR=10) (GSS=11,GSR=1) The final example demonstrates recovery from a half-open connection. DCCP A DCCP B (GSS=1,GSR=10) (GSS=10,GSR=1) (Crash) CLOSED OPEN REQUEST --> DCCP-Request(seq 400) --> ??? !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) REQUEST CLOSED REQUEST --> DCCP-Request(seq 402) --> ...
Top   ToC   RFC4340 - Page 55

7.6. Short Sequence Numbers

DCCP sequence numbers are 48 bits long. This large sequence space protects DCCP connections against some blind attacks, such as the injection of DCCP-Resets into the connection. However, DCCP-Data, DCCP-Ack, and DCCP-DataAck packets, which make up the body of any DCCP connection, may reduce header space by transmitting only the lower 24 bits of the relevant Sequence and Acknowledgement Numbers. The receiving endpoint will extend these numbers to 48 bits using the following pseudocode: procedure Extend_Sequence_Number(S, REF) /* S is a 24-bit sequence number from the packet header. REF is the relevant 48-bit reference sequence number: GSS if S is an Acknowledgement Number, and GSR if S is a Sequence Number. */ Set REF_low := low 24 bits of REF Set REF_hi := high 24 bits of REF If REF_low (<) S /* circular comparison mod 2^24 */ and S |<| REF_low, /* conventional, non-circular comparison */ Return (((REF_hi + 1) mod 2^24) << 24) | S Otherwise, if S (<) REF_low and REF_low |<| S, Return (((REF_hi - 1) mod 2^24) << 24) | S Otherwise, Return (REF_hi << 24) | S The two different kinds of comparison in the if statements detect when the low-order bits of the sequence space have wrapped. (The circular comparison "REF_low (<) S" returns true if and only if (S - REF_low), calculated using two's-complement arithmetic and then represented as an unsigned number, is less than or equal to 2^23 (mod 2^24).) When this happens, the high-order bits are incremented or decremented, as appropriate.

7.6.1. Allow Short Sequence Numbers Feature

Endpoints can require that all packets use long sequence numbers by leaving the Allow Short Sequence Numbers feature value at its default of zero. This can reduce the risk that data will be inappropriately injected into the connection. DCCP A sends a "Change L(Allow Short Seqnos, 1)" option to indicate its desire to send packets with short sequence numbers. Allow Short Sequence Numbers has feature number 2 and is server- priority. It takes one-byte Boolean values. When Allow Short Seqnos/B is zero, DCCP B MUST NOT send packets with short sequence numbers and DCCP A MUST ignore any packets with short sequence
Top   ToC   RFC4340 - Page 56
   numbers that are received.  Values of two or more are reserved.  New
   connections start with Allow Short Sequence Numbers 0 for both
   endpoints.

7.6.2. When to Avoid Short Sequence Numbers

Short sequence numbers reduce the rate DCCP connections can safely achieve and increase the risks of certain kinds of attacks, including blind data injection. Very-high-rate DCCP connections, and connections with large sequence windows (Section 7.5.2), SHOULD NOT use short sequence numbers on their data packets. The attack risk issues have been discussed in Section 7.5.5; we discuss the rate limitation issue here. The sequence-validity mechanism assumes that the network does not deliver extremely old data. In particular, it assumes that the network must have dropped any packet by the time the connection wraps around and uses its sequence number again. This constraint limits the maximum connection rate that can be safely achieved. Let MSL equal the maximum segment lifetime, P equal the average DCCP packet size in bits, and L equal the length of sequence numbers (24 or 48 bits). Then the maximum safe rate, in bits per second, is R = P*(2^L)/2MSL. For the default MSL of 2 minutes, 1500-byte DCCP packets, and short sequence numbers, the safe rate is therefore approximately 800 Mb/s. Although 2 minutes is a very large MSL for any networks that could sustain that rate with such small packets, long sequence numbers allow much higher rates under the same constraints: up to 14 petabits a second for 1500-byte packets and the default MSL.

7.7. NDP Count and Detecting Application Loss

DCCP's sequence numbers increment by one on every packet, including non-data packets (packets that don't carry application data). This makes DCCP sequence numbers suitable for detecting any network loss, but not for detecting the loss of application data. The NDP Count option reports the length of each burst of non-data packets. This lets the receiving DCCP reliably determine when a burst of loss included application data. +--------+--------+-------- ... --------+ |00100101| Length | NDP Count | +--------+--------+-------- ... --------+ Type=37 Len=3-8 (1-6 bytes) If a DCCP endpoint's Send NDP Count feature is one (see below), then that endpoint MUST send an NDP Count option on every packet whose
Top   ToC   RFC4340 - Page 57
   immediate predecessor was a non-data packet.  Non-data packets
   consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq,
   DCCP-Reset, DCCP-Sync, and DCCP-SyncAck.  The other packet types,
   namely DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, are
   considered data packets, although not all DCCP-Request and DCCP-
   Response packets will actually carry application data.

   The value stored in NDP Count equals the number of consecutive non-
   data packets in the run immediately previous to the current packet.
   Packets with no NDP Count option are considered to have NDP Count
   zero.

   The NDP Count option can carry one to six bytes of data.  The
   smallest option format that can hold the NDP Count SHOULD be used.

   With NDP Count, the receiver can reliably tell only whether a burst
   of loss contained at least one data packet.  For example, the
   receiver cannot always tell whether a burst of loss contained a non-
   data packet.

7.7.1. NDP Count Usage Notes

Say that K consecutive sequence numbers are missing in some burst of loss, and that the Send NDP Count feature is on. Then some application data was lost within those sequence numbers unless the packet following the hole contains an NDP Count option whose value is greater than or equal to K. For example, say that an endpoint sent the following sequence of non-data packets (Nx) and data packets (Dx). N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 Those packets would have NDP Counts as follows. N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 - 1 2 - 1 - - 1 - - - - 1 2 NDP Count is not useful for applications that include their own sequence numbers with their packet headers.

7.7.2. Send NDP Count Feature

The Send NDP Count feature lets DCCP endpoints negotiate whether they should send NDP Count options on their packets. DCCP A sends a "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count options.
Top   ToC   RFC4340 - Page 58
   Send NDP Count has feature number 7 and is server-priority.  It takes
   one-byte Boolean values.  DCCP B MUST send NDP Count options as
   described above when Send NDP Count/B is one, although it MAY send
   NDP Count options even when Send NDP Count/B is zero.  Values of two
   or more are reserved.  New connections start with Send NDP Count 0
   for both endpoints.

8. Event Processing

This section describes how DCCP connections move between states and which packets are sent when. Note that feature negotiation takes place in parallel with the connection-wide state transitions described here.

8.1. Connection Establishment

DCCP connections' initiation phase consists of a three-way handshake: an initial DCCP-Request packet sent by the client, a DCCP-Response sent by the server in reply, and finally an acknowledgement from the client, usually via a DCCP-Ack or DCCP-DataAck packet. The client moves from the REQUEST state to PARTOPEN, and finally to OPEN; the server moves from LISTEN to RESPOND, and finally to OPEN. Client State Server State CLOSED LISTEN 1. REQUEST --> Request --> 2. <-- Response <-- RESPOND 3. PARTOPEN --> Ack, DataAck --> 4. <-- Data, Ack, DataAck <-- OPEN 5. OPEN <-> Data, Ack, DataAck <-> OPEN

8.1.1. Client Request

When a client decides to initiate a connection, it enters the REQUEST state, chooses an initial sequence number (Section 7.2), and sends a DCCP-Request packet using that sequence number to the intended server. DCCP-Request packets will commonly carry feature negotiation options that open negotiations for various connection parameters, such as preferred congestion control IDs for each half-connection. They may also carry application data, but the client should be aware that the server may not accept such data. A client in the REQUEST state SHOULD use an exponential-backoff timer to send new DCCP-Request packets if no response is received. The first retransmission should occur after approximately one second, backing off to not less than one packet every 64 seconds; or the
Top   ToC   RFC4340 - Page 59
   endpoint can use whatever retransmission strategy is followed for
   retransmitting TCP SYNs.  Each new DCCP-Request MUST increment the
   Sequence Number by one and MUST contain the same Service Code and
   application data as the original DCCP-Request.

   A client MAY give up on its DCCP-Requests after some time (3 minutes,
   for example).  When it does, it SHOULD send a DCCP-Reset packet to
   the server with Reset Code 2, "Aborted", to clean up state in case
   one or more of the Requests actually arrived.  A client in REQUEST
   state has never received an initial sequence number from its peer, so
   the DCCP-Reset's Acknowledgement Number MUST be set to zero.

   The client leaves the REQUEST state for PARTOPEN when it receives a
   DCCP-Response from the server.

8.1.2. Service Codes

Each DCCP-Request contains a 32-bit Service Code, which identifies the application-level service to which the client application is trying to connect. Service Codes should correspond to application services and protocols. For example, there might be a Service Code for SIP control connections and one for RTP audio connections. Middleboxes, such as firewalls, can use the Service Code to identify the application running on a nonstandard port (assuming the DCCP header has not been encrypted). Endpoints MUST associate a Service Code with every DCCP socket, both actively and passively opened. The application will generally supply this Service Code. Each active socket MUST have exactly one Service Code. Passive sockets MAY, at the implementation's discretion, be associated with more than one Service Code; this might let multiple applications, or multiple versions of the same application, listen on the same port, differentiated by Service Code. If the DCCP-Request's Service Code doesn't equal any of the server's Service Codes for the given port, the server MUST reject the request by sending a DCCP- Reset packet with Reset Code 8, "Bad Service Code". A middlebox MAY also send such a DCCP-Reset in response to packets whose Service Code is considered unsuitable. Service Codes are not intended to be DCCP-specific and are allocated by IANA. Following the policies outlined in [RFC2434], most Service Codes are allocated First Come First Served, subject to the following guidelines. o Service Codes are allocated one at a time, or in small blocks. A short English description of the intended service is REQUIRED to obtain a Service Code assignment, but no specification, standards
Top   ToC   RFC4340 - Page 60
      track or otherwise, is necessary.  IANA maintains an association
      of Service Codes to the corresponding phrases.

   o  Users request specific Service Code values.  We suggest that users
      request Service Codes that can be represented using the "SC:"
      formatting convention described below.  Thus, the "Frobodyne Plotz
      Protocol" might correspond to Service Code 17178548426 or,
      equivalently, "SC:fdpz".  The canonical interpretation of a
      Service Code field is numeric.

   o  Service Codes whose bytes each have values in the set {32, 45-57,
      65-90} use a Specification Required allocation policy.  That is,
      these Service Codes are used for international standard or
      standards-track specifications, IETF or otherwise.  (This set
      consists of the ASCII digits, uppercase letters, and characters
      space, '-', '.', and '/'.)

   o  Service Codes whose high-order byte equals 63 (ASCII '?') are
      reserved for Private Use.

   o  Service Code 0 represents the absence of a meaningful Service Code
      and MUST NOT be allocated.

   o  The value 4294967295 is an invalid Service Code.  Servers MUST
      reject any DCCP-Request with this Service Code value by sending a
      DCCP-Reset packet with Reset Code 8, "Bad Service Code".

   This design for Service Code allocation is based on the allocation of
   4-byte identifiers for Macintosh resources, PNG chunks, and TrueType
   and OpenType tables.

   In text settings, we recommend that Service Codes be written in one
   of three forms, prefixed by the ASCII letters SC and either a colon
   ":" or equals sign "=".  These forms are interpreted as follows.

   SC:     Indicates a Service Code representable using a subset of the
           ASCII characters.  The colon is followed by one to four
           characters taken from the following set: letters, digits, and
           the characters in "-_+.*/?@" (not including quotes).
           Numerically, these characters have values in {42-43, 45-57,
           63-90, 95, 97-122}.  The Service Code is calculated by
           padding the string on the right with spaces (value 32) and
           intepreting the four-character result as a 32-bit big-endian
           number.

   SC=     Indicates a decimal Service Code.  The equals sign is
           followed by any number of decimal digits, which specify the
           Service Code.  Values above 4294967294 are illegal.
Top   ToC   RFC4340 - Page 61
   SC=x or SC=X
           Indicates a hexadecimal Service Code.  The "x" or "X" is
           followed by any number of hexadecimal digits (upper or lower
           case), which specify the Service Code.  Values above
           4294967294 are illegal.

   Thus, the Service Code 1717858426 might be represented in text as
   either SC:fdpz, SC=1717858426, or SC=x6664707A.

8.1.3. Server Response

In the second phase of the three-way handshake, the server moves from the LISTEN state to RESPOND and sends a DCCP-Response message to the client. In this phase, a server will often specify the features it would like to use, either from among those the client requested or in addition to those. Among these options is the congestion control mechanism the server expects to use. The server MAY respond to a DCCP-Request packet with a DCCP-Reset packet to refuse the connection. Relevant Reset Codes for refusing a connection include 7, "Connection Refused", when the DCCP-Request's Destination Port did not correspond to a DCCP port open for listening; 8, "Bad Service Code", when the DCCP-Request's Service Code did not correspond to the service code registered with the Destination Port; and 9, "Too Busy", when the server is currently too busy to respond to requests. The server SHOULD limit the rate at which it generates these resets; for example, to not more than 1024 per second. The server SHOULD NOT retransmit DCCP-Response packets; the client will retransmit the DCCP-Request if necessary. (Note that the "retransmitted" DCCP-Request will have, at least, a different sequence number from the "original" DCCP-Request. The server can thus distinguish true retransmissions from network duplicates.) The server will detect that the retransmitted DCCP-Request applies to an existing connection because of its Source and Destination Ports. Every valid DCCP-Request received while the server is in the RESPOND state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST increment the server's Sequence Number by one and MUST include the same application data, if any, as the original DCCP-Response. The server MUST NOT accept more than one piece of DCCP-Request application data per connection. In particular, the DCCP-Response sent in reply to a retransmitted DCCP-Request with application data SHOULD contain a Data Dropped option, in which the retransmitted DCCP-Request data is reported with Drop Code 0, Protocol Constraints. The original DCCP-Request SHOULD also be reported in the Data Dropped option, either in a Normal Block (if the server accepted the data or
Top   ToC   RFC4340 - Page 62
   there was no data) or in a Drop Code 0 Drop Block (if the server
   refused the data the first time as well).

   The Data Dropped and Init Cookie options are particularly useful for
   DCCP-Response packets (Sections 11.7 and 8.1.4).

   The server leaves the RESPOND state for OPEN when it receives a valid
   DCCP-Ack from the client, completing the three-way handshake.  It MAY
   also leave the RESPOND state for CLOSED after a timeout of not less
   than 4MSL (8 minutes); when doing so, it SHOULD send a DCCP-Reset
   with Reset Code 2, "Aborted", to clean up state at the client.

8.1.4. Init Cookie Option

+--------+--------+--------+--------+--------+-------- |00100100| Length | Init Cookie Value ... +--------+--------+--------+--------+--------+-------- Type=36 The Init Cookie option lets a DCCP server avoid having to hold any state until the three-way connection setup handshake has completed, in a similar fashion as for TCP SYN cookies [SYNCOOKIES]. The server wraps up the Service Code, server port, and any options it cares about from both the DCCP-Request and DCCP-Response in an opaque cookie. Typically the cookie will be encrypted using a secret known only to the server and will include a cryptographic checksum or magic value so that correct decryption can be verified. When the server receives the cookie back in the response, it can decrypt the cookie and instantiate all the state it avoided keeping. In the meantime, it need not move from the LISTEN state. The Init Cookie option MUST NOT be sent on DCCP-Request or DCCP-Data packets. Any Init Cookie options received on DCCP-Request or DCCP- Data packets, or after the connection has been established (when the connection's state is >= OPEN), MUST be ignored. The server MAY include Init Cookie options in its DCCP-Response. If so, then the client MUST echo the same Init Cookie options, in the same order, in each succeeding DCCP packet until one of those packets is acknowledged (showing that the three-way handshake has completed) or the connection is reset. As a result, the client MUST NOT use DCCP- Data packets until the three-way handshake completes or the connection is reset. The Init Cookie options on a client packet MUST equal those received on the DCCP-Request indicated by the client packet's Acknowledgement Number. The server SHOULD design its Init Cookie format so that Init Cookies can be checked for tampering; it SHOULD respond to a tampered Init Cookie option by resetting the connection with Reset Code 10, "Bad Init Cookie".
Top   ToC   RFC4340 - Page 63
   Init Cookie's precise implementation need not be specified here;
   since Init Cookies are opaque to the client, there are no
   interoperability concerns.  An example cookie format might encrypt
   (using a secret key) the connection's initial sequence and
   acknowledgement numbers, ports, Service Code, any options included on
   the DCCP-Request packet and the corresponding DCCP-Response, a random
   salt, and a magic number.  On receiving a reflected Init Cookie, the
   server would decrypt the cookie, validate it by checking its magic
   number, sequence numbers, and ports, and, if valid, create a
   corresponding socket using the options.

   Each individual Init Cookie option can hold at most 253 bytes of
   data, but a server can send multiple Init Cookie options to gain more
   space.

8.1.5. Handshake Completion

When the client receives a DCCP-Response from the server, it moves from the REQUEST state to PARTOPEN and completes the three-way handshake by sending a DCCP-Ack packet to the server. The client remains in PARTOPEN until it can be sure that the server has received some packet the client sent from PARTOPEN (either the initial DCCP- Ack or a later packet). Clients in the PARTOPEN state that want to send data MUST do so using DCCP-DataAck packets, not DCCP-Data packets. This is because DCCP-Data packets lack Acknowledgement Numbers, so the server can't tell from a DCCP-Data packet whether the client saw its DCCP-Response. Furthermore, if the DCCP-Response included an Init Cookie, that Init Cookie MUST be included on every packet sent in PARTOPEN. The single DCCP-Ack sent when entering the PARTOPEN state might, of course, be dropped by the network. The client SHOULD ensure that some packet gets through eventually. The preferred mechanism would be a roughly 200-millisecond timer, set every time a packet is transmitted in PARTOPEN. If this timer goes off and the client is still in PARTOPEN, the client generates another DCCP-Ack and backs off the timer. If the client remains in PARTOPEN for more than 4MSL (8 minutes), it SHOULD reset the connection with Reset Code 2, "Aborted". The client leaves the PARTOPEN state for OPEN when it receives a valid packet other than DCCP-Response, DCCP-Reset, or DCCP-Sync from the server.

8.2. Data Transfer

In the central data transfer phase of the connection, both server and client are in the OPEN state.
Top   ToC   RFC4340 - Page 64
   DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to
   application events on host A.  These packets are congestion-
   controlled by the CCID for the A-to-B half-connection.  In contrast,
   DCCP-Ack packets sent by DCCP A are controlled by the CCID for the
   B-to-A half-connection.  Generally, DCCP A will piggyback
   acknowledgement information on DCCP-Data packets when acceptable,
   creating DCCP-DataAck packets.  DCCP-Ack packets are used when there
   is no data to send from DCCP A to DCCP B, or when the congestion
   state of the A-to-B CCID will not allow data to be sent.

   DCCP-Sync and DCCP-SyncAck packets may also occur in the data
   transfer phase.  Some cases causing DCCP-Sync generation are
   discussed in Section 7.5.  One important distinction between DCCP-
   Sync packets and other packet types is that DCCP-Sync elicits an
   immediate acknowledgement.  On receiving a valid DCCP-Sync packet, a
   DCCP endpoint MUST immediately generate and send a DCCP-SyncAck
   response (subject to any implementation rate limits); the
   Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence
   Number of the DCCP-Sync.

   A particular DCCP implementation might decide to initiate feature
   negotiation only once the OPEN state was reached, in which case it
   might not allow data transfer until some time later.  Data received
   during that time SHOULD be rejected and reported using a Data Dropped
   Drop Block with Drop Code 0, Protocol Constraints (see Section 11.7).

8.3. Termination

DCCP connection termination uses a handshake consisting of an optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset packet. The server moves from the OPEN state, possibly through the CLOSEREQ state, to CLOSED; the client moves from OPEN through CLOSING to TIMEWAIT, and after 2MSL wait time (4 minutes) to CLOSED. The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the server decides to close the connection but doesn't want to hold TIMEWAIT state: Client State Server State OPEN OPEN 1. <-- CloseReq <-- CLOSEREQ 2. CLOSING --> Close --> 3. <-- Reset <-- CLOSED (LISTEN) 4. TIMEWAIT 5. CLOSED
Top   ToC   RFC4340 - Page 65
   A shorter sequence occurs when the client decides to close the
   connection.

     Client State                             Server State
        OPEN                                     OPEN
   1.   CLOSING   -->        Close         -->
   2.             <--        Reset         <--   CLOSED (LISTEN)
   3.   TIMEWAIT
   4.   CLOSED

   Finally, the server can decide to hold TIMEWAIT state:

     Client State                             Server State
        OPEN                                     OPEN
   1.             <--        Close         <--   CLOSING
   2.   CLOSED    -->        Reset         -->
   3.                                            TIMEWAIT
   4.                                            CLOSED (LISTEN)

   In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT
   state for the connection.  As in TCP, TIMEWAIT state, where an
   endpoint quietly preserves a socket for 2MSL (4 minutes) after its
   connection has closed, ensures that no connection duplicating the
   current connection's source and destination addresses and ports can
   start up while old packets might remain in the network.

   The termination handshake proceeds as follows.  The receiver of a
   valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet.
   The receiver of a valid DCCP-Close packet MUST respond with a DCCP-
   Reset packet with Reset Code 1, "Closed".  The receiver of a valid
   DCCP-Reset packet -- which is also the sender of the DCCP-Close
   packet (and possibly the receiver of the DCCP-CloseReq packet) --
   will hold TIMEWAIT state for the connection.

   A DCCP-Reset packet completes every DCCP connection, whether the
   termination is clean (due to application close; Reset Code 1,
   "Closed") or unclean.  Unlike TCP, which has two distinct termination
   mechanisms (FIN and RST), DCCP ends all connections in a uniform
   manner.  This is justified because some aspects of connection
   termination are the same independent of whether termination was
   clean.  For instance, the endpoint that receives a valid DCCP-Reset
   SHOULD hold TIMEWAIT state for the connection.  Processors that must
   distinguish between clean and unclean termination can examine the
   Reset Code.  DCCP implementations generally transition to the CLOSED
   state after sending a DCCP-Reset packet.

   Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP-
   CloseReq and DCCP-Close packets, respectively, until leaving those
Top   ToC   RFC4340 - Page 66
   states.  The retransmission timer should initially be set to go off
   in two round-trip times and should back off to not less than once
   every 64 seconds if no relevant response is received.

   Only the server can send a DCCP-CloseReq packet or enter the CLOSEREQ
   state.  A server receiving a sequence-valid DCCP-CloseReq packet MUST
   respond with a DCCP-Sync packet and otherwise ignore the DCCP-
   CloseReq.

   DCCP-Data, DCCP-DataAck, and DCCP-Ack packets received in CLOSEREQ or
   CLOSING states MAY be either processed or ignored.

8.3.1. Abnormal Termination

DCCP endpoints generate DCCP-Reset packets to terminate connections abnormally; a DCCP-Reset packet may be generated from any state. Resets sent in the CLOSED, LISTEN, and TIMEWAIT states use Reset Code 3, "No Connection", unless otherwise specified. Resets sent in the REQUEST or RESPOND states use Reset Code 4, "Packet Error", unless otherwise specified. DCCP endpoints in CLOSED, LISTEN, or TIMEWAIT state may need to generate a DCCP-Reset packet in response to a packet received from a peer. Since these states have no associated sequence number variables, the Sequence and Acknowledgement Numbers on the DCCP-Reset packet R are taken from the received packet P, as follows. 1. If P.ackno exists, then set R.seqno := P.ackno + 1. Otherwise, set R.seqno := 0. 2. Set R.ackno := P.seqno. 3. If the packet used short sequence numbers (P.X == 0), then set the upper 24 bits of R.seqno and R.ackno to 0.

8.4. DCCP State Diagram

The most common state transitions discussed above can be summarized in the following state diagram. The diagram is illustrative; the text in Section 8.5 and elsewhere should be considered definitive. For example, there are arcs (not shown) from every state except CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset.
Top   ToC   RFC4340 - Page 67
   +---------------------------+    +---------------------------+
   |                           v    v                           |
   |                        +----------+                        |
   |          +-------------+  CLOSED  +------------+           |
   |          | passive     +----------+  active    |           |
   |          |  open                      open     |           |
   |          |                         snd Request |           |
   |          v                                     v           |
   |     +----------+                          +----------+     |
   |     |  LISTEN  |                          | REQUEST  |     |
   |     +----+-----+                          +----+-----+     |
   |          | rcv Request            rcv Response |           |
   |          | snd Response             snd Ack    |           |
   |          v                                     v           |
   |     +----------+                          +----------+     |
   |     | RESPOND  |                          | PARTOPEN |     |
   |     +----+-----+                          +----+-----+     |
   |          | rcv Ack/DataAck         rcv packet  |           |
   |          |                                     |           |
   |          |             +----------+            |           |
   |          +------------>|   OPEN   |<-----------+           |
   |                        +--+-+--+--+                        |
   |       server active close | |  |   active close            |
   |           snd CloseReq    | |  | or rcv CloseReq           |
   |                           | |  |    snd Close              |
   |                           | |  |                           |
   |     +----------+          | |  |          +----------+     |
   |     | CLOSEREQ |<---------+ |  +--------->| CLOSING  |     |
   |     +----+-----+            |             +----+-----+     |
   |          | rcv Close        |        rcv Reset |           |
   |          | snd Reset        |                  |           |
   |<---------+                  |                  v           |
   |                             |             +----+-----+     |
   |                   rcv Close |             | TIMEWAIT |     |
   |                   snd Reset |             +----+-----+     |
   +-----------------------------+                  |           |
                                                    +-----------+
                                                 2MSL timer expires

8.5. Pseudocode

This section presents an algorithm describing the processing steps a DCCP endpoint must go through when it receives a packet. A DCCP implementation need not implement the algorithm as it is described here, but any implementation MUST generate observable effects exactly as indicated by this pseudocode, except where allowed otherwise by another part of this document.
Top   ToC   RFC4340 - Page 68
   The received packet is written as P, the socket as S.  Socket
   variables are:

   S.SWL - sequence number window low
   S.SWH - sequence number window high
   S.AWL - acknowledgement number window low
   S.AWH - acknowledgement number window high
   S.ISS - initial sequence number sent
   S.ISR - initial sequence number received
   S.OSR - first OPEN sequence number received
   S.GSS - greatest sequence number sent
   S.GSR - greatest valid sequence number received
   S.GAR - greatest valid acknowledgement number received on a
           non-Sync; initialized to S.ISS
   "Send packet" actions always use, and increment, S.GSS.

   Step 1: Check header basics
      /* This step checks for malformed packets.  Packets that fail
         these checks are ignored -- they do not receive Resets in
         response */
      If the packet is shorter than 12 bytes, drop packet and return
      If P.type is not understood, drop packet and return
      If P.Data Offset is smaller than the given packet type's
            fixed header length or larger than the packet's length,
            drop packet and return
      If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet
            has short sequence numbers), drop packet and return
      If the header checksum is incorrect, drop packet and return
      If P.CsCov is too large for the packet size, drop packet and
            return

   Step 2: Check ports and process TIMEWAIT state
      /* Flow ID is <src addr, src port, dst addr, dst port> 4-tuple */
      Look up flow ID in table and get corresponding socket
      If no socket, or S.state == TIMEWAIT,
         /* The following Reset's Sequence and Acknowledgement Numbers
            are taken from the input packet; see Section 8.3.1. */
         Generate Reset(No Connection) unless P.type == Reset
         Drop packet and return
Top   ToC   RFC4340 - Page 69
   Step 3: Process LISTEN state
      If S.state == LISTEN,
         If P.type == Request or P contains a valid Init Cookie option,
            /* Must scan the packet's options to check for Init
               Cookies.  Only Init Cookies are processed here,
               however; other options are processed in Step 8.  This
               scan need only be performed if the endpoint uses Init
               Cookies */
            /* Generate a new socket and switch to that socket */
            Set S := new socket for this port pair
            S.state = RESPOND
            Choose S.ISS (initial seqno) or set from Init Cookies
            Initialize S.GAR := S.ISS
            Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookies
            Continue with S.state == RESPOND
            /* A Response packet will be generated in Step 11 */
         Otherwise,
            Generate Reset(No Connection) unless P.type == Reset
            Drop packet and return

   Step 4: Prepare sequence numbers in REQUEST
      If S.state == REQUEST,
         If (P.type == Response or P.type == Reset)
               and S.AWL <= P.ackno <= S.AWH,
            /* Set sequence number variables corresponding to the
               other endpoint, so P will pass the tests in Step 6 */
            Set S.GSR, S.ISR, S.SWL, S.SWH
            /* Response processing continues in Step 10; Reset
               processing continues in Step 9 */
         Otherwise,
            /* Only Response and Reset are valid in REQUEST state */
            Generate Reset(Packet Error)
            Drop packet and return

   Step 5: Prepare sequence numbers for Sync
      If P.type == Sync or P.type == SyncAck,
         If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL,
            /* P is valid, so update sequence number variables
               accordingly.  After this update, P will pass the tests
               in Step 6.  A SyncAck is generated if necessary in
               Step 15 */
            Update S.GSR, S.SWL, S.SWH
         Otherwise,
            Drop packet and return
Top   ToC   RFC4340 - Page 70
   Step 6: Check sequence numbers
      If P.X == 0 and the relevant Allow Short Seqnos feature is 0,
         /* Packet has short seqnos, but short seqnos not allowed */
         Drop packet and return
      Otherwise, if P.X == 0,
         Extend P.seqno and P.ackno to 48 bits using the procedure
         in Section 7.6
      Let LSWL = S.SWL and LAWL = S.AWL
      If P.type == CloseReq or P.type == Close or P.type == Reset,
         LSWL := S.GSR + 1, LAWL := S.GAR
      If LSWL <= P.seqno <= S.SWH
            and (P.ackno does not exist or LAWL <= P.ackno <= S.AWH),
         Update S.GSR, S.SWL, S.SWH
         If P.type != Sync,
            Update S.GAR
      Otherwise,
         If P.type == Reset,
            Send Sync packet acknowledging S.GSR
         Otherwise,
            Send Sync packet acknowledging P.seqno
         Drop packet and return

   Step 7: Check for unexpected packet types
      If (S.is_server and P.type == CloseReq)
           or (S.is_server and P.type == Response)
           or (S.is_client and P.type == Request)
           or (S.state >= OPEN and P.type == Request
               and P.seqno >= S.OSR)
           or (S.state >= OPEN and P.type == Response
               and P.seqno >= S.OSR)
           or (S.state == RESPOND and P.type == Data),
         Send Sync packet acknowledging P.seqno
         Drop packet and return

   Step 8: Process options and mark acknowledgeable
      /* Option processing is not specifically described here.
         Certain options, such as Mandatory, may cause the connection
         to be reset, in which case Steps 9 and on are not executed */
      Mark packet as acknowledgeable (in Ack Vector terms, Received
           or Received ECN Marked)

   Step 9: Process Reset
      If P.type == Reset,
         Tear down connection
         S.state := TIMEWAIT
         Set TIMEWAIT timer
         Drop packet and return
Top   ToC   RFC4340 - Page 71
   Step 10: Process REQUEST state (second part)
      If S.state == REQUEST,
         /* If we get here, P is a valid Response from the server (see
            Step 4), and we should move to PARTOPEN state.  PARTOPEN
            means send an Ack, don't send Data packets, retransmit
            Acks periodically, and always include any Init Cookie from
            the Response */
         S.state := PARTOPEN
         Set PARTOPEN timer
         Continue with S.state == PARTOPEN
         /* Step 12 will send the Ack completing the three-way
            handshake */

   Step 11: Process RESPOND state
      If S.state == RESPOND,
         If P.type == Request,
            Send Response, possibly containing Init Cookie
            If Init Cookie was sent,
               Destroy S and return
               /* Step 3 will create another socket when the client
                  completes the three-way handshake */
         Otherwise,
            S.OSR := P.seqno
            S.state := OPEN

   Step 12: Process PARTOPEN state
      If S.state == PARTOPEN,
         If P.type == Response,
            Send Ack
         Otherwise, if P.type != Sync,
            S.OSR := P.seqno
            S.state := OPEN

   Step 13: Process CloseReq
      If P.type == CloseReq and S.state < CLOSEREQ,
         Generate Close
         S.state := CLOSING
         Set CLOSING timer

   Step 14: Process Close
      If P.type == Close,
         Generate Reset(Closed)
         Tear down connection
         Drop packet and return
Top   ToC   RFC4340 - Page 72
   Step 15: Process Sync
      If P.type == Sync,
         Generate SyncAck

   Step 16: Process data
      /* At this point any application data on P can be passed to the
         application, except that the application MUST NOT receive
         data from more than one Request or Response */

9. Checksums

DCCP uses a header checksum to protect its header against corruption. Generally, this checksum also covers any application data. DCCP applications can, however, request that the header checksum cover only part of the application data, or perhaps no application data at all. Link layers may then reduce their protection on unprotected parts of DCCP packets. For some noisy links, and for applications that can tolerate corruption, this can greatly improve delivery rates and perceived performance. Checksum coverage may eventually impact congestion control mechanisms as well. A packet with corrupt application data and complete checksum coverage is treated as lost. This incurs a heavy-duty loss response from the sender's congestion control mechanism, which can unfairly penalize connections on links with high background corruption. The combination of reduced checksum coverage and Data Checksum options may let endpoints report packets as corrupt rather than dropped, using Data Dropped options and Drop Code 3 (see Section 11.7). This may eventually benefit applications. However, further research is required to determine an appropriate response to corruption, which can sometimes correlate with congestion. Corrupt packets currently incur a loss response. The Data Checksum option, which contains a strong CRC, lets endpoints detect application data corruption. An API can then be used to avoid delivering corrupt data to the application, even if links deliver corrupt data to the endpoint due to reduced checksum coverage. However, the use of reduced checksum coverage for applications that demand correct data is currently considered experimental. This is because the combined loss-plus-corruption rate for packets with reduced checksum coverage may be significantly higher than that for packets with full checksum coverage, although the loss rate will generally be lower. Actual behavior will depend on link design; further research and experience is required. Reduced checksum coverage introduces some security considerations; see Section 18.1. See Appendix B for further motivation and
Top   ToC   RFC4340 - Page 73
   discussion.  DCCP's implementation of reduced checksum coverage was
   inspired by UDP-Lite [RFC3828].

9.1. Header Checksum Field

DCCP uses the TCP/IP checksum algorithm. The Checksum field in the DCCP generic header (see Section 5.1) equals the 16-bit one's complement of the one's complement sum of all 16-bit words in the DCCP header, DCCP options, a pseudoheader taken from the network- layer header, and, depending on the value of the Checksum Coverage field, some or all of the application data. When calculating the checksum, the Checksum field itself is treated as 0. If a packet contains an odd number of header and payload bytes to be checksummed, 8 zero bits are added on the right to form a 16-bit word for checksum purposes. The pad byte is not transmitted as part of the packet. The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits long and consists of the IPv4 source and destination addresses, the IP protocol number for DCCP (padded on the left with 8 zero bits), and the DCCP length as a 16-bit quantity (the length of the DCCP header with options, plus the length of any data); see [RFC793], Section 3.1. For IPv6, it is 320 bits long, and consists of the IPv6 source and destination addresses, the DCCP length as a 32-bit quantity, and the IP protocol number for DCCP (padded on the left with 24 zero bits); see [RFC2460], Section 8.1. Packets with invalid header checksums MUST be ignored. In particular, their options MUST NOT be processed.

9.2. Header Checksum Coverage Field

The Checksum Coverage field in the DCCP generic header (see Section 5.1) specifies what parts of the packet are covered by the Checksum field, as follows: CsCov = 0 The Checksum field covers the DCCP header, DCCP options, network-layer pseudoheader, and all application data in the packet, possibly padded on the right with zeros to an even number of bytes. CsCov = 1-15 The Checksum field covers the DCCP header, DCCP options, network-layer pseudoheader, and the initial (CsCov-1)*4 bytes of the packet's application data. Thus, if CsCov is 1, none of the application data is protected by the header checksum. The value (CsCov-1)*4 MUST be less than or equal to the length of the application data. Packets with invalid CsCov values MUST be ignored; in particular, their options MUST NOT be
Top   ToC   RFC4340 - Page 74
   processed.  The meanings of values other than 0 and 1 should be
   considered experimental.

   Values other than 0 specify that corruption is acceptable in some or
   all of the DCCP packet's application data.  In fact, DCCP cannot even
   detect corruption in areas not covered by the header checksum, unless
   the Data Checksum option is used.  Applications should not make any
   assumptions about the correctness of received data not covered by the
   checksum and should, if necessary, introduce their own validity
   checks.

   A DCCP application interface should let sending applications suggest
   a value for CsCov for sent packets, defaulting to 0 (full coverage).
   The Minimum Checksum Coverage feature, described below, lets an
   endpoint refuse delivery of application data on packets with partial
   checksum coverage; by default, only fully covered application data is
   accepted.  Lower layers that support partial error detection MAY use
   the Checksum Coverage field as a hint of where errors do not need to
   be detected.  Lower layers MUST use a strong error detection
   mechanism to detect at least errors that occur in the sensitive part
   of the packet, and to discard damaged packets.  The sensitive part
   consists of the bytes between the first byte of the IP header and the
   last byte identified by Checksum Coverage.

   For more details on application and lower-layer interface issues
   relating to partial checksumming, see [RFC3828].

9.2.1. Minimum Checksum Coverage Feature

The Minimum Checksum Coverage feature lets a DCCP endpoint determine whether its peer is willing to accept packets with reduced Checksum Coverage. For example, DCCP A sends a "Change R(Minimum Checksum Coverage, 1)" option to DCCP B to check whether B is willing to accept packets with Checksum Coverage set to 1. Minimum Checksum Coverage has feature number 8 and is server- priority. It takes one-byte integer values between 0 and 15; values of 16 or more are reserved. Minimum Checksum Coverage/B reflects values of Checksum Coverage that DCCP B finds unacceptable. Say that the value of Minimum Checksum Coverage/B is MinCsCov. Then: o If MinCsCov = 0, then DCCP B only finds packets with CsCov = 0 acceptable. o If MinCsCov > 0, then DCCP B additionally finds packets with CsCov >= MinCsCov acceptable.
Top   ToC   RFC4340 - Page 75
   DCCP B MAY refuse to process application data from packets with
   unacceptable Checksum Coverage.  Such packets SHOULD be reported
   using Data Dropped options (Section 11.7) with Drop Code 0, Protocol
   Constraints.  New connections start with Minimum Checksum Coverage 0
   for both endpoints.

9.3. Data Checksum Option

The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- check code of a DCCP packet's application data. +--------+--------+--------+--------+--------+--------+ |00101100|00000110| CRC-32c | +--------+--------+--------+--------+--------+--------+ Type=44 Length=6 The sending DCCP computes the CRC of the bytes comprising the application data area and stores it in the option data. The CRC-32c algorithm used for Data Checksum is the same as that used for SCTP [RFC3309]; note that the CRC-32c of zero bytes of data equals zero. The DCCP header checksum will cover the Data Checksum option, so the data checksum must be computed before the header checksum. A DCCP endpoint receiving a packet with a Data Checksum option either MUST or MAY check the Data Checksum; the choice depends on the value of the Check Data Checksum feature described below. If it checks the checksum, it computes the received application data's CRC-32c using the same algorithm as the sender and compares the result with the Data Checksum value. If the CRCs differ, the endpoint reacts in one of two ways: o The receiving application may have requested delivery of known- corrupt data via some optional API. In this case, the packet's data MUST be delivered to the application, with a note that it is known to be corrupt. Furthermore, the receiving endpoint MUST report the packet as delivered corrupt using a Data Dropped option (Drop Code 7, Delivered Corrupt). o Otherwise, the receiving endpoint MUST drop the application data and report that data as dropped due to corruption using a Data Dropped option (Drop Code 3, Corrupt). In either case, the packet is considered acknowledgeable (since its header was processed) and will therefore be acknowledged using the equivalent of Ack Vector's Received or Received ECN Marked states. Although Data Checksum is intended for packets containing application data, it may be included on other packets, such as DCCP-Ack, DCCP-
Top   ToC   RFC4340 - Page 76
   Sync, and DCCP-SyncAck.  The receiver SHOULD calculate the
   application data area's CRC-32c on such packets, just as it does for
   DCCP-Data and similar packets.  If the CRCs differ, the packets
   similarly MUST be reported using Data Dropped options (Drop Code 3),
   although their application data areas would not be delivered to the
   application in any case.

9.3.1. Check Data Checksum Feature

The Check Data Checksum feature lets a DCCP endpoint determine whether its peer will definitely check Data Checksum options. DCCP A sends a Mandatory "Change R(Check Data Checksum, 1)" option to DCCP B to require it to check Data Checksum options (the connection will be reset if it cannot). Check Data Checksum has feature number 9 and is server-priority. It takes one-byte Boolean values. DCCP B MUST check any received Data Checksum options when Check Data Checksum/B is one, although it MAY check them even when Check Data Checksum/B is zero. Values of two or more are reserved. New connections start with Check Data Checksum 0 for both endpoints.

9.3.2. Checksum Usage Notes

Internet links must normally apply strong integrity checks to the packets they transmit [RFC3828, RFC3819]. This is the default case when the DCCP header's Checksum Coverage value equals zero (full coverage). However, the DCCP Checksum Coverage value might not be zero. By setting partial Checksum Coverage, the application indicates that it can tolerate corruption in the unprotected part of the application data. Recognizing this, link layers may reduce error detection and/or correction strength when transmitting this unprotected part. This, in turn, can significantly increase the likelihood of the endpoint's receiving corrupt data; Data Checksum lets the receiver detect that corruption with very high probability.


(page 76 continued on part 4)

Next Section