Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 0793

Transmission Control Protocol

Pages: 91
Obsoletes:  0761
Obsoleted by:  9293
Updated by:  1122316860936528
Part 2 of 4 – Pages 15 to 37
First   Prev   Next

Top   ToC   RFC0793 - Page 15   prevText
                      3.  FUNCTIONAL SPECIFICATION

3.1.  Header Format

  TCP segments are sent as internet datagrams.  The Internet Protocol
  header carries several information fields, including the source and
  destination host addresses [2].  A TCP header follows the internet
  header, supplying information specific to the TCP protocol.  This
  division allows for the existence of host level protocols other than
  TCP.

  TCP Header Format


    0                   1                   2                   3   
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Source Port          |       Destination Port        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Acknowledgment Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Data |           |U|A|P|R|S|F|                               |
   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |
   |       |           |G|K|H|T|N|N|                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Checksum            |         Urgent Pointer        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options                    |    Padding    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             data                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                            TCP Header Format

          Note that one tick mark represents one bit position.

                               Figure 3.

  Source Port:  16 bits

    The source port number.

  Destination Port:  16 bits

    The destination port number.
Top   ToC   RFC0793 - Page 16
  Sequence Number:  32 bits

    The sequence number of the first data octet in this segment (except
    when SYN is present). If SYN is present the sequence number is the
    initial sequence number (ISN) and the first data octet is ISN+1.

  Acknowledgment Number:  32 bits

    If the ACK control bit is set this field contains the value of the
    next sequence number the sender of the segment is expecting to
    receive.  Once a connection is established this is always sent.

  Data Offset:  4 bits

    The number of 32 bit words in the TCP Header.  This indicates where
    the data begins.  The TCP header (even one including options) is an
    integral number of 32 bits long.

  Reserved:  6 bits

    Reserved for future use.  Must be zero.

  Control Bits:  6 bits (from left to right):

    URG:  Urgent Pointer field significant
    ACK:  Acknowledgment field significant
    PSH:  Push Function
    RST:  Reset the connection
    SYN:  Synchronize sequence numbers
    FIN:  No more data from sender

  Window:  16 bits

    The number of data octets beginning with the one indicated in the
    acknowledgment field which the sender of this segment is willing to
    accept.

  Checksum:  16 bits

    The checksum field is the 16 bit one's complement of the one's
    complement sum of all 16 bit words in the header and text.  If a
    segment contains an odd number of header and text octets to be
    checksummed, the last octet is padded on the right with zeros to
    form a 16 bit word for checksum purposes.  The pad is not
    transmitted as part of the segment.  While computing the checksum,
    the checksum field itself is replaced with zeros.

    The checksum also covers a 96 bit pseudo header conceptually
Top   ToC   RFC0793 - Page 17
    prefixed to the TCP header.  This pseudo header contains the Source
    Address, the Destination Address, the Protocol, and TCP length.
    This gives the TCP protection against misrouted segments.  This
    information is carried in the Internet Protocol and is transferred
    across the TCP/Network interface in the arguments or results of
    calls by the TCP on the IP.

                     +--------+--------+--------+--------+
                     |           Source Address          |
                     +--------+--------+--------+--------+
                     |         Destination Address       |
                     +--------+--------+--------+--------+
                     |  zero  |  PTCL  |    TCP Length   |
                     +--------+--------+--------+--------+

      The TCP Length is the TCP header length plus the data length in
      octets (this is not an explicitly transmitted quantity, but is
      computed), and it does not count the 12 octets of the pseudo
      header.

  Urgent Pointer:  16 bits

    This field communicates the current value of the urgent pointer as a
    positive offset from the sequence number in this segment.  The
    urgent pointer points to the sequence number of the octet following
    the urgent data.  This field is only be interpreted in segments with
    the URG control bit set.

  Options:  variable

    Options may occupy space at the end of the TCP header and are a
    multiple of 8 bits in length.  All options are included in the
    checksum.  An option may begin on any octet boundary.  There are two
    cases for the format of an option:

      Case 1:  A single octet of option-kind.

      Case 2:  An octet of option-kind, an octet of option-length, and
               the actual option-data octets.

    The option-length counts the two octets of option-kind and
    option-length as well as the option-data octets.

    Note that the list of options may be shorter than the data offset
    field might imply.  The content of the header beyond the
    End-of-Option option must be header padding (i.e., zero).

    A TCP must implement all options.
Top   ToC   RFC0793 - Page 18
    Currently defined options include (kind indicated in octal):

      Kind     Length    Meaning
      ----     ------    -------
       0         -       End of option list.
       1         -       No-Operation.
       2         4       Maximum Segment Size.
      

    Specific Option Definitions

      End of Option List

        +--------+
        |00000000|
        +--------+
         Kind=0

        This option code indicates the end of the option list.  This
        might not coincide with the end of the TCP header according to
        the Data Offset field.  This is used at the end of all options,
        not the end of each option, and need only be used if the end of
        the options would not otherwise coincide with the end of the TCP
        header.

      No-Operation

        +--------+
        |00000001|
        +--------+
         Kind=1

        This option code may be used between options, for example, to
        align the beginning of a subsequent option on a word boundary.
        There is no guarantee that senders will use this option, so
        receivers must be prepared to process options even if they do
        not begin on a word boundary.

      Maximum Segment Size

        +--------+--------+---------+--------+
        |00000010|00000100|   max seg size   |
        +--------+--------+---------+--------+
         Kind=2   Length=4
Top   ToC   RFC0793 - Page 19
        Maximum Segment Size Option Data:  16 bits

          If this option is present, then it communicates the maximum
          receive segment size at the TCP which sends this segment.
          This field must only be sent in the initial connection request
          (i.e., in segments with the SYN control bit set).  If this
          option is not used, any segment size is allowed.

  Padding:  variable

    The TCP header padding is used to ensure that the TCP header ends
    and data begins on a 32 bit boundary.  The padding is composed of
    zeros.

3.2.  Terminology

  Before we can discuss very much about the operation of the TCP we need
  to introduce some detailed terminology.  The maintenance of a TCP
  connection requires the remembering of several variables.  We conceive
  of these variables being stored in a connection record called a
  Transmission Control Block or TCB.  Among the variables stored in the
  TCB are the local and remote socket numbers, the security and
  precedence of the connection, pointers to the user's send and receive
  buffers, pointers to the retransmit queue and to the current segment.
  In addition several variables relating to the send and receive
  sequence numbers are stored in the TCB.

    Send Sequence Variables

      SND.UNA - send unacknowledged
      SND.NXT - send next
      SND.WND - send window
      SND.UP  - send urgent pointer
      SND.WL1 - segment sequence number used for last window update
      SND.WL2 - segment acknowledgment number used for last window
                update
      ISS     - initial send sequence number

    Receive Sequence Variables

      RCV.NXT - receive next
      RCV.WND - receive window
      RCV.UP  - receive urgent pointer
      IRS     - initial receive sequence number
Top   ToC   RFC0793 - Page 20
  The following diagrams may help to relate some of these variables to
  the sequence space.

  Send Sequence Space

                   1         2          3          4      
              ----------|----------|----------|---------- 
                     SND.UNA    SND.NXT    SND.UNA        
                                          +SND.WND        

        1 - old sequence numbers which have been acknowledged  
        2 - sequence numbers of unacknowledged data            
        3 - sequence numbers allowed for new data transmission 
        4 - future sequence numbers which are not yet allowed  

                          Send Sequence Space

                               Figure 4.
    
    

  The send window is the portion of the sequence space labeled 3 in
  figure 4.

  Receive Sequence Space

                       1          2          3      
                   ----------|----------|---------- 
                          RCV.NXT    RCV.NXT        
                                    +RCV.WND        

        1 - old sequence numbers which have been acknowledged  
        2 - sequence numbers allowed for new reception         
        3 - future sequence numbers which are not yet allowed  

                         Receive Sequence Space

                               Figure 5.
    
    

  The receive window is the portion of the sequence space labeled 2 in
  figure 5.

  There are also some variables used frequently in the discussion that
  take their values from the fields of the current segment.
Top   ToC   RFC0793 - Page 21
    Current Segment Variables

      SEG.SEQ - segment sequence number
      SEG.ACK - segment acknowledgment number
      SEG.LEN - segment length
      SEG.WND - segment window
      SEG.UP  - segment urgent pointer
      SEG.PRC - segment precedence value

  A connection progresses through a series of states during its
  lifetime.  The states are:  LISTEN, SYN-SENT, SYN-RECEIVED,
  ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK,
  TIME-WAIT, and the fictional state CLOSED.  CLOSED is fictional
  because it represents the state when there is no TCB, and therefore,
  no connection.  Briefly the meanings of the states are:

    LISTEN - represents waiting for a connection request from any remote
    TCP and port.

    SYN-SENT - represents waiting for a matching connection request
    after having sent a connection request.

    SYN-RECEIVED - represents waiting for a confirming connection
    request acknowledgment after having both received and sent a
    connection request.

    ESTABLISHED - represents an open connection, data received can be
    delivered to the user.  The normal state for the data transfer phase
    of the connection.

    FIN-WAIT-1 - represents waiting for a connection termination request
    from the remote TCP, or an acknowledgment of the connection
    termination request previously sent.

    FIN-WAIT-2 - represents waiting for a connection termination request
    from the remote TCP.

    CLOSE-WAIT - represents waiting for a connection termination request
    from the local user.

    CLOSING - represents waiting for a connection termination request
    acknowledgment from the remote TCP.

    LAST-ACK - represents waiting for an acknowledgment of the
    connection termination request previously sent to the remote TCP
    (which includes an acknowledgment of its connection termination
    request).
Top   ToC   RFC0793 - Page 22
    TIME-WAIT - represents waiting for enough time to pass to be sure
    the remote TCP received the acknowledgment of its connection
    termination request.

    CLOSED - represents no connection state at all.

  A TCP connection progresses from one state to another in response to
  events.  The events are the user calls, OPEN, SEND, RECEIVE, CLOSE,
  ABORT, and STATUS; the incoming segments, particularly those
  containing the SYN, ACK, RST and FIN flags; and timeouts.

  The state diagram in figure 6 illustrates only state changes, together
  with the causing events and resulting actions, but addresses neither
  error conditions nor actions which are not connected with state
  changes.  In a later section, more detail is offered with respect to
  the reaction of the TCP to events.

  NOTE BENE:  this diagram is only a summary and must not be taken as
  the total specification.
Top   ToC   RFC0793 - Page 23

                              +---------+ ---------\      active OPEN  
                              |  CLOSED |            \    -----------  
                              +---------+<---------\   \   create TCB  
                                |     ^              \   \  snd SYN    
                   passive OPEN |     |   CLOSE        \   \           
                   ------------ |     | ----------       \   \         
                    create TCB  |     | delete TCB         \   \       
                                V     |                      \   \     
                              +---------+            CLOSE    |    \   
                              |  LISTEN |          ---------- |     |  
                              +---------+          delete TCB |     |  
                   rcv SYN      |     |     SEND              |     |  
                  -----------   |     |    -------            |     V  
 +---------+      snd SYN,ACK  /       \   snd SYN          +---------+
 |         |<-----------------           ------------------>|         |
 |   SYN   |                    rcv SYN                     |   SYN   |
 |   RCVD  |<-----------------------------------------------|   SENT  |
 |         |                    snd ACK                     |         |
 |         |------------------           -------------------|         |
 +---------+   rcv ACK of SYN  \       /  rcv SYN,ACK       +---------+
   |           --------------   |     |   -----------                  
   |                  x         |     |     snd ACK                    
   |                            V     V                                
   |  CLOSE                   +---------+                              
   | -------                  |  ESTAB  |                              
   | snd FIN                  +---------+                              
   |                   CLOSE    |     |    rcv FIN                     
   V                  -------   |     |    -------                     
 +---------+          snd FIN  /       \   snd ACK          +---------+
 |  FIN    |<-----------------           ------------------>|  CLOSE  |
 | WAIT-1  |------------------                              |   WAIT  |
 +---------+          rcv FIN  \                            +---------+
   | rcv ACK of FIN   -------   |                            CLOSE  |  
   | --------------   snd ACK   |                           ------- |  
   V        x                   V                           snd FIN V  
 +---------+                  +---------+                   +---------+
 |FINWAIT-2|                  | CLOSING |                   | LAST-ACK|
 +---------+                  +---------+                   +---------+
   |                rcv ACK of FIN |                 rcv ACK of FIN |  
   |  rcv FIN       -------------- |    Timeout=2MSL -------------- |  
   |  -------              x       V    ------------        x       V  
    \ snd ACK                 +---------+delete TCB         +---------+
     ------------------------>|TIME WAIT|------------------>| CLOSED  |
                              +---------+                   +---------+

                      TCP Connection State Diagram
                               Figure 6.
Top   ToC   RFC0793 - Page 24
3.3.  Sequence Numbers

  A fundamental notion in the design is that every octet of data sent
  over a TCP connection has a sequence number.  Since every octet is
  sequenced, each of them can be acknowledged.  The acknowledgment
  mechanism employed is cumulative so that an acknowledgment of sequence
  number X indicates that all octets up to but not including X have been
  received.  This mechanism allows for straight-forward duplicate
  detection in the presence of retransmission.  Numbering of octets
  within a segment is that the first data octet immediately following
  the header is the lowest numbered, and the following octets are
  numbered consecutively.

  It is essential to remember that the actual sequence number space is
  finite, though very large.  This space ranges from 0 to 2**32 - 1.
  Since the space is finite, all arithmetic dealing with sequence
  numbers must be performed modulo 2**32.  This unsigned arithmetic
  preserves the relationship of sequence numbers as they cycle from
  2**32 - 1 to 0 again.  There are some subtleties to computer modulo
  arithmetic, so great care should be taken in programming the
  comparison of such values.  The symbol "=<" means "less than or equal"
  (modulo 2**32).

  The typical kinds of sequence number comparisons which the TCP must
  perform include:

    (a)  Determining that an acknowledgment refers to some sequence
         number sent but not yet acknowledged.

    (b)  Determining that all sequence numbers occupied by a segment
         have been acknowledged (e.g., to remove the segment from a
         retransmission queue).

    (c)  Determining that an incoming segment contains sequence numbers
         which are expected (i.e., that the segment "overlaps" the
         receive window).
Top   ToC   RFC0793 - Page 25
  In response to sending data the TCP will receive acknowledgments.  The
  following comparisons are needed to process the acknowledgments.

    SND.UNA = oldest unacknowledged sequence number

    SND.NXT = next sequence number to be sent

    SEG.ACK = acknowledgment from the receiving TCP (next sequence
              number expected by the receiving TCP)

    SEG.SEQ = first sequence number of a segment

    SEG.LEN = the number of octets occupied by the data in the segment
              (counting SYN and FIN)

    SEG.SEQ+SEG.LEN-1 = last sequence number of a segment

  A new acknowledgment (called an "acceptable ack"), is one for which
  the inequality below holds:

    SND.UNA < SEG.ACK =< SND.NXT

  A segment on the retransmission queue is fully acknowledged if the sum
  of its sequence number and length is less or equal than the
  acknowledgment value in the incoming segment.

  When data is received the following comparisons are needed:

    RCV.NXT = next sequence number expected on an incoming segments, and
        is the left or lower edge of the receive window

    RCV.NXT+RCV.WND-1 = last sequence number expected on an incoming
        segment, and is the right or upper edge of the receive window

    SEG.SEQ = first sequence number occupied by the incoming segment

    SEG.SEQ+SEG.LEN-1 = last sequence number occupied by the incoming
        segment

  A segment is judged to occupy a portion of valid receive sequence
  space if

    RCV.NXT =< SEG.SEQ < RCV.NXT+RCV.WND

  or

    RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND
Top   ToC   RFC0793 - Page 26
  The first part of this test checks to see if the beginning of the
  segment falls in the window, the second part of the test checks to see
  if the end of the segment falls in the window; if the segment passes
  either part of the test it contains data in the window.

  Actually, it is a little more complicated than this.  Due to zero
  windows and zero length segments, we have four cases for the
  acceptability of an incoming segment:

    Segment Receive  Test
    Length  Window
    ------- -------  -------------------------------------------

       0       0     SEG.SEQ = RCV.NXT

       0      >0     RCV.NXT =< SEG.SEQ < RCV.NXT+RCV.WND

      >0       0     not acceptable

      >0      >0     RCV.NXT =< SEG.SEQ < RCV.NXT+RCV.WND
                  or RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND

  Note that when the receive window is zero no segments should be
  acceptable except ACK segments.  Thus, it is be possible for a TCP to
  maintain a zero receive window while transmitting data and receiving
  ACKs.  However, even when the receive window is zero, a TCP must
  process the RST and URG fields of all incoming segments.

  We have taken advantage of the numbering scheme to protect certain
  control information as well.  This is achieved by implicitly including
  some control flags in the sequence space so they can be retransmitted
  and acknowledged without confusion (i.e., one and only one copy of the
  control will be acted upon).  Control information is not physically
  carried in the segment data space.  Consequently, we must adopt rules
  for implicitly assigning sequence numbers to control.  The SYN and FIN
  are the only controls requiring this protection, and these controls
  are used only at connection opening and closing.  For sequence number
  purposes, the SYN is considered to occur before the first actual data
  octet of the segment in which it occurs, while the FIN is considered
  to occur after the last actual data octet in a segment in which it
  occurs.  The segment length (SEG.LEN) includes both data and sequence
  space occupying controls.  When a SYN is present then SEG.SEQ is the
  sequence number of the SYN.
Top   ToC   RFC0793 - Page 27
  Initial Sequence Number Selection

  The protocol places no restriction on a particular connection being
  used over and over again.  A connection is defined by a pair of
  sockets.  New instances of a connection will be referred to as
  incarnations of the connection.  The problem that arises from this is
  -- "how does the TCP identify duplicate segments from previous
  incarnations of the connection?"  This problem becomes apparent if the
  connection is being opened and closed in quick succession, or if the
  connection breaks with loss of memory and is then reestablished.

  To avoid confusion we must prevent segments from one incarnation of a
  connection from being used while the same sequence numbers may still
  be present in the network from an earlier incarnation.  We want to
  assure this, even if a TCP crashes and loses all knowledge of the
  sequence numbers it has been using.  When new connections are created,
  an initial sequence number (ISN) generator is employed which selects a
  new 32 bit ISN.  The generator is bound to a (possibly fictitious) 32
  bit clock whose low order bit is incremented roughly every 4
  microseconds.  Thus, the ISN cycles approximately every 4.55 hours.
  Since we assume that segments will stay in the network no more than
  the Maximum Segment Lifetime (MSL) and that the MSL is less than 4.55
  hours we can reasonably assume that ISN's will be unique.

  For each connection there is a send sequence number and a receive
  sequence number.  The initial send sequence number (ISS) is chosen by
  the data sending TCP, and the initial receive sequence number (IRS) is
  learned during the connection establishing procedure.

  For a connection to be established or initialized, the two TCPs must
  synchronize on each other's initial sequence numbers.  This is done in
  an exchange of connection establishing segments carrying a control bit
  called "SYN" (for synchronize) and the initial sequence numbers.  As a
  shorthand, segments carrying the SYN bit are also called "SYNs".
  Hence, the solution requires a suitable mechanism for picking an
  initial sequence number and a slightly involved handshake to exchange
  the ISN's.

  The synchronization requires each side to send it's own initial
  sequence number and to receive a confirmation of it in acknowledgment
  from the other side.  Each side must also receive the other side's
  initial sequence number and send a confirming acknowledgment.

    1) A --> B  SYN my sequence number is X
    2) A <-- B  ACK your sequence number is X
    3) A <-- B  SYN my sequence number is Y
    4) A --> B  ACK your sequence number is Y
Top   ToC   RFC0793 - Page 28
  Because steps 2 and 3 can be combined in a single message this is
  called the three way (or three message) handshake.

  A three way handshake is necessary because sequence numbers are not
  tied to a global clock in the network, and TCPs may have different
  mechanisms for picking the ISN's.  The receiver of the first SYN has
  no way of knowing whether the segment was an old delayed one or not,
  unless it remembers the last sequence number used on the connection
  (which is not always possible), and so it must ask the sender to
  verify this SYN.  The three way handshake and the advantages of a
  clock-driven scheme are discussed in [3].

  Knowing When to Keep Quiet

  To be sure that a TCP does not create a segment that carries a
  sequence number which may be duplicated by an old segment remaining in
  the network, the TCP must keep quiet for a maximum segment lifetime
  (MSL) before assigning any sequence numbers upon starting up or
  recovering from a crash in which memory of sequence numbers in use was
  lost.  For this specification the MSL is taken to be 2 minutes.  This
  is an engineering choice, and may be changed if experience indicates
  it is desirable to do so.  Note that if a TCP is reinitialized in some
  sense, yet retains its memory of sequence numbers in use, then it need
  not wait at all; it must only be sure to use sequence numbers larger
  than those recently used.

  The TCP Quiet Time Concept

    This specification provides that hosts which "crash" without
    retaining any knowledge of the last sequence numbers transmitted on
    each active (i.e., not closed) connection shall delay emitting any
    TCP segments for at least the agreed Maximum Segment Lifetime (MSL)
    in the internet system of which the host is a part.  In the
    paragraphs below, an explanation for this specification is given.
    TCP implementors may violate the "quiet time" restriction, but only
    at the risk of causing some old data to be accepted as new or new
    data rejected as old duplicated by some receivers in the internet
    system.

    TCPs consume sequence number space each time a segment is formed and
    entered into the network output queue at a source host. The
    duplicate detection and sequencing algorithm in the TCP protocol
    relies on the unique binding of segment data to sequence space to
    the extent that sequence numbers will not cycle through all 2**32
    values before the segment data bound to those sequence numbers has
    been delivered and acknowledged by the receiver and all duplicate
    copies of the segments have "drained" from the internet.  Without
    such an assumption, two distinct TCP segments could conceivably be
Top   ToC   RFC0793 - Page 29
    assigned the same or overlapping sequence numbers, causing confusion
    at the receiver as to which data is new and which is old.  Remember
    that each segment is bound to as many consecutive sequence numbers
    as there are octets of data in the segment.

    Under normal conditions, TCPs keep track of the next sequence number
    to emit and the oldest awaiting acknowledgment so as to avoid
    mistakenly using a sequence number over before its first use has
    been acknowledged.  This alone does not guarantee that old duplicate
    data is drained from the net, so the sequence space has been made
    very large to reduce the probability that a wandering duplicate will
    cause trouble upon arrival.  At 2 megabits/sec. it takes 4.5 hours
    to use up 2**32 octets of sequence space.  Since the maximum segment
    lifetime in the net is not likely to exceed a few tens of seconds,
    this is deemed ample protection for foreseeable nets, even if data
    rates escalate to l0's of megabits/sec.  At 100 megabits/sec, the
    cycle time is 5.4 minutes which may be a little short, but still
    within reason.

    The basic duplicate detection and sequencing algorithm in TCP can be
    defeated, however, if a source TCP does not have any memory of the
    sequence numbers it last used on a given connection. For example, if
    the TCP were to start all connections with sequence number 0, then
    upon crashing and restarting, a TCP might re-form an earlier
    connection (possibly after half-open connection resolution) and emit
    packets with sequence numbers identical to or overlapping with
    packets still in the network which were emitted on an earlier
    incarnation of the same connection.  In the absence of knowledge
    about the sequence numbers used on a particular connection, the TCP
    specification recommends that the source delay for MSL seconds
    before emitting segments on the connection, to allow time for
    segments from the earlier connection incarnation to drain from the
    system.

    Even hosts which can remember the time of day and used it to select
    initial sequence number values are not immune from this problem
    (i.e., even if time of day is used to select an initial sequence
    number for each new connection incarnation).

    Suppose, for example, that a connection is opened starting with
    sequence number S.  Suppose that this connection is not used much
    and that eventually the initial sequence number function (ISN(t))
    takes on a value equal to the sequence number, say S1, of the last
    segment sent by this TCP on a particular connection.  Now suppose,
    at this instant, the host crashes, recovers, and establishes a new
    incarnation of the connection. The initial sequence number chosen is
    S1 = ISN(t) -- last used sequence number on old incarnation of
    connection!  If the recovery occurs quickly enough, any old
Top   ToC   RFC0793 - Page 30
    duplicates in the net bearing sequence numbers in the neighborhood
    of S1 may arrive and be treated as new packets by the receiver of
    the new incarnation of the connection.

    The problem is that the recovering host may not know for how long it
    crashed nor does it know whether there are still old duplicates in
    the system from earlier connection incarnations.

    One way to deal with this problem is to deliberately delay emitting
    segments for one MSL after recovery from a crash- this is the "quite
    time" specification.  Hosts which prefer to avoid waiting are
    willing to risk possible confusion of old and new packets at a given
    destination may choose not to wait for the "quite time".
    Implementors may provide TCP users with the ability to select on a
    connection by connection basis whether to wait after a crash, or may
    informally implement the "quite time" for all connections.
    Obviously, even where a user selects to "wait," this is not
    necessary after the host has been "up" for at least MSL seconds.

    To summarize: every segment emitted occupies one or more sequence
    numbers in the sequence space, the numbers occupied by a segment are
    "busy" or "in use" until MSL seconds have passed, upon crashing a
    block of space-time is occupied by the octets of the last emitted
    segment, if a new connection is started too soon and uses any of the
    sequence numbers in the space-time footprint of the last segment of
    the previous connection incarnation, there is a potential sequence
    number overlap area which could cause confusion at the receiver.

3.4.  Establishing a connection

  The "three-way handshake" is the procedure used to establish a
  connection.  This procedure normally is initiated by one TCP and
  responded to by another TCP.  The procedure also works if two TCP
  simultaneously initiate the procedure.  When simultaneous attempt
  occurs, each TCP receives a "SYN" segment which carries no
  acknowledgment after it has sent a "SYN".  Of course, the arrival of
  an old duplicate "SYN" segment can potentially make it appear, to the
  recipient, that a simultaneous connection initiation is in progress.
  Proper use of "reset" segments can disambiguate these cases.

  Several examples of connection initiation follow.  Although these
  examples do not show connection synchronization using data-carrying
  segments, this is perfectly legitimate, so long as the receiving TCP
  doesn't deliver the data to the user until it is clear the data is
  valid (i.e., the data must be buffered at the receiver until the
  connection reaches the ESTABLISHED state).  The three-way handshake
  reduces the possibility of false connections.  It is the
Top   ToC   RFC0793 - Page 31
  implementation of a trade-off between memory and messages to provide
  information for this checking.

  The simplest three-way handshake is shown in figure 7 below.  The
  figures should be interpreted in the following way.  Each line is
  numbered for reference purposes.  Right arrows (-->) indicate
  departure of a TCP segment from TCP A to TCP B, or arrival of a
  segment at B from A.  Left arrows (<--), indicate the reverse.
  Ellipsis (...) indicates a segment which is still in the network
  (delayed).  An "XXX" indicates a segment which is lost or rejected.
  Comments appear in parentheses.  TCP states represent the state AFTER
  the departure or arrival of the segment (whose contents are shown in
  the center of each line).  Segment contents are shown in abbreviated
  form, with sequence number, control flags, and ACK field.  Other
  fields such as window, addresses, lengths, and text have been left out
  in the interest of clarity.

  

      TCP A                                                TCP B

  1.  CLOSED                                               LISTEN

  2.  SYN-SENT    --> <SEQ=100><CTL=SYN>               --> SYN-RECEIVED

  3.  ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK>  <-- SYN-RECEIVED

  4.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK>       --> ESTABLISHED

  5.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK><DATA> --> ESTABLISHED

          Basic 3-Way Handshake for Connection Synchronization

                                Figure 7.

  In line 2 of figure 7, TCP A begins by sending a SYN segment
  indicating that it will use sequence numbers starting with sequence
  number 100.  In line 3, TCP B sends a SYN and acknowledges the SYN it
  received from TCP A.  Note that the acknowledgment field indicates TCP
  B is now expecting to hear sequence 101, acknowledging the SYN which
  occupied sequence 100.

  At line 4, TCP A responds with an empty segment containing an ACK for
  TCP B's SYN; and in line 5, TCP A sends some data.  Note that the
  sequence number of the segment in line 5 is the same as in line 4
  because the ACK does not occupy sequence number space (if it did, we
  would wind up ACKing ACK's!).
Top   ToC   RFC0793 - Page 32
  Simultaneous initiation is only slightly more complex, as is shown in
  figure 8.  Each TCP cycles from CLOSED to SYN-SENT to SYN-RECEIVED to
  ESTABLISHED.

  

      TCP A                                            TCP B

  1.  CLOSED                                           CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...

  6.  ESTABLISHED  <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED

  7.               ... <SEQ=101><ACK=301><CTL=ACK>     --> ESTABLISHED

                Simultaneous Connection Synchronization

                               Figure 8.

  The principle reason for the three-way handshake is to prevent old
  duplicate connection initiations from causing confusion.  To deal with
  this, a special control message, reset, has been devised.  If the
  receiving TCP is in a  non-synchronized state (i.e., SYN-SENT,
  SYN-RECEIVED), it returns to LISTEN on receiving an acceptable reset.
  If the TCP is in one of the synchronized states (ESTABLISHED,
  FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT), it
  aborts the connection and informs its user.  We discuss this latter
  case under "half-open" connections below.
Top   ToC   RFC0793 - Page 33
  

      TCP A                                                TCP B

  1.  CLOSED                                               LISTEN

  2.  SYN-SENT    --> <SEQ=100><CTL=SYN>               ...

  3.  (duplicate) ... <SEQ=90><CTL=SYN>               --> SYN-RECEIVED

  4.  SYN-SENT    <-- <SEQ=300><ACK=91><CTL=SYN,ACK>  <-- SYN-RECEIVED

  5.  SYN-SENT    --> <SEQ=91><CTL=RST>               --> LISTEN
  

  6.              ... <SEQ=100><CTL=SYN>               --> SYN-RECEIVED

  7.  SYN-SENT    <-- <SEQ=400><ACK=101><CTL=SYN,ACK>  <-- SYN-RECEIVED

  8.  ESTABLISHED --> <SEQ=101><ACK=401><CTL=ACK>      --> ESTABLISHED

                    Recovery from Old Duplicate SYN

                               Figure 9.

  As a simple example of recovery from old duplicates, consider
  figure 9.  At line 3, an old duplicate SYN arrives at TCP B.  TCP B
  cannot tell that this is an old duplicate, so it responds normally
  (line 4).  TCP A detects that the ACK field is incorrect and returns a
  RST (reset) with its SEQ field selected to make the segment
  believable.  TCP B, on receiving the RST, returns to the LISTEN state.
  When the original SYN (pun intended) finally arrives at line 6, the
  synchronization proceeds normally.  If the SYN at line 6 had arrived
  before the RST, a more complex exchange might have occurred with RST's
  sent in both directions.

  Half-Open Connections and Other Anomalies

  An established connection is said to be  "half-open" if one of the
  TCPs has closed or aborted the connection at its end without the
  knowledge of the other, or if the two ends of the connection have
  become desynchronized owing to a crash that resulted in loss of
  memory.  Such connections will automatically become reset if an
  attempt is made to send data in either direction.  However, half-open
  connections are expected to be unusual, and the recovery procedure is
  mildly involved.

  If at site A the connection no longer exists, then an attempt by the
Top   ToC   RFC0793 - Page 34
  user at site B to send any data on it will result in the site B TCP
  receiving a reset control message.  Such a message indicates to the
  site B TCP that something is wrong, and it is expected to abort the
  connection.

  Assume that two user processes A and B are communicating with one
  another when a crash occurs causing loss of memory to A's TCP.
  Depending on the operating system supporting A's TCP, it is likely
  that some error recovery mechanism exists.  When the TCP is up again,
  A is likely to start again from the beginning or from a recovery
  point.  As a result, A will probably try to OPEN the connection again
  or try to SEND on the connection it believes open.  In the latter
  case, it receives the error message "connection not open" from the
  local (A's) TCP.  In an attempt to establish the connection, A's TCP
  will send a segment containing SYN.  This scenario leads to the
  example shown in figure 10.  After TCP A crashes, the user attempts to
  re-open the connection.  TCP B, in the meantime, thinks the connection
  is open.

  

      TCP A                                           TCP B

  1.  (CRASH)                               (send 300,receive 100)

  2.  CLOSED                                           ESTABLISHED

  3.  SYN-SENT --> <SEQ=400><CTL=SYN>              --> (??)

  4.  (!!)     <-- <SEQ=300><ACK=100><CTL=ACK>     <-- ESTABLISHED

  5.  SYN-SENT --> <SEQ=100><CTL=RST>              --> (Abort!!)

  6.  SYN-SENT                                         CLOSED

  7.  SYN-SENT --> <SEQ=400><CTL=SYN>              -->

                     Half-Open Connection Discovery

                               Figure 10.

  When the SYN arrives at line 3, TCP B, being in a synchronized state,
  and the incoming segment outside the window, responds with an
  acknowledgment indicating what sequence it next expects to hear (ACK
  100).  TCP A sees that this segment does not acknowledge anything it
  sent and, being unsynchronized, sends a reset (RST) because it has
  detected a half-open connection.  TCP B aborts at line 5.  TCP A will
Top   ToC   RFC0793 - Page 35
  continue to try to establish the connection; the problem is now
  reduced to the basic 3-way handshake of figure 7.

  An interesting alternative case occurs when TCP A crashes and TCP B
  tries to send data on what it thinks is a synchronized connection.
  This is illustrated in figure 11.  In this case, the data arriving at
  TCP A from TCP B (line 2) is unacceptable because no such connection
  exists, so TCP A sends a RST.  The RST is acceptable so TCP B
  processes it and aborts the connection.

  

        TCP A                                              TCP B

  1.  (CRASH)                                   (send 300,receive 100)

  2.  (??)    <-- <SEQ=300><ACK=100><DATA=10><CTL=ACK> <-- ESTABLISHED

  3.          --> <SEQ=100><CTL=RST>                   --> (ABORT!!)

           Active Side Causes Half-Open Connection Discovery

                               Figure 11.

  In figure 12, we find the two TCPs A and B with passive connections
  waiting for SYN.  An old duplicate arriving at TCP B (line 2) stirs B
  into action.  A SYN-ACK is returned (line 3) and causes TCP A to
  generate a RST (the ACK in line 3 is not acceptable).  TCP B accepts
  the reset and returns to its passive LISTEN state.

  

      TCP A                                         TCP B

  1.  LISTEN                                        LISTEN

  2.       ... <SEQ=Z><CTL=SYN>                -->  SYN-RECEIVED

  3.  (??) <-- <SEQ=X><ACK=Z+1><CTL=SYN,ACK>   <--  SYN-RECEIVED

  4.       --> <SEQ=Z+1><CTL=RST>              -->  (return to LISTEN!)

  5.  LISTEN                                        LISTEN

       Old Duplicate SYN Initiates a Reset on two Passive Sockets

                               Figure 12.
Top   ToC   RFC0793 - Page 36
  A variety of other cases are possible, all of which are accounted for
  by the following rules for RST generation and processing.

  Reset Generation

  As a general rule, reset (RST) must be sent whenever a segment arrives
  which apparently is not intended for the current connection.  A reset
  must not be sent if it is not clear that this is the case.

  There are three groups of states:

    1.  If the connection does not exist (CLOSED) then a reset is sent
    in response to any incoming segment except another reset.  In
    particular, SYNs addressed to a non-existent connection are rejected
    by this means.

    If the incoming segment has an ACK field, the reset takes its
    sequence number from the ACK field of the segment, otherwise the
    reset has sequence number zero and the ACK field is set to the sum
    of the sequence number and segment length of the incoming segment.
    The connection remains in the CLOSED state.

    2.  If the connection is in any non-synchronized state (LISTEN,
    SYN-SENT, SYN-RECEIVED), and the incoming segment acknowledges
    something not yet sent (the segment carries an unacceptable ACK), or
    if an incoming segment has a security level or compartment which
    does not exactly match the level and compartment requested for the
    connection, a reset is sent.

    If our SYN has not been acknowledged and the precedence level of the
    incoming segment is higher than the precedence level requested then
    either raise the local precedence level (if allowed by the user and
    the system) or send a reset; or if the precedence level of the
    incoming segment is lower than the precedence level requested then
    continue as if the precedence matched exactly (if the remote TCP
    cannot raise the precedence level to match ours this will be
    detected in the next segment it sends, and the connection will be
    terminated then).  If our SYN has been acknowledged (perhaps in this
    incoming segment) the precedence level of the incoming segment must
    match the local precedence level exactly, if it does not a reset
    must be sent.

    If the incoming segment has an ACK field, the reset takes its
    sequence number from the ACK field of the segment, otherwise the
    reset has sequence number zero and the ACK field is set to the sum
    of the sequence number and segment length of the incoming segment.
    The connection remains in the same state.
Top   ToC   RFC0793 - Page 37
    3.  If the connection is in a synchronized state (ESTABLISHED,
    FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT),
    any unacceptable segment (out of window sequence number or
    unacceptible acknowledgment number) must elicit only an empty
    acknowledgment segment containing the current send-sequence number
    and an acknowledgment indicating the next sequence number expected
    to be received, and the connection remains in the same state.

    If an incoming segment has a security level, or compartment, or
    precedence which does not exactly match the level, and compartment,
    and precedence requested for the connection,a reset is sent and
    connection goes to the CLOSED state.  The reset takes its sequence
    number from the ACK field of the incoming segment.

  Reset Processing

  In all states except SYN-SENT, all reset (RST) segments are validated
  by checking their SEQ-fields.  A reset is valid if its sequence number
  is in the window.  In the SYN-SENT state (a RST received in response
  to an initial SYN), the RST is acceptable if the ACK field
  acknowledges the SYN.

  The receiver of a RST first validates it, then changes state.  If the
  receiver was in the LISTEN state, it ignores it.  If the receiver was
  in SYN-RECEIVED state and had previously been in the LISTEN state,
  then the receiver returns to the LISTEN state, otherwise the receiver
  aborts the connection and goes to the CLOSED state.  If the receiver
  was in any other state, it aborts the connection and advises the user
  and goes to the CLOSED state.



(page 37 continued on part 3)

Next Section