Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 0761

DoD standard Transmission Control Protocol

Pages: 88
Obsoleted by:  07937805
Part 2 of 3 – Pages 20 to 45
First   Prev   Next

ToP   noToC   RFC0761 - Page 20   prevText
        Buffer Size Option Data:  16 bits

          If this option is present, then it communicates the receive
          buffer size at the TCP which sends this segment.  This field
          should only be sent in the initial connection request (i.e.,
          in segments with the SYN control bit set).  If this option is
          not used, the default buffer size of one octet is assumed.

  Padding:  variable

    The TCP header padding is used to ensure that the TCP header ends
    and data begins on a 32 bit boundary.  The padding is composed of
    zeros.

3.2.  Terminology

  Before we can discuss very much about the operation of the TCP we need
  to introduce some detailed terminology.  The maintenance of a TCP
  connection requires the remembering of several variables.  We conceive
  of these variables being stored in a connection record called a
  Transmission Control Block or TCB.  Among the variables stored in the
  TCB are the local and remote socket numbers, the security and
  precedence of the connection, pointers to the user's send and receive
  buffers, pointers to the retransmit queue and to the current segment.
  In addition several variables relating to the send and receive
  sequence numbers are stored in the TCB.

    Send Sequence Variables

      SND.UNA - send unacknowledged
      SND.NXT - send sequence
      SND.WND - send window
      SND.BS  - send buffer size
      SND.UP  - send urgent pointer
      SND.WL  - send sequence number used for last window update
      SND.LBB - send last buffer beginning
      ISS     - initial send sequence number

    Receive Sequence Variables

      RCV.NXT - receive sequence
      RCV.WND - receive window
      RCV.BS  - receive buffer size
      RCV.UP  - receive urgent pointer
      RCV.LBB - receive last buffer beginning
      IRS     - initial receive sequence number
ToP   noToC   RFC0761 - Page 21
  The following diagrams may help to relate some of these variables to
  the sequence space.

  Send Sequence Space

                   1         2          3          4      
              ----------|----------|----------|---------- 
                     SND.UNA    SND.NXT    SND.UNA        
                                          +SND.WND        

        1 - old sequence numbers which have been acknowledged  
        2 - sequence numbers of unacknowledged data            
        3 - sequence numbers allowed for new data transmission 
        4 - future sequence numbers which are not yet allowed  

                          Send Sequence Space

                               Figure 4.
    
    

  Receive Sequence Space

                       1          2          3      
                   ----------|----------|---------- 
                          RCV.NXT    RCV.NXT        
                                    +RCV.WND        

        1 - old sequence numbers which have been acknowledged  
        2 - sequence numbers allowed for new reception         
        3 - future sequence numbers which are not yet allowed  

                         Receive Sequence Space

                               Figure 5.
    
    

  There are also some variables used frequently in the discussion that
  take their values from the fields of the current segment.
ToP   noToC   RFC0761 - Page 22
    Current Segment Variables

      SEG.SEQ - segment sequence number
      SEG.ACK - segment acknowledgment number
      SEG.LEN - segment length
      SEG.WND - segment window
      SEG.UP  - segment urgent pointer
      SEG.PRC - segment precedence value

  A connection progresses through a series of states during its
  lifetime.  The states are:  LISTEN, SYN-SENT, SYN-RECEIVED,
  ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, TIME-WAIT, CLOSE-WAIT, CLOSING,
  and the fictional state CLOSED.  CLOSED is fictional because it
  represents the state when there is no TCB, and therefore, no
  connection.  Briefly the meanings of the states are:

    LISTEN - represents waiting for a connection request from any remote
    TCP and port.

    SYN-SENT - represents waiting for a matching connection request
    after having sent a connection request.

    SYN-RECEIVED - represents waiting for a confirming connection
    request acknowledgment after having both received and sent a
    connection request.

    ESTABLISHED - represents an open connection, ready to transmit and
    receive data segments.

    FIN-WAIT-1 - represents waiting for a connection termination request
    from the remote TCP, or an acknowledgment of the connection
    termination request previously sent.

    FIN-WAIT-2 - represents waiting for a connection termination request
    from the remote TCP.

    TIME-WAIT - represents waiting for enough time to pass to be sure
    the remote TCP received the acknowledgment of its connection
    termination request.

    CLOSE-WAIT - represents waiting for a connection termination request
    from the local user.

    CLOSING - represents waiting for a connection termination request
    acknowledgment from the remote TCP.

    CLOSED - represents no connection state at all.
ToP   noToC   RFC0761 - Page 23
  A TCP connection progresses from one state to another in response to
  events.  The events are the user calls, OPEN, SEND, RECEIVE, CLOSE,
  ABORT, and STATUS; the incoming segments, particularly those
  containing the SYN and FIN flags; and timeouts.

  The Glossary contains a more complete list of terms and their
  definitions.

  The state diagram in figure 6 only illustrates state changes, together
  with the causing events and resulting actions, but addresses neither
  error conditions nor actions which are not connected with state
  changes.  In a later section, more detail is offered with respect to
  the reaction of the TCP to events.
ToP   noToC   RFC0761 - Page 24
                                    
                              +---------+ ---------\      active OPEN  
                              |  CLOSED |            \    -----------  
                              +---------+<---------\   \   create TCB  
                                |     ^              \   \  snd SYN    
                   passive OPEN |     |   CLOSE        \   \           
                   ------------ |     | ----------       \   \         
                    create TCB  |     | delete TCB         \   \       
                                V     |                      \   \     
                              +---------+            CLOSE    |    \   
                              |  LISTEN |          ---------- |     |  
                              +---------+          delete TCB |     |  
                   rcv SYN      |     |     SEND              |     |  
                  -----------   |     |    -------            |     V  
 +---------+      snd SYN,ACK  /       \   snd SYN          +---------+
 |         |<-----------------           ------------------>|         |
 |   SYN   |                    rcv SYN                     |   SYN   |
 |   RCVD  |<-----------------------------------------------|   SENT  |
 |         |                    snd ACK                     |         |
 |         |------------------           -------------------|         |
 +---------+   rcv ACK of SYN  \       /  rcv SYN,ACK       +---------+
   |           --------------   |     |   -----------                  
   |                  x         |     |     snd ACK                    
   |                            V     V                                
   |  CLOSE                   +---------+                              
   | -------                  |  ESTAB  |                              
   | snd FIN                  +---------+                              
   |                   CLOSE    |     |    rcv FIN                     
   V                  -------   |     |    -------                     
 +---------+          snd FIN  /       \   snd ACK          +---------+
 |  FIN    |<-----------------           ------------------>|  CLOSE  |
 | WAIT-1  |------------------           -------------------|   WAIT  |
 +---------+          rcv FIN  \       /   CLOSE            +---------+
   | rcv ACK of FIN   -------   |     |   -------                      
   | --------------   snd ACK   |     |   snd FIN                      
   V        x                   V     V                                
 +---------+                  +---------+                              
 |FINWAIT-2|                  | CLOSING |                              
 +---------+                  +---------+                              
   | rcv FIN                          | rcv ACK of FIN                 
   | -------    Timeout=2MSL          | --------------                 
   V snd ACK    ------------          V   delete TCB                   
 +---------+     delete TCB   +---------+                              
 |TIME WAIT|----------------->| CLOSED  |                              
 +---------+                  +---------+                              

                      TCP Connection State Diagram
                               Figure 6.
ToP   noToC   RFC0761 - Page 25
3.3.  Sequence Numbers

  A fundamental notion in the design is that every octet of data sent
  over a TCP connection has a sequence number.  Since every octet is
  sequenced, each of them can be acknowledged.  The acknowledgment
  mechanism employed is cumulative so that an acknowledgment of sequence
  number X indicates that all octets up to but not including X have been
  received.  This mechanism allows for straight-forward duplicate
  detection in the presence of retransmission.  Numbering of octets
  within a segment is that the first data octet immediately following
  the header is the lowest numbered, and the following octets are
  numbered consecutively.

  It is essential to remember that the actual sequence number space is
  finite, though very large.  This space ranges from 0 to 2**32 - 1.
  Since the space is finite, all arithmetic dealing with sequence
  numbers must be performed modulo 2**32.  This unsigned arithmetic
  preserves the relationship of sequence numbers as they cycle from
  2**32 - 1 to 0 again.  There are some subtleties to computer modulo
  arithmetic, so great care should be taken in programming the
  comparison of such values.  The typical kinds of sequence number
  comparisons which the TCP must perform include:

    (a)  Determining that an acknowledgment refers to some sequence
         number sent but not yet acknowledged.

    (b)  Determining that all sequence numbers occupied by a segment
         have been acknowledged (e.g., to remove the segment from a
         retransmission queue).

    (c)  Determining that an incoming segment contains sequence numbers
         which are expected (i.e., that the segment "overlaps" the
         receive window).
ToP   noToC   RFC0761 - Page 26
  On send connections the following comparisons are needed:

    older sequence numbers                        newer sequence numbers

                                    
        SND.UNA                SEG.ACK                 SND.NXT  
           |                      |                       |     
       ----|----XXXXXXX------XXXXXXXXXX---------XXXXXX----|---- 
           |    |            |    |             |         |     
                |            |                  |               
             Segment 1    Segment 2          Segment 3          

                      <----- sequence space ----->

                   Sending Sequence Space Information

                               Figure 7.

    SND.UNA = oldest unacknowledged sequence number

    SND.NXT = next sequence number to be sent

    SEG.ACK = acknowledgment (next sequence number expected by the
              acknowledging TCP)

    SEG.SEQ = first sequence number of a segment

    SEG.SEQ+SEG.LEN-1 = last sequence number of a segment

  A new acknowledgment (called an "acceptable ack"), is one for which
  the inequality below holds:

    SND.UNA < SEG.ACK =< SND.NXT

  All arithmetic is modulo 2**32 and that comparisons are unsigned.
  "=<" means "less than or equal".

  A segment on the retransmission queue is fully acknowledged if the sum
  of its sequence number and length is less than the acknowledgment
  value in the incoming segment.

  SEG.LEN is the number of octets occupied by the data in the segment.
  It is important to note that SEG.LEN must be non-zero; segments which
  do not occupy any sequence space (e.g., empty acknowledgment segments)
  are never placed on the retransmission queue, so would not go through
  this particular test.
ToP   noToC   RFC0761 - Page 27
  On receive connections the following comparisons are needed:

    older sequence numbers                        newer sequence numbers

                                    
                RCV.NXT                         RCV.NXT+RCV.WND 
                   |                               |            
       ---------XXX|XXX------XXXXXXXXXX---------XXX|XX--------- 
                |  |         |                  |  |            
                |            |                  |               
             Segment 1    Segment 2          Segment 3          

                      <----- sequence space ----->

                  Receiving Sequence Space Information

                                Figure 8.

    RCV.NXT = next sequence number expected on incoming segments

    RCV.NXT+RCV.WND = last sequence number expected on incoming
        segments, plus one

    SEG.SEQ = first sequence number occupied by the incoming segment

    SEG.SEQ+SEG.LEN-1 = last sequence number occupied by the incoming
        segment

  A segment is judged to occupy a portion of valid receive sequence
  space if

     0 =< (SEG.SEQ+SEG.LEN-1 - RCV.NXT) < (RCV.NXT+RCV.WND - RCV.NXT)

  SEG.SEQ+SEG.LEN-1 is the last sequence number occupied by the segment;
  RCV.NXT is the next sequence number expected on an incoming segment;
  and RCV.NXT+RCV.WND is the right edge of the receive window.

  Actually, it is a little more complicated than this.  Due to zero
  windows and zero length segments, we have four cases for the
  acceptability of an incoming segment:
ToP   noToC   RFC0761 - Page 28
    Segment Receive  Test
    Length  Window
    ------- -------  -------------------------------------------

       0       0     SEG.SEQ = RCV.NXT

       0      >0     RCV.NXT =< SEG.SEQ < RCV.NXT+RCV.WND

      >0       0     not acceptable

      >0      >0     RCV.NXT < SEG.SEQ+SEG.LEN =< RCV.NXT+RCV.WND

  Note that the acceptance test for a segment, since it requires the end
  of a segment to lie in the window, is somewhat more restrictive than
  is absolutely necessary.  If at least the first sequence number of the
  segment lies in the receive window, or if some part of the segment
  lies in the receive window, then the segment might be judged
  acceptable.  Thus, in figure 8, at least segments 1 and 2 are
  acceptable by the strict rule, and segment 3 may or may not be,
  depending on the strictness of interpretation of the rule.

  Note that when the receive window is zero no segments should be
  acceptable except ACK segments.  Thus, it should be possible for a TCP
  to maintain a zero receive window while transmitting data and
  receiving ACKs.

  We have taken advantage of the numbering scheme to protect certain
  control information as well.  This is achieved by implicitly including
  some control flags in the sequence space so they can be retransmitted
  and acknowledged without confusion (i.e., one and only one copy of the
  control will be acted upon).  Control information is not physically
  carried in the segment data space.  Consequently, we must adopt rules
  for implicitly assigning sequence numbers to control.  The SYN and FIN
  are the only controls requiring this protection, and these controls
  are used only at connection opening and closing.  For sequence number
  purposes, the SYN is considered to occur before the first actual data
  octet of the segment in which it occurs, while the FIN is considered
  to occur after the last actual data octet in a segment in which it
  occurs.  The segment length includes both data and sequence space
  occupying controls.  When a SYN is present then SEG.SEQ is the
  sequence number of the SYN.

  Initial Sequence Number Selection

  The protocol places no restriction on a particular connection being
  used over and over again.  A connection is defined by a pair of
  sockets.  New instances of a connection will be referred to as
  incarnations of the connection.  The problem that arises owing to this
ToP   noToC   RFC0761 - Page 29
  is -- "how does the TCP identify duplicate segments from previous
  incarnations of the connection?"  This problem becomes apparent if the
  connection is being opened and closed in quick succession, or if the
  connection breaks with loss of memory and is then reestablished.

  To avoid confusion we must prevent segments from one incarnation of a
  connection from being used while the same sequence numbers may still
  be present in the network from an earlier incarnation.  We want to
  assure this, even if a TCP crashes and loses all knowledge of the
  sequence numbers it has been using.  When new connections are created,
  an initial sequence number (ISN) generator is employed which selects a
  new 32 bit ISN.  The generator is bound to a (possibly fictitious) 32
  bit clock whose low order bit is incremented roughly every 4
  microseconds.  Thus, the ISN cycles approximately every 4.55 hours.
  Since we assume that segments will stay in the network no more than
  tens of seconds or minutes, at worst, we can reasonably assume that
  ISN's will be unique.

  For each connection there is a send sequence number and a receive
  sequence number.  The initial send sequence number (ISS) is chosen by
  the data sending TCP, and the initial receive sequence number (IRS) is
  learned during the connection establishing procedure.

  For a connection to be established or initialized, the two TCPs must
  synchronize on each other's initial sequence numbers.  This is done in
  an exchange of connection establishing messages carrying a control bit
  called "SYN" (for synchronize) and the initial sequence numbers.  As a
  shorthand, messages carrying the SYN bit are also called "SYNs".
  Hence, the solution requires a suitable mechanism for picking an
  initial sequence number and a slightly involved handshake to exchange
  the ISN's.  A "three way handshake" is necessary because sequence
  numbers are not tied to a global clock in the network, and TCPs may
  have different mechanisms for picking the ISN's.  The receiver of the
  first SYN has no way of knowing whether the segment was an old delayed
  one or not, unless it remembers the last sequence number used on the
  connection (which is not always possible), and so it must ask the
  sender to verify this SYN.

  The "three way handshake" and the advantages of a "clock-driven"
  scheme are discussed in [4].

  Knowing When to Keep Quiet

  To be sure that a TCP does not create a segment that carries a
  sequence number which may be duplicated by an old segment remaining in
  the network, the TCP must keep quiet for a maximum segment lifetime
  (MSL) before assigning any sequence numbers upon starting up or
  recovering from a crash in which memory of sequence numbers in use was
ToP   noToC   RFC0761 - Page 30
  lost.  For this specification the MSL is taken to be 2 minutes.  This
  is an engineering choice, and may be changed if experience indicates
  it is desirable to do so.  Note that if a TCP is reinitialized in some
  sense, yet retains its memory of sequence numbers in use, then it need
  not wait at all; it must only be sure to use sequence numbers larger
  than those recently used.

  It should be noted that this strategy does not protect against
  spoofing or other replay type duplicate message problems.

3.4.  Establishing a connection

  The "three-way handshake" is the procedure used to establish a
  connection.  This procedure normally is initiated by one TCP and
  responded to by another TCP.  The procedure also works if two TCP
  simultaneously initiate the procedure.  When simultaneous attempt
  occurs, the TCP receives a "SYN" segment which carries no
  acknowledgment after it has sent a "SYN".  Of course, the arrival of
  an old duplicate "SYN" segment can potentially make it appear, to the
  recipient, that a simultaneous connection initiation is in progress.
  Proper use of "reset" segments can disambiguate these cases.  Several
  examples of connection initiation follow.  Although these examples do
  not show connection synchronization using data-carrying segments, this
  is perfectly legitimate, so long as the receiving TCP doesn't deliver
  the data to the user until it is clear the data is valid (i.e., the
  data must be buffered at the receiver until the connection reaches the
  ESTABLISHED state).  The three-way handshake reduces the possibility
  of false connections.  It is the implementation of a trade-off between
  memory and messages to provide information for this checking.

  The simplest three-way handshake is shown in figure 9 below.  The
  figures should be interpreted in the following way.  Each line is
  numbered for reference purposes.  Right arrows (-->) indicate
  departure of a TCP segment from TCP A to TCP B, or arrival of a
  segment at B from A.  Left arrows (<--), indicate the reverse.
  Ellipsis (...) indicates a segment which is still in the network
  (delayed).  An "XXX" indicates a segment which is lost or rejected.
  Comments appear in parentheses.  TCP states represent the state AFTER
  the departure or arrival of the segment (whose contents are shown in
  the center of each line).  Segment contents are shown in abbreviated
  form, with sequence number, control flags, and ACK field.  Other
  fields such as window, addresses, lengths, and text have been left out
  in the interest of clarity.
ToP   noToC   RFC0761 - Page 31
      TCP A                                                TCP B

  1.  CLOSED                                               LISTEN

  2.  SYN-SENT    --> <SEQ=100><CTL=SYN>               --> SYN-RECEIVED

  3.  ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK>  <-- SYN-RECEIVED

  4.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK>       --> ESTABLISHED

  5.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK><DATA> --> ESTABLISHED

          Basic 3-Way Handshake for Connection Synchronization

                                Figure 9.

  In line 2 of figure 9, TCP A begins by sending a SYN segment
  indicating that it will use sequence numbers starting with sequence
  number 100.  In line 3, TCP B sends a SYN and acknowledges the SYN it
  received from TCP A.  Note that the acknowledgment field indicates TCP
  B is now expecting to hear sequence 101, acknowledging the SYN which
  occupied sequence 100.

  At line 4, TCP A responds with an empty segment containing an ACK for
  TCP B's SYN; and in line 5, TCP A sends some data.  Note that the
  sequence number of the segment in line 5 is the same as in line 4
  because the ACK does not occupy sequence number space (if it did, we
  would wind up ACKing ACK's!).

  Simultaneous initiation is only slightly more complex, as is shown in
  figure 10.  Each TCP cycles from CLOSED to SYN-SENT to SYN-RECEIVED to
  ESTABLISHED.

  The principle reason for the three-way handshake is to prevent old
  duplicate connection initiations from causing confusion.  To deal with
  this, a special control message, reset, has been devised.  If the
  receiving TCP is in a  non-synchronized state (i.e., SYN-SENT,
  SYN-RECEIVED), it returns to LISTEN on receiving an acceptable reset.
  If the TCP is in one of the synchronized states (ESTABLISHED,
  FIN-WAIT-1, FIN-WAIT-2, TIME-WAIT, CLOSE-WAIT, CLOSING), it aborts the
  connection and informs its user.  We discuss this latter case under
  "half-open" connections below.
ToP   noToC   RFC0761 - Page 32
      TCP A                                        TCP B

  1.  CLOSED                                       CLOSED

  2.  SYN-SENT     --> <SEQ=100><CTL=SYN>          ...

  3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>          <-- SYN-SENT

  4.               ... <SEQ=100><CTL=SYN>          --> SYN-RECEIVED

  5.  SYN-RECEIVED --> <SEQ=101><ACK=301><CTL=ACK> ...

  6.  ESTABLISHED  <-- <SEQ=301><ACK=101><CTL=ACK> <-- SYN-RECEIVED

  7.               ... <SEQ=101><ACK=301><CTL=ACK> --> ESTABLISHED

                Simultaneous Connection Synchronization

                               Figure 10.

  

      TCP A                                                TCP B

  1.  CLOSED                                               LISTEN

  2.  SYN-SENT    --> <SEQ=100><CTL=SYN>               ...

  3.  (duplicate) ... <SEQ=1000><CTL=SYN>              --> SYN-RECEIVED

  4.  SYN-SENT    <-- <SEQ=300><ACK=1001><CTL=SYN,ACK> <-- SYN-RECEIVED

  5.  SYN-SENT    --> <SEQ=1001><CTL=RST>              --> LISTEN
  

  6.              ... <SEQ=100><CTL=SYN>               --> SYN-RECEIVED

  7.  SYN-SENT    <-- <SEQ=400><ACK=101><CTL=SYN,ACK>  <-- SYN-RECEIVED

  8.  ESTABLISHED --> <SEQ=101><ACK=401><CTL=ACK>      --> ESTABLISHED

                    Recovery from Old Duplicate SYN

                               Figure 11.

  As a simple example of recovery from old duplicates, consider
ToP   noToC   RFC0761 - Page 33
  figure 11.  At line 3, an old duplicate SYN arrives at TCP B.  TCP B
  cannot tell that this is an old duplicate, so it responds normally
  (line 4).  TCP A detects that the ACK field is incorrect and returns a
  RST (reset) with its SEQ field selected to make the segment
  believable.  TCP B, on receiving the RST, returns to the LISTEN state.
  When the original SYN (pun intended) finally arrives at line 6, the
  synchronization proceeds normally.  If the SYN at line 6 had arrived
  before the RST, a more complex exchange might have occurred with RST's
  sent in both directions.

  Half-Open Connections and Other Anomalies

  An established connection is said to be  "half-open" if one of the
  TCPs has closed or aborted the connection at its end without the
  knowledge of the other, or if the two ends of the connection have
  become desynchronized owing to a crash that resulted in loss of
  memory.  Such connections will automatically become reset if an
  attempt is made to send data in either direction.  However, half-open
  connections are expected to be unusual, and the recovery procedure is
  mildly involved.

  If at site A the connection no longer exists, then an attempt by the
  user at site B to send any data on it will result in the site B TCP
  receiving a reset control message.  Such a message should indicate to
  the site B TCP that something is wrong, and it is expected to abort
  the connection.

  Assume that two user processes A and B are communicating with one
  another when a crash occurs causing loss of memory to A's TCP.
  Depending on the operating system supporting A's TCP, it is likely
  that some error recovery mechanism exists.  When the TCP is up again,
  A is likely to start again from the beginning or from a recovery
  point.  As a result, A will probably try to OPEN the connection again
  or try to SEND on the connection it believes open.  In the latter
  case, it receives the error message "connection not open" from the
  local (A's) TCP.  In an attempt to establish the connection, A's TCP
  will send a segment containing SYN.  This scenario leads to the
  example shown in figure 12.  After TCP A crashes, the user attempts to
  re-open the connection.  TCP B, in the meantime, thinks the connection
  is open.
ToP   noToC   RFC0761 - Page 34
      TCP A                                           TCP B

  1.  (CRASH)                               (send 300,receive 100)

  2.  CLOSED                                           ESTABLISHED

  3.  SYN-SENT --> <SEQ=400><CTL=SYN>              --> (??)

  4.  (!!)     <-- <SEQ=300><ACK=100><CTL=ACK>     <-- ESTABLISHED

  5.  SYN-SENT --> <SEQ=100><CTL=RST>              --> (Abort!!)

  6.                                                   CLOSED

  7.  SYN-SENT --> <SEQ=400><CTL=SYN>              -->

                     Half-Open Connection Discovery

                               Figure 12.

  When the SYN arrives at line 3, TCP B, being in a synchronized state,
  responds with an acknowledgment indicating what sequence it next
  expects to hear (ACK 100).  TCP A sees that this segment does not
  acknowledge anything it sent and, being unsynchronized, sends a reset
  (RST) because it has detected a half-open connection.  TCP B aborts at
  line 5.  TCP A will continue to try to establish the connection; the
  problem is now reduced to the basic 3-way handshake of figure 9.

  An interesting alternative case occurs when TCP A crashes and TCP B
  tries to send data on what it thinks is a synchronized connection.
  This is illustrated in figure 13.  In this case, the data arriving at
  TCP A from TCP B (line 2) is unacceptable because no such connection
  exists, so TCP A sends a RST.  The RST is acceptable so TCP B
  processes it and aborts the connection.
ToP   noToC   RFC0761 - Page 35
        TCP A                                              TCP B

  1.  (CRASH)                                   (send 300,receive 100)

  2.  (??)    <-- <SEQ=300><ACK=100><DATA=10><CTL=ACK> <-- ESTABLISHED

  3.          --> <SEQ=100><CTL=RST>                   --> (ABORT!!)

           Active Side Causes Half-Open Connection Discovery

                               Figure 13.

  In figure 14, we find the two TCPs A and B with passive connections
  waiting for SYN.  An old duplicate arriving at TCP B (line 2) stirs B
  into action.  A SYN-ACK is returned (line 3) and causes TCP A to
  generate a RST (the ACK in line 3 is not acceptable).  TCP B accepts
  the reset and returns to its passive LISTEN state.

  

      TCP A                                         TCP B

  1.  LISTEN                                        LISTEN

  2.       ... <SEQ=Z><CTL=SYN>                -->  SYN-RECEIVED

  3.  (??) <-- <SEQ=X><ACK=Z+1><CTL=SYN,ACK>   <--  SYN-RECEIVED

  4.       --> <SEQ=Z+1><CTL=RST>              -->  (return to LISTEN!)

  5.  LISTEN                                        LISTEN

       Old Duplicate SYN Initiates a Reset on two Passive Sockets

                               Figure 14.

  A variety of other cases are possible, all of which are accounted for
  by the following rules for RST generation and processing.

  Reset Generation

  As a general rule, reset (RST) should be sent whenever a segment
  arrives which apparently is not intended for the current or a future
  incarnation of the connection.  A reset should not be sent if it is
  not clear that this is the case.  Thus, if any segment arrives for a
  nonexistent connection, a reset should be sent.  If a segment ACKs
ToP   noToC   RFC0761 - Page 36
  something which has never been sent on the current connection, then
  one of the following two cases applies.

  1.  If the connection is in any non-synchronized state (LISTEN,
  SYN-SENT, SYN-RECEIVED) or if the connection does not exist, a reset
  (RST) should be formed and sent for any segment that acknowledges
  something not yet sent.  The RST should take its SEQ field from the
  ACK field of the offending segment (if the ACK control bit was set),
  and its ACK bit should be reset (zero), except to refuse a initial
  SYN.  A reset is also sent if an incoming segment has a security level
  or compartment which does not exactly match the level and compartment
  requested for the connection.  If the precedence of the incoming
  segment is less than the precedence level requested a reset is sent.

  2.  If the connection is in a synchronized state (ESTABLISHED,
  FIN-WAIT-1, FIN-WAIT-2, TIME-WAIT, CLOSE-WAIT, CLOSING), any
  unacceptable segment should elicit only an empty acknowledgment
  segment containing the current send-sequence number and an
  acknowledgment indicating the next sequence number expected to be
  received.

  Reset Processing

  All reset (RST) segments are validated by checking their SEQ-fields.
  A reset is valid if its sequence number is in the window.  In the case
  of a RST received in response to an initial SYN any sequence number is
  acceptable if the ACK field acknowledges the SYN.

  The receiver of a RST first validates it, then changes state.  If the
  receiver was in the LISTEN state, it ignores it.  If the receiver was
  in SYN-RECEIVED state and had previously been in the LISTEN state,
  then the receiver returns to the LISTEN state, otherwise the receiver
  aborts the connection and goes to the CLOSED state.  If the receiver
  was in any other state, it aborts the connection and advises the user
  and goes to the CLOSED state.

3.5.  Closing a Connection

  CLOSE is an operation meaning "I have no more data to send."  The
  notion of closing a full-duplex connection is subject to ambiguous
  interpretation, of course, since it may not be obvious how to treat
  the receiving side of the connection.  We have chosen to treat CLOSE
  in a simplex fashion.  The user who CLOSEs may continue to RECEIVE
  until he is told that the other side has CLOSED also.  Thus, a program
  could initiate several SENDs followed by a CLOSE, and then continue to
  RECEIVE until signaled that a RECEIVE failed because the other side
  has CLOSED.  We assume that the TCP will signal a user, even if no
  RECEIVEs are outstanding, that the other side has closed, so the user
ToP   noToC   RFC0761 - Page 37
  can terminate his side gracefully.  A TCP will reliably deliver all
  buffers SENT before the connection was CLOSED so a user who expects no
  data in return need only wait to hear the connection was CLOSED
  successfully to know that all his data was received at the destination
  TCP.

  There are essentially three cases:

    1) The user initiates by telling the TCP to CLOSE the connection

    2) The remote TCP initiates by sending a FIN control signal

    3) Both users CLOSE simultaneously

  Case 1:  Local user initiates the close

    In this case, a FIN segment can be constructed and placed on the
    outgoing segment queue.  No further SENDs from the user will be
    accepted by the TCP, and it enters the FIN-WAIT-1 state.  RECEIVEs
    are allowed in this state.  All segments preceding and including FIN
    will be retransmitted until acknowledged.  When the other TCP has
    both acknowledged the FIN and sent a FIN of its own, the first TCP
    can ACK this FIN.  It should be noted that a TCP receiving a FIN
    will ACK but not send its own FIN until its user has CLOSED the
    connection also.

  Case 2:  TCP receives a FIN from the network

    If an unsolicited FIN arrives from the network, the receiving TCP
    can ACK it and tell the user that the connection is closing.  The
    user should respond with a CLOSE, upon which the TCP can send a FIN
    to the other TCP.  The TCP then waits until its own FIN is
    acknowledged whereupon it deletes the connection.  If an ACK is not
    forthcoming, after a timeout the connection is aborted and the user
    is told.

  Case 3:  both users close simultaneously

    A simultaneous CLOSE by users at both ends of a connection causes
    FIN segments to be exchanged.  When all segments preceding the FINs
    have been processed and acknowledged, each TCP can ACK the FIN it
    has received.  Both will, upon receiving these ACKs, delete the
    connection.
ToP   noToC   RFC0761 - Page 38
      TCP A                                                TCP B

  1.  ESTABLISHED                                          ESTABLISHED

  2.  (Close)
      FIN-WAIT-1  --> <SEQ=100><CTL=FIN>               --> CLOSE-WAIT

  3.  FIN-WAIT-2  <-- <SEQ=300><ACK=101><CTL=ACK>      <-- CLOSE-WAIT

  4.                                                       (Close)
      TIME-WAIT   <-- <SEQ=301><CTL=FIN>               <-- CLOSING

  5.  TIME-WAIT   --> <SEQ=100><ACK=301><CTL=ACK>      --> CLOSED

  6.  (2 MSL)
      CLOSED

                         Normal Close Sequence

                               Figure 15.

  

      TCP A                                                TCP B

  1.  ESTABLISHED                                          ESTABLISHED

  2.  (Close)                                              (Close)
      FIN-WAIT-1  --> <SEQ=100><CTL=FIN>               ... FIN-WAIT-1
                  <-- <SEQ=300><CTL=FIN>               <--
                  ... <SEQ=100><CTL=FIN>               -->

  3.  CLOSING     --> <SEQ=100><ACK=301><CTL=ACK>      ... CLOSING
                  <-- <SEQ=300><ACK=101><CTL=ACK>      <--
                  ... <SEQ=100><ACK=301><CTL=ACK>      -->

  4.  CLOSED                                               CLOSED

                      Simultaneous Close Sequence

                               Figure 16.
ToP   noToC   RFC0761 - Page 39
3.6.  Precedence and Security

  The intent is that connection be allowed only between ports operating
  with exactly the same security and compartment values and at the
  higher of the precedence level requested by the two parts.

  The precedence levels are:

    flash override - 111
    flash          - 110
    immediate      - 10X
    priority       - 01X
    routine        - 00X

  The security levels are:

    top secret    - 11
    secret        - 10
    confidential  - 01
    unclassified  - 00

  The compartments are assigned by the Defense Communications Agency.
  The defaults are precedence:  routine, security:  unclassified,
  compartment:  zero.  A host which does not implement precedence or
  security feature should clear these fields to zero for segments it
  sends.

  A connection attempt with mismatched security/compartment values or a
  lower precedence value should be rejected by sending a reset.

  Note that TCP modules which operate only at the default value of
  precedence will still have to check the precedence of incoming
  segments and possibly raise the precedence level they use on the
  connection.

3.7.  Data Communication

  Once the connection is established data is communicated by the
  exchange of segments.  Because segments may be lost due to errors
  (checksum test failure), or network congestion, TCP uses
  retransmission (after a timeout) to ensure delivery of every segment.
  Duplicate segments may arrive due to network or TCP retransmission.
  As discussed in the section on sequence numbers the TCP performs
  certain tests on the sequence and acknowledgment numbers in the
  segments to verify their acceptability.

  The sender of data keeps track of the next sequence number to use in
  the variable SND.NXT.  The receiver of data keeps track of the next
ToP   noToC   RFC0761 - Page 40
  sequence number to expect in the variable RCV.NXT.  The sender of data
  keeps track of the oldest unacknowledged sequence number in the
  variable SND.UNA.  If the data flow is momentarily idle and all data
  sent has been acknowledged then the three variables will be equal.

  When the sender creates a segment and transmits it the sender advances
  SND.NXT.  When the receiver accepts a segment it advances RCV.NXT and
  sends an acknowledgment.  When the data sender receives an
  acknowledgment it advances SND.UNA.  The extent to which the values of
  these variables differ is a measure of the delay in the communication.

  Normally the amount by which the variables are advanced is the length
  of the data in the segment.  However, when letters are used there are
  special provisions for coordination the sequence numbers, the letter
  boundaries, and the receive buffer boundaries.

  End of Letter Sequence Number Adjustments

  There is provision in TCP for the receiver of data to optionally
  communicate to the sender of data on a connection at the time of the
  connection synchronization the receiver's buffer size.  If this is
  done the receiver must use this fixed size of buffers for the lifetime
  of the connection.  If a buffer size is communicated then there is a
  coordination between receive buffers, letters, and sequence numbers.

  Each time a buffer is completed either due to being filled or due to
  an end of letter, the sequence number is incremented through the end
  of that buffer.

  That is, whenever an EOL is transmitted, the sender advances its send
  sequence number, SND.NXT, by an amount sufficient to consume all the
  unused space in the receiver's buffer.  The amount of space consumed
  in this fashion is subtracted from the send window just as is the
  space consumed by actual data.

  And, whenever an EOL is received, the receiver advances its receive
  sequence number, RCV.NXT, by an amount sufficient to consume all the
  unused space in the receiver's buffer.  The amount of space consumed
  in this fashion is subtracted from the receive window just as is the
  space consumed by actual data.
ToP   noToC   RFC0761 - Page 41
    older sequence numbers                        newer sequence numbers

            |           Buffer 1            |   Buffer 2       
            |                               |                  
        ----+-------------------------------+----------------- 
            XXXXXXXXXXXXXXXXXXXXX+++++++++++                   
            |                    |          |                  
            |<-----SEG.LEN------>|          |                  
            |                    |          |                  
            |                    |          |                  
         SEG.SEQ                 A          B                  

                    XXX - data octets from segment 
                    +++ - phantom data             

                      <----- sequence space ----->

                        End of Letter Adjustment

                               Figure 17.

  In the case illustrated above, if the segment does not carry an EOL
  flag, the next value of SND.NXT or RCV.NXT will be A.  If it does
  carry an EOL flag, the next value will be B.

  The exchange of buffer size and sequencing information is done in
  units of octets.  If no buffer size is stated, then the buffer size is
  assumed to be 1 octet.  The receiver tells the sender the size of the
  buffer in a SYN segment that contains the 16 bit buffer size data in
  an option field in the TCP header.

  Each EOL advances the sequence number (SN) to the next buffer boundary

    While LBB < SEG.SEQ+SEG.LEN
    Do LBB <- LBB + BS End
    SN <- LBB

    where LBB is the Last Buffer Beginning, and BS is the buffer size.

  The CLOSE user call implies an end of letter, as does the FIN control
  flag in an incoming segment.

  The Communication of Urgent Information

  The objective of the TCP urgent mechanism is to allow the sending user
  to stimulate the receiving user to accept some urgent data and to
  permit the receiving TCP to indicate to the receiving user when all
  the currently known urgent data has been received by the user.
ToP   noToC   RFC0761 - Page 42
  This mechanism permits a point in the data stream to be designated as
  the end of "urgent" information.  Whenever this point is in advance of
  the receive sequence number (RCV.NXT) at the receiving TCP, that TCP
  should tell the user to go into "urgent mode"; when the receive
  sequence number catches up to the urgent pointer, the TCP should tell
  user to go into "normal mode".  If the urgent pointer is updated while
  the user is in "read fast" mode, the update will be invisible to the
  user.

  The method employs a urgent field which is carried in all segments
  transmitted.  The URG control flag indicates that the urgent field is
  meaningful and should be added to the segment sequence number to yield
  the urgent pointer.  The absence of this flag indicates that the
  urgent pointer has not changed.

  To send an urgent indication the user must also send at least one data
  octet.  If the sending user also indicates end of letter, timely
  delivery of the urgent information to the destination process is
  enhanced.

  Managing the Window

  The window sent in each segment indicates the range of sequence number
  the sender of the window (the data receiver) is currently prepared to
  accept.  There is an assumption that this is related to the currently
  available data buffer space available for this connection.  The window
  information is a guideline to be aimed at.

  Indicating a large window encourages transmissions.  If more data
  arrives than can be accepted, it will be discarded.  This will result
  in excessive retransmissions, adding unnecessarily to the load on the
  network and the TCPs.  Indicating a small window may restrict the
  transmission of data to the point of introducing a round trip delay
  between each new segment transmitted.

  The mechanisms provided allow a TCP to advertise a large window and to
  subsequently advertise a much smaller window without having accepted
  that much data.  This, so called "shrinking the window," is strongly
  discouraged.  The robustness principle dictates that TCPs will not
  shrink the window themselves, but will be prepared for such behavior
  on the part of other TCPs.

  The sending TCP must be prepared to accept and send at least one octet
  of new data even if the send window is zero.  The sending TCP should
  regularly retransmit to the receiving TCP even when the window is
  zero.  Two minutes is recommended for the retransmission interval when
  the window is zero.  This retransmission is essential to guarantee
ToP   noToC   RFC0761 - Page 43
  that when either TCP has a zero window the re-opening of the window
  will be reliably reported to the other.

  The sending TCP packages the data to be transmitted into segments
  which fit the current window, and may repackage segments on the
  retransmission queue.  Such repackaging is not required, but may be
  helpful.

  Users must keep reading connections they close for sending until the
  TCP says no more data.

  In a connection with a one-way data flow, the window information will
  be carried in acknowledgment segments that all have the same sequence
  number so there will be no way to reorder them if they arrive out of
  order.  This is not a serious problem, but it will allow the window
  information to be on occasion temporarily based on old reports from
  the data receiver.

3.8.  Interfaces

  There are of course two interfaces of concern:  the user/TCP interface
  and the TCP/IP interface.  We have a fairly elaborate model of the
  user/TCP interface, but only a sketch of the interface to the lower
  level protocol module.

  User/TCP Interface

    The functional description of user commands to the TCP is, at best,
    fictional, since every operating system will have different
    facilities.  Consequently, we must warn readers that different TCP
    implementations may have different user interfaces.  However, all
    TCPs must provide a certain minimum set of services to guarantee
    that all TCP implementations can support the same protocol
    hierarchy.  This section specifies the functional interfaces
    required of all TCP implementations.

    TCP User Commands

      The following sections functionally characterize a USER/TCP
      interface.  The notation used is similar to most procedure or
      function calls in high level languages, but this usage is not
      meant to rule out trap type service calls (e.g., SVCs, UUOs,
      EMTs).

      The user commands described below specify the basic functions the
      TCP must perform to support interprocess communication.
      Individual implementations should define their own exact format,
      and may provide combinations or subsets of the basic functions in
ToP   noToC   RFC0761 - Page 44
      single calls.  In particular, some implementations may wish to
      automatically OPEN a connection on the first SEND or RECEIVE
      issued by the user for a given connection.

      In providing interprocess communication facilities, the TCP must
      not only accept commands, but must also return information to the
      processes it serves.  The latter consists of:

        (a) general information about a connection (e.g., interrupts,
        remote close, binding of unspecified foreign socket).

        (b) replies to specific user commands indicating success or
        various types of failure.

      Open

        Format:  OPEN (local port, foreign socket, active/passive
        [, buffer size] [, timeout] [, precedence]
        [, security/compartment]) -> local connection name

        We assume that the local TCP is aware of the identity of the
        processes it serves and will check the authority of the process
        to use the connection specified.  Depending upon the
        implementation of the TCP, the local network and TCP identifiers
        for the source address will either be supplied by the TCP or by
        the processes that serve it (e.g., the program which interfaces
        the TCP network).  These considerations are the result of
        concern about security, to the extent that no TCP be able to
        masquerade as another one, and so on.  Similarly, no process can
        masquerade as another without the collusion of the TCP.

        If the active/passive flag is set to passive, then this is a
        call to LISTEN for an incoming connection.  A passive open may
        have either a fully specified foreign socket to wait for a
        particular connection or an unspecified foreign socket to wait
        for any call.  A fully specified passive call can be made active
        by the subsequent execution of a SEND.

        A full-duplex transmission control block (TCB) is created and
        partially filled in with data from the OPEN command parameters.

        On an active OPEN command, the TCP will begin the procedure to
        synchronize (i.e., establish) the connection at once.

        The buffer size, if present, indicates that the caller will
        always receive data from the connection in that size of buffers.
        This buffer size is a measure of the buffer between the user and
ToP   noToC   RFC0761 - Page 45
        the local TCP.  The buffer size between the two TCPs may be
        different.

        The timeout, if present, permits the caller to set up a timeout
        for all buffers transmitted on the connection.  If a buffer is
        not successfully delivered to the destination within the timeout
        period, the TCP will abort the connection.  The present global
        default is 30 seconds.  The buffer retransmission rate may vary;
        most likely, it will be related to the measured time for
        responses from the remote TCP.

        The TCP or some component of the operating system will verify
        the users authority to open a connection with the specified
        precedence or security/compartment.  The absence of precedence
        or security/compartment specification in the OPEN call indicates
        the default values should be used.

        TCP will accept incoming requests as matching only if the
        security/compartment information is exactly the same and only if
        the precedence is equal to or higher than the precedence
        requested in the OPEN call.

        The precedence for the connection is the higher of the values
        requested in the OPEN call and received from the incoming
        request, and fixed at that value for the life of the connection.

        Depending on the TCP implementation, either a local connection
        name will be returned to the user by the TCP, or the user will
        specify this local connection name (in which case another
        parameter is needed in the call).  The local connection name can
        then be used as a short hand term for the connection defined by
        the <local socket, foreign socket> pair.

      Send

        Format:  SEND(local connection name, buffer address, byte count,
        EOL flag, URGENT flag [, timeout])

        This call causes the data contained in the indicated user buffer
        to be sent on the indicated connection.  If the connection has
        not been opened, the SEND is considered an error.  Some
        implementations may allow users to SEND first; in which case, an
        automatic OPEN would be done.  If the calling process is not
        authorized to use this connection, an error is returned.

        If the EOL flag is set, the data is the End Of a Letter, and the
        EOL bit will be set in the last TCP segment created from the


(next page on part 3)

Next Section