RFC 1037

NFILE - a file access protocol

Pages: 86
Historic

Part 3 of 3 – Pages 53 to 86

noToC RFC1037 - Page 53 prevText

9.  NFILE RESYNCHRONIZATION PROCEDURE

   Ordinarily, the user side sends NFILE commands to the server side
   over the control connection; the server side responds to every user
   command, and file data is transmitted over the data channels.  This
   section describes a resynchronization procedure that takes place when
   something disturbs the usual course of events.

   First, if the server side aborts while sending or receiving data,
   nothing can be done to salvage the connection between the two hosts.
   The control connection and any data channels associated with this
   connection are broken.  This happens rarely, if at all.

   It is not unusual for the user side to abort file operations, either
   commands or data transfer.  On a Symbolics computer, the user can do
   this by pressing CONTROL-ABORT.  An important aspect of any file
   protocol is the way it handles the situation when the user side
   aborts file operations.

   An NFILE user side reacts to user side aborts by immediately marking
   the connection unsafe.  When a control connection is unsafe, it must
   be resynchronized before it can be used again.  Data channels can
   also be marked unsafe, and must also be resynchronized before further
   use.  The resynchronization process rids the connection (whether
   control or data connection) of bytes of data that are now unwanted,
   and thus cleans up the channel so it can be used again.

   The resynchronization procedure is somewhat complex, but it fulfills
   a genuine need.  For those interested, a brief design discussion is
   included as note <3>.

noToC RFC1037 - Page 54

9.1  NFILE Control Connection Resynchronization

   NFILE requires any unsafe control connection to undergo a
   resynchronization procedure before further use.  Therefore, the
   resynchronization does not necessarily occur immediately after the
   control connection is marked unsafe.  The user side initiates the
   control connection resynchronization when another operation on the
   control connection is attempted.

   A "mark" is defined in the context of Byte Stream with Mark:  See the
   section "Discussion of Byte Stream with Mark", section 12.1.

   USER SIDE STEPS:  CONTROL CONNECTION RESYNCHRONIZATION

       1. The user side sends a mark over the control connection to
          the server.

       2. The user side sends the ASCII characters USER-RESYNC-DUMMY
          (as a data token) to the server.

       3. The user side sends a second mark to the server.

       4. The user side declares the control connection safe (at the
          token list level).

       5. The user side generates and sends a unique data token to
          the server.

       6. The user side then waits, expecting to detect a mark
          followed by the unique data token.  The user side reads and
          discards all tokens and marks until the desired match is
          found.

   Once the user side detects the mark and unique data token, the
   control connection has been fully resynchronized, and can be used
   again.


   SERVER SIDE STEPS:  CONTROL CONNECTION RESYNCHRONIZATION

        1. The server side detects a mark.  The server is thus alerted
           that the control connection is unsafe, and that
           resynchronization is in progress.

        2. The server continues to read data coming from the user side
           until it detects the second mark, and the token following
           it.

noToC RFC1037 - Page 55

        3. The server checks to see if the token following the mark is
           USER-RESYNC-DUMMY.  This rare situation occurs if the user
           aborts during the course of the resynchronization itself.
           If so, the server side discards the USER-RESYNC-DUMMY
           token.  The control connection is still unsafe, and the
           user side restarts the resynchronization procedure; the
           server side therefore begins at Step 2 again.

        4. If the token following the mark is not USER-RESYNC-DUMMY
           (this is the expected circumstance), the server should have
           received a single data token that is the unique data token
           generated by the user side.

               a. The server sends a mark to the user side.

               b. The server declares the control connection safe (at
                  the token list level).

               c. The server sends the unique data token to the user
                  side.

        5. If the server detects something following the mark that was
           neither USER-RESYNC-DUMMY nor a single data token, a
           protocol error has occurred.

9.2  NFILE Data Connection Resynchronization

   The NFILE data channel resynchronization procedure is similar to the
   NFILE control connection resynchronization.  Both procedures are
   based on a mark signalling the unsafe condition, then a second mark
   followed by a unique identifier.  One important difference between
   the two procedures is the circumstances in which they occur.  Control
   connections are put into unsafe states only when the user aborts
   during control connection I/O operations.  Data channels are made
   unsafe by a larger set of circumstances:

noToC RFC1037 - Page 56

       - User aborts occur during the file protocol operations that
         assign and deassign data channels.  This is the most common
         cause of data channels becoming unsafe.

       - A server receives a CLOSE command (with abort-p supplied as
         Boolean truth) specifying an open file that has not finished
         transmitting data.  That is, file reading is aborted.

       - The ABORT command is issued, causing data channels to be
         made unsafe.

       - The FILEPOS command is issued, causing the input data
         channel to become unsafe.

   The resynchronization clears the data channel of unwanted data from
   aborted operations and puts the data channel in a known state.  The
   data channel resynchronization procedure is invoked when the user
   side gives the RESYNCHRONIZE-DATA-CHANNEL command over the control
   connection.

   The following policies can be used to improve response time, but are
   not required by the NFILE protocol:  The user side can initiate
   resynchronization only if it needs the data channel, having first
   tried to use a free data channel that does not require
   resynchronization.  Also, the user side can periodically
   resynchronize all unsafe data channels.

   In giving the RESYNCHRONIZE-DATA-CHANNEL command, the user side
   indicates which data channel should be resynchronized.  Data channels
   are unidirectional, which means that depending on the direction
   (either input or output) of the data channel, either the user side or
   the server side sends the resynchronization data.  This is another
   difference from the resynchronization of the control connection, in
   which the resynchronization data is always sent by the user side.
   The resynchronization steps for input data channels are different
   than the steps for output data channels.

noToC RFC1037 - Page 57

   INPUT DATA CHANNEL RESYNCHRONIZATION

      1. The user side gives the RESYNCHRONIZE-DATA-CHANNEL command
         on the control connection, with only one argument, the
         handle of the data channel to be resynchronized.

      2. The server side of the data channel generates a unique
         identifier, and sends that data token in its regular
         command response to the user side.

      3. The server side sends a mark over the data channel.

      4. The server side sends the unique identifier token over the
         data channel.

      5. The user side reads until it detects a mark followed by the
         unique identifier token.  The resynchronization is then
         complete.  The data channel is no longer in an unsafe
         state.

   OUTPUT DATA CHANNEL RESYNCHRONIZATION

      1. The user side gives the RESYNCHRONIZE-DATA-CHANNEL command
         on the control connection, with two arguments: the handle
         of the data channel to be resynchronized, and a unique
         identifier that it has just generated.

      2. The user side of the data channel sends a mark.

      3. The user side of the data channel sends a dummy identifier
         token.  The dummy identifier can be any token that the
         server could not interpret as being the unique identifier.
         One suggestion is the data token DUMMY-IDENTIFIER.

      4. The server side of the data channel was alerted by the
         RESYNCHRONIZE-DATA-CHANNEL command that resynchronization
         is in progress.  The server side now reads the data,
         seeking the first mark.

      5. The server side reads and discards the first mark and the
         dummy identifier.

      6. The user side sends a second mark.

      7. The user side sends the unique identifier.

      8. The server side recognizes the mark and the unique
         identifier that follows, and the resynchronization is

noToC RFC1037 - Page 58

         complete.  The data channel is no longer in the unsafe
         state.

10.  NFILE ERRORS AND NOTIFICATIONS

   NFILE recognizes two types of errors:  command response errors and
   asynchronous errors.  In addition to errors, NFILE supports
   notifications.

   Command response errors:

       - Signify an error that prevented the successful completion of
         the command; when such an error occurs, a command response
         error is sent instead of a normal command response.
       - Occur frequently in normal operations

   Asynchronous errors:

       - Are not related to any specific command
       - Are associated with an erring data channel
       - Typically indicate a problem in the transfer, such as
         running out of disk space or allocation, or an unreadable
         disk record
       - Occur rarely in normal operations

   Notifications:

       - Are not associated with an error
       - Are sent at the server's discretion
       - Provide general information, such as a warning that the
         system is going down

10.1  Notifications From the NFILE Server

   The NFILE server can send asynchronous notifications to the user side
   over the control connection.  The text of the notification contains
   information of interest to the person using NFILE, such as a warning
   that the server's operating system will be going down soon.
   Notifications can come from the server side at any time that the
   server is not sending something else.

   The format of NFILE notifications is:

             (NOTIFICATION "" text)

   The empty string "" takes the place of a transaction identifier.
   Notifications are initiated by the server, and are not associated
   with any transaction originated by the user side.n

noToC RFC1037 - Page 59

10.2  NFILE Command Response Errors

   When an error prevents the successful completion of an NFILE command,
   a command response error is sent instead of the normal command
   response.  A normal command response indicates success; a command
   response error indicates failure of the command.

   NFILE command response errors are sent from the server to the user
   across the control connection as top-level token lists, in this
   format:

             (ERROR tid three-letter-code error-vars message)

   ERROR is a keyword.  The tid is the transaction identifier of the
   command that encountered this error.  The arguments three-letter-
   code, error-vars, and message are all required.

   The three-letter-code provides the information on what kind of an
   error was encountered.  For a table of the three-letter codes and
   their meanings:  See the section "NFILE Three-letter Error Codes",
   section 10.4.

   message is a string that is displayed to the human user of the
   protocol.

   error-vars is a keyword/value list.  The three possible keywords are:
   PATHNAME, OPERATION, and NEW-PATHNAME.  Before transmitting an error,
   the server looks at the type of error to see if it can easily
   determine the value of any of the keywords.  If so, the server
   includes the keyword/value pair in its error.  If not, the
   keyword/value pair is omitted.  The value associated with OPERATION
   is the keyword naming the NFILE command that failed.  The values
   associated with PATHNAME and NEW-PATHNAME are strings in the full
   pathname syntax of the server host.

   For example, suppose the server on a file system with hierarchical
   directories could not access a file because its containing directory
   did not exist.  The command error response would use the PATHNAME
   keyword to indicate the first directory level that did not exist,
   instead of the full pathname which was supplied as the command
   argument.  This gives the user side valuable information that it
   otherwise would not have known.

noToC RFC1037 - Page 60

10.3  NFILE Asynchronous Errors

   When a data channel process, in either direction, encounters an error
   condition, the server sends an asynchronous error description. An
   asynchronous error description consists of a top-level token list.
   Typically, asynchronous errors indicate error conditions in the
   transfer, such as running out of disk space or allocation, or a
   unreadable disk record.

   The format of asynchronous error descriptions is:

         (ASYNC-ERROR handle three-letter-code error-vars message)

   ASYNC-ERROR is a keyword.  The handle argument identifies the erring
   data channel.  The arguments three-letter-code, error-vars, and
   message are all required.  Their meanings are the same as in NFILE
   command error responses: See the section "NFILE Command Response
   Errors", section 10.2.

   When the server detects an asynchronous error on an input data
   channel, the server sends an asynchronous error description on that
   data channel itself.  When an asynchronous error occurs on an output
   data channel, the asynchronous error description is sent on the
   control connection.

   Some asynchronous errors are restartable.  In this context,
   restartable means it makes sense to try to resume the operation.  One
   example of a restartable error is an attempt to write a file to a
   file system that is out of room.  The server side indicates whether
   an asynchronous error is restartable by prepending the keyword
   RESTARTABLE and the associated value Boolean truth to the error-vars
   list.  To proceed from a restartable error, the user side sends a
   CONTINUE command over the control connection.

   On any asynchronous error, either input or output, the data channel
   on the server side enters an "asynchronous error outstanding" state.
   The server can exit that state in one of two ways:  by receiving a
   CONTINUE command or a CLOSE command with the abort-p argument
   supplied as Boolean truth.

   On a normal CLOSE (not a close-abort), the server side checks the
   channel it was requested to close.  If an asynchronous error
   description has been sent on the data channel, but not yet processed
   by CONTINUE, the server side does not close the channel, but sends a
   command error response.  The same thing happens on a FINISH command
   received on a channel that has an asynchronous error pending.  In
   both cases, the three-letter code included in the command error
   response is EPC, for Error Pending on Channel.

noToC RFC1037 - Page 61

10.4  NFILE Three-letter Error Codes

   Usually the server's operating system provides some description of an
   error that occurs.  NFILE has a mechanism for conveying that
   information to the user side.  Upon detecting an error, the NFILE
   server should characterize the error by choosing the three-letter
   code that best describes the error.  The three-letter code is an
   argument in both the command response error and asynchronous error
   messages from the server to the user.

   Each of the NFILE three-letter codes represents some system error.
   The set of codes enables all operating systems to use one error-
   reporting mechanism.  Some operating systems will never encounter
   certain of the error conditions.

   Some errors fit logically into two error codes.  For example, suppose
   the server could not delete a file because the file was not found.
   This error could be considered either CDF (Cannot Delete File) or FNF
   (File Not Found).  In this case, File Not Found gives more specific
   and valuable information than Cannot Delete File.  Since the protocol
   does not allow more than one error code to be reported when an error
   occurs, the server must choose the most appropriate error code, given
   the information available to it from the operating system.

   This is the set of three-letter codes:

     ACC   Access error.  This indicates a protection-violation error.

     ATD   Incorrect access to directory.  A directory could not be
           accessed because the user's access rights to it did not
           permit this type of access.

     ATF   Incorrect access to file.  A file could not be accessed
           because the user's access rights to it did not permit this
           type of access.

     BUG   File system bug.  This includes all protocol violations
           detected by the server, as well as by the host file system.

     CCD   Cannot create directory.  An error occurred in attempting to
           create a directory.

     CDF   Cannot delete file.  The file system reported that it cannot
           delete a file.

     CCL   Cannot create link.  An error occurred in attempting to
           create a link.

noToC RFC1037 - Page 62

     CIR   Circular link.  An operation was attempted on a pathname that
           designates a link that eventually links back to itself.

     CRF   Cannot rename file.  An error occurred in attempting to
           rename a file.

     CSP   Cannot set property.  An error occurred in attempting to
           change the properties of a file.  This could mean that you
           tried to set a property that only the file system is allowed
           to set, or a property that is not defined on this type of
           file system.

     DAE   Directory already exists.  A directory could not be created
           because a directory or file of this name already exists.

     DAT   Data error.  The file system contains unreadable data.  This
           could mean data errors detected by hardware or inconsistent
           data inside the file system.

     DEV   Device not found.  The device of the file was not found or
           does not exist.

     DND   "Do Not Delete" flag set.  An attempt was made to delete a
           file that is marked by a "Do Not Delete" flag.

     DNE   Directory not empty.  An invalid deletion of a nonempty
           directory was attempted.

     DNF   Directory not found.  The directory was not found or does not
           exist.  This refers specifically to the containing directory;
           if you are trying to access a directory, and the actual
           directory you are trying to access is not found, FNF (for
           File Not Found) should be indicated instead.

     EPC   Error pending on channel.  The server cannot close the
           channel in attempting to close or finish the channel.

     FAE   File already exists.  The file could not be created because a
           file or directory of this name already exists.

     FNF   File not found.  The file was not found in the containing
           directory.  The TOPS-20 and TENEX "no such file type" and "no
           such file version" errors should also report this condition.

     FOO   File open for output.  Opening a file that was already opened
           for output was attempted.

     FOR   Filepos out of range.  Setting the file pointer past the

noToC RFC1037 - Page 63

           end-of-file position or to a negative position was attempted.

     FTB   File too big.  File is larger than the maximum file size
           supported by the file system.

     HNA    Host not available The file server or file system is
           intentionally denying service to user.  This does not mean
           that the network connection failed; it means that the file
           system is explicitly not available.

     IBS    Invalid byte size.  The value of the "byte size" option was
           not valid.

     ICO   Inconsistent options.  Some of the options given in this
           operation are inconsistent with others.

     IOD   Invalid operation for directory.  The specified operation is
           invalid for directories, and the given pathname specifies a
           directory, in directory pathname as file format.

     IOL   Invalid operation for link.  The specified operation is
           invalid for links, and this pathname is the name of a link.

     IP?   Invalid password.  The specified password was invalid.

     IPS   Invalid pathname syntax.  This includes all invalid pathname
           syntax errors.

     IPV   Invalid property value.  The new value provided for the
           property is invalid.

     IWC   Invalid wildcard.  The pathname is not a valid wildcard
           pathname.

     LCK   File locked.  The file is locked.  It cannot be accessed,
           possibly because it is in use by some other process.

     LIP   Login problems.  A problem was encountered while trying to
           log in to the file system.

     MSC   Miscellaneous problems.

     NAV   Not available.  The file or device exists but is not
           available.  Typically, the disk pack is not mounted on a
           drive, the drive is broken, or the like.  Operator
           intervention is probably required to fix the problem, but
           retrying the operation is likely to succeed after the problem
           is solved.

noToC RFC1037 - Page 64

     NER   Not enough resources.  For example, a system limit on the
           number of open files or network connections has been reached.

     NET   Network problem.  The file server had some sort of trouble
           trying to create a new data connection, or perform some other
           network operation, and was unable to do so.

     NFS   No file system.  The file system was not available.  For
           example, this host does not have any file systems, or this
           host's file system cannot be initialized or accessed for some
           reason, or the file system simply does not exist.

     NLI   Not logged in.  A file operation was attempted before logging
           in.  Normally the file system interface always logs in before
           doing any operation, but this problem can occur in certain
           unusual cases in which logging in has been aborted.


     NMR   No more room.  The file system is out of room.  This can mean
           any of several things:

                      - The entire file system is full.
                      - The particular volume involved is full.
                      - The particular directory involved is full.
                      - The user's allocated quota has been exceeded.

     RAD   Rename across directories.  The devices or directories of the
           initial and target pathnames are not the same, but on this
           file system they are required to be.

     REF   Rename to existing file.  The target name of a rename
           operation is the name of a file that already exists.

     UKC   Unknown operation. An unsupported file system operation was
           attempted, or an unsupported command was attempted.

     UKP   Unknown property.  The property is unknown.

     UNK   Unknown user.  The specified user name is unknown to this
           host.

     UUO   Unimplemented option.  An option to a command is not
           implemented.

     WKF   Wrong kind of file.  This includes errors in which an invalid
           operation for a file, directory, or link was attempted.

     WNA   Wildcard not allowed.

noToC RFC1037 - Page 65

11.  TOKEN LIST TRANSPORT LAYER

   PURPOSE:  The Token List Transport Layer is a protocol that
   facilitates the transmission of simple structured data, such as
   lists.

11.1  Introduction to the Token List Transport Layer

   The Token List Transport Layer is a general-purpose protocol.  The
   Token List Transport Layer sends "tokens" through its underlying
   stream.  Each token usually represents a simple quantity, such as a
   string or integer.

   Tokens can be organized into "token lists".  Special tokens are
   provided to denote the starting and ending point of lists.  The token
   list transport layer differentiates between "top-level token lists",
   which are not contained in other lists, and "embedded token lists",
   which are contained in other lists.  Using lists makes it convenient
   to send structured records, such as commands and command responses of
   the client protocol.  The top-level token lists provide robustness.

   The Token List Transport Layer is a general term that includes two
   separate but related subjects:  the "token list stream" and the
   "token list data stream".  The token list stream is commonly used for
   applications that can easily organize the information to be
   transmitted into tokens and lists.  The token list data stream is
   more appropriate for transmitting a large volume of data that cannot
   easily be structured into tokens and lists, such as file data, which
   is simply a sequence of characters or bytes.

   The following table illustrates the main differences between token
   list streams and token list data streams:

                     Token List Data Stream      Token List Stream
                     ----------------------      -----------------

     Built on:     token list stream           Byte Stream with Mark

     Transmits:    stream data                 tokens, token lists

     Example
     of use:       NFILE data channels         NFILE control
                                               connection

noToC RFC1037 - Page 66

   NFILE uses the the Token List Transport Layer, and provides an
   excellent example of its usefulness.  The NFILE commands and command
   responses are sent over the control connection in a token list
   stream.  File data is sent across each data channel in a token list
   data stream.

11.2  Token List Stream

11.2.1  Types of Tokens and Token Lists

   All numbers in the token list documentation are represented in
   decimal notation.  Bytes are 8 bits long.

   TYPES OF TOKENS

   Tokens are of the following types:

            1. Atomic tokens.

               Atomic tokens are of the following subtypes:

              - Data tokens.  A data token consists of a sequence of
                bytes with an effectively infinite maximum length.  In
                some contexts a data token represents a string; in
                other contexts, a data token is other arbitrary data.

                Each data token is preceded in the token list stream
                by a representation of its length in bytes.

                Data tokens that are under 200 bytes long are preceded
                by one byte containing their length in bytes.  That
                is, a data token of 34 bytes is preceded by one byte
                of value 34.

                Data tokens 200 bytes or over are preceded by the byte
                known as PUNCTUATION-LONG, of value 201.  After the
                201 comes a four-byte-long number (least significant
                byte first) containing the length of the data token
                that follows.

              - Numeric tokens.  A sequence of bytes that represent
                and encode a nonnegative binary integer.  The largest
                valid integer is 2^63 - 1.

                Numeric tokens are either short integers (less than
                256) or long integers (greater than or equal to 256).
                Short integers are preceded by the byte known as
                PUNCTUATION-SHORT-INTEGER, of value 206.

noToC RFC1037 - Page 67

                Long integers are begun by PUNCTUATION-LONG-INTEGER,
                of value 207.  One byte follows, containing the length
                (in bytes) of the long integer.  The integer itself is
                next, least significant byte first.

              - Keyword tokens.  A sequence of bytes that represent
                and encode a named identifier of the implemented
                protocol.  Keyword tokens are used by the client
                protocol to convey a name; the only significance of a
                keyword token is in its name.

                Each keyword is preceded by the byte known as
                PUNCTUATION-KEYWORD, of value 208.  The data token
                following PUNCTUATION-KEYWORD represents the name of
                the keyword as a string.  The characters are in
                upper-case standard ASCII.

              - Boolean truth.  A special token that represents the
                Boolean truth value.  This token is known as
                BOOLEAN-TRUTH, of value 209 <4>.

   2. Control tokens.

   The token list stream supports four control tokens to delimit token
   lists, and one padding token.

               TOP-LEVEL-LIST-BEGIN  202   This control token
                                           appears at the start of
                                           each top-level token list.

               TOP-LEVEL-LIST-END    203   This control token
                                           appears at the end of
                                           each top-level token list.
               LIST-BEGIN            204   This control token
                                           appears at the start of
                                           each embedded token list.

               LIST-END              205   This control token
                                           appears at the end of
                                           each embedded token list.

               PUNCTUATION-PAD       200   This padding token should
                                           be ignored by the token
                                           list stream.  It can be
                                           sent to fill buffers.

noToC RFC1037 - Page 68

   TOKEN LISTS

   A token list consists of a sequence of atomic tokens or token lists.
   Token lists are begun and ended by control tokens that delimit the
   token lists.  There are three types of token lists:

         1. Top-level token lists.

            Top-level token lists begin with TOP-LEVEL-LIST-BEGIN and
            end with TOP-LEVEL-LIST-END.  Top-level token lists are not
            contained in other lists.

         2. Embedded token lists.

            These token lists occur inside other token lists.  They
            begin with LIST-BEGIN and end with LIST-END.

         3. The empty token list.

            This is a special example of the embedded token list.  In
            some contexts, the empty token list represents Boolean
            falsity.  An embedded empty token list is composed of a
            LIST-BEGIN followed immediately by a LIST-END.  A top-level
            empty token list is composed of TOP-LEVEL-LIST-BEGIN
            followed immediately by TOP-LEVEL-LIST-END.

11.2.2  Token List Stream Example

   This section contains an example of some data that can appear on a
   token list stream.  The example is a top-level token list encoding an
   NFILE DELETE command.

   The DELETE command is composed of the following pieces:  a TOP-
   LEVEL-LIST-BEGIN, the keyword DELETE, a data token containing the
   transaction identifier, a LIST-BEGIN, a LIST-END, a data token
   containing a pathname of a file to be deleted, and a TOP-LEVEL-LIST-
   END.  This example uses t105 as the transaction identifier, and
   /usr/max/temp as the pathname.

   All numbers in this section are expressed in decimal notation.

   The pieces of the command are displayed here in order:

            1. TOP-LEVEL-LIST-BEGIN
            2. The keyword token whose name is DELETE
            3. The data token containing the characters:  t105
            4. LIST-BEGIN
            5. LIST-END

noToC RFC1037 - Page 69

            6. The data token containing the characters:  /usr/max/temp
            7. TOP-LEVEL-LIST-END

   Now, let's translate each piece of the command into the bytes that
   are transmitted through the token list stream.

        1. TOP-LEVEL-LIST-BEGIN

           202     represents TOP-LEVEL-LIST-BEGIN

        2. The keyword token whose name is DELETE.

           A keyword token is introduced by PUNCTUATION-KEYWORD, which
           is represented in the token list stream as the byte 208.

           A data token follows, containing the string "DELETE".  A
           data token under 200 bytes long is introduced by one byte
           containing its length in bytes.  The length of this data
           token is 6 bytes.

           The data token continues with the standard ASCII character
           set representation of each character in the string DELETE:

               208     represents PUNCTUATION-KEYWORD
               006     represents the length of this data token
               068     represents "D"
               069     represents "E"
               076     represents "L"
               069     represents "E"
               084     represents "T"
               069     represents "E"

        3. The data token containing the characters:  t105

           This data token is begun by its length in bytes (4), and
           continues with the NFILE character set representation of
           each character in the string:

               004     represents the length of this data token
               116     represents "t"
               049     represents "1"
               048     represents "0"
               053     represents "5"

        4. LIST-BEGIN

               204     represents LIST-BEGIN

noToC RFC1037 - Page 70

        5. LIST-END

               205     represents LIST-END

        6. The data token containing the characters:  /usr/max/temp

               013     represents length of this data token
               047     represents "/"
               117     represents "u"
               115     represents "s"
               114     represents "r"
               047     represents "/"
               109     represents "m"
               097     represents "a"
               120     represents "x"
               047     represents "/"
               116     represents "t"
               101     represents "e"
               109     represents "m"
               112     represents "p"

        7. TOP-LEVEL-LIST-END

               203     represents TOP-LEVEL-LIST-END

11.2.3  Mapping of Lisp Objects to Token List Stream Representation

   The Symbolics interface to the token list stream sends Lisp objects
   through the underlying Byte Stream with Mark and produces Lisp
   objects on the other end.  Not all Lisp objects can be sent in this
   way.  For example, compound objects other than lists are not handled.
   An appropriate analogy is the sending and reconstruction of list
   structure via printed representation.  These are the types of objects
   that can be sent, and their representations:

        - Lisp strings are represented as data tokens in the NFILE
          character set.  Only 8-bit strings can be sent <5>.

        - Keyword symbols are represented as keyword tokens.  Although
          identifiable and reconstructable as keyword symbols, only
          their names are sent.  Any properties, bindings, and the
          like are not sent.

        - T is represented as BOOLEAN-TRUTH.

        - NIL is represented as the empty token list.

        - Lists are represented as token lists.  Circular lists cannot

noToC RFC1037 - Page 71

          be sent.  See the footnote related to the ambiguity between

          NIL and the empty list:  See the section "Types of Tokens
          and Token Lists", section 11.2.1.

        - Integers are represented as numeric tokens.  Only
          nonnegative integers less than 2^63 can be sent.

11.2.4  Aborting and the Token List Stream

   A token list stream accrues the benefits of the abort management
   policy of the Byte Stream with Mark on which it is built.  In order
   to fully realize this benefit, some simple rules must be obeyed by
   any implementation of the token list stream.

   The term "transmission" means either an atomic token or a complete
   top-level token list. A transmission starts with the control token
   TOP-LEVEL-BEGIN and ends with TOP-LEVEL-END.  The top-level token
   list can contain embedded token lists.

   The interface that writes to the token list stream must be capable of
   writing the representation of entire transmissions.  When this
   interface is called, it must effectively lock the token list stream,
   and exclude access by other processes until the entire transmission
   has been encoded and sent.

   If the sending is aborted while the stream is locked, the stream
   enters an "unsafe" state.  Trying to send data while the stream is
   unsafe signals an error.  The application and the token list stream
   must send a mark to cause resynchronization, and allow the token list
   stream to be used again.  When the reading side encounters this mark,
   it resynchronizes itself according to whatever client protocol is in
   use.

   Similarly, the interface that reads from the token list stream must
   be capable of reading entire transmissions.  When this interface is
   called, it must lock the stream, excluding access by other processes
   until the entire transmission has been read.

   If the reading is aborted while the stream is locked, the stream
   enters an unsafe state.  The only exit from this unsafe state is by
   means of receiving a mark.  When the stream is unsafe, the only valid
   operation that can be performed upon it is "read and discard all
   tokens until a mark is encountered; read and discard that mark;
   declare the stream safe again".

noToC RFC1037 - Page 72

   Depending on the client protocol, the receipt of a mark might cause
   the reading side to read for further marks.  NFILE implements the
   resynchronization of token list streams, and serves as a useful
   example: See the section "NFILE Control Connection
   Resynchronization", section 9.1.

   The Symbolics implementation provides the two mark-handling
   primitives in this way:


      1. Send token (or list) preceded by a mark.  When the stream
         is in the unsafe state (on the output side), this is the
         only permitted output operation (other than closing).

      2. Read through to a mark and read the token (or list)
         following the mark.  When the stream is in the unsafe state
         (on the input side), this is the only permitted input
         operation (other than closing).

11.3  Token List Data Stream

   The token list data stream is a facility to transmit stream data
   through a token list stream.  The token list data stream imposes the
   following protocol on the data transmitted:

            - Data is sent in the format of loose data tokens, not
              contained in token lists.

            - The keyword token EOF indicates that the end of data has
              been reached.

            - Token lists can be transmitted through the token list
              data stream.

            - No loose tokens other than data tokens or the keyword
              token EOF can be sent.

            - Boundaries between data tokens are not signification.
              The data is considered to be a continuous stream, with
              the possible exception of marks.

   The token list data stream is most appropriate for sending file data.
   It is expected (but not required) that its typical mode of use is to
   send a large number of data tokens, with an occasional token list.
   The design intent was that token lists would be used by the
   application program to indicate exceptional situations.

   Data tokens, the keyword token EOF, and token lists are defined in

noToC RFC1037 - Page 73

   the token list stream documentation:  See the section "Types of
   Tokens and Token Lists", section 11.2.1.

   The NFILE file protocol provides a good example of the use of token
   list data streams.  NFILE sends file data through token list data
   streams; each NFILE data channel is a token list data stream.  Errors
   such as disk errors during the reading of a file are conveyed as
   token lists through the token list data stream.

12.  BYTE STREAM WITH MARK

   PURPOSE:  Byte Stream with Mark is a simple layer of protocol that
   guarantees that an out-of-band signal can be transmitted in the case
   of program interruption.  Byte Stream with Mark is designed to
   provide end-to-end stream consistency in the face of user program
   aborts.

12.1  Discussion of Byte Stream with Mark

   INTRODUCTION

   Byte Stream with Mark is a reliable, bidirectional byte stream with
   one out-of-band (but not out-of-sequence) signal called a "mark".
   The design of Byte Stream with Mark ensures that the mark is always
   recognizable on the receiving end.  The Byte Stream with Mark is
   built on an underlying stream, which must support the transmission of
   8-bit bytes.  Byte Stream with Mark has been implemented to run on
   TCP and Chaos.  Marks are implemented differently on the two
   protocols.

   Marks are used to resynchronize the stream when something has
   occurred to interrupt normal operations.  For example, an application
   layer sending data over the Byte Stream with Mark can abort in the
   middle of sending that data.  Recovery is handled by sending a mark.

   In the context of this document, "aborting" is defined as follows:
   Aborting the current execution of a program means to halt that
   execution and to abandon it, never to complete it.  The data
   representing the state of the execution are irrevocably discarded.

   EXAMPLE OF USE

   Byte Stream with Mark is the layer of protocol underlying NFILE.
   NFILE uses the marks implemented in Byte Stream with Mark to
   resynchronize control connections or data channels whose
   synchronization has been lost.  For a description of NFILE's use of
   marks to resynchronize streams:  See the section "NFILE
   Resynchronization Procedure", section 9.

noToC RFC1037 - Page 74

   BYTE STREAM WITH MARK ON CHAOSNET

   A mark is recognized on Chaosnet by a packet bearing the opcode 201
   (octal).  There is no data in a mark packet, so the data portion of
   the packet is ignored.  Byte Stream with Mark transmits all data in
   packets bearing opcode 200 (octal).

   If Byte Stream with Mark is implemented on another (non-Chaos) stream
   that supports opcode-bearing packets, the recommended implementation
   is the reservation of an opcode for the mark.

   BYTE STREAM WITH MARK ON TCP:  RECORD MODE

   The purpose of Byte Stream with Mark is to guarantee that marks can
   always be unambiguously identified.  Therefore, for TCP (and for any
   transport layer that does not implement packets natively) a simple
   record stream is imposed on the stream.  The record boundaries serve
   only to distinguish where a mark can occur.  A record consists of a
   two-byte byte count, most significant byte first, followed by that
   many bytes of data.  A byte count of zero is recognized as a mark.

   Both the sending side and the receiving side must rigorously maintain
   the integrity of the record boundaries.  A writer to the stream must
   never output a byte count without that number of data bytes
   following.  Similarly, a reader of the stream, after reading a byte
   count, has effectively contracted to read that many bytes from the
   encapsulated stream, regardless of whether those bytes are requested
   by the application layer.

   MAINTAINING RECORD INTEGRITY

   This subsection deals with maintaining record integrity on non-Chaos
   networks.  Since Chaos implements packets natively, no special care
   is required to maintain record integrity on the Chaos network.

   The design discussed here guarantees record integrity; the underlying
   stream must guarantee data integrity.

   The basic design of Byte Stream with Mark on TCP (and other transport
   layers that do not implement packets natively) is to preserve record
   integrity by putting clearly demarcated, byte-counted records in the
   natural records of the encapsulated stream.  Therefore, when the
   outer stream requests a buffer's worth of file data from the
   encapsulated stream, it expects to receive a buffer containing one
   entire, ntegral, record of that stream, complete with byte count.

   Because of diverse network implementations on different operating
   systems, the software that implements the encapsulated stream might

noToC RFC1037 - Page 75

   not be able to provide integral record buffers to the Byte Stream
   with Mark implementation.  For example, the writing stream could have
   written records that are much longer than available buffers on the
   receiving system.  In this case, a request to read from the
   encapsulated stream returns some buffer or some amount of data
   representing less than an entire Byte Stream with Mark record.  The
   input subroutine of the Byte Stream with Mark implementation must
   therefore return a region of this (smaller) buffer, representing less
   than the full Byte Stream with Mark record.  Nevertheless, the Byte
   Stream with Mark must extract the count of the full Byte Stream with
   Mark record from the first such buffer of each Byte Stream with Mark
   record, and maintain and update this count as succeeding component
   buffers are read.

   In this case, if the program reading from the Byte Stream with Mark
   aborts while reading data, the implementation of Byte Stream with
   Mark must continue to read through the remaining buffers of the Byte
   Stream with Mark record that has been subdivided in this fashion.

   The user side program will have determined that an abort has
   occurred, and will request the Byte Stream with Mark to read up to
   and through the next mark.  The Byte Stream with Mark will have
   processed a fractional record, and must discard the remaining buffers
   of the record now being read.

12.2  Byte Stream with Mark Abortable States

   Byte Stream with Mark is designed to provide end-to-end stream
   consistency in the face of user program aborts.  This section
   describes user program aborts, and how Byte Stream with Mark handles
   them.  In the context of this document, "aborting" is defined as
   follows:  Aborting the current execution of a program means to halt
   that execution and to abandon it, never to complete it.  The data
   representing the state of the execution are irrevocably discarded.

   USER PROGRAM ABORTS AND I/O STREAMS

   Aborting the execution of the code that manipulates I/O streams, in
   general, poses significant problems.  Given that a stream is a static
   data object, and is intended to be used over and over again, aborting
   the execution of any routine manipulating a stream can leave it in an
   inconsistent, unusable state.

   Many operating systems solve this problem by manipulating a large
   subset of streams within the confines of the supervisor or executive
   program, which is not vulnerable to aborts, short of system or
   network failure.  Nevertheless, the need still exists to implement
   streams outside of the boundaries of the supervisor.  Furthermore,

noToC RFC1037 - Page 76

   the Symbolics computer environment has no supervisor or executive
   program, and is thus vulnerable to aborts everywhere.

   BYTE STREAM WITH MARK HANDLING OF USER PROGRAM ABORTS

   Byte Stream with Mark is designed to be nearly impervious to the
   aborting of programs using it.  Its design is based on careful
   analysis of all possible states of the stream, and of the effect of
   aborts of the programs using the stream in each of these states.
   This section provides that analysis.

   A "transmission" is a collection of user data sent by the application
   level through the Byte Stream with Mark whose end is well-defined,
   once its start has been recognized.  For instance, the token list
   stream, when using Byte Stream with Mark, sends token lists.  When a
   TOP-LEVEL-LIST-BEGIN has been sent, the containing transmission is
   not considered complete until the corresponding TOP-LEVEL-LIST-END is
   read.  See the section "Token List Transport Layer", section 11.

   The following cases are possible states of the stream when an abort
   occurs:

         1. Abort occurs when the user program is not manipulating the
            stream.

            This case presents no problem.

         2. Abort occurs after a transmission has been partially sent,
            at a packet or record boundary.

            This implies that the datum that would indicate the
            successful complete sending of that transmission has been
            not yet been sent.

            The Byte Stream with Mark state is consistent, but the
            application level state is not.  The application level must
            determine that the execution of the code composing and
            sending its transmission was, in fact, aborted, and
            initiate resynchronization via marks.

            The receiving side must be careful not to act upon a
            transmission (that is, to perform any action or side
            effect) until the transmission has been successfully
            received in entirety.  This protects the user program from
            the possibility that an abort can occur after a
            transmission has been partially sent.

noToC RFC1037 - Page 77

         3. Abort occurs during the sending or receiving of a record.

            This is the most vulnerable state of the mechanism.  This
            case does not occur on packet-oriented media; it is
            subsumed by the next case.

            This case is handled by minimizing the extent of this
            window, and killing the connection when and if the
            situation is detected.  Depending on the operating system
            involved, this window could be minimized by using
            interrupt-disabling mechanisms, auxiliary processes or
            tasks, or some other technique.

            For buffered streams, input and output waiting can be done
            in consistent states, thus minimizing the amount of time
            manipulating the actual encapsulated stream.  For
            unbuffered streams, a lot of time can be spent in this
            window.  It is expected that unbuffered streams will be
            exceedingly uncommon.  Nevertheless, the implementation of
            Byte Stream with Mark must detect this case.

         4. Abort occurs during the sending or receiving of fundamental
            units of the lowest-level underlying stream (packets,
            buffers, or bytes).

            This case is usually handled by inhibiting interrupts, or
            other forms of masking, in the code implementing the
            encapsulated stream, since no waiting is possible at
            unexpected times.

13.  POSSIBLE FUTURE EXTENSIONS

   NFILE was designed to be extended as the needs of its clients grow,
   or as new clients with different needs appear.  Currently it meets
   the needs of the Symbolics Genera 7.0 operating system, although its
   design is intentionally general.  If users of other operating systems
   identify new features that would be useful, they could be added to
   NFILE.  This section illustrates some areas areas where the design of
   NFILE intentionally accommodates extensions.

         - The NFILE protocol encodes commands and responses as text,
           rather than using prearranged numbers.  This means that new
           commands and responses can be added without having to obtain
           a new number from a central registry.

         - The Token List Transport Layer provides a general substrate
           for the value-transmission portion of network protocols.  In
           fact, it has been used at Symbolics for other protocols

noToC RFC1037 - Page 78

           besides NFILE.  The Token List Transport Layer could
           conveniently be extended to support transmission of other
           types of values besides those it currently supports.

         - The character set to be used for file transfer could be made
           negotiable.

         - The command character set could be made negotiable.
           Currently there is no negotiation sequence, but one could be
           added.

         - Greater support for more complex file organizations could be
           added, such as record files, databases, and so on.  This
           could be an extension to the direct access mode facility.

         - Currently, the LOGIN command allows the user side to inform
           the server which version of NFILE it is running.  This
           feature is included in NFILE so that a server can continue
           to support older versions of the protocol even after new,
           extended versions have been implemented.  However, the
           specification is currently somewhat vague as to how the
           server can make use of the version.

         - NFILE is not restricted to using TCP or Chaos as its
           underlying protocol.  NFILE can be built on any byte stream
           protocol that supports reliable transmission of 8-bit bytes
           and multiple connections.

   In addition to the possible future extensions, we would like to
   mention a known limitation of NFILE.

   Currently NFILE requires multiple connections for a single session.
   That is, the control connection must be separate from the data
   connections.  If NFILE is to be used over a telephone, this
   requirement poses an inconvenient restriction.  It is possible to
   implement a multiplexing scheme as a level between NFILE and the
   communication medium.

noToC RFC1037 - Page 79

                                APPENDIX A
                          NORMAL TRANSLATION MODE


   NORMAL translation mode guarantees the following:

         - A file containing characters in the NFILE character set can
           be written to any NFILE server and read back intact
           (containing the same characters).

         - A file written by NFILE should not appear as "foreign" to a
           server operating system unless the file contains NFILE's
           extended characters.  That is, a server file that uses only
           the subset of the NFILE character set limited to standard
           ASCII characters (the 95 printing characters, and the native
           representation of return, linefeed, page, backspace, rubout,
           and tab) can be read and written, with the result being the
           same data in NFILE characters as exists in server
           characters.

   In this section, all numbers designating values of character codes
   are to be interpreted in octal.  The notation "x in c1..c2" means
   "for all character codes x such that c1 <= x <= c2."

   The NFILE character set is an extension of standard ASCII.  The 95
   ASCII printing characters have the same numerical codes in the NFILE
   character set.  Five ASCII non-printing characters have counterparts
   in the NFILE character set, as shown in the following table.  The
   NFILE character set includes a single Return character, rather than
   the carriage-return line-feed sequence typically used in ASCII.  The
   NFILE character set does not include the ASCII control characters,
   other than the five shown in the following table, but does include
   some additional printing and formatting characters that have no
   counterparts in ASCII.

                             NFILE     Standard ASCII

         Rubout:             207       177
         Backspace:          210       10
         Tab:                211       11
         Linefeed:           212       12
         Page:               214       14

   Note that the NFILE Return character is of code 215.  This character
   includes "going to the next line".  This is a notable difference from
   the convention used in PDP-10 ASCII in which lines are ended by a
   pair of characters, "carriage return" and "line feed".

noToC RFC1037 - Page 80

   NORMAL TRANSLATION TO UNIX SERVERS

   The translation given in this table is appropriate for use by UNIX
   servers, or other servers that use 8-bit bytes to store ASCII
   characters.  Machines with 8-bit bytes usually place the extra NFILE
   characters in the top half of their character set.

       TABLE 1.   TRANSLATIONS FROM NFILE CHARACTERS TO UNIX CHARACTERS


            NFILE character       UNIX character

            x in 000..007         x
            x in 010..015         x + 200
            x in 016..176         x
            177                   377
            x in 200..207         x
            x in 210..211         x - 200
            212                   015
            x in 213..214         x - 200
            215                   012
            x in 216..376         x
            377                   177

       TABLE 2.   TRANSLATIONS FROM UNIX CHARACTERS TO NFILE CHARACTERS


            UNIX character        NFILE character

            x in 000..007         x
            x in 010..011         x + 200
            012                   215
            x in 013..014         x + 200
            015                   212
            x in 016..176         x
            177                   377
            x in 200..207         x
            x in 210..215         x - 200
            x in 216..376         x
            377                   177

   NORMAL TRANSLATION TO PDP-10 FAMILY SERVERS

   The translation given in this table is appropriate for use by PDP-10
   family servers, or other servers that use 7-bit bytes to store ASCII
   characters.  On the PDP-10 the sequence CRLF, 015 012, represents a
   new line.

noToC RFC1037 - Page 81

   The mechanism for this translation on machines with 7-bit bytes is to
   use the RUBOUT character (octal code 177) as an escape character.

         TABLE 3.   TRANSLATIONS FROM NFILE TO PDP-10 CHARACTERS


            NFILE character       PDP-10 character(s)

            x in 000..007         x
            x in 010..012         177 x
            013                   013
            x in 014..015         177 x
            x in 016..176         x
            177                   177 177
            x in 200..207         177 x - 200
            x in 210..212         x - 200
            213                   177 013
            214                   014
            215                   015 012
            x in 216..376         177 x - 200
            377                   no corresponding code

   These tables might seem confusing at first, but there are some
   general rules about it that should make it clearer.  First, NFILE
   characters in the range 000..177 are generally represented as
   themselves, and x in 200..377 is generally represented as 177
   followed by x - 200.  That is, 177 is used to quote the second 200
   NFILE characters.  It was deemed that 177 is a more useful and common
   character than 377, so 177 177 means 177, and there is no way to
   describe 377 with PDP-10 ASCII characters.  In the NFILE character
   set, the formatting control characters appear offset up by 200 with
   respect to standard ASCII.  This explains why the preferred mode of
   expressing 210 (backspace) is 010, and 010 turns into 177 010.  The
   same reasoning applies to 211 (Tab), 212 (Linefeed), 214 (Formfeed),
   and 215 (Return).

   More special care is needed for the Return character, which is the
   mapping of the system-dependent representation of "the start of a new
   line".  The NFILE Return (215) is equivalent to 015 012 (CRLF) in
   some ASCII systems.  In the NFILE character set there is no
   representation

noToC RFC1037 - Page 82

     TABLE 4.   TRANSLATIONS FROM PDP-10 CHARACTERS TO NFILE CHARACTERS


            PDP-10 character      NFILE character

            x in 000..007         x
            x in 010..012         x + 200
            013                   013
            014                   214
            015 012               215
            015 not-012           115
            x in 016..176         x
            177 x in 000..007     x + 200
            177 x in 010..012     x
            177 013               213
            177 x in 014..015     x
            177 x in 016..176     x + 200
            177 177               177

   of a carriage that doesn't go to a new line, so if there is one in a
   server file, it must be translated to something else.  When
   converting ASCII characters to NFILE characters, an 015 followed by
   an 012 therefore turns into a 215.  A stray CR is arbitrarily
   translated into a single M (115).

noToC RFC1037 - Page 83

                                APPENDIX B
                           RAW TRANSLATION MODE


   RAW mode means no translation should be performed.  In RAW mode the
   server operating system should treat the file as a character file and
   use the same data formatting that would be appropriate for a
   character file, but transfer the actual binary values of the
   character codes.

noToC RFC1037 - Page 84

                                APPENDIX C
                       SUPER-IMAGE TRANSLATION MODE


   SUPER-IMAGE mode is intended for use by PDP-10 family machines only.
   It is included largely as an illustration of a system-dependent
   extension.  A server machine that has 8-bit bytes should treat
   SUPER-IMAGE mode the same as NORMAL mode.

   In this section, all numbers designating values of character codes
   are to be interpreted in octal.  The notation "x in c1..c2" means
   "for all character codes x such that c1 <= x <= c2."

   SUPER-IMAGE mode suppresses the use of the 177 character as an escape
   character.  Character translation should be done as in NORMAL mode,
   with one exception.  When a two-character sequence beginning with 177
   is detected, the 177 should not be output at all.

   In this section, all numbers designating values of character codes
   are to be interpreted in octal.  SUPER-IMAGE mode is intended for use
   by PDP-10 machines only.

   SUPER-IMAGE suppresses the use of Rubout for quoting.  That is, for
   each entry beginning with a 177 in the PDP-10 character column in the
   NORMAL translation table, the NFILE character has the 177 removed.

         TABLE 5.   SUPER-IMAGE TRANSLATION FROM NFILE TO ASCII


            NFILE character   PDP-10 character(s)


            x in 000..177     x
            x in 200..214     <x - 200>
            215               015 012
            x in 216..376     <x - 200>
            377               no corresponding code

noToC RFC1037 - Page 85

         TABLE 6.   SUPER-IMAGE TRANSLATION FROM ASCII TO NFILE


            PDP-10 character  NFILE character


            x in 000..007     x
            x in 010..012     x + 200
            013               013
            014               214
            015 012           215
            015 not-012       115
            x in <016..176>   x
            177               177

noToC RFC1037 - Page 86

                                   NOTES

   1. NFILE's requirement for using the NFILE character set is
      recognized as a drawback for non-Symbolics machines.  A useful
      extension to NFILE would be a provision to make the character set
      negotiable.

   2. Implementation note:  Care must be taken that the freeing is done
      before the control connection is allowed to process another
      command, or else the control connection may find the data channel
      to be falsely indicated as being in use.

   3. The Symbolics operating system has the policy that whenever the
      user side is waiting for the server side, a user abort can occur.
      This user side waiting can occur in any context, such awaiting a
      response, waiting in the middle of reading network input, or
      waiting in the middle of transmitting network output.  Thus there
      are no "hung" states.

   4. Note that the Token List Transport Layer supplies a special token
      to indicate Boolean truth, but no corresponding token to indicate
      Boolean falsity.  NFILE uses an empty token list to indicate
      Boolean falsity.  The historical reason for this asymmetry is the
      inability of the Lisp language to differentiate between the empty
      list and NIL, which is traditionally used to mean Boolean falsity.
      If the flexibility of both a Boolean falsity and an empty token
      list were allowed, it would create problems for an operating
      system that cannot distinguish between the two.  This aspect of
      the protocol is recognized as a concession to the Lisp language.
      The unfortunate effect is to disallow operating systems to
      distinguish between Boolean falsity and an empty list.

   5. No so-called "fat strings" can be sent.