Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 1813

NFS Version 3 Protocol Specification

Pages: 126
Informational
Part 1 of 4 – Pages 1 to 27
None   None   Next

Top   ToC   RFC1813 - Page 1
Network Working Group                                       B. Callaghan
Request for Comments: 1813                                  B. Pawlowski
Category: Informational                                      P. Staubach
                                                  Sun Microsystems, Inc.
                                                               June 1995


                  NFS Version 3 Protocol Specification

Status of this Memo

   This memo provides information for the Internet community.
   This memo does not specify an Internet standard of any kind.
   Distribution of this memo is unlimited.

IESG Note

   Internet Engineering Steering Group comment: please note that
   the IETF is not involved in creating or maintaining this
   specification.  This is the significance of the specification
   not being on the standards track.

Abstract

   This paper describes the NFS version 3 protocol.  This paper is
   provided so that people can write compatible implementations.

Table of Contents

   1.    Introduction . . . . . . . . . . . . . . . . . . . . . . .   3
   1.1     Scope of the NFS version 3 protocol  . . . . . . . . . .   4
   1.2     Useful terms . . . . . . . . . . . . . . . . . . . . . .   5
   1.3     Remote Procedure Call  . . . . . . . . . . . . . . . . .   5
   1.4     External Data Representation . . . . . . . . . . . . . .   5
   1.5     Authentication and Permission Checking . . . . . . . . .   7
   1.6     Philosophy . . . . . . . . . . . . . . . . . . . . . . .   8
   1.7     Changes from the NFS version 2 protocol  . . . . . . . .  11
   2.    RPC Information  . . . . . . . . . . . . . . . . . . . . .  14
   2.1     Authentication . . . . . . . . . . . . . . . . . . . . .  14
   2.2     Constants  . . . . . . . . . . . . . . . . . . . . . . .  14
   2.3     Transport address  . . . . . . . . . . . . . . . . . . .  14
   2.4     Sizes  . . . . . . . . . . . . . . . . . . . . . . . . .  14
   2.5     Basic Data Types . . . . . . . . . . . . . . . . . . . .  15
   2.6     Defined Error Numbers  . . . . . . . . . . . . . . . . .  17
   3.    Server Procedures  . . . . . . . . . . . . . . . . . . . .  27
   3.1     General comments on attributes . . . . . . . . . . . . .  29
   3.2     General comments on filenames  . . . . . . . . . . . . .  30
   3.3.0   NULL: Do nothing . . . . . . . . . . . . . . . . . . . .  31
Top   ToC   RFC1813 - Page 2
   3.3.1   GETATTR: Get file attributes . . . . . . . . . . . . . .  32
   3.3.2   SETATTR: Set file attributes . . . . . . . . . . . . . .  33
   3.3.3   LOOKUP: Lookup filename  . . . . . . . . . . . . . . . .  37
   3.3.4   ACCESS: Check access permission  . . . . . . . . . . . .  40
   3.3.5   READLINK: Read from symbolic link  . . . . . . . . . . .  44
   3.3.6   READ: Read from file . . . . . . . . . . . . . . . . . .  46
   3.3.7   WRITE: Write to file . . . . . . . . . . . . . . . . . .  49
   3.3.8   CREATE: Create a file  . . . . . . . . . . . . . . . . .  54
   3.3.9   MKDIR: Create a directory  . . . . . . . . . . . . . . .  58
   3.3.10  SYMLINK: Create a symbolic link  . . . . . . . . . . . .  61
   3.3.11  MKNOD: Create a special device . . . . . . . . . . . . .  63
   3.3.12  REMOVE: Remove a file  . . . . . . . . . . . . . . . . .  67
   3.3.13  RMDIR: Remove a directory  . . . . . . . . . . . . . . .  69
   3.3.14  RENAME: Rename a file or directory . . . . . . . . . . .  71
   3.3.15  LINK: Create link to an object . . . . . . . . . . . . .  74
   3.3.16  READDIR: Read From directory . . . . . . . . . . . . . .  76
   3.3.17  READDIRPLUS: Extended read from directory  . . . . . . .  80
   3.3.18  FSSTAT: Get dynamic file system information  . . . . . .  84
   3.3.19  FSINFO: Get static file system information . . . . . . .  86
   3.3.20  PATHCONF: Retrieve POSIX information . . . . . . . . . .  90
   3.3.21  COMMIT: Commit cached data on a server to stable storage  92
   4.    Implementation issues  . . . . . . . . . . . . . . . . . .  96
   4.1     Multiple version support . . . . . . . . . . . . . . . .  96
   4.2     Server/client relationship . . . . . . . . . . . . . . .  96
   4.3     Path name interpretation . . . . . . . . . . . . . . . .  97
   4.4     Permission issues  . . . . . . . . . . . . . . . . . . .  98
   4.5     Duplicate request cache  . . . . . . . . . . . . . . . .  99
   4.6     File name component handling . . . . . . . . . . . . . . 101
   4.7     Synchronous modifying operations . . . . . . . . . . . . 101
   4.8     Stable storage . . . . . . . . . . . . . . . . . . . . . 101
   4.9     Lookups and name resolution  . . . . . . . . . . . . . . 102
   4.10    Adaptive retransmission  . . . . . . . . . . . . . . . . 102
   4.11    Caching policies . . . . . . . . . . . . . . . . . . . . 102
   4.12    Stable versus unstable writes. . . . . . . . . . . . . . 103
   4.13    32 bit clients/servers and 64 bit clients/servers. . . . 104
   5.    Appendix I: Mount protocol . . . . . . . . . . . . . . . . 106
   5.1     RPC Information  . . . . . . . . . . . . . . . . . . . . 106
   5.1.1     Authentication . . . . . . . . . . . . . . . . . . . . 106
   5.1.2     Constants  . . . . . . . . . . . . . . . . . . . . . . 106
   5.1.3     Transport address  . . . . . . . . . . . . . . . . . . 106
   5.1.4     Sizes  . . . . . . . . . . . . . . . . . . . . . . . . 106
   5.1.5     Basic Data Types . . . . . . . . . . . . . . . . . . . 106
   5.2     Server Procedures  . . . . . . . . . . . . . . . . . . . 107
   5.2.0     NULL: Do nothing . . . . . . . . . . . . . . . . . . . 108
   5.2.1     MNT: Add mount entry . . . . . . . . . . . . . . . . . 109
   5.2.2     DUMP: Return mount entries . . . . . . . . . . . . . . 110
   5.2.3     UMNT: Remove mount entry . . . . . . . . . . . . . . . 111
   5.2.4     UMNTALL: Remove all mount entries  . . . . . . . . . . 112
Top   ToC   RFC1813 - Page 3
   5.2.5     EXPORT: Return export list . . . . . . . . . . . . . . 113
   6.    Appendix II: Lock manager protocol . . . . . . . . . . . . 114
   6.1     RPC Information  . . . . . . . . . . . . . . . . . . . . 114
   6.1.1     Authentication . . . . . . . . . . . . . . . . . . . . 114
   6.1.2     Constants  . . . . . . . . . . . . . . . . . . . . . . 114
   6.1.3     Transport Address  . . . . . . . . . . . . . . . . . . 115
   6.1.4     Basic Data Types . . . . . . . . . . . . . . . . . . . 115
   6.2     NLM Procedures . . . . . . . . . . . . . . . . . . . . . 118
   6.2.0     NULL: Do nothing . . . . . . . . . . . . . . . . . . . 120
   6.3     Implementation issues  . . . . . . . . . . . . . . . . . 120
   6.3.1     64-bit offsets and lengths . . . . . . . . . . . . . . 120
   6.3.2     File handles . . . . . . . . . . . . . . . . . . . . . 120
   7.    Appendix III: Bibliography . . . . . . . . . . . . . . . . 122
   8.    Security Considerations  . . . . . . . . . . . . . . . . . 125
   9.    Acknowledgements . . . . . . . . . . . . . . . . . . . . . 125
   10.   Authors' Addresses . . . . . . . . . . . . . . . . . . . . 126

1. Introduction

   Sun's NFS protocol provides transparent remote access to shared
   file systems across networks. The NFS protocol is designed to be
   machine, operating system, network architecture, and transport
   protocol independent. This independence is achieved through the
   use of Remote Procedure Call (RPC) primitives built on top of an
   eXternal Data Representation (XDR).  Implementations of the NFS
   version 2 protocol exist for a variety of machines, from personal
   computers to supercomputers. The initial version of the NFS
   protocol is specified in the Network File System Protocol
   Specification [RFC1094]. A description of the initial
   implementation can be found in [Sandberg].

   The supporting MOUNT protocol performs the operating
   system-specific functions that allow clients to attach remote
   directory trees to a point within the local file system. The
   mount process also allows the server to grant remote access
   privileges to a restricted set of clients via export control.

   The Lock Manager provides support for file locking when used in
   the NFS environment. The Network Lock Manager (NLM) protocol
   isolates the inherently stateful aspects of file locking into a
   separate protocol.

   A complete description of the above protocols and their
   implementation is to be found in [X/OpenNFS].

   The purpose of this document is to:
Top   ToC   RFC1813 - Page 4
        o Specify the NFS version 3 protocol.

        o Describe semantics of the protocol through annotation
          and description of intended implementation.

        o Specify the MOUNT version 3 protocol.

        o Briefly describe the changes between the NLM version 3
          protocol and the NLM version 4 protocol.

   The normative text is the description of the RPC procedures and
   arguments and results, which defines the over-the-wire protocol,
   and the semantics of those procedures. The material describing
   implementation practice aids the understanding of the protocol
   specification and describes some possible implementation issues
   and solutions. It is not possible to describe all implementations
   and the UNIX operating system implementation of the NFS version 3
   protocol is most often used to provide examples. Given that, the
   implementation discussion does not bear the authority of the
   description of the over-the-wire protocol itself.

1.1 Scope of the NFS version 3 protocol

   This revision of the NFS protocol addresses new requirements.
   The need to support larger files and file systems has prompted
   extensions to allow 64 bit file sizes and offsets. The revision
   enhances security by adding support for an access check to be
   done on the server. Performance modifications are of three
   types:

   1. The number of over-the-wire packets for a given
      set of file operations is reduced by returning file
      attributes on every operation, thus decreasing the number
      of calls to get modified attributes.

   2. The write throughput bottleneck caused by the synchronous
      definition of write in the NFS version 2 protocol has been
      addressed by adding support so that the NFS server can do
      unsafe writes. Unsafe writes are writes which have not
      been committed to stable storage before the operation
      returns.  This specification defines a method for
      committing these unsafe writes to stable storage in a
      reliable way.

   3. Limitations on transfer sizes have been relaxed.

   The ability to support multiple versions of a protocol in RPC
   will allow implementors of the NFS version 3 protocol to define
Top   ToC   RFC1813 - Page 5
   clients and servers that provide backwards compatibility with
   the existing installed base of NFS version 2 protocol
   implementations.

   The extensions described here represent an evolution of the
   existing NFS protocol and most of the design features of the
   NFS protocol described in [Sandberg] persist. See Changes
   from the NFS version 2 protocol on page 11 for a more
   detailed summary of the changes introduced by this revision.

1.2 Useful terms

   In this specification, a "server" is a machine that provides
   resources to the network; a "client" is a machine that accesses
   resources over the network; a "user" is a person logged in on a
   client; an "application" is a program that executes on a client.

1.3 Remote Procedure Call

   The Sun Remote Procedure Call specification provides a
   procedure-oriented interface to remote services. Each server
   supplies a program, which is a set of procedures. The NFS
   service is one such program. The combination of host address,
   program number, version number, and procedure number specify one
   remote service procedure.  Servers can support multiple versions
   of a program by using different protocol version numbers.

   The NFS protocol was designed to not require any specific level
   of reliability from its lower levels so it could potentially be
   used on many underlying transport protocols. The NFS service is
   based on RPC which provides the abstraction above lower level
   network and transport protocols.

   The rest of this document assumes the NFS environment is
   implemented on top of Sun RPC, which is specified in [RFC1057].
   A complete discussion is found in [Corbin].

1.4 External Data Representation

   The eXternal Data Representation (XDR) specification provides a
   standard way of representing a set of data types on a network.
   This solves the problem of different byte orders, structure
   alignment, and data type representation on different,
   communicating machines.

   In this document, the RPC Data Description Language is used to
   specify the XDR format parameters and results to each of the RPC
   service procedures that an NFS server provides. The RPC Data
Top   ToC   RFC1813 - Page 6
   Description Language is similar to declarations in the C
   programming language. A few new constructs have been added.
   The notation:

      string  name[SIZE];
      string  data<DSIZE>;

   defines name, which is a fixed size block of SIZE bytes, and
   data, which is a variable sized block of up to DSIZE bytes. This
   notation indicates fixed-length arrays and arrays with a
   variable number of elements up to a fixed maximum. A
   variable-length definition with no size specified means there is
   no maximum size for the field.

   The discriminated union definition:

      union example switch (enum status) {
           case OK:
              struct {
                 filename      file1;
                 filename      file2;
                 integer       count;
              }
           case ERROR:
              struct {
                 errstat       error;
                 integer       errno;
              }
           default:
              void;
      }

   defines a structure where the first thing over the network is an
   enumeration type called status. If the value of status is OK,
   the next thing on the network will be the structure containing
   file1, file2, and count. Else, if the value of status is ERROR,
   the next thing on the network will be a structure containing
   error and errno.  If the value of status is neither OK nor
   ERROR, then there is no more data in the structure.

   The XDR type, hyper, is an 8 byte (64 bit) quantity. It is used
   in the same way as the integer type. For example:

      hyper          foo;
      unsigned hyper bar;

   foo is an 8 byte signed value, while bar is an 8 byte unsigned
   value.
Top   ToC   RFC1813 - Page 7
   Although RPC/XDR compilers exist to generate client and server
   stubs from RPC Data Description Language input, NFS
   implementations do not require their use. Any software that
   provides equivalent encoding and decoding to the canonical
   network order of data defined by XDR can be used to interoperate
   with other NFS implementations.

   XDR is described in [RFC1014].

1.5 Authentication and Permission Checking

   The RPC protocol includes a slot for authentication parameters
   on every call. The contents of the authentication parameters are
   determined by the type of authentication used by the server and
   client. A server may support several different flavors of
   authentication at once. The AUTH_NONE flavor provides null
   authentication, that is, no authentication information is
   passed. The AUTH_UNIX flavor provides UNIX-style user ID, group
   ID, and groups with each call. The AUTH_DES flavor provides
   DES-encrypted authentication parameters based on a network-wide
   name, with session keys exchanged via a public key scheme. The
   AUTH_KERB flavor provides DES encrypted authentication
   parameters based on a network-wide name with session keys
   exchanged via Kerberos secret keys.

   The NFS server checks permissions by taking the credentials from
   the RPC authentication information in each remote request. For
   example, using the AUTH_UNIX flavor of authentication, the
   server gets the user's effective user ID, effective group ID and
   groups on each call, and uses them to check access. Using user
   ids and group ids implies that the client and server either
   share the same ID list or do local user and group ID mapping.
   Servers and clients must agree on the mapping from user to uid
   and from group to gid, for those sites that do not implement a
   consistent user ID and group ID space. In practice, such mapping
   is typically performed on the server, following a static mapping
   scheme or a mapping established by the user from a client at
   mount time.

   The AUTH_DES and AUTH_KERB style of authentication is based on a
   network-wide name. It provides greater security through the use
   of DES encryption and public keys in the case of AUTH_DES, and
   DES encryption and Kerberos secret keys (and tickets) in the
   AUTH_KERB case. Again, the server and client must agree on the
   identity of a particular name on the network, but the name to
   identity mapping is more operating system independent than the
   uid and gid mapping in AUTH_UNIX. Also, because the
   authentication parameters are encrypted, a malicious user must
Top   ToC   RFC1813 - Page 8
   know another users network password or private key to masquerade
   as that user. Similarly, the server returns a verifier that is
   also encrypted so that masquerading as a server requires knowing
   a network password.

   The NULL procedure typically requires no authentication.

1.6 Philosophy

   This specification defines the NFS version 3 protocol, that is
   the over-the-wire protocol by which a client accesses a server.
   The protocol provides a well-defined interface to a server's
   file resources. A client or server implements the protocol and
   provides a mapping of the local file system semantics and
   actions into those defined in the NFS version 3 protocol.
   Implementations may differ to varying degrees, depending on the
   extent to which a given environment can support all the
   operations and semantics defined in the NFS version 3 protocol.
   Although implementations exist and are used to illustrate
   various aspects of the NFS version 3 protocol, the protocol
   specification itself is the final description of how clients
   access server resources.

   Because the NFS version 3 protocol is designed to be
   operating-system independent, it does not necessarily match the
   semantics of any existing system. Server implementations are
   expected to make a best effort at supporting the protocol.  If a
   server cannot support a particular protocol procedure, it may
   return the error, NFS3ERR_NOTSUP, that indicates that the
   operation is not supported.  For example, many operating systems
   do not support the notion of a hard link. A server that cannot
   support hard links should return NFS3ERR_NOTSUP in response to a
   LINK request. FSINFO describes the most commonly unsupported
   procedures in the properties bit map.  Alternatively, a server
   may not natively support a given operation, but can emulate it
   in the NFS version 3 protocol implementation to provide greater
   functionality.

   In some cases, a server can support most of the semantics
   described by the protocol but not all. For example, the ctime
   field in the fattr structure gives the time that a file's
   attributes were last modified. Many systems do not keep this
   information. In this case, rather than not support the GETATTR
   operation, a server could simulate it by returning the last
   modified time in place of ctime.  Servers must be careful when
   simulating attribute information because of possible side
   effects on clients. For example, many clients use file
   modification times as a basis for their cache consistency
Top   ToC   RFC1813 - Page 9
   scheme.

   NFS servers are dumb and NFS clients are smart. It is the
   clients that do the work required to convert the generalized
   file access that servers provide into a file access method that
   is useful to applications and users. In the LINK example given
   above, a UNIX client that received an NFS3ERR_NOTSUP error from
   a server would do the recovery necessary to either make it look
   to the application like the link request had succeeded or return
   a reasonable error. In general, it is the burden of the client
   to recover.

   The NFS version 3 protocol assumes a stateless server
   implementation.  Statelessness means that the server does not
   need to maintain state about any of its clients in order to
   function correctly. Stateless servers have a distinct advantage
   over stateful servers in the event of a crash. With stateless
   servers, a client need only retry a request until the server
   responds; the client does not even need to know that the server
   has crashed. See additional comments in Duplicate request cache
   on page 99.

   For a server to be useful, it holds nonvolatile state: data
   stored in the file system. Design assumptions in the NFS version
   3 protocol regarding flushing of modified data to stable storage
   reduce the number of failure modes in which data loss can occur.
   In this way, NFS version 3 protocol implementations can tolerate
   transient failures, including transient failures of the network.
   In general, server implementations of the NFS version 3 protocol
   cannot tolerate a non-transient failure of the stable storage
   itself. However, there exist fault tolerant implementations
   which attempt to address such problems.

   That is not to say that an NFS version 3 protocol server can't
   maintain noncritical state. In many cases, servers will maintain
   state (cache) about previous operations to increase performance.
   For example, a client READ request might trigger a read-ahead of
   the next block of the file into the server's data cache in the
   anticipation that the client is doing a sequential read and the
   next client READ request will be satisfied from the server's
   data cache instead of from the disk. Read-ahead on the server
   increases performance by overlapping server disk I/O with client
   requests. The important point here is that the read-ahead block
   is not necessary for correct server behavior. If the server
   crashes and loses its memory cache of read buffers, recovery is
   simple on reboot - clients will continue read operations
   retrieving data from the server disk.
Top   ToC   RFC1813 - Page 10
   Most data-modifying operations in the NFS protocol are
   synchronous.  That is, when a data modifying procedure returns
   to the client, the client can assume that the operation has
   completed and any modified data associated with the request is
   now on stable storage. For example, a synchronous client WRITE
   request may cause the server to update data blocks, file system
   information blocks, and file attribute information - the latter
   information is usually referred to as metadata. When the WRITE
   operation completes, the client can assume that the write data
   is safe and discard it.  This is a very important part of the
   stateless nature of the server. If the server did not flush
   dirty data to stable storage before returning to the client, the
   client would have no way of knowing when it was safe to discard
   modified data. The following data modifying procedures are
   synchronous: WRITE (with stable flag set to FILE_SYNC), CREATE,
   MKDIR, SYMLINK, MKNOD, REMOVE, RMDIR, RENAME, LINK, and COMMIT.

   The NFS version 3 protocol introduces safe asynchronous writes
   on the server, when the WRITE procedure is used in conjunction
   with the COMMIT procedure. The COMMIT procedure provides a way
   for the client to flush data from previous asynchronous WRITE
   requests on the server to stable storage and to detect whether
   it is necessary to retransmit the data. See the procedure
   descriptions of WRITE on page 49 and COMMIT on page 92.

   The LOOKUP procedure is used by the client to traverse
   multicomponent file names (pathnames). Each call to LOOKUP is
   used to resolve one segment of a pathname. There are two reasons
   for restricting LOOKUP to a single segment: it is hard to
   standardize a common format for hierarchical file names and the
   client and server may have different mappings of pathnames to
   file systems. This would imply that either the client must break
   the path name at file system attachment points, or the server
   must know about the client's file system attachment points. In
   NFS version 3 protocol implementations, it is the client that
   constructs the hierarchical file name space using mounts to
   build a hierarchy. Support utilities, such as the Automounter,
   provide a way to manage a shared, consistent image of the file
   name space while still being driven by the client mount
   process.

   Clients can perform caching in varied manner. The general
   practice with the NFS version 2 protocol was to implement a
   time-based client-server cache consistency mechanism. It is
   expected NFS version 3 protocol implementations will use a
   similar mechanism. The NFS version 3 protocol has some explicit
   support, in the form of additional attribute information to
   eliminate explicit attribute checks. However, caching is not
Top   ToC   RFC1813 - Page 11
   required, nor is any caching policy defined by the protocol.
   Neither the NFS version 2 protocol nor the NFS version 3
   protocol provide a means of maintaining strict client-server
   consistency (and, by implication, consistency across client
   caches).

1.7 Changes from the NFS Version 2 Protocol

   The ROOT and WRITECACHE procedures have been removed. A MKNOD
   procedure has been defined to allow the creation of special
   files, eliminating the overloading of CREATE. Caching on the
   client is not defined nor dictated by the NFS version 3
   protocol, but additional information and hints have been added
   to the protocol to allow clients that implement caching to
   manage their caches more effectively. Procedures that affect the
   attributes of a file or directory may now return the new
   attributes after the operation has completed to optimize out a
   subsequent GETATTR used in validating attribute caches. In
   addition, operations that modify the directory in which the
   target object resides return the old and new attributes of the
   directory to allow clients to implement more intelligent cache
   invalidation procedures.  The ACCESS procedure provides access
   permission checking on the server, the FSSTAT procedure returns
   dynamic information about a file system, the FSINFO procedure
   returns static information about a file system and server, the
   READDIRPLUS procedure returns file handles and attributes in
   addition to directory entries, and the PATHCONF procedure
   returns POSIX pathconf information about a file.

   Below is a list of the important changes between the NFS version
   2 protocol and the NFS version 3 protocol.

   File handle size
         The file handle has been increased to a variable-length
         array of 64 bytes maximum from a fixed array of 32
         bytes. This addresses some known requirements for a
         slightly larger file handle size. The file handle was
         converted from fixed length to variable length to
         reduce local storage and network bandwidth requirements
         for systems which do not utilize the full 64 bytes of
         length.

   Maximum data sizes
         The maximum size of a data transfer used in the READ
         and WRITE procedures is now set by values in the FSINFO
         return structure. In addition, preferred transfer sizes
         are returned by FSINFO. The protocol does not place any
         artificial limits on the maximum transfer sizes.
Top   ToC   RFC1813 - Page 12
         Filenames and pathnames are now specified as strings of
         variable length. The actual length restrictions are
         determined by the client and server implementations as
         appropriate.  The protocol does not place any
         artificial limits on the length. The error,
         NFS3ERR_NAMETOOLONG, is provided to allow the server to
         return an indication to the client that it received a
         pathname that was too long for it to handle.

   Error return
         Error returns in some instances now return data (for
         example, attributes). nfsstat3 now defines the full set
         of errors that can be returned by a server. No other
         values are allowed.

   File type
         The file type now includes NF3CHR and NF3BLK for
         special files. Attributes for these types include
         subfields for UNIX major and minor devices numbers.
         NF3SOCK and NF3FIFO are now defined for sockets and
         fifos in the file system.

   File attributes
         The blocksize (the size in bytes of a block in the
         file) field has been removed. The mode field no longer
         contains file type information. The size and fileid
         fields have been widened to eight-byte unsigned
         integers from four-byte integers. Major and minor
         device information is now presented in a distinct
         structure.  The blocks field name has been changed to
         used and now contains the total number of bytes used by
         the file. It is also an eight-byte unsigned integer.

   Set file attributes
         In the NFS version 2 protocol, the settable attributes
         were represented by a subset of the file attributes
         structure; the client indicated those attributes which
         were not to be modified by setting the corresponding
         field to -1, overloading some unsigned fields. The set
         file attributes structure now uses a discriminated
         union for each field to tell whether or how to set that
         field. The atime and mtime fields can be set to either
         the server's current time or a time supplied by the
         client.

   LOOKUP
         The LOOKUP return structure now includes the attributes
         for the directory searched.
Top   ToC   RFC1813 - Page 13
   ACCESS
         An ACCESS procedure has been added to allow an explicit
         over-the-wire permissions check. This addresses known
         problems with the superuser ID mapping feature in many
         server implementations (where, due to mapping of root
         user, unexpected permission denied errors could occur
         while reading from or writing to a file).  This also
         removes the assumption which was made in the NFS
         version 2 protocol that access to files was based
         solely on UNIX style mode bits.

   READ
         The reply structure includes a Boolean that is TRUE if
         the end-of-file was encountered during the READ.  This
         allows the client to correctly detect end-of-file.

   WRITE
         The beginoffset and totalcount fields were removed from
         the WRITE arguments. The reply now includes a count so
         that the server can write less than the requested
         amount of data, if required. An indicator was added to
         the arguments to instruct the server as to the level of
         cache synchronization that is required by the client.

   CREATE
         An exclusive flag and a create verifier was added for
         the exclusive creation of regular files.

   MKNOD
         This procedure was added to support the creation of
         special files. This avoids overloading fields of CREATE
         as was done in some NFS version 2 protocol
         implementations.

   READDIR
         The READDIR arguments now include a verifier to allow
         the server to validate the cookie. The cookie is now a
         64 bit unsigned integer instead of the 4 byte array
         which was used in the NFS version 2 protocol.  This
         will help to reduce interoperability problems.

   READDIRPLUS
         This procedure was added to return file handles and
         attributes in an extended directory list.

   FSINFO
         FSINFO was added to provide nonvolatile information
         about a file system. The reply includes preferred and
Top   ToC   RFC1813 - Page 14
         maximum read transfer size, preferred and maximum write
         transfer size, and flags stating whether links or
         symbolic links are supported.  Also returned are
         preferred transfer size for READDIR procedure replies,
         server time granularity, and whether times can be set
         in a SETATTR request.

   FSSTAT
         FSSTAT was added to provide volatile information about
         a file system, for use by utilities such as the Unix
         system df command. The reply includes the total size
         and free space in the file system specified in bytes,
         the total number of files and number of free file slots
         in the file system, and an estimate of time between
         file system modifications (for use in cache consistency
         checking algorithms).

   COMMIT
         The COMMIT procedure provides the synchronization
         mechanism to be used with asynchronous WRITE
         operations.

2. RPC Information

2.1 Authentication

   The NFS service uses AUTH_NONE in the NULL procedure. AUTH_UNIX,
   AUTH_DES, or AUTH_KERB are used for all other procedures. Other
   authentication types may be supported in the future.

2.2 Constants

   These are the RPC constants needed to call the NFS Version 3
   service.  They are given in decimal.

      PROGRAM  100003
      VERSION  3

2.3 Transport address

   The NFS protocol is normally supported over the TCP and UDP
   protocols.  It uses port 2049, the same as the NFS version 2
   protocol.

2.4 Sizes

   These are the sizes, given in decimal bytes, of various XDR
   structures used in the NFS version 3 protocol:
Top   ToC   RFC1813 - Page 15
   NFS3_FHSIZE 64
      The maximum size in bytes of the opaque file handle.

   NFS3_COOKIEVERFSIZE 8
      The size in bytes of the opaque cookie verifier passed by
      READDIR and READDIRPLUS.

   NFS3_CREATEVERFSIZE 8
      The size in bytes of the opaque verifier used for
      exclusive CREATE.

   NFS3_WRITEVERFSIZE 8
      The size in bytes of the opaque verifier used for
      asynchronous WRITE.

2.5 Basic Data Types

   The following XDR definitions are basic definitions that are
   used in other structures.

   uint64
         typedef unsigned hyper uint64;

   int64
         typedef hyper int64;

   uint32
         typedef unsigned long uint32;

   int32
         typedef long int32;

   filename3
         typedef string filename3<>;

   nfspath3
         typedef string nfspath3<>;

   fileid3
         typedef uint64 fileid3;

   cookie3
         typedef uint64 cookie3;

   cookieverf3
         typedef opaque cookieverf3[NFS3_COOKIEVERFSIZE];
Top   ToC   RFC1813 - Page 16
   createverf3
         typedef opaque createverf3[NFS3_CREATEVERFSIZE];

   writeverf3
         typedef opaque writeverf3[NFS3_WRITEVERFSIZE];

   uid3
         typedef uint32 uid3;

   gid3
         typedef uint32 gid3;

   size3
         typedef uint64 size3;

   offset3
         typedef uint64 offset3;

   mode3
         typedef uint32 mode3;

   count3
         typedef uint32 count3;

   nfsstat3
      enum nfsstat3 {
         NFS3_OK             = 0,
         NFS3ERR_PERM        = 1,
         NFS3ERR_NOENT       = 2,
         NFS3ERR_IO          = 5,
         NFS3ERR_NXIO        = 6,
         NFS3ERR_ACCES       = 13,
         NFS3ERR_EXIST       = 17,
         NFS3ERR_XDEV        = 18,
         NFS3ERR_NODEV       = 19,
         NFS3ERR_NOTDIR      = 20,
         NFS3ERR_ISDIR       = 21,
         NFS3ERR_INVAL       = 22,
         NFS3ERR_FBIG        = 27,
         NFS3ERR_NOSPC       = 28,
         NFS3ERR_ROFS        = 30,
         NFS3ERR_MLINK       = 31,
         NFS3ERR_NAMETOOLONG = 63,
         NFS3ERR_NOTEMPTY    = 66,
         NFS3ERR_DQUOT       = 69,
         NFS3ERR_STALE       = 70,
         NFS3ERR_REMOTE      = 71,
         NFS3ERR_BADHANDLE   = 10001,
Top   ToC   RFC1813 - Page 17
         NFS3ERR_NOT_SYNC    = 10002,
         NFS3ERR_BAD_COOKIE  = 10003,
         NFS3ERR_NOTSUPP     = 10004,
         NFS3ERR_TOOSMALL    = 10005,
         NFS3ERR_SERVERFAULT = 10006,
         NFS3ERR_BADTYPE     = 10007,
         NFS3ERR_JUKEBOX     = 10008
      };

   The nfsstat3 type is returned with every procedure's results
   except for the NULL procedure. A value of NFS3_OK indicates that
   the call completed successfully. Any other value indicates that
   some error occurred on the call, as identified by the error
   code. Note that the precise numeric encoding must be followed.
   No other values may be returned by a server. Servers are
   expected to make a best effort mapping of error conditions to
   the set of error codes defined. In addition, no error
   precedences are specified by this specification.  Error
   precedences determine the error value that should be returned
   when more than one error applies in a given situation. The error
   precedence will be determined by the individual server
   implementation. If the client requires specific error
   precedences, it should check for the specific errors for
   itself.

2.6 Defined Error Numbers

   A description of each defined error follows:

   NFS3_OK
       Indicates the call completed successfully.

   NFS3ERR_PERM
       Not owner. The operation was not allowed because the
       caller is either not a privileged user (root) or not the
       owner of the target of the operation.

   NFS3ERR_NOENT
       No such file or directory. The file or directory name
       specified does not exist.

   NFS3ERR_IO
       I/O error. A hard error (for example, a disk error)
       occurred while processing the requested operation.

   NFS3ERR_NXIO
       I/O error. No such device or address.
Top   ToC   RFC1813 - Page 18
   NFS3ERR_ACCES
       Permission denied. The caller does not have the correct
       permission to perform the requested operation. Contrast
       this with NFS3ERR_PERM, which restricts itself to owner
       or privileged user permission failures.

   NFS3ERR_EXIST
       File exists. The file specified already exists.

   NFS3ERR_XDEV
       Attempt to do a cross-device hard link.

   NFS3ERR_NODEV
       No such device.

   NFS3ERR_NOTDIR
       Not a directory. The caller specified a non-directory in
       a directory operation.

   NFS3ERR_ISDIR
       Is a directory. The caller specified a directory in a
       non-directory operation.

   NFS3ERR_INVAL
       Invalid argument or unsupported argument for an
       operation. Two examples are attempting a READLINK on an
       object other than a symbolic link or attempting to
       SETATTR a time field on a server that does not support
       this operation.

   NFS3ERR_FBIG
       File too large. The operation would have caused a file to
       grow beyond the server's limit.

   NFS3ERR_NOSPC
       No space left on device. The operation would have caused
       the server's file system to exceed its limit.

   NFS3ERR_ROFS
       Read-only file system. A modifying operation was
       attempted on a read-only file system.

   NFS3ERR_MLINK
       Too many hard links.

   NFS3ERR_NAMETOOLONG
       The filename in an operation was too long.
Top   ToC   RFC1813 - Page 19
   NFS3ERR_NOTEMPTY

       An attempt was made to remove a directory that was not
       empty.

   NFS3ERR_DQUOT
       Resource (quota) hard limit exceeded. The user's resource
       limit on the server has been exceeded.

   NFS3ERR_STALE
       Invalid file handle. The file handle given in the
       arguments was invalid. The file referred to by that file
       handle no longer exists or access to it has been
       revoked.

   NFS3ERR_REMOTE
       Too many levels of remote in path. The file handle given
       in the arguments referred to a file on a non-local file
       system on the server.

   NFS3ERR_BADHANDLE
       Illegal NFS file handle. The file handle failed internal
       consistency checks.

   NFS3ERR_NOT_SYNC
       Update synchronization mismatch was detected during a
       SETATTR operation.

   NFS3ERR_BAD_COOKIE
       READDIR or READDIRPLUS cookie is stale.

   NFS3ERR_NOTSUPP
       Operation is not supported.

   NFS3ERR_TOOSMALL
       Buffer or request is too small.

   NFS3ERR_SERVERFAULT
       An error occurred on the server which does not map to any
       of the legal NFS version 3 protocol error values.  The
       client should translate this into an appropriate error.
       UNIX clients may choose to translate this to EIO.

   NFS3ERR_BADTYPE
       An attempt was made to create an object of a type not
       supported by the server.
Top   ToC   RFC1813 - Page 20
   NFS3ERR_JUKEBOX
       The server initiated the request, but was not able to
       complete it in a timely fashion. The client should wait
       and then try the request with a new RPC transaction ID.
       For example, this error should be returned from a server
       that supports hierarchical storage and receives a request
       to process a file that has been migrated. In this case,
       the server should start the immigration process and
       respond to client with this error.

   ftype3

      enum ftype3 {
         NF3REG    = 1,
         NF3DIR    = 2,
         NF3BLK    = 3,
         NF3CHR    = 4,
         NF3LNK    = 5,
         NF3SOCK   = 6,
         NF3FIFO   = 7
      };

   The enumeration, ftype3, gives the type of a file. The type,
   NF3REG, is a regular file, NF3DIR is a directory, NF3BLK is a
   block special device file, NF3CHR is a character special device
   file, NF3LNK is a symbolic link, NF3SOCK is a socket, and
   NF3FIFO is a named pipe. Note that the precise enum encoding
   must be followed.

   specdata3

      struct specdata3 {
           uint32     specdata1;
           uint32     specdata2;
      };

   The interpretation of the two words depends on the type of file
   system object. For a block special (NF3BLK) or character special
   (NF3CHR) file, specdata1 and specdata2 are the major and minor
   device numbers, respectively.  (This is obviously a
   UNIX-specific interpretation.) For all other file types, these
   two elements should either be set to 0 or the values should be
   agreed upon by the client and server. If the client and server
   do not agree upon the values, the client should treat these
   fields as if they are set to 0. This data field is returned as
   part of the fattr3 structure and so is available from all
   replies returning attributes. Since these fields are otherwise
   unused for objects which are not devices, out of band
Top   ToC   RFC1813 - Page 21
   information can be passed from the server to the client.
   However, once again, both the server and the client must agree
   on the values passed.

   nfs_fh3

      struct nfs_fh3 {
         opaque       data<NFS3_FHSIZE>;
      };

   The nfs_fh3 is the variable-length opaque object returned by the
   server on LOOKUP, CREATE, SYMLINK, MKNOD, LINK, or READDIRPLUS
   operations, which is used by the client on subsequent operations
   to reference the file. The file handle contains all the
   information the server needs to distinguish an individual file.
   To the client, the file handle is opaque. The client stores file
   handles for use in a later request and can compare two file
   handles from the same server for equality by doing a
   byte-by-byte comparison, but cannot otherwise interpret the
   contents of file handles. If two file handles from the same
   server are equal, they must refer to the same file, but if they
   are not equal, no conclusions can be drawn. Servers should try
   to maintain a one-to-one correspondence between file handles and
   files, but this is not required. Clients should use file handle
   comparisons only to improve performance, not for correct
   behavior.

   Servers can revoke the access provided by a file handle at any
   time.  If the file handle passed in a call refers to a file
   system object that no longer exists on the server or access for
   that file handle has been revoked, the error, NFS3ERR_STALE,
   should be returned.

   nfstime3

      struct nfstime3 {
         uint32   seconds;
         uint32   nseconds;
      };

   The nfstime3 structure gives the number of seconds and
   nanoseconds since midnight January 1, 1970 Greenwich Mean Time.
   It is used to pass time and date information. The times
   associated with files are all server times except in the case of
   a SETATTR operation where the client can explicitly set the file
   time. A server converts to and from local time when processing
   time values, preserving as much accuracy as possible. If the
   precision of timestamps stored for a file is less than that
Top   ToC   RFC1813 - Page 22
   defined by NFS version 3 protocol, loss of precision can occur.
   An adjunct time maintenance protocol is recommended to reduce
   client and server time skew.

   fattr3

      struct fattr3 {
         ftype3     type;
         mode3      mode;
         uint32     nlink;
         uid3       uid;
         gid3       gid;
         size3      size;
         size3      used;
         specdata3  rdev;
         uint64     fsid;
         fileid3    fileid;
         nfstime3   atime;
         nfstime3   mtime;
         nfstime3   ctime;
      };

   This structure defines the attributes of a file system object.
   It is returned by most operations on an object; in the case of
   operations that affect two objects (for example, a MKDIR that
   modifies the target directory attributes and defines new
   attributes for the newly created directory), the attributes for
   both may be returned. In some cases, the attributes are returned
   in the structure, wcc_data, which is defined below; in other
   cases the attributes are returned alone.  The main changes from
   the NFS version 2 protocol are that many of the fields have been
   widened and the major/minor device information is now presented
   in a distinct structure rather than being packed into a word.

   The fattr3 structure contains the basic attributes of a file.
   All servers should support this set of attributes even if they
   have to simulate some of the fields. Type is the type of the
   file. Mode is the protection mode bits. Nlink is the number of
   hard links to the file - that is, the number of different names
   for the same file. Uid is the user ID of the owner of the file.
   Gid is the group ID of the group of the file. Size is the size
   of the file in bytes. Used is the number of bytes of disk space
   that the file actually uses (which can be smaller than the size
   because the file may have holes or it may be larger due to
   fragmentation). Rdev describes the device file if the file type
   is NF3CHR or NF3BLK - see specdata3 on page 20. Fsid is the file
   system identifier for the file system. Fileid is a number which
   uniquely identifies the file within its file system (on UNIX
Top   ToC   RFC1813 - Page 23
   this would be the inumber). Atime is the time when the file data
   was last accessed. Mtime is the time when the file data was last
   modified.  Ctime is the time when the attributes of the file
   were last changed.  Writing to the file changes the ctime in
   addition to the mtime.

   The mode bits are defined as follows:

      0x00800 Set user ID on execution.
      0x00400 Set group ID on execution.
      0x00200 Save swapped text (not defined in POSIX).
      0x00100 Read permission for owner.
      0x00080 Write permission for owner.
      0x00040 Execute permission for owner on a file. Or lookup
              (search) permission for owner in directory.
      0x00020 Read permission for group.
      0x00010 Write permission for group.
      0x00008 Execute permission for group on a file. Or lookup
              (search) permission for group in directory.
      0x00004 Read permission for others.
      0x00002 Write permission for others.
      0x00001 Execute permission for others on a file. Or lookup
              (search) permission for others in directory.

   post_op_attr

      union post_op_attr switch (bool attributes_follow) {
      case TRUE:
         fattr3   attributes;
      case FALSE:
         void;
      };

   This structure is used for returning attributes in those
   operations that are not directly involved with manipulating
   attributes. One of the principles of this revision of the NFS
   protocol is to return the real value from the indicated
   operation and not an error from an incidental operation. The
   post_op_attr structure was designed to allow the server to
   recover from errors encountered while getting attributes.

   This appears to make returning attributes optional. However,
   server implementors are strongly encouraged to make best effort
   to return attributes whenever possible, even when returning an
   error.
Top   ToC   RFC1813 - Page 24
   wcc_attr

      struct wcc_attr {
         size3       size;
         nfstime3    mtime;
         nfstime3    ctime;
      };

   This is the subset of pre-operation attributes needed to better
   support the weak cache consistency semantics. Size is the file
   size in bytes of the object before the operation. Mtime is the
   time of last modification of the object before the operation.
   Ctime is the time of last change to the attributes of the object
   before the operation. See discussion in wcc_attr on page 24.

   The use of mtime by clients to detect changes to file system
   objects residing on a server is dependent on the granularity of
   the time base on the server.

   pre_op_attr

      union pre_op_attr switch (bool attributes_follow) {
      case TRUE:
           wcc_attr  attributes;
      case FALSE:
           void;
      };

   wcc_data

      struct wcc_data {
         pre_op_attr    before;
         post_op_attr   after;
      };

   When a client performs an operation that modifies the state of a
   file or directory on the server, it cannot immediately determine
   from the post-operation attributes whether the operation just
   performed was the only operation on the object since the last
   time the client received the attributes for the object. This is
   important, since if an intervening operation has changed the
   object, the client will need to invalidate any cached data for
   the object (except for the data that it just wrote).

   To deal with this, the notion of weak cache consistency data or
   wcc_data is introduced. A wcc_data structure consists of certain
   key fields from the object attributes before the operation,
   together with the object attributes after the operation. This
Top   ToC   RFC1813 - Page 25
   information allows the client to manage its cache more
   accurately than in NFS version 2 protocol implementations. The
   term, weak cache consistency, emphasizes the fact that this
   mechanism does not provide the strict server-client consistency
   that a cache consistency protocol would provide.

   In order to support the weak cache consistency model, the server
   will need to be able to get the pre-operation attributes of the
   object, perform the intended modify operation, and then get the
   post-operation attributes atomically. If there is a window for
   the object to get modified between the operation and either of
   the get attributes operations, then the client will not be able
   to determine whether it was the only entity to modify the
   object. Some information will have been lost, thus weakening the
   weak cache consistency guarantees.

   post_op_fh3

      union post_op_fh3 switch (bool handle_follows) {
      case TRUE:
           nfs_fh3  handle;
      case FALSE:
           void;
      };

   One of the principles of this revision of the NFS protocol is to
   return the real value from the indicated operation and not an
   error from an incidental operation. The post_op_fh3 structure
   was designed to allow the server to recover from errors
   encountered while constructing a file handle.

   This is the structure used to return a file handle from the
   CREATE, MKDIR, SYMLINK, MKNOD, and READDIRPLUS requests. In each
   case, the client can get the file handle by issuing a LOOKUP
   request after a successful return from one of the listed
   operations. Returning the file handle is an optimization so that
   the client is not forced to immediately issue a LOOKUP request
   to get the file handle.

   sattr3

      enum time_how {
         DONT_CHANGE        = 0,
         SET_TO_SERVER_TIME = 1,
         SET_TO_CLIENT_TIME = 2
      };

      union set_mode3 switch (bool set_it) {
Top   ToC   RFC1813 - Page 26
      case TRUE:
         mode3    mode;
      default:
         void;
      };

      union set_uid3 switch (bool set_it) {
      case TRUE:
         uid3     uid;
      default:
         void;
      };

      union set_gid3 switch (bool set_it) {
      case TRUE:
         gid3     gid;
      default:
         void;
      };

      union set_size3 switch (bool set_it) {
      case TRUE:
         size3    size;
      default:
         void;
      };

      union set_atime switch (time_how set_it) {
      case SET_TO_CLIENT_TIME:
         nfstime3  atime;
      default:
         void;
      };

      union set_mtime switch (time_how set_it) {
      case SET_TO_CLIENT_TIME:
         nfstime3  mtime;
      default:
         void;
      };

      struct sattr3 {
         set_mode3   mode;
         set_uid3    uid;
         set_gid3    gid;
         set_size3   size;
         set_atime   atime;
         set_mtime   mtime;
Top   ToC   RFC1813 - Page 27
      };

   The sattr3 structure contains the file attributes that can be
   set from the client. The fields are the same as the similarly
   named fields in the fattr3 structure. In the NFS version 3
   protocol, the settable attributes are described by a structure
   containing a set of discriminated unions. Each union indicates
   whether the corresponding attribute is to be updated, and if so,
   how.

   There are two forms of discriminated unions used. In setting the
   mode, uid, gid, or size, the discriminated union is switched on
   a boolean, set_it; if it is TRUE, a value of the appropriate
   type is then encoded.

   In setting the atime or mtime, the union is switched on an
   enumeration type, set_it. If set_it has the value DONT_CHANGE,
   the corresponding attribute is unchanged. If it has the value,
   SET_TO_SERVER_TIME, the corresponding attribute is set by the
   server to its local time; no data is provided by the client.
   Finally, if set_it has the value, SET_TO_CLIENT_TIME, the
   attribute is set to the time passed by the client in an nfstime3
   structure. (See FSINFO on page 86, which addresses the issue of
   time granularity).

   diropargs3

      struct diropargs3 {
         nfs_fh3     dir;
         filename3   name;
      };

   The diropargs3 structure is used in directory operations. The
   file handle, dir, identifies the directory in which to
   manipulate or access the file, name. See additional comments in
   File name component handling on page 101.



(page 27 continued on part 2)

Next Section