1. Introduction
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119], except where "REQUIRED" and "RECOMMENDED" are used as qualifiers to distinguish classes of attributes as described in Sections 1.4.3.2 and 5 of this document.1.2. NFS Version 4 Goals
The Network File System version 4 (NFSv4) protocol is a further revision of the NFS protocol defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains the essential characteristics of previous versions: design for easy recovery; independent of transport protocols, operating systems, and file systems; simplicity; and good performance. The NFSv4 revision has the following goals: o Improved access and good performance on the Internet. The protocol is designed to transit firewalls easily, perform well where latency is high and bandwidth is low, and scale to very large numbers of clients per server. o Strong security with negotiation built into the protocol. The protocol builds on the work of the Open Network Computing (ONC) Remote Procedure Call (RPC) working group in supporting the RPCSEC_GSS protocol (see both [RFC2203] and [RFC5403]). Additionally, the NFSv4 protocol provides a mechanism to allow clients and servers the ability to negotiate security and require clients and servers to support a minimal set of security schemes. o Good cross-platform interoperability. The protocol features a file system model that provides a useful, common set of features that does not unduly favor one file system or operating system over another. o Designed for protocol extensions. The protocol is designed to accept standard extensions that do not compromise backward compatibility.
This document, together with the companion External Data Representation (XDR) description document [RFC7531], obsoletes [RFC3530] as the authoritative document describing NFSv4. It does not introduce any over-the-wire protocol changes, in the sense that previously valid requests remain valid.1.3. Definitions in the Companion Document RFC 7531 Are Authoritative
The "Network File System (NFS) Version 4 External Data Representation Standard (XDR) Description" [RFC7531] contains the definitions in XDR description language of the constructs used by the protocol. Inside this document, several of the constructs are reproduced for purposes of explanation. The reader is warned of the possibility of errors in the reproduced constructs outside of [RFC7531]. For any part of the document that is inconsistent with [RFC7531], [RFC7531] is to be considered authoritative.1.4. Overview of NFSv4 Features
To provide a reasonable context for the reader, the major features of the NFSv4 protocol will be reviewed in brief. This is done to provide an appropriate context for both the reader who is familiar with the previous versions of the NFS protocol and the reader who is new to the NFS protocols. For the reader new to the NFS protocols, some fundamental knowledge is still expected. The reader should be familiar with the XDR and RPC protocols as described in [RFC4506] and [RFC5531]. A basic knowledge of file systems and distributed file systems is expected as well.1.4.1. RPC and Security
As with previous versions of NFS, the XDR and RPC mechanisms used for the NFSv4 protocol are those defined in [RFC4506] and [RFC5531]. To meet end-to-end security requirements, the RPCSEC_GSS framework (both version 1 in [RFC2203] and version 2 in [RFC5403]) will be used to extend the basic RPC security. With the use of RPCSEC_GSS, various mechanisms can be provided to offer authentication, integrity, and privacy to the NFSv4 protocol. Kerberos V5 will be used as described in [RFC4121] to provide one security framework. With the use of RPCSEC_GSS, other mechanisms may also be specified and used for NFSv4 security. To enable in-band security negotiation, the NFSv4 protocol has added a new operation that provides the client with a method of querying the server about its policies regarding which security mechanisms must be used for access to the server's file system resources. With this, the client can securely match the security mechanism that meets the policies specified at both the client and server.
1.4.2. Procedure and Operation Structure
A significant departure from the previous versions of the NFS protocol is the introduction of the COMPOUND procedure. For the NFSv4 protocol, there are two RPC procedures: NULL and COMPOUND. The COMPOUND procedure is defined in terms of operations, and these operations correspond more closely to the traditional NFS procedures. With the use of the COMPOUND procedure, the client is able to build simple or complex requests. These COMPOUND requests allow for a reduction in the number of RPCs needed for logical file system operations. For example, without previous contact with a server a client will be able to read data from a file in one request by combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. With previous versions of the NFS protocol, this type of single request was not possible. The model used for COMPOUND is very simple. There is no logical OR or ANDing of operations. The operations combined within a COMPOUND request are evaluated in order by the server. Once an operation returns a failing result, the evaluation ends and the results of all evaluated operations are returned to the client. The NFSv4 protocol continues to have the client refer to a file or directory at the server by a "filehandle". The COMPOUND procedure has a method of passing a filehandle from one operation to another within the sequence of operations. There is a concept of a current filehandle and a saved filehandle. Most operations use the current filehandle as the file system object to operate upon. The saved filehandle is used as temporary filehandle storage within a COMPOUND procedure as well as an additional operand for certain operations.1.4.3. File System Model
The general file system model used for the NFSv4 protocol is the same as previous versions. The server file system is hierarchical, with the regular files contained within being treated as opaque byte streams. In a slight departure, file and directory names are encoded with UTF-8 to deal with the basics of internationalization. The NFSv4 protocol does not require a separate protocol to provide for the initial mapping between pathname and filehandle. Instead of using the older MOUNT protocol for this mapping, the server provides a root filehandle that represents the logical root or top of the file system tree provided by the server. The server provides multiple file systems by gluing them together with pseudo-file systems. These pseudo-file systems provide for potential gaps in the pathnames between real file systems.
1.4.3.1. Filehandle Types
In previous versions of the NFS protocol, the filehandle provided by the server was guaranteed to be valid or persistent for the lifetime of the file system object to which it referred. For some server implementations, this persistence requirement has been difficult to meet. For the NFSv4 protocol, this requirement has been relaxed by introducing another type of filehandle -- volatile. With persistent and volatile filehandle types, the server implementation can match the abilities of the file system at the server along with the operating environment. The client will have knowledge of the type of filehandle being provided by the server and can be prepared to deal with the semantics of each.1.4.3.2. Attribute Types
The NFSv4 protocol has a rich and extensible file object attribute structure, which is divided into REQUIRED, RECOMMENDED, and named attributes (see Section 5). Several (but not all) of the REQUIRED attributes are derived from the attributes of NFSv3 (see the definition of the fattr3 data type in [RFC1813]). An example of a REQUIRED attribute is the file object's type (Section 5.8.1.2) so that regular files can be distinguished from directories (also known as folders in some operating environments) and other types of objects. REQUIRED attributes are discussed in Section 5.1. An example of the RECOMMENDED attributes is an acl (Section 6.2.1). This attribute defines an Access Control List (ACL) on a file object. An ACL provides file access control beyond the model used in NFSv3. The ACL definition allows for specification of specific sets of permissions for individual users and groups. In addition, ACL inheritance allows propagation of access permissions and restriction down a directory tree as file system objects are created. RECOMMENDED attributes are discussed in Section 5.2. A named attribute is an opaque byte stream that is associated with a directory or file and referred to by a string name. Named attributes are meant to be used by client applications as a method to associate application-specific data with a regular file or directory. NFSv4.1 modifies named attributes relative to NFSv4.0 by tightening the allowed operations in order to prevent the development of non-interoperable implementations. Named attributes are discussed in Section 5.3.
1.4.3.3. Multi-Server Namespace
A single-server namespace is the file system hierarchy that the server presents for remote access. It is a proper subset of all the file systems available locally. NFSv4 contains a number of features to allow implementation of namespaces that cross server boundaries and that allow and facilitate a non-disruptive transfer of support for individual file systems between servers. They are all based upon attributes that allow one file system to specify alternative or new locations for that file system. That is, just as a client might traverse across local file systems on a single server, it can now traverse to a remote file system on a different server. These attributes may be used together with the concept of absent file systems, which provide specifications for additional locations but no actual file system content. This allows a number of important facilities: o Location attributes may be used with absent file systems to implement referrals whereby one server may direct the client to a file system provided by another server. This allows extensive multi-server namespaces to be constructed. o Location attributes may be provided for present file systems to provide the locations of alternative file system instances or replicas to be used in the event that the current file system instance becomes unavailable. o Location attributes may be provided when a previously present file system becomes absent. This allows non-disruptive migration of file systems to alternative servers.1.4.4. OPEN and CLOSE
The NFSv4 protocol introduces OPEN and CLOSE operations. The OPEN operation provides a single point where file lookup, creation, and share semantics (see Section 9.9) can be combined. The CLOSE operation also provides for the release of state accumulated by OPEN.1.4.5. File Locking
With the NFSv4 protocol, the support for byte-range file locking is part of the NFS protocol. The file locking support is structured so that an RPC callback mechanism is not required. This is a departure from the previous versions of the NFS file locking protocol, Network Lock Manager (NLM) [RFC1813]. The state associated with file locks is maintained at the server under a lease-based model. The server defines a single lease period for all state held by an NFS client.
If the client does not renew its lease within the defined period, all state associated with the client's lease may be released by the server. The client may renew its lease by use of the RENEW operation or implicitly by use of other operations (primarily READ).1.4.6. Client Caching and Delegation
The file, attribute, and directory caching for the NFSv4 protocol is similar to previous versions. Attributes and directory information are cached for a duration determined by the client. At the end of a predefined timeout, the client will query the server to see if the related file system object has been updated. For file data, the client checks its cache validity when the file is opened. A query is sent to the server to determine if the file has been changed. Based on this information, the client determines if the data cache for the file should be kept or released. Also, when the file is closed, any modified data is written to the server. If an application wants to serialize access to file data, file locking of the file data ranges in question should be used. The major addition to NFSv4 in the area of caching is the ability of the server to delegate certain responsibilities to the client. When the server grants a delegation for a file to a client, the client is guaranteed certain semantics with respect to the sharing of that file with other clients. At OPEN, the server may provide the client either a read (OPEN_DELEGATE_READ) or a write (OPEN_DELEGATE_WRITE) delegation for the file (see Section 10.4). If the client is granted an OPEN_DELEGATE_READ delegation, it is assured that no other client has the ability to write to the file for the duration of the delegation. If the client is granted an OPEN_DELEGATE_WRITE delegation, the client is assured that no other client has read or write access to the file. Delegations can be recalled by the server. If another client requests access to the file in such a way that the access conflicts with the granted delegation, the server is able to notify the initial client and recall the delegation. This requires that a callback path exist between the server and client. If this callback path does not exist, then delegations cannot be granted. The essence of a delegation is that it allows the client to locally service operations such as OPEN, CLOSE, LOCK, LOCKU, READ, or WRITE without immediate interaction with the server.
1.5. General Definitions
The following definitions are provided for the purpose of providing an appropriate context for the reader. Absent File System: A file system is "absent" when a namespace component does not have a backing file system. Anonymous Stateid: The Anonymous Stateid is a special locking object and is defined in Section 9.1.4.3. Byte: In this document, a byte is an octet, i.e., a datum exactly 8 bits in length. Client: The client is the entity that accesses the NFS server's resources. The client may be an application that contains the logic to access the NFS server directly. The client may also be the traditional operating system client that provides remote file system services for a set of applications. With reference to byte-range locking, the client is also the entity that maintains a set of locks on behalf of one or more applications. This client is responsible for crash or failure recovery for those locks it manages. Note that multiple clients may share the same transport and connection, and multiple clients may exist on the same network node. Client ID: The client ID is a 64-bit quantity used as a unique, shorthand reference to a client-supplied verifier and ID. The server is responsible for supplying the client ID. File System: The file system is the collection of objects on a server that share the same fsid attribute (see Section 5.8.1.9). Lease: A lease is an interval of time defined by the server for which the client is irrevocably granted a lock. At the end of a lease period the lock may be revoked if the lease has not been extended. The lock must be revoked if a conflicting lock has been granted after the lease interval. All leases granted by a server have the same fixed duration. Note that the fixed interval duration was chosen to alleviate the expense a server would have in maintaining state about variable- length leases across server failures.
Lock: The term "lock" is used to refer to record (byte-range) locks as well as share reservations unless specifically stated otherwise. Lock-Owner: Each byte-range lock is associated with a specific lock-owner and an open-owner. The lock-owner consists of a client ID and an opaque owner string. The client presents this to the server to establish the ownership of the byte-range lock as needed. Open-Owner: Each open file is associated with a specific open-owner, which consists of a client ID and an opaque owner string. The client presents this to the server to establish the ownership of the open as needed. READ Bypass Stateid: The READ Bypass Stateid is a special locking object and is defined in Section 9.1.4.3. Server: The "server" is the entity responsible for coordinating client access to a set of file systems. Stable Storage: NFSv4 servers must be able to recover without data loss from multiple power failures (including cascading power failures, that is, several power failures in quick succession), operating system failures, and hardware failure of components other than the storage medium itself (for example, disk, non-volatile RAM). Some examples of stable storage that are allowable for an NFS server include: (1) Media commit of data. That is, the modified data has been successfully written to the disk media -- for example, the disk platter. (2) An immediate reply disk drive with battery-backed on-drive intermediate storage or uninterruptible power system (UPS). (3) Server commit of data with battery-backed intermediate storage and recovery software. (4) Cache commit with UPS and recovery software.
Stateid: A stateid is a 128-bit quantity returned by a server that uniquely identifies the open and locking states provided by the server for a specific open-owner or lock-owner/open-owner pair for a specific file and type of lock. Verifier: A verifier is a 64-bit quantity generated by the client that the server can use to determine if the client has restarted and lost all previous lock state.1.6. Changes since RFC 3530
The main changes from RFC 3530 [RFC3530] are: o The XDR definition has been moved to a companion document [RFC7531]. o The IETF intellectual property statements were updated to the latest version. o There is a restructured and more complete explanation of multi- server namespace features. o The handling of domain names was updated to reflect Internationalized Domain Names in Applications (IDNA) [RFC5891]. o The previously required LIPKEY and SPKM-3 security mechanisms have been removed. o Some clarification was provided regarding a client re-establishing callback information to the new server if state has been migrated. o A third edge case was added for courtesy locks and network partitions. o The definition of stateid was strengthened.1.7. Changes between RFC 3010 and RFC 3530
The definition of the NFSv4 protocol in [RFC3530] replaced and obsoleted the definition present in [RFC3010]. While portions of the two documents remained the same, there were substantive changes in others. The changes made between [RFC3010] and [RFC3530] reflect implementation experience and further review of the protocol.
The following list is not inclusive of all changes but presents some of the most notable changes or additions made: o The state model has added an open_owner4 identifier. This was done to accommodate POSIX-based clients and the model they use for file locking. For POSIX clients, an open_owner4 would correspond to a file descriptor potentially shared amongst a set of processes and the lock_owner4 identifier would correspond to a process that is locking a file. o Added clarifications and error conditions for the handling of the owner and group attributes. Since these attributes are string based (as opposed to the numeric uid/gid of previous versions of NFS), translations may not be available and hence the changes made. o Added clarifications for the ACL and mode attributes to address evaluation and partial support. o For identifiers that are defined as XDR opaque, set limits on their size. o Added the mounted_on_fileid attribute to allow POSIX clients to correctly construct local mounts. o Modified the SETCLIENTID/SETCLIENTID_CONFIRM operations to deal correctly with confirmation details along with adding the ability to specify new client callback information. Also added clarification of the callback information itself. o Added a new operation RELEASE_LOCKOWNER to enable notifying the server that a lock_owner4 will no longer be used by the client. o Added RENEW operation changes to identify the client correctly and allow for additional error returns. o Verified error return possibilities for all operations. o Removed use of the pathname4 data type from LOOKUP and OPEN in favor of having the client construct a sequence of LOOKUP operations to achieve the same effect.
2. Protocol Data Types
The syntax and semantics to describe the data types of the NFSv4 protocol are defined in the XDR [RFC4506] and RPC [RFC5531] documents. The next sections build upon the XDR data types to define types and structures specific to this protocol. As a reminder, the size constants and authoritative definitions can be found in [RFC7531].2.1. Basic Data Types
Table 1 lists the base NFSv4 data types. +-----------------+-------------------------------------------------+ | Data Type | Definition | +-----------------+-------------------------------------------------+ | int32_t | typedef int int32_t; | | | | | uint32_t | typedef unsigned int uint32_t; | | | | | int64_t | typedef hyper int64_t; | | | | | uint64_t | typedef unsigned hyper uint64_t; | | | | | attrlist4 | typedef opaque attrlist4<>; | | | | | | Used for file/directory attributes. | | | | | bitmap4 | typedef uint32_t bitmap4<>; | | | | | | Used in attribute array encoding. | | | | | changeid4 | typedef uint64_t changeid4; | | | | | | Used in the definition of change_info4. | | | | | clientid4 | typedef uint64_t clientid4; | | | | | | Shorthand reference to client identification. | | | | | count4 | typedef uint32_t count4; | | | | | | Various count parameters (READ, WRITE, COMMIT). | | | | | length4 | typedef uint64_t length4; | | | | | | Describes LOCK lengths. | | | |
| mode4 | typedef uint32_t mode4; |
| | |
| | Mode attribute data type. |
| | |
| nfs_cookie4 | typedef uint64_t nfs_cookie4; |
| | |
| | Opaque cookie value for READDIR. |
| | |
| nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; |
| | |
| | Filehandle definition. |
| | |
| nfs_ftype4 | enum nfs_ftype4; |
| | |
| | Various defined file types. |
| | |
| nfsstat4 | enum nfsstat4; |
| | |
| | Return value for operations. |
| | |
| nfs_lease4 | typedef uint32_t nfs_lease4; |
| | |
| | Duration of a lease in seconds. |
| | |
| offset4 | typedef uint64_t offset4; |
| | |
| | Various offset designations (READ, WRITE, LOCK, |
| | COMMIT). |
| | |
| qop4 | typedef uint32_t qop4; |
| | |
| | Quality of protection designation in SECINFO. |
| | |
| sec_oid4 | typedef opaque sec_oid4<>; |
| | |
| | Security Object Identifier. The sec_oid4 data |
| | type is not really opaque. Instead, it |
| | contains an ASN.1 OBJECT IDENTIFIER as used by |
| | GSS-API in the mech_type argument to |
| | GSS_Init_sec_context. See [RFC2743] for |
| | details. |
| | |
| seqid4 | typedef uint32_t seqid4; |
| | |
| | Sequence identifier used for file locking. |
| | |
| utf8string | typedef opaque utf8string<>; | | | | | | UTF-8 encoding for strings. | | | | | utf8str_cis | typedef utf8string utf8str_cis; | | | | | | Case-insensitive UTF-8 string. | | | | | utf8str_cs | typedef utf8string utf8str_cs; | | | | | | Case-sensitive UTF-8 string. | | | | | utf8str_mixed | typedef utf8string utf8str_mixed; | | | | | | UTF-8 strings with a case-sensitive prefix and | | | a case-insensitive suffix. | | | | | component4 | typedef utf8str_cs component4; | | | | | | Represents pathname components. | | | | | linktext4 | typedef opaque linktext4<>; | | | | | | Symbolic link contents ("symbolic link" is | | | defined in an Open Group [openg_symlink] | | | standard). | | | | | ascii_REQUIRED4 | typedef utf8string ascii_REQUIRED4; | | | | | | String is sent as ASCII and thus is | | | automatically UTF-8. | | | | | pathname4 | typedef component4 pathname4<>; | | | | | | Represents pathname for fs_locations. | | | | | nfs_lockid4 | typedef uint64_t nfs_lockid4; | | | | | verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | | | | | | Verifier used for various operations (COMMIT, | | | CREATE, OPEN, READDIR, WRITE) | | | NFS4_VERIFIER_SIZE is defined as 8. | +-----------------+-------------------------------------------------+ Table 1: Base NFSv4 Data Types
2.2. Structured Data Types
2.2.1. nfstime4
struct nfstime4 { int64_t seconds; uint32_t nseconds; }; The nfstime4 structure gives the number of seconds and nanoseconds since midnight or 0 hour January 1, 1970 Coordinated Universal Time (UTC). Values greater than zero for the seconds field denote dates after the 0 hour January 1, 1970. Values less than zero for the seconds field denote dates before the 0 hour January 1, 1970. In both cases, the nseconds field is to be added to the seconds field for the final time representation. For example, if the time to be represented is one-half second before 0 hour January 1, 1970, the seconds field would have a value of negative one (-1) and the nseconds fields would have a value of one-half second (500000000). Values greater than 999,999,999 for nseconds are considered invalid. This data type is used to pass time and date information. A server converts to and from its local representation of time when processing time values, preserving as much accuracy as possible. If the precision of timestamps stored for a file system object is less than defined, loss of precision can occur. An adjunct time maintenance protocol is recommended to reduce client and server time skew.2.2.2. time_how4
enum time_how4 { SET_TO_SERVER_TIME4 = 0, SET_TO_CLIENT_TIME4 = 1 };2.2.3. settime4
union settime4 switch (time_how4 set_it) { case SET_TO_CLIENT_TIME4: nfstime4 time; default: void; }; The above definitions are used as the attribute definitions to set time values. If set_it is SET_TO_SERVER_TIME4, then the server uses its local representation of time for the time value.
2.2.4. specdata4
struct specdata4 { uint32_t specdata1; /* major device number */ uint32_t specdata2; /* minor device number */ }; This data type represents additional information for the device file types NF4CHR and NF4BLK.2.2.5. fsid4
struct fsid4 { uint64_t major; uint64_t minor; }; This type is the file system identifier that is used as a REQUIRED attribute.2.2.6. fs_location4
struct fs_location4 { utf8str_cis server<>; pathname4 rootpath; };2.2.7. fs_locations4
struct fs_locations4 { pathname4 fs_root; fs_location4 locations<>; }; The fs_location4 and fs_locations4 data types are used for the fs_locations RECOMMENDED attribute, which is used for migration and replication support.2.2.8. fattr4
struct fattr4 { bitmap4 attrmask; attrlist4 attr_vals; }; The fattr4 structure is used to represent file and directory attributes.
The bitmap is a counted array of 32-bit integers used to contain bit values. The position of the integer in the array that contains bit n can be computed from the expression (n / 32), and its bit within that integer is (n mod 32). 0 1 +-----------+-----------+-----------+-- | count | 31 .. 0 | 63 .. 32 | +-----------+-----------+-----------+--2.2.9. change_info4
struct change_info4 { bool atomic; changeid4 before; changeid4 after; }; This structure is used with the CREATE, LINK, REMOVE, and RENAME operations to let the client know the value of the change attribute for the directory in which the target file system object resides.2.2.10. clientaddr4
struct clientaddr4 { /* see struct rpcb in RFC 1833 */ string r_netid<>; /* network id */ string r_addr<>; /* universal address */ }; The clientaddr4 structure is used as part of the SETCLIENTID operation, either (1) to specify the address of the client that is using a client ID or (2) as part of the callback registration. The r_netid and r_addr fields respectively contain a network id and universal address. The network id and universal address concepts, together with formats for TCP over IPv4 and TCP over IPv6, are defined in [RFC5665], specifically Tables 2 and 3 and Sections 5.2.3.3 and 5.2.3.4.2.2.11. cb_client4
struct cb_client4 { unsigned int cb_program; clientaddr4 cb_location; }; This structure is used by the client to inform the server of its callback address; it includes the program number and client address.
2.2.12. nfs_client_id4
struct nfs_client_id4 { verifier4 verifier; opaque id<NFS4_OPAQUE_LIMIT>; }; This structure is part of the arguments to the SETCLIENTID operation.2.2.13. open_owner4
struct open_owner4 { clientid4 clientid; opaque owner<NFS4_OPAQUE_LIMIT>; }; This structure is used to identify the owner of open state.2.2.14. lock_owner4
struct lock_owner4 { clientid4 clientid; opaque owner<NFS4_OPAQUE_LIMIT>; }; This structure is used to identify the owner of file locking state.2.2.15. open_to_lock_owner4
struct open_to_lock_owner4 { seqid4 open_seqid; stateid4 open_stateid; seqid4 lock_seqid; lock_owner4 lock_owner; }; This structure is used for the first LOCK operation done for an open_owner4. It provides both the open_stateid and lock_owner such that the transition is made from a valid open_stateid sequence to that of the new lock_stateid sequence. Using this mechanism avoids the confirmation of the lock_owner/lock_seqid pair since it is tied to established state in the form of the open_stateid/open_seqid.
2.2.16. stateid4
struct stateid4 { uint32_t seqid; opaque other[NFS4_OTHER_SIZE]; }; This structure is used for the various state-sharing mechanisms between the client and server. For the client, this data structure is read-only. The server is required to increment the seqid field monotonically at each transition of the stateid. This is important since the client will inspect the seqid in OPEN stateids to determine the order of OPEN processing done by the server.3. RPC and Security Flavor
The NFSv4 protocol is an RPC application that uses RPC version 2 and the XDR as defined in [RFC5531] and [RFC4506]. The RPCSEC_GSS security flavors as defined in version 1 ([RFC2203]) and version 2 ([RFC5403]) MUST be implemented as the mechanism to deliver stronger security for the NFSv4 protocol. However, deployment of RPCSEC_GSS is optional.3.1. Ports and Transports
Historically, NFSv2 and NFSv3 servers have resided on port 2049. The registered port 2049 [RFC3232] for the NFS protocol SHOULD be the default configuration. Using the registered port for NFS services means the NFS client will not need to use the RPC binding protocols as described in [RFC1833]; this will allow NFS to transit firewalls. Where an NFSv4 implementation supports operation over the IP network protocol, the supported transport layer between NFS and IP MUST be an IETF standardized transport protocol that is specified to avoid network congestion; such transports include TCP and the Stream Control Transmission Protocol (SCTP). To enhance the possibilities for interoperability, an NFSv4 implementation MUST support operation over the TCP transport protocol. If TCP is used as the transport, the client and server SHOULD use persistent connections. This will prevent the weakening of TCP's congestion control via short-lived connections and will improve performance for the Wide Area Network (WAN) environment by eliminating the need for SYN handshakes. As noted in Section 19, the authentication model for NFSv4 has moved from machine-based to principal-based. However, this modification of the authentication model does not imply a technical requirement to
move the TCP connection management model from whole machine-based to one based on a per-user model. In particular, NFS over TCP client implementations have traditionally multiplexed traffic for multiple users over a common TCP connection between an NFS client and server. This has been true, regardless of whether the NFS client is using AUTH_SYS, AUTH_DH, RPCSEC_GSS, or any other flavor. Similarly, NFS over TCP server implementations have assumed such a model and thus scale the implementation of TCP connection management in proportion to the number of expected client machines. It is intended that NFSv4 will not modify this connection management model. NFSv4 clients that violate this assumption can expect scaling issues on the server and hence reduced service.3.1.1. Client Retransmission Behavior
When processing an NFSv4 request received over a reliable transport such as TCP, the NFSv4 server MUST NOT silently drop the request, except if the established transport connection has been broken. Given such a contract between NFSv4 clients and servers, clients MUST NOT retry a request unless one or both of the following are true: o The transport connection has been broken o The procedure being retried is the NULL procedure Since reliable transports, such as TCP, do not always synchronously inform a peer when the other peer has broken the connection (for example, when an NFS server reboots), the NFSv4 client may want to actively "probe" the connection to see if has been broken. Use of the NULL procedure is one recommended way to do so. So, when a client experiences a remote procedure call timeout (of some arbitrary implementation-specific amount), rather than retrying the remote procedure call, it could instead issue a NULL procedure call to the server. If the server has died, the transport connection break will eventually be indicated to the NFSv4 client. The client can then reconnect, and then retry the original request. If the NULL procedure call gets a response, the connection has not broken. The client can decide to wait longer for the original request's response, or it can break the transport connection and reconnect before re-sending the original request. For callbacks from the server to the client, the same rules apply, but the server doing the callback becomes the client, and the client receiving the callback becomes the server.
3.2. Security Flavors
Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203], an additional security flavor of RPCSEC_GSS has been introduced, which uses the functionality of GSS-API [RFC2743]. This allows for the use of various security mechanisms by the RPC layer without the additional implementation overhead of adding RPC security flavors. For NFSv4, the RPCSEC_GSS security flavor MUST be used to enable the mandatory-to-implement security mechanism. Other flavors, such as AUTH_NONE, AUTH_SYS, and AUTH_DH, MAY be implemented as well.3.2.1. Security Mechanisms for NFSv4
RPCSEC_GSS, via GSS-API, supports multiple mechanisms that provide security services. For interoperability, NFSv4 clients and servers MUST support the Kerberos V5 security mechanism. The use of RPCSEC_GSS requires selection of mechanism, quality of protection (QOP), and service (authentication, integrity, privacy). For the mandated security mechanisms, NFSv4 specifies that a QOP of zero is used, leaving it up to the mechanism or the mechanism's configuration to map QOP zero to an appropriate level of protection. Each mandated mechanism specifies a minimum set of cryptographic algorithms for implementing integrity and privacy. NFSv4 clients and servers MUST be implemented on operating environments that comply with the required cryptographic algorithms of each required mechanism.3.2.1.1. Kerberos V5 as a Security Triple
The Kerberos V5 GSS-API mechanism as described in [RFC4121] MUST be implemented with the RPCSEC_GSS services as specified in Table 2. Both client and server MUST support each of the pseudo-flavors. +--------+-------+----------------------+-----------------------+ | Number | Name | Mechanism's OID | RPCSEC_GSS service | +--------+-------+----------------------+-----------------------+ | 390003 | krb5 | 1.2.840.113554.1.2.2 | rpc_gss_svc_none | | 390004 | krb5i | 1.2.840.113554.1.2.2 | rpc_gss_svc_integrity | | 390005 | krb5p | 1.2.840.113554.1.2.2 | rpc_gss_svc_privacy | +--------+-------+----------------------+-----------------------+ Table 2: Mapping Pseudo-Flavor to Service Note that the pseudo-flavor is presented here as a mapping aid to the implementer. Because this NFS protocol includes a method to negotiate security and it understands the GSS-API mechanism, the
pseudo-flavor is not needed. The pseudo-flavor is needed for NFSv3 since the security negotiation is done via the MOUNT protocol as described in [RFC2623]. At the time this document was specified, the Advanced Encryption Standard (AES) with HMAC-SHA1 was a required algorithm set for Kerberos V5. In contrast, when NFSv4.0 was first specified in [RFC3530], weaker algorithm sets were REQUIRED for Kerberos V5, and were REQUIRED in the NFSv4.0 specification, because the Kerberos V5 specification at the time did not specify stronger algorithms. The NFSv4 specification does not specify required algorithms for Kerberos V5, and instead, the implementer is expected to track the evolution of the Kerberos V5 standard if and when stronger algorithms are specified.3.2.1.1.1. Security Considerations for Cryptographic Algorithms in Kerberos V5
When deploying NFSv4, the strength of the security achieved depends on the existing Kerberos V5 infrastructure. The algorithms of Kerberos V5 are not directly exposed to or selectable by the client or server, so there is some due diligence required by the user of NFSv4 to ensure that security is acceptable where needed. Guidance is provided in [RFC6649] as to why weak algorithms should be disabled by default.3.3. Security Negotiation
With the NFSv4 server potentially offering multiple security mechanisms, the client needs a method to determine or negotiate which mechanism is to be used for its communication with the server. The NFS server can have multiple points within its file system namespace that are available for use by NFS clients. In turn, the NFS server can be configured such that each of these entry points can have different or multiple security mechanisms in use. The security negotiation between client and server SHOULD be done with a secure channel to eliminate the possibility of a third party intercepting the negotiation sequence and forcing the client and server to choose a lower level of security than required or desired. See Section 19 for further discussion.
3.3.1. SECINFO
The SECINFO operation will allow the client to determine, on a per-filehandle basis, what security triple (see [RFC2743] and Section 16.31.4) is to be used for server access. In general, the client will not have to use the SECINFO operation, except during initial communication with the server or when the client encounters a new security policy as the client navigates the namespace. Either condition will force the client to negotiate a new security triple.3.3.2. Security Error
Based on the assumption that each NFSv4 client and server MUST support a minimum set of security (i.e., Kerberos V5 under RPCSEC_GSS), the NFS client will start its communication with the server with one of the minimal security triples. During communication with the server, the client can receive an NFS error of NFS4ERR_WRONGSEC. This error allows the server to notify the client that the security triple currently being used is not appropriate for access to the server's file system resources. The client is then responsible for determining what security triples are available at the server and choosing one that is appropriate for the client. See Section 16.31 for further discussion of how the client will respond to the NFS4ERR_WRONGSEC error and use SECINFO.3.3.3. Callback RPC Authentication
Except as noted elsewhere in this section, the callback RPC (described later) MUST mutually authenticate the NFS server to the principal that acquired the client ID (also described later), using the security flavor of the original SETCLIENTID operation used. For AUTH_NONE, there are no principals, so this is a non-issue. AUTH_SYS has no notions of mutual authentication or a server principal, so the callback from the server simply uses the AUTH_SYS credential that the user used when he set up the delegation. For AUTH_DH, one commonly used convention is that the server uses the credential corresponding to this AUTH_DH principal: unix.host@domain where host and domain are variables corresponding to the name of the server host and directory services domain in which it lives, such as a Network Information System domain or a DNS domain.
Regardless of what security mechanism under RPCSEC_GSS is being used, the NFS server MUST identify itself in GSS-API via a GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE names are of the form: service@hostname For NFS, the "service" element is: nfs Implementations of security mechanisms will convert nfs@hostname to various different forms. For Kerberos V5, the following form is RECOMMENDED: nfs/hostname For Kerberos V5, nfs/hostname would be a server principal in the Kerberos Key Distribution Center database. This is the same principal the client acquired a GSS-API context for when it issued the SETCLIENTID operation; therefore, the realm name for the server principal must be the same for the callback as it was for the SETCLIENTID.