18.19. Operation 22: PUTFH - Set Current Filehandle
18.19.1. ARGUMENTS
struct PUTFH4args { nfs_fh4 object; };18.19.2. RESULTS
struct PUTFH4res { /* * If status is NFS4_OK, * new CURRENT_FH: argument to PUTFH */ nfsstat4 status; };18.19.3. DESCRIPTION
This operation replaces the current filehandle with the filehandle provided as an argument. It clears the current stateid. If the security mechanism used by the requester does not meet the requirements of the filehandle provided to this operation, the server MUST return NFS4ERR_WRONGSEC. See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.2 for more details on the current stateid.18.19.4. IMPLEMENTATION
This operation is used in an NFS request to set the context for file accessing operations that follow in the same COMPOUND request.18.20. Operation 23: PUTPUBFH - Set Public Filehandle
18.20.1. ARGUMENT
void;
18.20.2. RESULT
struct PUTPUBFH4res { /* * If status is NFS4_OK, * new CURRENT_FH: public fh */ nfsstat4 status; };18.20.3. DESCRIPTION
This operation replaces the current filehandle with the filehandle that represents the public filehandle of the server's namespace. This filehandle may be different from the "root" filehandle that may be associated with some other directory on the server. PUTPUBFH also clears the current stateid. The public filehandle represents the concepts embodied in RFC 2054 [42], RFC 2055 [43], and RFC 2224 [53]. The intent for NFSv4.1 is that the public filehandle (represented by the PUTPUBFH operation) be used as a method of providing WebNFS server compatibility with NFSv3. The public filehandle and the root filehandle (represented by the PUTROOTFH operation) SHOULD be equivalent. If the public and root filehandles are not equivalent, then the directory corresponding to the public filehandle MUST be a descendant of the directory corresponding to the root filehandle. See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.2 for more details on the current stateid.18.20.4. IMPLEMENTATION
This operation is used in an NFS request to set the context for file accessing operations that follow in the same COMPOUND request. With the NFSv3 public filehandle, the client is able to specify whether the pathname provided in the LOOKUP should be evaluated as either an absolute path relative to the server's root or relative to the public filehandle. RFC 2224 [53] contains further discussion of the functionality. With NFSv4.1, that type of specification is not directly available in the LOOKUP operation. The reason for this is because the component separators needed to specify absolute vs. relative are not allowed in NFSv4. Therefore, the client is
responsible for constructing its request such that the use of either PUTROOTFH or PUTPUBFH signifies absolute or relative evaluation of an NFS URL, respectively. Note that there are warnings mentioned in RFC 2224 [53] with respect to the use of absolute evaluation and the restrictions the server may place on that evaluation with respect to how much of its namespace has been made available. These same warnings apply to NFSv4.1. It is likely, therefore, that because of server implementation details, an NFSv3 absolute public filehandle look up may behave differently than an NFSv4.1 absolute resolution. There is a form of security negotiation as described in RFC 2755 [54] that uses the public filehandle and an overloading of the pathname. This method is not available with NFSv4.1 as filehandles are not overloaded with special meaning and therefore do not provide the same framework as NFSv3. Clients should therefore use the security negotiation mechanisms described in Section 2.6.18.21. Operation 24: PUTROOTFH - Set Root Filehandle
18.21.1. ARGUMENTS
void;18.21.2. RESULTS
struct PUTROOTFH4res { /* * If status is NFS4_OK, * new CURRENT_FH: root fh */ nfsstat4 status; };18.21.3. DESCRIPTION
This operation replaces the current filehandle with the filehandle that represents the root of the server's namespace. From this filehandle, a LOOKUP operation can locate any other filehandle on the server. This filehandle may be different from the "public" filehandle that may be associated with some other directory on the server. PUTROOTFH also clears the current stateid. See Section 16.2.3.1.1 for more details on the current filehandle.
See Section 16.2.3.1.2 for more details on the current stateid.18.21.4. IMPLEMENTATION
This operation is used in an NFS request to set the context for file accessing operations that follow in the same COMPOUND request.18.22. Operation 25: READ - Read from File
18.22.1. ARGUMENTS
struct READ4args { /* CURRENT_FH: file */ stateid4 stateid; offset4 offset; count4 count; };18.22.2. RESULTS
struct READ4resok { bool eof; opaque data<>; }; union READ4res switch (nfsstat4 status) { case NFS4_OK: READ4resok resok4; default: void; };18.22.3. DESCRIPTION
The READ operation reads data from the regular file identified by the current filehandle. The client provides an offset of where the READ is to start and a count of how many bytes are to be read. An offset of zero means to read data starting at the beginning of the file. If offset is greater than or equal to the size of the file, the status NFS4_OK is returned with a data length set to zero and eof is set to TRUE. The READ is subject to access permissions checking.
If the client specifies a count value of zero, the READ succeeds and returns zero bytes of data again subject to access permissions checking. The server may choose to return fewer bytes than specified by the client. The client needs to check for this condition and handle the condition appropriately. Except when special stateids are used, the stateid value for a READ request represents a value returned from a previous byte-range lock or share reservation request or the stateid associated with a delegation. The stateid identifies the associated owners if any and is used by the server to verify that the associated locks are still valid (e.g., have not been revoked). If the read ended at the end-of-file (formally, in a correctly formed READ operation, if offset + count is equal to the size of the file), or the READ operation extends beyond the size of the file (if offset + count is greater than the size of the file), eof is returned as TRUE; otherwise, it is FALSE. A successful READ of an empty file will always return eof as TRUE. If the current filehandle is not an ordinary file, an error will be returned to the client. In the case that the current filehandle represents an object of type NF4DIR, NFS4ERR_ISDIR is returned. If the current filehandle designates a symbolic link, NFS4ERR_SYMLINK is returned. In all other cases, NFS4ERR_WRONG_TYPE is returned. For a READ with a stateid value of all bits equal to zero, the server MAY allow the READ to be serviced subject to mandatory byte-range locks or the current share deny modes for the file. For a READ with a stateid value of all bits equal to one, the server MAY allow READ operations to bypass locking checks at the server. On success, the current filehandle retains its value.18.22.4. IMPLEMENTATION
If the server returns a "short read" (i.e., fewer data than requested and eof is set to FALSE), the client should send another READ to get the remaining data. A server may return less data than requested under several circumstances. The file may have been truncated by another client or perhaps on the server itself, changing the file size from what the requesting client believes to be the case. This would reduce the actual amount of data available to the client. It is possible that the server reduce the transfer size and so return a short read result. Server resource exhaustion may also occur in a short read.
If mandatory byte-range locking is in effect for the file, and if the byte-range corresponding to the data to be read from the file is WRITE_LT locked by an owner not associated with the stateid, the server will return the NFS4ERR_LOCKED error. The client should try to get the appropriate READ_LT via the LOCK operation before re- attempting the READ. When the READ completes, the client should release the byte-range lock via LOCKU. If another client has an OPEN_DELEGATE_WRITE delegation for the file being read, the delegation must be recalled, and the operation cannot proceed until that delegation is returned or revoked. Except where this happens very quickly, one or more NFS4ERR_DELAY errors will be returned to requests made while the delegation remains outstanding. Normally, delegations will not be recalled as a result of a READ operation since the recall will occur as a result of an earlier OPEN. However, since it is possible for a READ to be done with a special stateid, the server needs to check for this case even though the client should have done an OPEN previously.18.23. Operation 26: READDIR - Read Directory
18.23.1. ARGUMENTS
struct READDIR4args { /* CURRENT_FH: directory */ nfs_cookie4 cookie; verifier4 cookieverf; count4 dircount; count4 maxcount; bitmap4 attr_request; };18.23.2. RESULTS
struct entry4 { nfs_cookie4 cookie; component4 name; fattr4 attrs; entry4 *nextentry; }; struct dirlist4 { entry4 *entries; bool eof; };
struct READDIR4resok { verifier4 cookieverf; dirlist4 reply; }; union READDIR4res switch (nfsstat4 status) { case NFS4_OK: READDIR4resok resok4; default: void; };18.23.3. DESCRIPTION
The READDIR operation retrieves a variable number of entries from a file system directory and returns client-requested attributes for each entry along with information to allow the client to request additional directory entries in a subsequent READDIR. The arguments contain a cookie value that represents where the READDIR should start within the directory. A value of zero for the cookie is used to start reading at the beginning of the directory. For subsequent READDIR requests, the client specifies a cookie value that is provided by the server on a previous READDIR request. The request's cookieverf field should be set to 0 zero) when the request's cookie field is zero (first read of the directory). On subsequent requests, the cookieverf field must match the cookieverf returned by the READDIR in which the cookie was acquired. If the server determines that the cookieverf is no longer valid for the directory, the error NFS4ERR_NOT_SAME must be returned. The dircount field of the request is a hint of the maximum number of bytes of directory information that should be returned. This value represents the total length of the names of the directory entries and the cookie value for these entries. This length represents the XDR encoding of the data (names and cookies) and not the length in the native format of the server. The maxcount field of the request represents the maximum total size of all of the data being returned within the READDIR4resok structure and includes the XDR overhead. The server MAY return less data. If the server is unable to return a single directory entry within the maxcount limit, the error NFS4ERR_TOOSMALL MUST be returned to the client.
Finally, the request's attr_request field represents the list of attributes to be returned for each directory entry supplied by the server. A successful reply consists of a list of directory entries. Each of these entries contains the name of the directory entry, a cookie value for that entry, and the associated attributes as requested. The "eof" flag has a value of TRUE if there are no more entries in the directory. The cookie value is only meaningful to the server and is used as a cursor for the directory entry. As mentioned, this cookie is used by the client for subsequent READDIR operations so that it may continue reading a directory. The cookie is similar in concept to a READ offset but MUST NOT be interpreted as such by the client. Ideally, the cookie value SHOULD NOT change if the directory is modified since the client may be caching these values. In some cases, the server may encounter an error while obtaining the attributes for a directory entry. Instead of returning an error for the entire READDIR operation, the server can instead return the attribute rdattr_error (Section 5.8.1.12). With this, the server is able to communicate the failure to the client and not fail the entire operation in the instance of what might be a transient failure. Obviously, the client must request the fattr4_rdattr_error attribute for this method to work properly. If the client does not request the attribute, the server has no choice but to return failure for the entire READDIR operation. For some file system environments, the directory entries "." and ".." have special meaning, and in other environments, they do not. If the server supports these special entries within a directory, they SHOULD NOT be returned to the client as part of the READDIR response. To enable some client environments, the cookie values of zero, 1, and 2 are to be considered reserved. Note that the UNIX client will use these values when combining the server's response and local representations to enable a fully formed UNIX directory presentation to the application. For READDIR arguments, cookie values of one and two SHOULD NOT be used, and for READDIR results, cookie values of zero, one, and two SHOULD NOT be returned. On success, the current filehandle retains its value.
18.23.4. IMPLEMENTATION
The server's file system directory representations can differ greatly. A client's programming interfaces may also be bound to the local operating environment in a way that does not translate well into the NFS protocol. Therefore, the use of the dircount and maxcount fields are provided to enable the client to provide hints to the server. If the client is aggressive about attribute collection during a READDIR, the server has an idea of how to limit the encoded response. If dircount is zero, the server bounds the reply's size based on the request's maxcount field. The cookieverf may be used by the server to help manage cookie values that may become stale. It should be a rare occurrence that a server is unable to continue properly reading a directory with the provided cookie/cookieverf pair. The server SHOULD make every effort to avoid this condition since the application at the client might be unable to properly handle this type of failure. The use of the cookieverf will also protect the client from using READDIR cookie values that might be stale. For example, if the file system has been migrated, the server might or might not be able to use the same cookie values to service READDIR as the previous server used. With the client providing the cookieverf, the server is able to provide the appropriate response to the client. This prevents the case where the server accepts a cookie value but the underlying directory has changed and the response is invalid from the client's context of its previous READDIR. Since some servers will not be returning "." and ".." entries as has been done with previous versions of the NFS protocol, the client that requires these entries be present in READDIR responses must fabricate them.18.24. Operation 27: READLINK - Read Symbolic Link
18.24.1. ARGUMENTS
/* CURRENT_FH: symlink */ void;18.24.2. RESULTS
struct READLINK4resok { linktext4 link; };
union READLINK4res switch (nfsstat4 status) { case NFS4_OK: READLINK4resok resok4; default: void; };18.24.3. DESCRIPTION
READLINK reads the data associated with a symbolic link. Depending on the value of the UTF-8 capability attribute (Section 14.4), the data is encoded in UTF-8. Whether created by an NFS client or created locally on the server, the data in a symbolic link is not interpreted (except possibly to check for proper UTF-8 encoding) when created, but is simply stored. On success, the current filehandle retains its value.18.24.4. IMPLEMENTATION
A symbolic link is nominally a pointer to another file. The data is not necessarily interpreted by the server, just stored in the file. It is possible for a client implementation to store a pathname that is not meaningful to the server operating system in a symbolic link. A READLINK operation returns the data to the client for interpretation. If different implementations want to share access to symbolic links, then they must agree on the interpretation of the data in the symbolic link. The READLINK operation is only allowed on objects of type NF4LNK. The server should return the error NFS4ERR_WRONG_TYPE if the object is not of type NF4LNK.18.25. Operation 28: REMOVE - Remove File System Object
18.25.1. ARGUMENTS
struct REMOVE4args { /* CURRENT_FH: directory */ component4 target; };18.25.2. RESULTS
struct REMOVE4resok { change_info4 cinfo; };
union REMOVE4res switch (nfsstat4 status) { case NFS4_OK: REMOVE4resok resok4; default: void; };18.25.3. DESCRIPTION
The REMOVE operation removes (deletes) a directory entry named by filename from the directory corresponding to the current filehandle. If the entry in the directory was the last reference to the corresponding file system object, the object may be destroyed. The directory may be either of type NF4DIR or NF4ATTRDIR. For the directory where the filename was removed, the server returns change_info4 information in cinfo. With the atomic field of the change_info4 data type, the server will indicate if the before and after change attributes were obtained atomically with respect to the removal. If the target has a length of zero, or if the target does not obey the UTF-8 definition (and the server is enforcing UTF-8 encoding; see Section 14.4), the error NFS4ERR_INVAL will be returned. On success, the current filehandle retains its value.18.25.4. IMPLEMENTATION
NFSv3 required a different operator RMDIR for directory removal and REMOVE for non-directory removal. This allowed clients to skip checking the file type when being passed a non-directory delete system call (e.g., unlink() [27] in POSIX) to remove a directory, as well as the converse (e.g., a rmdir() on a non-directory) because they knew the server would check the file type. NFSv4.1 REMOVE can be used to delete any directory entry independent of its file type. The implementor of an NFSv4.1 client's entry points from the unlink() and rmdir() system calls should first check the file type against the types the system call is allowed to remove before sending a REMOVE operation. Alternatively, the implementor can produce a COMPOUND call that includes a LOOKUP/VERIFY sequence of operations to verify the file type before a REMOVE operation in the same COMPOUND call. The concept of last reference is server specific. However, if the numlinks field in the previous attributes of the object had the value 1, the client should not rely on referring to the object via a filehandle. Likewise, the client should not rely on the resources (disk space, directory entry, and so on) formerly associated with the
object becoming immediately available. Thus, if a client needs to be able to continue to access a file after using REMOVE to remove it, the client should take steps to make sure that the file will still be accessible. While the traditional mechanism used is to RENAME the file from its old name to a new hidden name, the NFSv4.1 OPEN operation MAY return a result flag, OPEN4_RESULT_PRESERVE_UNLINKED, which indicates to the client that the file will be preserved if the file has an outstanding open (see Section 18.16). If the server finds that the file is still open when the REMOVE arrives: o The server SHOULD NOT delete the file's directory entry if the file was opened with OPEN4_SHARE_DENY_WRITE or OPEN4_SHARE_DENY_BOTH. o If the file was not opened with OPEN4_SHARE_DENY_WRITE or OPEN4_SHARE_DENY_BOTH, the server SHOULD delete the file's directory entry. However, until last CLOSE of the file, the server MAY continue to allow access to the file via its filehandle. o The server MUST NOT delete the directory entry if the reply from OPEN had the flag OPEN4_RESULT_PRESERVE_UNLINKED set. The server MAY implement its own restrictions on removal of a file while it is open. The server might disallow such a REMOVE (or a removal that occurs as part of RENAME). The conditions that influence the restrictions on removal of a file while it is still open include: o Whether certain access protocols (i.e., not just NFS) are holding the file open. o Whether particular options, access modes, or policies on the server are enabled. If a file has an outstanding OPEN and this prevents the removal of the file's directory entry, the error NFS4ERR_FILE_OPEN is returned. Where the determination above cannot be made definitively because delegations are being held, they MUST be recalled to allow processing of the REMOVE to continue. When a delegation is held, the server has no reliable knowledge of the status of OPENs for that client, so unless there are files opened with the particular deny modes by clients without delegations, the determination cannot be made until
delegations are recalled, and the operation cannot proceed until each sufficient delegation has been returned or revoked to allow the server to make a correct determination. In all cases in which delegations are recalled, the server is likely to return one or more NFS4ERR_DELAY errors while delegations remain outstanding. If the current filehandle designates a directory for which another client holds a directory delegation, then, unless the situation can be resolved by sending a notification, the directory delegation MUST be recalled, and the operation MUST NOT proceed until the delegation is returned or revoked. Except where this happens very quickly, one or more NFS4ERR_DELAY errors will be returned to requests made while delegation remains outstanding. When the current filehandle designates a directory for which one or more directory delegations exist, then, when those delegations request such notifications, NOTIFY4_REMOVE_ENTRY will be generated as a result of this operation. Note that when a remove occurs as a result of a RENAME, NOTIFY4_REMOVE_ENTRY will only be generated if the removal happens as a separate operation. In the case in which the removal is integrated and atomic with RENAME, the notification of the removal is integrated with notification for the RENAME. See the discussion of the NOTIFY4_RENAME_ENTRY notification in Section 20.4.18.26. Operation 29: RENAME - Rename Directory Entry
18.26.1. ARGUMENTS
struct RENAME4args { /* SAVED_FH: source directory */ component4 oldname; /* CURRENT_FH: target directory */ component4 newname; };18.26.2. RESULTS
struct RENAME4resok { change_info4 source_cinfo; change_info4 target_cinfo; };
union RENAME4res switch (nfsstat4 status) { case NFS4_OK: RENAME4resok resok4; default: void; };18.26.3. DESCRIPTION
The RENAME operation renames the object identified by oldname in the source directory corresponding to the saved filehandle, as set by the SAVEFH operation, to newname in the target directory corresponding to the current filehandle. The operation is required to be atomic to the client. Source and target directories MUST reside on the same file system on the server. On success, the current filehandle will continue to be the target directory. If the target directory already contains an entry with the name newname, the source object MUST be compatible with the target: either both are non-directories or both are directories and the target MUST be empty. If compatible, the existing target is removed before the rename occurs or, preferably, the target is removed atomically as part of the rename. See Section 18.25.4 for client and server actions whenever a target is removed. Note however that when the removal is performed atomically with the rename, certain parts of the removal described there are integrated with the rename. For example, notification of the removal will not be via a NOTIFY4_REMOVE_ENTRY but will be indicated as part of the NOTIFY4_ADD_ENTRY or NOTIFY4_RENAME_ENTRY generated by the rename. If the source object and the target are not compatible or if the target is a directory but not empty, the server will return the error NFS4ERR_EXIST. If oldname and newname both refer to the same file (e.g., they might be hard links of each other), then unless the file is open (see Section 18.26.4), RENAME MUST perform no action and return NFS4_OK. For both directories involved in the RENAME, the server returns change_info4 information. With the atomic field of the change_info4 data type, the server will indicate if the before and after change attributes were obtained atomically with respect to the rename. If oldname refers to a named attribute and the saved and current filehandles refer to different file system objects, the server will return NFS4ERR_XDEV just as if the saved and current filehandles represented directories on different file systems.
If oldname or newname has a length of zero, or if oldname or newname does not obey the UTF-8 definition, the error NFS4ERR_INVAL will be returned.18.26.4. IMPLEMENTATION
The server MAY impose restrictions on the RENAME operation such that RENAME may not be done when the file being renamed is open or when that open is done by particular protocols, or with particular options or access modes. Similar restrictions may be applied when a file exists with the target name and is open. When RENAME is rejected because of such restrictions, the error NFS4ERR_FILE_OPEN is returned. When oldname and rename refer to the same file and that file is open in a fashion such that RENAME would normally be rejected with NFS4ERR_FILE_OPEN if oldname and newname were different files, then RENAME SHOULD be rejected with NFS4ERR_FILE_OPEN. If a server does implement such restrictions and those restrictions include cases of NFSv4 opens preventing successful execution of a rename, the server needs to recall any delegations that could hide the existence of opens relevant to that decision. This is because when a client holds a delegation, the server might not have an accurate account of the opens for that client, since the client may execute OPENs and CLOSEs locally. The RENAME operation need only be delayed until a definitive result can be obtained. For example, if there are multiple delegations and one of them establishes an open whose presence would prevent the rename, given the server's semantics, NFS4ERR_FILE_OPEN may be returned to the caller as soon as that delegation is returned without waiting for other delegations to be returned. Similarly, if such opens are not associated with delegations, NFS4ERR_FILE_OPEN can be returned immediately with no delegation recall being done. If the current filehandle or the saved filehandle designates a directory for which another client holds a directory delegation, then, unless the situation can be resolved by sending a notification, the delegation MUST be recalled, and the operation cannot proceed until the delegation is returned or revoked. Except where this happens very quickly, one or more NFS4ERR_DELAY errors will be returned to requests made while delegation remains outstanding. When the current and saved filehandles are the same and they designate a directory for which one or more directory delegations exist, then, when those delegations request such notifications, a notification of type NOTIFY4_RENAME_ENTRY will be generated as a result of this operation. When oldname and rename refer to the same
file, no notification is generated (because, as Section 18.26.3 states, the server MUST take no action). When a file is removed because it has the same name as the target, if that removal is done atomically with the rename, a NOTIFY4_REMOVE_ENTRY notification will not be generated. Instead, the deletion of the file will be reported as part of the NOTIFY4_RENAME_ENTRY notification. When the current and saved filehandles are not the same: o If the current filehandle designates a directory for which one or more directory delegations exist, then, when those delegations request such notifications, NOTIFY4_ADD_ENTRY will be generated as a result of this operation. When a file is removed because it has the same name as the target, if that removal is done atomically with the rename, a NOTIFY4_REMOVE_ENTRY notification will not be generated. Instead, the deletion of the file will be reported as part of the NOTIFY4_ADD_ENTRY notification. o If the saved filehandle designates a directory for which one or more directory delegations exist, then, when those delegations request such notifications, NOTIFY4_REMOVE_ENTRY will be generated as a result of this operation. If the object being renamed has file delegations held by clients other than the one doing the RENAME, the delegations MUST be recalled, and the operation cannot proceed until each such delegation is returned or revoked. Note that in the case of multiply linked files, the delegation recall requirement applies even if the delegation was obtained through a different name than the one being renamed. In all cases in which delegations are recalled, the server is likely to return one or more NFS4ERR_DELAY errors while the delegation(s) remains outstanding, although it might not do that if the delegations are returned quickly. The RENAME operation must be atomic to the client. The statement "source and target directories MUST reside on the same file system on the server" means that the fsid fields in the attributes for the directories are the same. If they reside on different file systems, the error NFS4ERR_XDEV is returned. Based on the value of the fh_expire_type attribute for the object, the filehandle may or may not expire on a RENAME. However, server implementors are strongly encouraged to attempt to keep filehandles from expiring in this fashion.
On some servers, the file names "." and ".." are illegal as either oldname or newname, and will result in the error NFS4ERR_BADNAME. In addition, on many servers the case of oldname or newname being an alias for the source directory will be checked for. Such servers will return the error NFS4ERR_INVAL in these cases. If either of the source or target filehandles are not directories, the server will return NFS4ERR_NOTDIR.18.27. Operation 31: RESTOREFH - Restore Saved Filehandle
18.27.1. ARGUMENTS
/* SAVED_FH: */ void;18.27.2. RESULTS
struct RESTOREFH4res { /* * If status is NFS4_OK, * new CURRENT_FH: value of saved fh */ nfsstat4 status; };18.27.3. DESCRIPTION
The RESTOREFH operation sets the current filehandle and stateid to the values in the saved filehandle and stateid. If there is no saved filehandle, then the server will return the error NFS4ERR_NOFILEHANDLE. See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.2 for more details on the current stateid.
18.27.4. IMPLEMENTATION
Operations like OPEN and LOOKUP use the current filehandle to represent a directory and replace it with a new filehandle. Assuming that the previous filehandle was saved with a SAVEFH operator, the previous filehandle can be restored as the current filehandle. This is commonly used to obtain post-operation attributes for the directory, e.g., PUTFH (directory filehandle) SAVEFH GETATTR attrbits (pre-op dir attrs) CREATE optbits "foo" attrs GETATTR attrbits (file attributes) RESTOREFH GETATTR attrbits (post-op dir attrs)18.28. Operation 32: SAVEFH - Save Current Filehandle
18.28.1. ARGUMENTS
/* CURRENT_FH: */ void;18.28.2. RESULTS
struct SAVEFH4res { /* * If status is NFS4_OK, * new SAVED_FH: value of current fh */ nfsstat4 status; };18.28.3. DESCRIPTION
The SAVEFH operation saves the current filehandle and stateid. If a previous filehandle was saved, then it is no longer accessible. The saved filehandle can be restored as the current filehandle with the RESTOREFH operator. On success, the current filehandle retains its value. See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.2 for more details on the current stateid.
18.28.4. IMPLEMENTATION
18.29. Operation 33: SECINFO - Obtain Available Security
18.29.1. ARGUMENTS
struct SECINFO4args { /* CURRENT_FH: directory */ component4 name; };18.29.2. RESULTS
/* * From RFC 2203 */ enum rpc_gss_svc_t { RPC_GSS_SVC_NONE = 1, RPC_GSS_SVC_INTEGRITY = 2, RPC_GSS_SVC_PRIVACY = 3 }; struct rpcsec_gss_info { sec_oid4 oid; qop4 qop; rpc_gss_svc_t service; }; /* RPCSEC_GSS has a value of '6' - See RFC 2203 */ union secinfo4 switch (uint32_t flavor) { case RPCSEC_GSS: rpcsec_gss_info flavor_info; default: void; }; typedef secinfo4 SECINFO4resok<>; union SECINFO4res switch (nfsstat4 status) { case NFS4_OK: /* CURRENTFH: consumed */ SECINFO4resok resok4; default: void; };
18.29.3. DESCRIPTION
The SECINFO operation is used by the client to obtain a list of valid RPC authentication flavors for a specific directory filehandle, file name pair. SECINFO should apply the same access methodology used for LOOKUP when evaluating the name. Therefore, if the requester does not have the appropriate access to LOOKUP the name, then SECINFO MUST behave the same way and return NFS4ERR_ACCESS. The result will contain an array that represents the security mechanisms available, with an order corresponding to the server's preferences, the most preferred being first in the array. The client is free to pick whatever security mechanism it both desires and supports, or to pick in the server's preference order the first one it supports. The array entries are represented by the secinfo4 structure. The field 'flavor' will contain a value of AUTH_NONE, AUTH_SYS (as defined in RFC 5531 [3]), or RPCSEC_GSS (as defined in RFC 2203 [4]). The field flavor can also be any other security flavor registered with IANA. For the flavors AUTH_NONE and AUTH_SYS, no additional security information is returned. The same is true of many (if not most) other security flavors, including AUTH_DH. For a return value of RPCSEC_GSS, a security triple is returned that contains the mechanism object identifier (OID, as defined in RFC 2743 [7]), the quality of protection (as defined in RFC 2743 [7]), and the service type (as defined in RFC 2203 [4]). It is possible for SECINFO to return multiple entries with flavor equal to RPCSEC_GSS with different security triple values. On success, the current filehandle is consumed (see Section 2.6.3.1.1.8), and if the next operation after SECINFO tries to use the current filehandle, that operation will fail with the status NFS4ERR_NOFILEHANDLE. If the name has a length of zero, or if the name does not obey the UTF-8 definition (assuming UTF-8 capabilities are enabled; see Section 14.4), the error NFS4ERR_INVAL will be returned. See Section 2.6 for additional information on the use of SECINFO.18.29.4. IMPLEMENTATION
The SECINFO operation is expected to be used by the NFS client when the error value of NFS4ERR_WRONGSEC is returned from another NFS operation. This signifies to the client that the server's security
policy is different from what the client is currently using. At this point, the client is expected to obtain a list of possible security flavors and choose what best suits its policies. As mentioned, the server's security policies will determine when a client request receives NFS4ERR_WRONGSEC. See Table 8 for a list of operations that can return NFS4ERR_WRONGSEC. In addition, when READDIR returns attributes, the rdattr_error (Section 5.8.1.12) can contain NFS4ERR_WRONGSEC. Note that CREATE and REMOVE MUST NOT return NFS4ERR_WRONGSEC. The rationale for CREATE is that unless the target name exists, it cannot have a separate security policy from the parent directory, and the security policy of the parent was checked when its filehandle was injected into the COMPOUND request's operations stream (for similar reasons, an OPEN operation that creates the target MUST NOT return NFS4ERR_WRONGSEC). If the target name exists, while it might have a separate security policy, that is irrelevant because CREATE MUST return NFS4ERR_EXIST. The rationale for REMOVE is that while that target might have a separate security policy, the target is going to be removed, and so the security policy of the parent trumps that of the object being removed. RENAME and LINK MAY return NFS4ERR_WRONGSEC, but the NFS4ERR_WRONGSEC error applies only to the saved filehandle (see Section 2.6.3.1.2). Any NFS4ERR_WRONGSEC error on the current filehandle used by LINK and RENAME MUST be returned by the PUTFH, PUTPUBFH, PUTROOTFH, or RESTOREFH operation that injected the current filehandle. With the exception of LINK and RENAME, the set of operations that can return NFS4ERR_WRONGSEC represents the point at which the client can inject a filehandle into the "current filehandle" at the server. The filehandle is either provided by the client (PUTFH, PUTPUBFH, PUTROOTFH), generated as a result of a name-to-filehandle translation (LOOKUP and OPEN), or generated from the saved filehandle via RESTOREFH. As Section 2.6.3.1.1.1 states, a put filehandle operation followed by SAVEFH MUST NOT return NFS4ERR_WRONGSEC. Thus, the RESTOREFH operation, under certain conditions (see Section 2.6.3.1.1), is permitted to return NFS4ERR_WRONGSEC so that security policies can be honored. The READDIR operation will not directly return the NFS4ERR_WRONGSEC error. However, if the READDIR request included a request for attributes, it is possible that the READDIR request's security triple did not match that of a directory entry. If this is the case and the client has requested the rdattr_error attribute, the server will return the NFS4ERR_WRONGSEC error in rdattr_error for the entry. To resolve an error return of NFS4ERR_WRONGSEC, the client does the following:
o For LOOKUP and OPEN, the client will use SECINFO with the same current filehandle and name as provided in the original LOOKUP or OPEN to enumerate the available security triples. o For the rdattr_error, the client will use SECINFO with the same current filehandle as provided in the original READDIR. The name passed to SECINFO will be that of the directory entry (as returned from READDIR) that had the NFS4ERR_WRONGSEC error in the rdattr_error attribute. o For PUTFH, PUTROOTFH, PUTPUBFH, RESTOREFH, LINK, and RENAME, the client will use SECINFO_NO_NAME { style = SECINFO_STYLE4_CURRENT_FH }. The client will prefix the SECINFO_NO_NAME operation with the appropriate PUTFH, PUTPUBFH, or PUTROOTFH operation that provides the filehandle originally provided by the PUTFH, PUTPUBFH, PUTROOTFH, or RESTOREFH operation. NOTE: In NFSv4.0, the client was required to use SECINFO, and had to reconstruct the parent of the original filehandle and the component name of the original filehandle. The introduction in NFSv4.1 of SECINFO_NO_NAME obviates the need for reconstruction. o For LOOKUPP, the client will use SECINFO_NO_NAME { style = SECINFO_STYLE4_PARENT } and provide the filehandle that equals the filehandle originally provided to LOOKUPP. See Section 21 for a discussion on the recommendations for the security flavor used by SECINFO and SECINFO_NO_NAME.18.30. Operation 34: SETATTR - Set Attributes
18.30.1. ARGUMENTS
struct SETATTR4args { /* CURRENT_FH: target object */ stateid4 stateid; fattr4 obj_attributes; };18.30.2. RESULTS
struct SETATTR4res { nfsstat4 status; bitmap4 attrsset; };
18.30.3. DESCRIPTION
The SETATTR operation changes one or more of the attributes of a file system object. The new attributes are specified with a bitmap and the attributes that follow the bitmap in bit order. The stateid argument for SETATTR is used to provide byte-range locking context that is necessary for SETATTR requests that set the size attribute. Since setting the size attribute modifies the file's data, it has the same locking requirements as a corresponding WRITE. Any SETATTR that sets the size attribute is incompatible with a share reservation that specifies OPEN4_SHARE_DENY_WRITE. The area between the old end-of-file and the new end-of-file is considered to be modified just as would have been the case had the area in question been specified as the target of WRITE, for the purpose of checking conflicts with byte-range locks, for those cases in which a server is implementing mandatory byte-range locking behavior. A valid stateid SHOULD always be specified. When the file size attribute is not set, the special stateid consisting of all bits equal to zero MAY be passed. On either success or failure of the operation, the server will return the attrsset bitmask to represent what (if any) attributes were successfully set. The attrsset in the response is a subset of the attrmask field of the obj_attributes field in the argument. On success, the current filehandle retains its value.18.30.4. IMPLEMENTATION
If the request specifies the owner attribute to be set, the server SHOULD allow the operation to succeed if the current owner of the object matches the value specified in the request. Some servers may be implemented in a way as to prohibit the setting of the owner attribute unless the requester has privilege to do so. If the server is lenient in this one case of matching owner values, the client implementation may be simplified in cases of creation of an object (e.g., an exclusive create via OPEN) followed by a SETATTR. The file size attribute is used to request changes to the size of a file. A value of zero causes the file to be truncated, a value less than the current size of the file causes data from new size to the end of the file to be discarded, and a size greater than the current size of the file causes logically zeroed data bytes to be added to the end of the file. Servers are free to implement this using unallocated bytes (holes) or allocated data bytes set to zero. Clients should not make any assumptions regarding a server's
implementation of this feature, beyond that the bytes in the affected byte-range returned by READ will be zeroed. Servers MUST support extending the file size via SETATTR. SETATTR is not guaranteed to be atomic. A failed SETATTR may partially change a file's attributes, hence the reason why the reply always includes the status and the list of attributes that were set. If the object whose attributes are being changed has a file delegation that is held by a client other than the one doing the SETATTR, the delegation(s) must be recalled, and the operation cannot proceed to actually change an attribute until each such delegation is returned or revoked. In all cases in which delegations are recalled, the server is likely to return one or more NFS4ERR_DELAY errors while the delegation(s) remains outstanding, although it might not do that if the delegations are returned quickly. If the object whose attributes are being set is a directory and another client holds a directory delegation for that directory, then if enabled, asynchronous notifications will be generated when the set of attributes changed has a non-null intersection with the set of attributes for which notification is requested. Notifications of type NOTIFY4_CHANGE_DIR_ATTRS will be sent to the appropriate client(s), but the SETATTR is not delayed by waiting for these notifications to be sent. If the object whose attributes are being set is a member of the directory for which another client holds a directory delegation, then asynchronous notifications will be generated when the set of attributes changed has a non-null intersection with the set of attributes for which notification is requested. Notifications of type NOTIFY4_CHANGE_CHILD_ATTRS will be sent to the appropriate clients, but the SETATTR is not delayed by waiting for these notifications to be sent. Changing the size of a file with SETATTR indirectly changes the time_modify and change attributes. A client must account for this as size changes can result in data deletion. The attributes time_access_set and time_modify_set are write-only attributes constructed as a switched union so the client can direct the server in setting the time values. If the switched union specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 to be used for the operation. If the switch union does not specify SET_TO_CLIENT_TIME4, the server is to use its current time for the SETATTR operation.
If server and client times differ, programs that compare client time to file times can break. A time synchronization protocol should be used to limit client/server time skew. Use of a COMPOUND containing a VERIFY operation specifying only the change attribute, immediately followed by a SETATTR, provides a means whereby a client may specify a request that emulates the functionality of the SETATTR guard mechanism of NFSv3. Since the function of the guard mechanism is to avoid changes to the file attributes based on stale information, delays between checking of the guard condition and the setting of the attributes have the potential to compromise this function, as would the corresponding delay in the NFSv4 emulation. Therefore, NFSv4.1 servers SHOULD take care to avoid such delays, to the degree possible, when executing such a request. If the server does not support an attribute as requested by the client, the server SHOULD return NFS4ERR_ATTRNOTSUPP. A mask of the attributes actually set is returned by SETATTR in all cases. That mask MUST NOT include attribute bits not requested to be set by the client. If the attribute masks in the request and reply are equal, the status field in the reply MUST be NFS4_OK.18.31. Operation 37: VERIFY - Verify Same Attributes
18.31.1. ARGUMENTS
struct VERIFY4args { /* CURRENT_FH: object */ fattr4 obj_attributes; };18.31.2. RESULTS
struct VERIFY4res { nfsstat4 status; };18.31.3. DESCRIPTION
The VERIFY operation is used to verify that attributes have the value assumed by the client before proceeding with the following operations in the COMPOUND request. If any of the attributes do not match, then the error NFS4ERR_NOT_SAME must be returned. The current filehandle retains its value after successful completion of the operation.
18.31.4. IMPLEMENTATION
One possible use of the VERIFY operation is the following series of operations. With this, the client is attempting to verify that the file being removed will match what the client expects to be removed. This series can help prevent the unintended deletion of a file. PUTFH (directory filehandle) LOOKUP (file name) VERIFY (filehandle == fh) PUTFH (directory filehandle) REMOVE (file name) This series does not prevent a second client from removing and creating a new file in the middle of this sequence, but it does help avoid the unintended result. In the case that a RECOMMENDED attribute is specified in the VERIFY operation and the server does not support that attribute for the file system object, the error NFS4ERR_ATTRNOTSUPP is returned to the client. When the attribute rdattr_error or any set-only attribute (e.g., time_modify_set) is specified, the error NFS4ERR_INVAL is returned to the client.18.32. Operation 38: WRITE - Write to File
18.32.1. ARGUMENTS
enum stable_how4 { UNSTABLE4 = 0, DATA_SYNC4 = 1, FILE_SYNC4 = 2 }; struct WRITE4args { /* CURRENT_FH: file */ stateid4 stateid; offset4 offset; stable_how4 stable; opaque data<>; };
18.32.2. RESULTS
struct WRITE4resok { count4 count; stable_how4 committed; verifier4 writeverf; }; union WRITE4res switch (nfsstat4 status) { case NFS4_OK: WRITE4resok resok4; default: void; };18.32.3. DESCRIPTION
The WRITE operation is used to write data to a regular file. The target file is specified by the current filehandle. The offset specifies the offset where the data should be written. An offset of zero specifies that the write should start at the beginning of the file. The count, as encoded as part of the opaque data parameter, represents the number of bytes of data that are to be written. If the count is zero, the WRITE will succeed and return a count of zero subject to permissions checking. The server MAY write fewer bytes than requested by the client. The client specifies with the stable parameter the method of how the data is to be processed by the server. If stable is FILE_SYNC4, the server MUST commit the data written plus all file system metadata to stable storage before returning results. This corresponds to the NFSv2 protocol semantics. Any other behavior constitutes a protocol violation. If stable is DATA_SYNC4, then the server MUST commit all of the data to stable storage and enough of the metadata to retrieve the data before returning. The server implementor is free to implement DATA_SYNC4 in the same fashion as FILE_SYNC4, but with a possible performance drop. If stable is UNSTABLE4, the server is free to commit any part of the data and the metadata to stable storage, including all or none, before returning a reply to the client. There is no guarantee whether or when any uncommitted data will subsequently be committed to stable storage. The only guarantees made by the server are that it will not destroy any data without changing the value of writeverf and that it will not commit the data and metadata at a level less than that requested by the client.
Except when special stateids are used, the stateid value for a WRITE request represents a value returned from a previous byte-range LOCK or OPEN request or the stateid associated with a delegation. The stateid identifies the associated owners if any and is used by the server to verify that the associated locks are still valid (e.g., have not been revoked). Upon successful completion, the following results are returned. The count result is the number of bytes of data written to the file. The server may write fewer bytes than requested. If so, the actual number of bytes written starting at location, offset, is returned. The server also returns an indication of the level of commitment of the data and metadata via committed. Per Table 11, o The server MAY commit the data at a stronger level than requested. o The server MUST commit the data at a level at least as high as that committed. Valid combinations of the fields stable in the request and committed in the reply. +------------+-----------------------------------+ | stable | committed | +------------+-----------------------------------+ | UNSTABLE4 | FILE_SYNC4, DATA_SYNC4, UNSTABLE4 | | DATA_SYNC4 | FILE_SYNC4, DATA_SYNC4 | | FILE_SYNC4 | FILE_SYNC4 | +------------+-----------------------------------+ Table 11 The final portion of the result is the field writeverf. This field is the write verifier and is a cookie that the client can use to determine whether a server has changed instance state (e.g., server restart) between a call to WRITE and a subsequent call to either WRITE or COMMIT. This cookie MUST be unchanged during a single instance of the NFSv4.1 server and MUST be unique between instances of the NFSv4.1 server. If the cookie changes, then the client MUST assume that any data written with an UNSTABLE4 value for committed and an old writeverf in the reply has been lost and will need to be recovered. If a client writes data to the server with the stable argument set to UNSTABLE4 and the reply yields a committed response of DATA_SYNC4 or UNSTABLE4, the client will follow up some time in the future with a COMMIT operation to synchronize outstanding asynchronous data and
metadata with the server's stable storage, barring client error. It is possible that due to client crash or other error that a subsequent COMMIT will not be received by the server. For a WRITE with a stateid value of all bits equal to zero, the server MAY allow the WRITE to be serviced subject to mandatory byte- range locks or the current share deny modes for the file. For a WRITE with a stateid value of all bits equal to 1, the server MUST NOT allow the WRITE operation to bypass locking checks at the server and otherwise is treated as if a stateid of all bits equal to zero were used. On success, the current filehandle retains its value.18.32.4. IMPLEMENTATION
It is possible for the server to write fewer bytes of data than requested by the client. In this case, the server SHOULD NOT return an error unless no data was written at all. If the server writes less than the number of bytes specified, the client will need to send another WRITE to write the remaining data. It is assumed that the act of writing data to a file will cause the time_modified and change attributes of the file to be updated. However, these attributes SHOULD NOT be changed unless the contents of the file are changed. Thus, a WRITE request with count set to zero SHOULD NOT cause the time_modified and change attributes of the file to be updated. Stable storage is persistent storage that survives: 1. Repeated power failures. 2. Hardware failures (of any board, power supply, etc.). 3. Repeated software crashes and restarts. This definition does not address failure of the stable storage module itself. The verifier is defined to allow a client to detect different instances of an NFSv4.1 protocol server over which cached, uncommitted data may be lost. In the most likely case, the verifier allows the client to detect server restarts. This information is required so that the client can safely determine whether the server could have lost cached data. If the server fails unexpectedly and the client has uncommitted data from previous WRITE requests (done with the stable argument set to UNSTABLE4 and in which the result
committed was returned as UNSTABLE4 as well), the server might not have flushed cached data to stable storage. The burden of recovery is on the client, and the client will need to retransmit the data to the server. A suggested verifier would be to use the time that the server was last started (if restarting the server results in lost buffers). The reply's committed field allows the client to do more effective caching. If the server is committing all WRITE requests to stable storage, then it SHOULD return with committed set to FILE_SYNC4, regardless of the value of the stable field in the arguments. A server that uses an NVRAM accelerator may choose to implement this policy. The client can use this to increase the effectiveness of the cache by discarding cached data that has already been committed on the server. Some implementations may return NFS4ERR_NOSPC instead of NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the current filehandle is of type NF4DIR, the server will return NFS4ERR_ISDIR. If the current file is a symbolic link, the error NFS4ERR_SYMLINK will be returned. Otherwise, if the current filehandle does not designate an ordinary file, the server will return NFS4ERR_WRONG_TYPE. If mandatory byte-range locking is in effect for the file, and the corresponding byte-range of the data to be written to the file is READ_LT or WRITE_LT locked by an owner that is not associated with the stateid, the server MUST return NFS4ERR_LOCKED. If so, the client MUST check if the owner corresponding to the stateid used with the WRITE operation has a conflicting READ_LT lock that overlaps with the byte-range that was to be written. If the stateid's owner has no conflicting READ_LT lock, then the client SHOULD try to get the appropriate write byte-range lock via the LOCK operation before re- attempting the WRITE. When the WRITE completes, the client SHOULD release the byte-range lock via LOCKU. If the stateid's owner had a conflicting READ_LT lock, then the client has no choice but to return an error to the application that attempted the WRITE. The reason is that since the stateid's owner had a READ_LT lock, either the server attempted to temporarily effectively upgrade this READ_LT lock to a WRITE_LT lock or the server has no upgrade capability. If the server attempted to upgrade the READ_LT lock and failed, it is pointless for the client to re- attempt the upgrade via the LOCK operation, because there might be another client also trying to upgrade. If two clients are blocked
trying to upgrade the same lock, the clients deadlock. If the server has no upgrade capability, then it is pointless to try a LOCK operation to upgrade. If one or more other clients have delegations for the file being written, those delegations MUST be recalled, and the operation cannot proceed until those delegations are returned or revoked. Except where this happens very quickly, one or more NFS4ERR_DELAY errors will be returned to requests made while the delegation remains outstanding. Normally, delegations will not be recalled as a result of a WRITE operation since the recall will occur as a result of an earlier OPEN. However, since it is possible for a WRITE to be done with a special stateid, the server needs to check for this case even though the client should have done an OPEN previously.