Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 5661

Network File System (NFS) Version 4 Minor Version 1 Protocol

Pages: 617
Obsoleted by:  8881
Updated by:  81788434
Part 19 of 20 – Pages 552 to 587
First   Prev   Next

Top   ToC   RFC5661 - Page 552   prevText

18.45. Operation 52: SECINFO_NO_NAME - Get Security on Unnamed Object

18.45.1. ARGUMENT

enum secinfo_style4 { SECINFO_STYLE4_CURRENT_FH = 0, SECINFO_STYLE4_PARENT = 1 }; /* CURRENT_FH: object or child directory */ typedef secinfo_style4 SECINFO_NO_NAME4args;

18.45.2. RESULT

/* CURRENTFH: consumed if status is NFS4_OK */ typedef SECINFO4res SECINFO_NO_NAME4res;

18.45.3. DESCRIPTION

Like the SECINFO operation, SECINFO_NO_NAME is used by the client to obtain a list of valid RPC authentication flavors for a specific file object. Unlike SECINFO, SECINFO_NO_NAME only works with objects that are accessed by filehandle. There are two styles of SECINFO_NO_NAME, as determined by the value of the secinfo_style4 enumeration. If SECINFO_STYLE4_CURRENT_FH is passed, then SECINFO_NO_NAME is querying for the required security for the current filehandle. If SECINFO_STYLE4_PARENT is passed, then SECINFO_NO_NAME is querying for the required security of the current filehandle's parent. If the style selected is SECINFO_STYLE4_PARENT, then SECINFO should apply the same access methodology used for LOOKUPP when evaluating the traversal to the parent directory. Therefore, if the requester does not have the appropriate access to LOOKUPP the parent, then SECINFO_NO_NAME must behave the same way and return NFS4ERR_ACCESS. If PUTFH, PUTPUBFH, PUTROOTFH, or RESTOREFH returns NFS4ERR_WRONGSEC, then the client resolves the situation by sending a COMPOUND request that consists of PUTFH, PUTPUBFH, or PUTROOTFH immediately followed by SECINFO_NO_NAME, style SECINFO_STYLE4_CURRENT_FH. See Section 2.6 for instructions on dealing with NFS4ERR_WRONGSEC error returns from PUTFH, PUTROOTFH, PUTPUBFH, or RESTOREFH. If SECINFO_STYLE4_PARENT is specified and there is no parent directory, SECINFO_NO_NAME MUST return NFS4ERR_NOENT.
Top   ToC   RFC5661 - Page 553
   On success, the current filehandle is consumed (see
   Section 2.6.3.1.1.8), and if the next operation after SECINFO_NO_NAME
   tries to use the current filehandle, that operation will fail with
   the status NFS4ERR_NOFILEHANDLE.

   Everything else about SECINFO_NO_NAME is the same as SECINFO.  See
   the discussion on SECINFO (Section 18.29.3).

18.45.4. IMPLEMENTATION

See the discussion on SECINFO (Section 18.29.4).

18.46. Operation 53: SEQUENCE - Supply Per-Procedure Sequencing and Control

18.46.1. ARGUMENT

struct SEQUENCE4args { sessionid4 sa_sessionid; sequenceid4 sa_sequenceid; slotid4 sa_slotid; slotid4 sa_highest_slotid; bool sa_cachethis; };

18.46.2. RESULT

const SEQ4_STATUS_CB_PATH_DOWN = 0x00000001; const SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING = 0x00000002; const SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED = 0x00000004; const SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED = 0x00000008; const SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED = 0x00000010; const SEQ4_STATUS_ADMIN_STATE_REVOKED = 0x00000020; const SEQ4_STATUS_RECALLABLE_STATE_REVOKED = 0x00000040; const SEQ4_STATUS_LEASE_MOVED = 0x00000080; const SEQ4_STATUS_RESTART_RECLAIM_NEEDED = 0x00000100; const SEQ4_STATUS_CB_PATH_DOWN_SESSION = 0x00000200; const SEQ4_STATUS_BACKCHANNEL_FAULT = 0x00000400; const SEQ4_STATUS_DEVID_CHANGED = 0x00000800; const SEQ4_STATUS_DEVID_DELETED = 0x00001000;
Top   ToC   RFC5661 - Page 554
   struct SEQUENCE4resok {
           sessionid4      sr_sessionid;
           sequenceid4     sr_sequenceid;
           slotid4         sr_slotid;
           slotid4         sr_highest_slotid;
           slotid4         sr_target_highest_slotid;
           uint32_t        sr_status_flags;
   };

   union SEQUENCE4res switch (nfsstat4 sr_status) {
   case NFS4_OK:
           SEQUENCE4resok  sr_resok4;
   default:
           void;
   };

18.46.3. DESCRIPTION

The SEQUENCE operation is used by the server to implement session request control and the reply cache semantics. SEQUENCE MUST appear as the first operation of any COMPOUND in which it appears. The error NFS4ERR_SEQUENCE_POS will be returned when it is found in any position in a COMPOUND beyond the first. Operations other than SEQUENCE, BIND_CONN_TO_SESSION, EXCHANGE_ID, CREATE_SESSION, and DESTROY_SESSION, MUST NOT appear as the first operation in a COMPOUND. Such operations MUST yield the error NFS4ERR_OP_NOT_IN_SESSION if they do appear at the start of a COMPOUND. If SEQUENCE is received on a connection not associated with the session via CREATE_SESSION or BIND_CONN_TO_SESSION, and connection association enforcement is enabled (see Section 18.35), then the server returns NFS4ERR_CONN_NOT_BOUND_TO_SESSION. The sa_sessionid argument identifies the session to which this request applies. The sr_sessionid result MUST equal sa_sessionid. The sa_slotid argument is the index in the reply cache for the request. The sa_sequenceid field is the sequence number of the request for the reply cache entry (slot). The sr_slotid result MUST equal sa_slotid. The sr_sequenceid result MUST equal sa_sequenceid. The sa_highest_slotid argument is the highest slot ID for which the client has a request outstanding; it could be equal to sa_slotid. The server returns two "highest_slotid" values: sr_highest_slotid and sr_target_highest_slotid. The former is the highest slot ID the server will accept in future SEQUENCE operation, and SHOULD NOT be
Top   ToC   RFC5661 - Page 555
   less than the value of sa_highest_slotid (but see Section 2.10.6.1
   for an exception).  The latter is the highest slot ID the server
   would prefer the client use on a future SEQUENCE operation.

   If sa_cachethis is TRUE, then the client is requesting that the
   server cache the entire reply in the server's reply cache; therefore,
   the server MUST cache the reply (see Section 2.10.6.1.3).  The server
   MAY cache the reply if sa_cachethis is FALSE.  If the server does not
   cache the entire reply, it MUST still record that it executed the
   request at the specified slot and sequence ID.

   The response to the SEQUENCE operation contains a word of status
   flags (sr_status_flags) that can provide to the client information
   related to the status of the client's lock state and communications
   paths.  Note that any status bits relating to lock state MAY be reset
   when lock state is lost due to a server restart (even if the session
   is persistent across restarts; session persistence does not imply
   lock state persistence) or the establishment of a new client
   instance.

   SEQ4_STATUS_CB_PATH_DOWN
      When set, indicates that the client has no operational backchannel
      path for any session associated with the client ID, making it
      necessary for the client to re-establish one.  This bit remains
      set on all SEQUENCE responses on all sessions associated with the
      client ID until at least one backchannel is available on any
      session associated with the client ID.  If the client fails to re-
      establish a backchannel for the client ID, it is subject to having
      recallable state revoked.

   SEQ4_STATUS_CB_PATH_DOWN_SESSION
      When set, indicates that the session has no operational
      backchannel.  There are two reasons why
      SEQ4_STATUS_CB_PATH_DOWN_SESSION may be set and not
      SEQ4_STATUS_CB_PATH_DOWN.  First is that a callback operation that
      applies specifically to the session (e.g., CB_RECALL_SLOT, see
      Section 20.8) needs to be sent.  Second is that the server did
      send a callback operation, but the connection was lost before the
      reply.  The server cannot be sure whether or not the client
      received the callback operation, and so, per rules on request
      retry, the server MUST retry the callback operation over the same
      session.  The SEQ4_STATUS_CB_PATH_DOWN_SESSION bit is the
      indication to the client that it needs to associate a connection
      to the session's backchannel.  This bit remains set on all
      SEQUENCE responses of the session until a connection is associated
      with the session's a backchannel.  If the client fails to re-
      establish a backchannel for the session, it is subject to having
      recallable state revoked.
Top   ToC   RFC5661 - Page 556
   SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING
      When set, indicates that all GSS contexts or RPCSEC_GSS handles
      assigned to the session's backchannel will expire within a period
      equal to the lease time.  This bit remains set on all SEQUENCE
      replies until at least one of the following are true:

      *  All SSV RPCSEC_GSS handles on the session's backchannel have
         been destroyed and all non-SSV GSS contexts have expired.

      *  At least one more SSV RPCSEC_GSS handle has been added to the
         backchannel.

      *  The expiration time of at least one non-SSV GSS context of an
         RPCSEC_GSS handle is beyond the lease period from the current
         time (relative to the time of when a SEQUENCE response was
         sent)

   SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED
      When set, indicates all non-SSV GSS contexts and all SSV
      RPCSEC_GSS handles assigned to the session's backchannel have
      expired or have been destroyed.  This bit remains set on all
      SEQUENCE replies until at least one non-expired non-SSV GSS
      context for the session's backchannel has been established or at
      least one SSV RPCSEC_GSS handle has been assigned to the
      backchannel.

   SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED
      When set, indicates that the lease has expired and as a result the
      server released all of the client's locking state.  This status
      bit remains set on all SEQUENCE replies until the loss of all such
      locks has been acknowledged by use of FREE_STATEID (see
      Section 18.38), or by establishing a new client instance by
      destroying all sessions (via DESTROY_SESSION), the client ID (via
      DESTROY_CLIENTID), and then invoking EXCHANGE_ID and
      CREATE_SESSION to establish a new client ID.

   SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED
      When set, indicates that some subset of the client's locks have
      been revoked due to expiration of the lease period followed by
      another client's conflicting LOCK operation.  This status bit
      remains set on all SEQUENCE replies until the loss of all such
      locks has been acknowledged by use of FREE_STATEID.
Top   ToC   RFC5661 - Page 557
   SEQ4_STATUS_ADMIN_STATE_REVOKED
      When set, indicates that one or more locks have been revoked
      without expiration of the lease period, due to administrative
      action.  This status bit remains set on all SEQUENCE replies until
      the loss of all such locks has been acknowledged by use of
      FREE_STATEID.

   SEQ4_STATUS_RECALLABLE_STATE_REVOKED
      When set, indicates that one or more recallable objects have been
      revoked without expiration of the lease period, due to the
      client's failure to return them when recalled, which may be a
      consequence of there being no working backchannel and the client
      failing to re-establish a backchannel per the
      SEQ4_STATUS_CB_PATH_DOWN, SEQ4_STATUS_CB_PATH_DOWN_SESSION, or
      SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED status flags.  This status bit
      remains set on all SEQUENCE replies until the loss of all such
      locks has been acknowledged by use of FREE_STATEID.

   SEQ4_STATUS_LEASE_MOVED
      When set, indicates that responsibility for lease renewal has been
      transferred to one or more new servers.  This condition will
      continue until the client receives an NFS4ERR_MOVED error and the
      server receives the subsequent GETATTR for the fs_locations or
      fs_locations_info attribute for an access to each file system for
      which a lease has been moved to a new server.  See
      Section 11.7.7.1.

   SEQ4_STATUS_RESTART_RECLAIM_NEEDED
      When set, indicates that due to server restart, the client must
      reclaim locking state.  Until the client sends a global
      RECLAIM_COMPLETE (Section 18.51), every SEQUENCE operation will
      return SEQ4_STATUS_RESTART_RECLAIM_NEEDED.

   SEQ4_STATUS_BACKCHANNEL_FAULT
      The server has encountered an unrecoverable fault with the
      backchannel (e.g., it has lost track of the sequence ID for a slot
      in the backchannel).  The client MUST stop sending more requests
      on the session's fore channel, wait for all outstanding requests
      to complete on the fore and back channel, and then destroy the
      session.

   SEQ4_STATUS_DEVID_CHANGED
      The client is using device ID notifications and the server has
      changed a device ID mapping held by the client.  This flag will
      stay present until the client has obtained the new mapping with
      GETDEVICEINFO.
Top   ToC   RFC5661 - Page 558
   SEQ4_STATUS_DEVID_DELETED
      The client is using device ID notifications and the server has
      deleted a device ID mapping held by the client.  This flag will
      stay in effect until the client sends a GETDEVICEINFO on the
      device ID with a null value in the argument gdia_notify_types.

   The value of the sa_sequenceid argument relative to the cached
   sequence ID on the slot falls into one of three cases.

   o  If the difference between sa_sequenceid and the server's cached
      sequence ID at the slot ID is two (2) or more, or if sa_sequenceid
      is less than the cached sequence ID (accounting for wraparound of
      the unsigned sequence ID value), then the server MUST return
      NFS4ERR_SEQ_MISORDERED.

   o  If sa_sequenceid and the cached sequence ID are the same, this is
      a retry, and the server replies with what is recorded in the reply
      cache.  The lease is possibly renewed as described below.

   o  If sa_sequenceid is one greater (accounting for wraparound) than
      the cached sequence ID, then this is a new request, and the slot's
      sequence ID is incremented.  The operations subsequent to
      SEQUENCE, if any, are processed.  If there are no other
      operations, the only other effects are to cache the SEQUENCE reply
      in the slot, maintain the session's activity, and possibly renew
      the lease.

   If the client reuses a slot ID and sequence ID for a completely
   different request, the server MAY treat the request as if it is a
   retry of what it has already executed.  The server MAY however detect
   the client's illegal reuse and return NFS4ERR_SEQ_FALSE_RETRY.

   If SEQUENCE returns an error, then the state of the slot (sequence
   ID, cached reply) MUST NOT change, and the associated lease MUST NOT
   be renewed.

   If SEQUENCE returns NFS4_OK, then the associated lease MUST be
   renewed (see Section 8.3), except if
   SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED is returned in sr_status_flags.

18.46.4. IMPLEMENTATION

The server MUST maintain a mapping of session ID to client ID in order to validate any operations that follow SEQUENCE that take a stateid as an argument and/or result.
Top   ToC   RFC5661 - Page 559
   If the client establishes a persistent session, then a SEQUENCE
   received after a server restart might encounter requests performed
   and recorded in a persistent reply cache before the server restart.
   In this case, SEQUENCE will be processed successfully, while requests
   that were not previously performed and recorded are rejected with
   NFS4ERR_DEADSESSION.

   Depending on which of the operations within the COMPOUND were
   successfully performed before the server restart, these operations
   will also have replies sent from the server reply cache.  Note that
   when these operations establish locking state, it is locking state
   that applies to the previous server instance and to the previous
   client ID, even though the server restart, which logically happened
   after these operations, eliminated that state.  In the case of a
   partially executed COMPOUND, processing may reach an operation not
   processed during the earlier server instance, making this operation a
   new one and not performable on the existing session.  In this case,
   NFS4ERR_DEADSESSION will be returned from that operation.

18.47. Operation 54: SET_SSV - Update SSV for a Client ID

18.47.1. ARGUMENT

struct ssa_digest_input4 { SEQUENCE4args sdi_seqargs; }; struct SET_SSV4args { opaque ssa_ssv<>; opaque ssa_digest<>; };

18.47.2. RESULT

struct ssr_digest_input4 { SEQUENCE4res sdi_seqres; }; struct SET_SSV4resok { opaque ssr_digest<>; }; union SET_SSV4res switch (nfsstat4 ssr_status) { case NFS4_OK: SET_SSV4resok ssr_resok4; default: void; };
Top   ToC   RFC5661 - Page 560

18.47.3. DESCRIPTION

This operation is used to update the SSV for a client ID. Before SET_SSV is called the first time on a client ID, the SSV is zero. The SSV is the key used for the SSV GSS mechanism (Section 2.10.9) SET_SSV MUST be preceded by a SEQUENCE operation in the same COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV state protection when the client ID was created (see Section 18.35); the server returns NFS4ERR_INVAL in that case. The field ssa_digest is computed as the output of the HMAC (RFC 2104 [11]) using the subkey derived from the SSV4_SUBKEY_MIC_I2T and current SSV as the key (see Section 2.10.9 for a description of subkeys), and an XDR encoded value of data type ssa_digest_input4. The field sdi_seqargs is equal to the arguments of the SEQUENCE operation for the COMPOUND procedure that SET_SSV is within. The argument ssa_ssv is XORed with the current SSV to produce the new SSV. The argument ssa_ssv SHOULD be generated randomly. In the response, ssr_digest is the output of the HMAC using the subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and an XDR encoded value of data type ssr_digest_input4. The field sdi_seqres is equal to the results of the SEQUENCE operation for the COMPOUND procedure that SET_SSV is within. As noted in Section 18.35, the client and server can maintain multiple concurrent versions of the SSV. The client and server each MUST maintain an internal SSV version number, which is set to one the first time SET_SSV executes on the server and the client receives the first SET_SSV reply. Each subsequent SET_SSV increases the internal SSV version number by one. The value of this version number corresponds to the smpt_ssv_seq, smt_ssv_seq, sspt_ssv_seq, and ssct_ssv_seq fields of the SSV GSS mechanism tokens (see Section 2.10.9).

18.47.4. IMPLEMENTATION

When the server receives ssa_digest, it MUST verify the digest by computing the digest the same way the client did and comparing it with ssa_digest. If the server gets a different result, this is an error, NFS4ERR_BAD_SESSION_DIGEST. This error might be the result of another SET_SSV from the same client ID changing the SSV. If so, the client recovers by sending a SET_SSV operation again with a recomputed digest based on the subkey of the new SSV. If the transport connection is dropped after the SET_SSV request is sent, but before the SET_SSV reply is received, then there are special
Top   ToC   RFC5661 - Page 561
   considerations for recovery if the client has no more connections
   associated with sessions associated with the client ID of the SSV.
   See Section 18.34.4.

   Clients SHOULD NOT send an ssa_ssv that is equal to a previous
   ssa_ssv, nor equal to a previous or current SSV (including an ssa_ssv
   equal to zero since the SSV is initialized to zero when the client ID
   is created).

   Clients SHOULD send SET_SSV with RPCSEC_GSS privacy.  Servers MUST
   support RPCSEC_GSS with privacy for any COMPOUND that has { SEQUENCE,
   SET_SSV }.

   A client SHOULD NOT send SET_SSV with the SSV GSS mechanism's
   credential because the purpose of SET_SSV is to seed the SSV from
   non-SSV credentials.  Instead, SET_SSV SHOULD be sent with the
   credential of a user that is accessing the client ID for the first
   time (Section 2.10.8.3).  However, if the client does send SET_SSV
   with SSV credentials, the digest protecting the arguments uses the
   value of the SSV before ssa_ssv is XORed in, and the digest
   protecting the results uses the value of the SSV after the ssa_ssv is
   XORed in.

18.48. Operation 55: TEST_STATEID - Test Stateids for Validity

18.48.1. ARGUMENT

struct TEST_STATEID4args { stateid4 ts_stateids<>; };

18.48.2. RESULT

struct TEST_STATEID4resok { nfsstat4 tsr_status_codes<>; }; union TEST_STATEID4res switch (nfsstat4 tsr_status) { case NFS4_OK: TEST_STATEID4resok tsr_resok4; default: void; };
Top   ToC   RFC5661 - Page 562

18.48.3. DESCRIPTION

The TEST_STATEID operation is used to check the validity of a set of stateids. It can be used at any time, but the client should definitely use it when it receives an indication that one or more of its stateids have been invalidated due to lock revocation. This occurs when the SEQUENCE operation returns with one of the following sr_status_flags set: o SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED o SEQ4_STATUS_EXPIRED_ADMIN_STATE_REVOKED o SEQ4_STATUS_EXPIRED_RECALLABLE_STATE_REVOKED The client can use TEST_STATEID one or more times to test the validity of its stateids. Each use of TEST_STATEID allows a large set of such stateids to be tested and avoids problems with earlier stateids in a COMPOUND request from interfering with the checking of subsequent stateids, as would happen if individual stateids were tested by a series of corresponding by operations in a COMPOUND request. For each stateid, the server returns the status code that would be returned if that stateid were to be used in normal operation. Returning such a status indication is not an error and does not cause COMPOUND processing to terminate. Checks for the validity of the stateid proceed as they would for normal operations with a number of exceptions: o There is no check for the type of stateid object, as would be the case for normal use of a stateid. o There is no reference to the current filehandle. o Special stateids are always considered invalid (they result in the error code NFS4ERR_BAD_STATEID). All stateids are interpreted as being associated with the client for the current session. Any possible association with a previous instance of the client (as stale stateids) is not considered. The valid status values in the returned status_code array are NFS4ERR_OK, NFS4ERR_BAD_STATEID, NFS4ERR_OLD_STATEID, NFS4ERR_EXPIRED, NFS4ERR_ADMIN_REVOKED, and NFS4ERR_DELEG_REVOKED.
Top   ToC   RFC5661 - Page 563

18.48.4. IMPLEMENTATION

See Sections 8.2.2 and 8.2.4 for a discussion of stateid structure, lifetime, and validation.

18.49. Operation 56: WANT_DELEGATION - Request Delegation

18.49.1. ARGUMENT

union deleg_claim4 switch (open_claim_type4 dc_claim) { /* * No special rights to object. Ordinary delegation * request of the specified object. Object identified * by filehandle. */ case CLAIM_FH: /* new to v4.1 */ /* CURRENT_FH: object being delegated */ void; /* * Right to file based on a delegation granted * to a previous boot instance of the client. * File is specified by filehandle. */ case CLAIM_DELEG_PREV_FH: /* new to v4.1 */ /* CURRENT_FH: object being delegated */ void; /* * Right to the file established by an open previous * to server reboot. File identified by filehandle. * Used during server reclaim grace period. */ case CLAIM_PREVIOUS: /* CURRENT_FH: object being reclaimed */ open_delegation_type4 dc_delegate_type; }; struct WANT_DELEGATION4args { uint32_t wda_want; deleg_claim4 wda_claim; };
Top   ToC   RFC5661 - Page 564

18.49.2. RESULT

union WANT_DELEGATION4res switch (nfsstat4 wdr_status) { case NFS4_OK: open_delegation4 wdr_resok4; default: void; };

18.49.3. DESCRIPTION

Where this description mandates the return of a specific error code for a specific condition, and where multiple conditions apply, the server MAY return any of the mandated error codes. This operation allows a client to: o Get a delegation on all types of files except directories. o Register a "want" for a delegation for the specified file object, and be notified via a callback when the delegation is available. The server MAY support notifications of availability via callbacks. If the server does not support registration of wants, it MUST NOT return an error to indicate that, and instead MUST return with ond_why set to WND4_CONTENTION or WND4_RESOURCE and ond_server_will_push_deleg or ond_server_will_signal_avail set to FALSE. When the server indicates that it will notify the client by means of a callback, it will either provide the delegation using a CB_PUSH_DELEG operation or cancel its promise by sending a CB_WANTS_CANCELLED operation. o Cancel a want for a delegation. The client SHOULD NOT set OPEN4_SHARE_ACCESS_READ and SHOULD NOT set OPEN4_SHARE_ACCESS_WRITE in wda_want. If it does, the server MUST ignore them. The meanings of the following flags in wda_want are the same as they are in OPEN, except as noted below. o OPEN4_SHARE_ACCESS_WANT_READ_DELEG o OPEN4_SHARE_ACCESS_WANT_WRITE_DELEG o OPEN4_SHARE_ACCESS_WANT_ANY_DELEG
Top   ToC   RFC5661 - Page 565
   o  OPEN4_SHARE_ACCESS_WANT_NO_DELEG.  Unlike the OPEN operation, this
      flag SHOULD NOT be set by the client in the arguments to
      WANT_DELEGATION, and MUST be ignored by the server.

   o  OPEN4_SHARE_ACCESS_WANT_CANCEL

   o  OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_RESRC_AVAIL

   o  OPEN4_SHARE_ACCESS_WANT_PUSH_DELEG_WHEN_UNCONTENDED

   The handling of the above flags in WANT_DELEGATION is the same as in
   OPEN.  Information about the delegation and/or the promises the
   server is making regarding future callbacks are the same as those
   described in the open_delegation4 structure.

   The successful results of WANT_DELEGATION are of data type
   open_delegation4, which is the same data type as the "delegation"
   field in the results of the OPEN operation (see Section 18.16.3).
   The server constructs wdr_resok4 the same way it constructs OPEN's
   "delegation" with one difference: WANT_DELEGATION MUST NOT return a
   delegation type of OPEN_DELEGATE_NONE.

   If ((wda_want & OPEN4_SHARE_ACCESS_WANT_DELEG_MASK) &
   ~OPEN4_SHARE_ACCESS_WANT_NO_DELEG) is zero, then the client is
   indicating no explicit desire or non-desire for a delegation and the
   server MUST return NFS4ERR_INVAL.

   The client uses the OPEN4_SHARE_ACCESS_WANT_CANCEL flag in the
   WANT_DELEGATION operation to cancel a previously requested want for a
   delegation.  Note that if the server is in the process of sending the
   delegation (via CB_PUSH_DELEG) at the time the client sends a
   cancellation of the want, the delegation might still be pushed to the
   client.

   If WANT_DELEGATION fails to return a delegation, and the server
   returns NFS4_OK, the server MUST set the delegation type to
   OPEN4_DELEGATE_NONE_EXT, and set od_whynone, as described in
   Section 18.16.  Write delegations are not available for file types
   that are not writable.  This includes file objects of types NF4BLK,
   NF4CHR, NF4LNK, NF4SOCK, and NF4FIFO.  If the client requests
   OPEN4_SHARE_ACCESS_WANT_WRITE_DELEG without
   OPEN4_SHARE_ACCESS_WANT_READ_DELEG on an object with one of the
   aforementioned file types, the server must set
   wdr_resok4.od_whynone.ond_why to WND4_WRITE_DELEG_NOT_SUPP_FTYPE.
Top   ToC   RFC5661 - Page 566

18.49.4. IMPLEMENTATION

A request for a conflicting delegation is not normally intended to trigger the recall of the existing delegation. Servers may choose to treat some clients as having higher priority such that their wants will trigger recall of an existing delegation, although that is expected to be an unusual situation. Servers will generally recall delegations assigned by WANT_DELEGATION on the same basis as those assigned by OPEN. CB_RECALL will generally be done only when other clients perform operations inconsistent with the delegation. The normal response to aging of delegations is to use CB_RECALL_ANY, in order to give the client the opportunity to keep the delegations most useful from its point of view.

18.50. Operation 57: DESTROY_CLIENTID - Destroy a Client ID

18.50.1. ARGUMENT

struct DESTROY_CLIENTID4args { clientid4 dca_clientid; };

18.50.2. RESULT

struct DESTROY_CLIENTID4res { nfsstat4 dcr_status; };

18.50.3. DESCRIPTION

The DESTROY_CLIENTID operation destroys the client ID. If there are sessions (both idle and non-idle), opens, locks, delegations, layouts, and/or wants (Section 18.49) associated with the unexpired lease of the client ID, the server MUST return NFS4ERR_CLIENTID_BUSY. DESTROY_CLIENTID MAY be preceded with a SEQUENCE operation as long as the client ID derived from the session ID of SEQUENCE is not the same as the client ID to be destroyed. If the client IDs are the same, then the server MUST return NFS4ERR_CLIENTID_BUSY. If DESTROY_CLIENTID is not prefixed by SEQUENCE, it MUST be the only operation in the COMPOUND request (otherwise, the server MUST return NFS4ERR_NOT_ONLY_OP). If the operation is sent without a SEQUENCE preceding it, a client that retransmits the request may receive an error in response, because the original request might have been successfully executed.
Top   ToC   RFC5661 - Page 567

18.50.4. IMPLEMENTATION

DESTROY_CLIENTID allows a server to immediately reclaim the resources consumed by an unused client ID, and also to forget that it ever generated the client ID. By forgetting that it ever generated the client ID, the server can safely reuse the client ID on a future EXCHANGE_ID operation.

18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished

18.51.1. ARGUMENT

struct RECLAIM_COMPLETE4args { /* * If rca_one_fs TRUE, * * CURRENT_FH: object in * file system reclaim is * complete for. */ bool rca_one_fs; };

18.51.2. RESULTS

struct RECLAIM_COMPLETE4res { nfsstat4 rcr_status; };

18.51.3. DESCRIPTION

A RECLAIM_COMPLETE operation is used to indicate that the client has reclaimed all of the locking state that it will recover, when it is recovering state due to either a server restart or the transfer of a file system to another server. There are two types of RECLAIM_COMPLETE operations: o When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. This indicates that recovery of all locks that the client held on the previous server instance have been completed. o When rca_one_fs is TRUE, a file system-specific RECLAIM_COMPLETE is being done. This indicates that recovery of locks for a single fs (the one designated by the current filehandle) due to a file system transition have been completed. Presence of a current filehandle is only required when rca_one_fs is set to TRUE.
Top   ToC   RFC5661 - Page 568
   Once a RECLAIM_COMPLETE is done, there can be no further reclaim
   operations for locks whose scope is defined as having completed
   recovery.  Once the client sends RECLAIM_COMPLETE, the server will
   not allow the client to do subsequent reclaims of locking state for
   that scope and, if these are attempted, will return NFS4ERR_NO_GRACE.

   Whenever a client establishes a new client ID and before it does the
   first non-reclaim operation that obtains a lock, it MUST send a
   RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there are no
   locks to reclaim.  If non-reclaim locking operations are done before
   the RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned.

   Similarly, when the client accesses a file system on a new server,
   before it sends the first non-reclaim operation that obtains a lock
   on this new server, it MUST send a RECLAIM_COMPLETE with rca_one_fs
   set to TRUE and current filehandle within that file system, even if
   there are no locks to reclaim.  If non-reclaim locking operations are
   done on that file system before the RECLAIM_COMPLETE, an
   NFS4ERR_GRACE error will be returned.

   Any locks not reclaimed at the point at which RECLAIM_COMPLETE is
   done become non-reclaimable.  The client MUST NOT attempt to reclaim
   them, either during the current server instance or in any subsequent
   server instance, or on another server to which responsibility for
   that file system is transferred.  If the client were to do so, it
   would be violating the protocol by representing itself as owning
   locks that it does not own, and so has no right to reclaim.  See
   Section 8.4.3 for a discussion of edge conditions related to lock
   reclaim.

   By sending a RECLAIM_COMPLETE, the client indicates readiness to
   proceed to do normal non-reclaim locking operations.  The client
   should be aware that such operations may temporarily result in
   NFS4ERR_GRACE errors until the server is ready to terminate its grace
   period.

18.51.4. IMPLEMENTATION

Servers will typically use the information as to when reclaim activity is complete to reduce the length of the grace period. When the server maintains in persistent storage a list of clients that might have had locks, it is in a position to use the fact that all such clients have done a RECLAIM_COMPLETE to terminate the grace period and begin normal operations (i.e., grant requests for new locks) sooner than it might otherwise.
Top   ToC   RFC5661 - Page 569
   Latency can be minimized by doing a RECLAIM_COMPLETE as part of the
   COMPOUND request in which the last lock-reclaiming operation is done.
   When there are no reclaims to be done, RECLAIM_COMPLETE should be
   done immediately in order to allow the grace period to end as soon as
   possible.

   RECLAIM_COMPLETE should only be done once for each server instance or
   occasion of the transition of a file system.  If it is done a second
   time, the error NFS4ERR_COMPLETE_ALREADY will result.  Note that
   because of the session feature's retry protection, retries of
   COMPOUND requests containing RECLAIM_COMPLETE operation will not
   result in this error.

   When a RECLAIM_COMPLETE is sent, the client effectively acknowledges
   any locks not yet reclaimed as lost.  This allows the server to re-
   enable the client to recover locks if the occurrence of edge
   conditions, as described in Section 8.4.3, had caused the server to
   disable the client from recovering locks.

18.52. Operation 10044: ILLEGAL - Illegal Operation

18.52.1. ARGUMENTS

void;

18.52.2. RESULTS

struct ILLEGAL4res { nfsstat4 status; };

18.52.3. DESCRIPTION

This operation is a placeholder for encoding a result to handle the case of the client sending an operation code within COMPOUND that is not supported. See the COMPOUND procedure description for more details. The status field of ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL.

18.52.4. IMPLEMENTATION

A client will probably not send an operation with code OP_ILLEGAL but if it does, the response will be ILLEGAL4res just as it would be with any other invalid operation code. Note that if the server gets an
Top   ToC   RFC5661 - Page 570
   illegal operation code that is not OP_ILLEGAL, and if the server
   checks for legal operation codes during the XDR decode phase, then
   the ILLEGAL4res would not be returned.

19. NFSv4.1 Callback Procedures

The procedures used for callbacks are defined in the following sections. In the interest of clarity, the terms "client" and "server" refer to NFS clients and servers, despite the fact that for an individual callback RPC, the sense of these terms would be precisely the opposite. Both procedures, CB_NULL and CB_COMPOUND, MUST be implemented.

19.1. Procedure 0: CB_NULL - No Operation

19.1.1. ARGUMENTS

void;

19.1.2. RESULTS

void;

19.1.3. DESCRIPTION

CB_NULL is the standard ONC RPC NULL procedure, with the standard void argument and void response. Even though there is no direct functionality associated with this procedure, the server will use CB_NULL to confirm the existence of a path for RPCs from the server to client.

19.1.4. ERRORS

None.

19.2. Procedure 1: CB_COMPOUND - Compound Operations

Top   ToC   RFC5661 - Page 571

19.2.1. ARGUMENTS

enum nfs_cb_opnum4 { OP_CB_GETATTR = 3, OP_CB_RECALL = 4, /* Callback operations new to NFSv4.1 */ OP_CB_LAYOUTRECALL = 5, OP_CB_NOTIFY = 6, OP_CB_PUSH_DELEG = 7, OP_CB_RECALL_ANY = 8, OP_CB_RECALLABLE_OBJ_AVAIL = 9, OP_CB_RECALL_SLOT = 10, OP_CB_SEQUENCE = 11, OP_CB_WANTS_CANCELLED = 12, OP_CB_NOTIFY_LOCK = 13, OP_CB_NOTIFY_DEVICEID = 14, OP_CB_ILLEGAL = 10044 }; union nfs_cb_argop4 switch (unsigned argop) { case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; case OP_CB_RECALL: CB_RECALL4args opcbrecall; case OP_CB_LAYOUTRECALL: CB_LAYOUTRECALL4args opcblayoutrecall; case OP_CB_NOTIFY: CB_NOTIFY4args opcbnotify; case OP_CB_PUSH_DELEG: CB_PUSH_DELEG4args opcbpush_deleg; case OP_CB_RECALL_ANY: CB_RECALL_ANY4args opcbrecall_any; case OP_CB_RECALLABLE_OBJ_AVAIL: CB_RECALLABLE_OBJ_AVAIL4args opcbrecallable_obj_avail; case OP_CB_RECALL_SLOT: CB_RECALL_SLOT4args opcbrecall_slot; case OP_CB_SEQUENCE: CB_SEQUENCE4args opcbsequence; case OP_CB_WANTS_CANCELLED: CB_WANTS_CANCELLED4args opcbwants_cancelled; case OP_CB_NOTIFY_LOCK: CB_NOTIFY_LOCK4args opcbnotify_lock; case OP_CB_NOTIFY_DEVICEID: CB_NOTIFY_DEVICEID4args opcbnotify_deviceid; case OP_CB_ILLEGAL: void; };
Top   ToC   RFC5661 - Page 572
   struct CB_COMPOUND4args {
           utf8str_cs      tag;
           uint32_t        minorversion;
           uint32_t        callback_ident;
           nfs_cb_argop4   argarray<>;
   };

19.2.2. RESULTS

union nfs_cb_resop4 switch (unsigned resop) { case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; case OP_CB_RECALL: CB_RECALL4res opcbrecall; /* new NFSv4.1 operations */ case OP_CB_LAYOUTRECALL: CB_LAYOUTRECALL4res opcblayoutrecall; case OP_CB_NOTIFY: CB_NOTIFY4res opcbnotify; case OP_CB_PUSH_DELEG: CB_PUSH_DELEG4res opcbpush_deleg; case OP_CB_RECALL_ANY: CB_RECALL_ANY4res opcbrecall_any; case OP_CB_RECALLABLE_OBJ_AVAIL: CB_RECALLABLE_OBJ_AVAIL4res opcbrecallable_obj_avail; case OP_CB_RECALL_SLOT: CB_RECALL_SLOT4res opcbrecall_slot; case OP_CB_SEQUENCE: CB_SEQUENCE4res opcbsequence; case OP_CB_WANTS_CANCELLED: CB_WANTS_CANCELLED4res opcbwants_cancelled; case OP_CB_NOTIFY_LOCK: CB_NOTIFY_LOCK4res opcbnotify_lock; case OP_CB_NOTIFY_DEVICEID: CB_NOTIFY_DEVICEID4res opcbnotify_deviceid;
Top   ToC   RFC5661 - Page 573
    /* Not new operation */
    case OP_CB_ILLEGAL:    CB_ILLEGAL4res  opcbillegal;
   };

   struct CB_COMPOUND4res {
           nfsstat4 status;
           utf8str_cs      tag;
           nfs_cb_resop4   resarray<>;
   };

19.2.3. DESCRIPTION

The CB_COMPOUND procedure is used to combine one or more of the callback procedures into a single RPC request. The main callback RPC program has two main procedures: CB_NULL and CB_COMPOUND. All other operations use the CB_COMPOUND procedure as a wrapper. During the processing of the CB_COMPOUND procedure, the client may find that it does not have the available resources to execute any or all of the operations within the CB_COMPOUND sequence. Refer to Section 2.10.6.4 for details. The minorversion field of the arguments MUST be the same as the minorversion of the COMPOUND procedure used to create the client ID and session. For NFSv4.1, minorversion MUST be set to 1. Contained within the CB_COMPOUND results is a "status" field. This status MUST be equal to the status of the last operation that was executed within the CB_COMPOUND procedure. Therefore, if an operation incurred an error, then the "status" value will be the same error value as is being returned for the operation that failed. The "tag" field is handled the same way as that of the COMPOUND procedure (see Section 16.2.3). Illegal operation codes are handled in the same way as they are handled for the COMPOUND procedure.

19.2.4. IMPLEMENTATION

The CB_COMPOUND procedure is used to combine individual operations into a single RPC request. The client interprets each of the operations in turn. If an operation is executed by the client and the status of that operation is NFS4_OK, then the next operation in the CB_COMPOUND procedure is executed. The client continues this process until there are no more operations to be executed or one of the operations has a status value other than NFS4_OK.
Top   ToC   RFC5661 - Page 574

19.2.5. ERRORS

CB_COMPOUND will of course return every error that each operation on the backchannel can return (see Table 7). However, if CB_COMPOUND returns zero operations, obviously the error returned by COMPOUND has nothing to do with an error returned by an operation. The list of errors CB_COMPOUND will return if it processes zero operations includes: CB_COMPOUND error returns +------------------------------+------------------------------------+ | Error | Notes | +------------------------------+------------------------------------+ | NFS4ERR_BADCHAR | The tag argument has a character | | | the replier does not support. | | NFS4ERR_BADXDR | | | NFS4ERR_DELAY | | | NFS4ERR_INVAL | The tag argument is not in UTF-8 | | | encoding. | | NFS4ERR_MINOR_VERS_MISMATCH | | | NFS4ERR_SERVERFAULT | | | NFS4ERR_TOO_MANY_OPS | | | NFS4ERR_REP_TOO_BIG | | | NFS4ERR_REP_TOO_BIG_TO_CACHE | | | NFS4ERR_REQ_TOO_BIG | | +------------------------------+------------------------------------+ Table 15

20. NFSv4.1 Callback Operations

20.1. Operation 3: CB_GETATTR - Get Attributes

20.1.1. ARGUMENT

struct CB_GETATTR4args { nfs_fh4 fh; bitmap4 attr_request; };
Top   ToC   RFC5661 - Page 575

20.1.2. RESULT

struct CB_GETATTR4resok { fattr4 obj_attributes; }; union CB_GETATTR4res switch (nfsstat4 status) { case NFS4_OK: CB_GETATTR4resok resok4; default: void; };

20.1.3. DESCRIPTION

The CB_GETATTR operation is used by the server to obtain the current modified state of a file that has been OPEN_DELEGATE_WRITE delegated. The size and change attributes are the only ones guaranteed to be serviced by the client. See Section 10.4.3 for a full description of how the client and server are to interact with the use of CB_GETATTR. If the filehandle specified is not one for which the client holds an OPEN_DELEGATE_WRITE delegation, an NFS4ERR_BADHANDLE error is returned.

20.1.4. IMPLEMENTATION

The client returns attrmask bits and the associated attribute values only for the change attribute, and attributes that it may change (time_modify, and size).

20.2. Operation 4: CB_RECALL - Recall a Delegation

20.2.1. ARGUMENT

struct CB_RECALL4args { stateid4 stateid; bool truncate; nfs_fh4 fh; };

20.2.2. RESULT

struct CB_RECALL4res { nfsstat4 status; };
Top   ToC   RFC5661 - Page 576

20.2.3. DESCRIPTION

The CB_RECALL operation is used to begin the process of recalling a delegation and returning it to the server. The truncate flag is used to optimize recall for a file object that is a regular file and is about to be truncated to zero. When it is TRUE, the client is freed of the obligation to propagate modified data for the file to the server, since this data is irrelevant. If the handle specified is not one for which the client holds a delegation, an NFS4ERR_BADHANDLE error is returned. If the stateid specified is not one corresponding to an OPEN delegation for the file specified by the filehandle, an NFS4ERR_BAD_STATEID is returned.

20.2.4. IMPLEMENTATION

The client SHOULD reply to the callback immediately. Replying does not complete the recall except when the value of the reply's status field is neither NFS4ERR_DELAY nor NFS4_OK. The recall is not complete until the delegation is returned using a DELEGRETURN operation.

20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from Client

20.3.1. ARGUMENT

/* * NFSv4.1 callback arguments and results */ enum layoutrecall_type4 { LAYOUTRECALL4_FILE = LAYOUT4_RET_REC_FILE, LAYOUTRECALL4_FSID = LAYOUT4_RET_REC_FSID, LAYOUTRECALL4_ALL = LAYOUT4_RET_REC_ALL }; struct layoutrecall_file4 { nfs_fh4 lor_fh; offset4 lor_offset; length4 lor_length; stateid4 lor_stateid; };
Top   ToC   RFC5661 - Page 577
   union layoutrecall4 switch(layoutrecall_type4 lor_recalltype) {
   case LAYOUTRECALL4_FILE:
           layoutrecall_file4 lor_layout;
   case LAYOUTRECALL4_FSID:
           fsid4              lor_fsid;
   case LAYOUTRECALL4_ALL:
           void;
   };

   struct CB_LAYOUTRECALL4args {
           layouttype4             clora_type;
           layoutiomode4           clora_iomode;
           bool                    clora_changed;
           layoutrecall4           clora_recall;
   };

20.3.2. RESULT

struct CB_LAYOUTRECALL4res { nfsstat4 clorr_status; };

20.3.3. DESCRIPTION

The CB_LAYOUTRECALL operation is used by the server to recall layouts from the client; as a result, the client will begin the process of returning layouts via LAYOUTRETURN. The CB_LAYOUTRECALL operation specifies one of three forms of recall processing with the value of layoutrecall_type4. The recall is for one of the following: a specific layout of a specific file (LAYOUTRECALL4_FILE), an entire file system ID (LAYOUTRECALL4_FSID), or all file systems (LAYOUTRECALL4_ALL). The behavior of the operation varies based on the value of the layoutrecall_type4. The value and behaviors are: LAYOUTRECALL4_FILE For a layout to match the recall request, the values of the following fields must match those of the layout: clora_type, clora_iomode, lor_fh, and the byte-range specified by lor_offset and lor_length. The clora_iomode field may have a special value of LAYOUTIOMODE4_ANY. The special value LAYOUTIOMODE4_ANY will match any iomode originally returned in a layout; therefore, it acts as a wild card. The other special value used is for lor_length. If lor_length has a value of NFS4_UINT64_MAX, the lor_length field means the maximum possible file size. If a matching layout is found, it MUST be returned using the
Top   ToC   RFC5661 - Page 578
      LAYOUTRETURN operation (see Section 18.44).  An example of the
      field's special value use is if clora_iomode is LAYOUTIOMODE4_ANY,
      lor_offset is zero, and lor_length is NFS4_UINT64_MAX, then the
      entire layout is to be returned.

      The NFS4ERR_NOMATCHING_LAYOUT error is only returned when the
      client does not hold layouts for the file or if the client does
      not have any overlapping layouts for the specification in the
      layout recall.

   LAYOUTRECALL4_FSID and LAYOUTRECALL4_ALL

      If LAYOUTRECALL4_FSID is specified, the fsid specifies the file
      system for which any outstanding layouts MUST be returned.  If
      LAYOUTRECALL4_ALL is specified, all outstanding layouts MUST be
      returned.  In addition, LAYOUTRECALL4_FSID and LAYOUTRECALL4_ALL
      specify that all the storage device ID to storage device address
      mappings in the affected file system(s) are also recalled.  The
      respective LAYOUTRETURN with either LAYOUTRETURN4_FSID or
      LAYOUTRETURN4_ALL acknowledges to the server that the client
      invalidated the said device mappings.  See Section 12.5.5.2.1.5
      for considerations with "bulk" recall of layouts.

      The NFS4ERR_NOMATCHING_LAYOUT error is only returned when the
      client does not hold layouts and does not have valid deviceid
      mappings.

   In processing the layout recall request, the client also varies its
   behavior based on the value of the clora_changed field.  This field
   is used by the server to provide additional context for the reason
   why the layout is being recalled.  A FALSE value for clora_changed
   indicates that no change in the layout is expected and the client may
   write modified data to the storage devices involved; this must be
   done prior to returning the layout via LAYOUTRETURN.  A TRUE value
   for clora_changed indicates that the server is changing the layout.
   Examples of layout changes and reasons for a TRUE indication are the
   following: the metadata server is restriping the file or a permanent
   error has occurred on a storage device and the metadata server would
   like to provide a new layout for the file.  Therefore, a
   clora_changed value of TRUE indicates some level of change for the
   layout and the client SHOULD NOT write and commit modified data to
   the storage devices.  In this case, the client writes and commits
   data through the metadata server.

   See Section 12.5.3 for a description of how the lor_stateid field in
   the arguments is to be constructed.  Note that the "seqid" field of
   lor_stateid MUST NOT be zero.  See Sections 8.2, 12.5.3, and 12.5.5.2
   for a further discussion and requirements.
Top   ToC   RFC5661 - Page 579

20.3.4. IMPLEMENTATION

The client's processing for CB_LAYOUTRECALL is similar to CB_RECALL (recall of file delegations) in that the client responds to the request before actually returning layouts via the LAYOUTRETURN operation. While the client responds to the CB_LAYOUTRECALL immediately, the operation is not considered complete (i.e., considered pending) until all affected layouts are returned to the server via the LAYOUTRETURN operation. Before returning the layout to the server via LAYOUTRETURN, the client should wait for the response from in-process or in-flight READ, WRITE, or COMMIT operations that use the recalled layout. If the client is holding modified data that is affected by a recalled layout, the client has various options for writing the data to the server. As always, the client may write the data through the metadata server. In fact, the client may not have a choice other than writing to the metadata server when the clora_changed argument is TRUE and a new layout is unavailable from the server. However, the client may be able to write the modified data to the storage device if the clora_changed argument is FALSE; this needs to be done before returning the layout via LAYOUTRETURN. If the client were to obtain a new layout covering the modified data's byte-range, then writing to the storage devices is an available alternative. Note that before obtaining a new layout, the client must first return the original layout. In the case of modified data being written while the layout is held, the client must use LAYOUTCOMMIT operations at the appropriate time; as required LAYOUTCOMMIT must be done before the LAYOUTRETURN. If a large amount of modified data is outstanding, the client may send LAYOUTRETURNs for portions of the recalled layout; this allows the server to monitor the client's progress and adherence to the original recall request. However, the last LAYOUTRETURN in a sequence of returns MUST specify the full range being recalled (see Section 12.5.5.1 for details). If a server needs to delete a device ID and there are layouts referring to the device ID, CB_LAYOUTRECALL MUST be invoked to cause the client to return all layouts referring to the device ID before the server can delete the device ID. If the client does not return the affected layouts, the server MAY revoke the layouts.
Top   ToC   RFC5661 - Page 580

20.4. Operation 6: CB_NOTIFY - Notify Client of Directory Changes

20.4.1. ARGUMENT

/* * Directory notification types. */ enum notify_type4 { NOTIFY4_CHANGE_CHILD_ATTRS = 0, NOTIFY4_CHANGE_DIR_ATTRS = 1, NOTIFY4_REMOVE_ENTRY = 2, NOTIFY4_ADD_ENTRY = 3, NOTIFY4_RENAME_ENTRY = 4, NOTIFY4_CHANGE_COOKIE_VERIFIER = 5 }; /* Changed entry information. */ struct notify_entry4 { component4 ne_file; fattr4 ne_attrs; }; /* Previous entry information */ struct prev_entry4 { notify_entry4 pe_prev_entry; /* what READDIR returned for this entry */ nfs_cookie4 pe_prev_entry_cookie; }; struct notify_remove4 { notify_entry4 nrm_old_entry; nfs_cookie4 nrm_old_entry_cookie; }; struct notify_add4 { /* * Information on object * possibly renamed over. */ notify_remove4 nad_old_entry<1>; notify_entry4 nad_new_entry; /* what READDIR would have returned for this entry */ nfs_cookie4 nad_new_entry_cookie<1>; prev_entry4 nad_prev_entry<1>; bool nad_last_entry; };
Top   ToC   RFC5661 - Page 581
   struct notify_attr4 {
           notify_entry4   na_changed_entry;
   };

   struct notify_rename4 {
           notify_remove4  nrn_old_entry;
           notify_add4     nrn_new_entry;
   };

   struct notify_verifier4 {
           verifier4       nv_old_cookieverf;
           verifier4       nv_new_cookieverf;
   };

   /*
    * Objects of type notify_<>4 and
    * notify_device_<>4 are encoded in this.
    */
   typedef opaque notifylist4<>;

   struct notify4 {
           /* composed from notify_type4 or notify_deviceid_type4 */
           bitmap4         notify_mask;
           notifylist4     notify_vals;
   };

   struct CB_NOTIFY4args {
           stateid4    cna_stateid;
           nfs_fh4     cna_fh;
           notify4     cna_changes<>;
   };

20.4.2. RESULT

struct CB_NOTIFY4res { nfsstat4 cnr_status; };

20.4.3. DESCRIPTION

The CB_NOTIFY operation is used by the server to send notifications to clients about changes to delegated directories. The registration of notifications for the directories occurs when the delegation is established using GET_DIR_DELEGATION. These notifications are sent over the backchannel. The notification is sent once the original request has been processed on the server. The server will send an array of notifications for changes that might have occurred in the
Top   ToC   RFC5661 - Page 582
   directory.  The notifications are sent as list of pairs of bitmaps
   and values.  See Section 3.3.7 for a description of how NFSv4.1
   bitmaps work.

   If the server has more notifications than can fit in the CB_COMPOUND
   request, it SHOULD send a sequence of serial CB_COMPOUND requests so
   that the client's view of the directory does not become confused.
   For example, if the server indicates that a file named "foo" is added
   and that the file "foo" is removed, the order in which the client
   receives these notifications needs to be the same as the order in
   which the corresponding operations occurred on the server.

   If the client holding the delegation makes any changes in the
   directory that cause files or sub-directories to be added or removed,
   the server will notify that client of the resulting change(s).  If
   the client holding the delegation is making attribute or cookie
   verifier changes only, the server does not need to send notifications
   to that client.  The server will send the following information for
   each operation:

   NOTIFY4_ADD_ENTRY
      The server will send information about the new directory entry
      being created along with the cookie for that entry.  The entry
      information (data type notify_add4) includes the component name of
      the entry and attributes.  The server will send this type of entry
      when a file is actually being created, when an entry is being
      added to a directory as a result of a rename across directories
      (see below), and when a hard link is being created to an existing
      file.  If this entry is added to the end of the directory, the
      server will set the nad_last_entry flag to TRUE.  If the file is
      added such that there is at least one entry before it, the server
      will also return the previous entry information (nad_prev_entry, a
      variable-length array of up to one element.  If the array is of
      zero length, there is no previous entry), along with its cookie.
      This is to help clients find the right location in their file name
      caches and directory caches where this entry should be cached.  If
      the new entry's cookie is available, it will be in the
      nad_new_entry_cookie (another variable-length array of up to one
      element) field.  If the addition of the entry causes another entry
      to be deleted (which can only happen in the rename case)
      atomically with the addition, then information on this entry is
      reported in nad_old_entry.

   NOTIFY4_REMOVE_ENTRY
      The server will send information about the directory entry being
      deleted.  The server will also send the cookie value for the
      deleted entry so that clients can get to the cached information
      for this entry.
Top   ToC   RFC5661 - Page 583
   NOTIFY4_RENAME_ENTRY
      The server will send information about both the old entry and the
      new entry.  This includes the name and attributes for each entry.
      In addition, if the rename causes the deletion of an entry (i.e.,
      the case of a file renamed over), then this is reported in
      nrn_new_new_entry.nad_old_entry.  This notification is only sent
      if both entries are in the same directory.  If the rename is
      across directories, the server will send a remove notification to
      one directory and an add notification to the other directory,
      assuming both have a directory delegation.

   NOTIFY4_CHANGE_CHILD_ATTRS/NOTIFY4_CHANGE_DIR_ATTRS
      The client will use the attribute mask to inform the server of
      attributes for which it wants to receive notifications.  This
      change notification can be requested for changes to the attributes
      of the directory as well as changes to any file's attributes in
      the directory by using two separate attribute masks.  The client
      cannot ask for change attribute notification for a specific file.
      One attribute mask covers all the files in the directory.  Upon
      any attribute change, the server will send back the values of
      changed attributes.  Notifications might not make sense for some
      file system-wide attributes, and it is up to the server to decide
      which subset it wants to support.  The client can negotiate the
      frequency of attribute notifications by letting the server know
      how often it wants to be notified of an attribute change.  The
      server will return supported notification frequencies or an
      indication that no notification is permitted for directory or
      child attributes by setting the dir_notif_delay and
      dir_entry_notif_delay attributes, respectively.

   NOTIFY4_CHANGE_COOKIE_VERIFIER
      If the cookie verifier changes while a client is holding a
      delegation, the server will notify the client so that it can
      invalidate its cookies and re-send a READDIR to get the new set of
      cookies.

20.5. Operation 7: CB_PUSH_DELEG - Offer Previously Requested Delegation to Client

20.5.1. ARGUMENT

struct CB_PUSH_DELEG4args { nfs_fh4 cpda_fh; open_delegation4 cpda_delegation; };
Top   ToC   RFC5661 - Page 584

20.5.2. RESULT

struct CB_PUSH_DELEG4res { nfsstat4 cpdr_status; };

20.5.3. DESCRIPTION

CB_PUSH_DELEG is used by the server both to signal to the client that the delegation it wants (previously indicated via a want established from an OPEN or WANT_DELEGATION operation) is available and to simultaneously offer the delegation to the client. The client has the choice of accepting the delegation by returning NFS4_OK to the server, delaying the decision to accept the offered delegation by returning NFS4ERR_DELAY, or permanently rejecting the offer of the delegation by returning NFS4ERR_REJECT_DELEG. When a delegation is rejected in this fashion, the want previously established is permanently deleted and the delegation is subject to acquisition by another client.

20.5.4. IMPLEMENTATION

If the client does return NFS4ERR_DELAY and there is a conflicting delegation request, the server MAY process it at the expense of the client that returned NFS4ERR_DELAY. The client's want will not be cancelled, but MAY be processed behind other delegation requests or registered wants. When a client returns a status other than NFS4_OK, NFS4ERR_DELAY, or NFS4ERR_REJECT_DELAY, the want remains pending, although servers may decide to cancel the want by sending a CB_WANTS_CANCELLED.

20.6. Operation 8: CB_RECALL_ANY - Keep Any N Recallable Objects

20.6.1. ARGUMENT

const RCA4_TYPE_MASK_RDATA_DLG = 0; const RCA4_TYPE_MASK_WDATA_DLG = 1; const RCA4_TYPE_MASK_DIR_DLG = 2; const RCA4_TYPE_MASK_FILE_LAYOUT = 3; const RCA4_TYPE_MASK_BLK_LAYOUT = 4; const RCA4_TYPE_MASK_OBJ_LAYOUT_MIN = 8; const RCA4_TYPE_MASK_OBJ_LAYOUT_MAX = 9; const RCA4_TYPE_MASK_OTHER_LAYOUT_MIN = 12; const RCA4_TYPE_MASK_OTHER_LAYOUT_MAX = 15;
Top   ToC   RFC5661 - Page 585
   struct  CB_RECALL_ANY4args      {
           uint32_t        craa_objects_to_keep;
           bitmap4         craa_type_mask;
   };

20.6.2. RESULT

struct CB_RECALL_ANY4res { nfsstat4 crar_status; };

20.6.3. DESCRIPTION

The server may decide that it cannot hold all of the state for recallable objects, such as delegations and layouts, without running out of resources. In such a case, while not optimal, the server is free to recall individual objects to reduce the load. Because the general purpose of such recallable objects as delegations is to eliminate client interaction with the server, the server cannot interpret lack of recent use as indicating that the object is no longer useful. The absence of visible use is consistent with a delegation keeping potential operations from being sent to the server. In the case of layouts, while it is true that the usefulness of a layout is indicated by the use of the layout when storage devices receive I/O requests, because there is no mandate that a storage device indicate to the metadata server any past or present use of a layout, the metadata server is not likely to know which layouts are good candidates to recall in response to low resources. In order to implement an effective reclaim scheme for such objects, the server's knowledge of available resources must be used to determine when objects must be recalled with the clients selecting the actual objects to be returned. Server implementations may differ in their resource allocation requirements. For example, one server may share resources among all classes of recallable objects, whereas another may use separate resource pools for layouts and for delegations, or further separate resources by types of delegations. When a given resource pool is over-utilized, the server can send a CB_RECALL_ANY to clients holding recallable objects of the types involved, allowing it to keep a certain number of such objects and return any excess. A mask specifies which types of objects are to be limited. The client chooses, based on its own knowledge of current usefulness, which of the objects in that class should be returned.
Top   ToC   RFC5661 - Page 586
   A number of bits are defined.  For some of these, ranges are defined
   and it is up to the definition of the storage protocol to specify how
   these are to be used.  There are ranges reserved for object-based
   storage protocols and for other experimental storage protocols.  An
   RFC defining such a storage protocol needs to specify how particular
   bits within its range are to be used.  For example, it may specify a
   mapping between attributes of the layout (read vs. write, size of
   area) and the bit to be used, or it may define a field in the layout
   where the associated bit position is made available by the server to
   the client.

   RCA4_TYPE_MASK_RDATA_DLG

      The client is to return OPEN_DELEGATE_READ delegations on non-
      directory file objects.

   RCA4_TYPE_MASK_WDATA_DLG

      The client is to return OPEN_DELEGATE_WRITE delegations on regular
      file objects.

   RCA4_TYPE_MASK_DIR_DLG

      The client is to return directory delegations.

   RCA4_TYPE_MASK_FILE_LAYOUT

      The client is to return layouts of type LAYOUT4_NFSV4_1_FILES.

   RCA4_TYPE_MASK_BLK_LAYOUT

      See [41] for a description.

   RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX

      See [40] for a description.

   RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX

      This range is reserved for telling the client to recall layouts of
      experimental or site-specific layout types (see Section 3.3.13).

   When a bit is set in the type mask that corresponds to an undefined
   type of recallable object, NFS4ERR_INVAL MUST be returned.  When a
   bit is set that corresponds to a defined type of object but the
   client does not support an object of the type, NFS4ERR_INVAL MUST NOT
   be returned.  Future minor versions of NFSv4 may expand the set of
   valid type mask bits.
Top   ToC   RFC5661 - Page 587
   CB_RECALL_ANY specifies a count of objects that the client may keep
   as opposed to a count that the client must return.  This is to avoid
   a potential race between a CB_RECALL_ANY that had a count of objects
   to free with a set of client-originated operations to return layouts
   or delegations.  As a result of the race, the client and server would
   have differing ideas as to how many objects to return.  Hence, the
   client could mistakenly free too many.

   If resource demands prompt it, the server may send another
   CB_RECALL_ANY with a lower count, even if it has not yet received an
   acknowledgment from the client for a previous CB_RECALL_ANY with the
   same type mask.  Although the possibility exists that these will be
   received by the client in an order different from the order in which
   they were sent, any such permutation of the callback stream is
   harmless.  It is the job of the client to bring down the size of the
   recallable object set in line with each CB_RECALL_ANY received, and
   until that obligation is met, it cannot be cancelled or modified by
   any subsequent CB_RECALL_ANY for the same type mask.  Thus, if the
   server sends two CB_RECALL_ANYs, the effect will be the same as if
   the lower count was sent, whatever the order of recall receipt.  Note
   that this means that a server may not cancel the effect of a
   CB_RECALL_ANY by sending another recall with a higher count.  When a
   CB_RECALL_ANY is received and the count is already within the limit
   set or is above a limit that the client is working to get down to,
   that callback has no effect.

   Servers are generally free to deny recallable objects when
   insufficient resources are available.  Note that the effect of such a
   policy is implicitly to give precedence to existing objects relative
   to requested ones, with the result that resources might not be
   optimally used.  To prevent this, servers are well advised to make
   the point at which they start sending CB_RECALL_ANY callbacks
   somewhat below that at which they cease to give out new delegations
   and layouts.  This allows the client to purge its less-used objects
   whenever appropriate and so continue to have its subsequent requests
   given new resources freed up by object returns.

20.6.4. IMPLEMENTATION

The client can choose to return any type of object specified by the mask. If a server wishes to limit the use of objects of a specific type, it should only specify that type in the mask it sends. Should the client fail to return requested objects, it is up to the server to handle this situation, typically by sending specific recalls (i.e., sending CB_RECALL operations) to properly limit resource usage. The server should give the client enough time to return objects before proceeding to specific recalls. This time should not be less than the lease period.


(next page on part 20)

Next Section