RFC 7862

Network File System (NFS) Version 4 Minor Version 2 Protocol

Pages: 104
Proposed Standard
→ Errata
Updated by: 8178

Part 2 of 6 – Pages 10 to 32

RFC7862 - Page 10 prevText

4.  Server-Side Copy

   The server-side copy features provide mechanisms that allow an NFS
   client to copy file data on a server or between two servers without
   the data being transmitted back and forth over the network through
   the NFS client.  Without these features, an NFS client would copy
   data from one location to another by reading the data from the source
   server over the network and then writing the data back over the
   network to the destination server.

   If the source object and destination object are on different file
   servers, the file servers will communicate with one another to
   perform the COPY operation.  The server-to-server protocol by which
   this is accomplished is not defined in this document.

   The copy feature allows the server to perform the copying either
   synchronously or asynchronously.  The client can request synchronous
   copying, but the server may not be able to honor this request.  If
   the server intends to perform asynchronous copying, it supplies the
   client with a request identifier that the client can use to monitor
   the progress of the copying and, if appropriate, cancel a request in
   progress.  The request identifier is a stateid representing the
   internal state held by the server while the copying is performed.
   Multiple asynchronous copies of all or part of a file may be in
   progress in parallel on a server; the stateid request identifier
   allows monitoring and canceling to be applied to the correct request.

4.1.  Protocol Overview

   The server-side copy offload operations support both intra-server and
   inter-server file copies.  An intra-server copy is a copy in which
   the source file and destination file reside on the same server.  In
   an inter-server copy, the source file and destination file are on
   different servers.  In both cases, the copy may be performed
   synchronously or asynchronously.

   In addition, the CLONE operation provides COPY-like functionality in
   the intra-server case, which is both synchronous and atomic in that
   other operations may not see the target file in any state between the
   state before the CLONE operation and the state after it.

   Throughout the rest of this document, the NFS server containing the
   source file is referred to as the "source server" and the NFS server
   to which the file is transferred as the "destination server".  In the
   case of an intra-server copy, the source server and destination
   server are the same server.  Therefore, in the context of an
   intra-server copy, the terms "source server" and "destination server"
   refer to the single server performing the copy.

RFC7862 - Page 11

   The new operations are designed to copy files or regions within them.
   Other file system objects can be copied by building on these
   operations or using other techniques.  For example, if a user wishes
   to copy a directory, the client can synthesize a directory COPY
   operation by first creating the destination directory and the
   individual (empty) files within it and then copying the contents of
   the source directory's files to files in the new destination
   directory.

   For the inter-server copy, the operations are defined to be
   compatible with the traditional copy authorization approach.  The
   client and user are authorized at the source for reading.  Then, they
   are authorized at the destination for writing.

4.1.1.  COPY Operations

   CLONE:  Used by the client to request a synchronous atomic COPY-like
      operation.  (Section 15.13)

   COPY_NOTIFY:  Used by the client to request the source server to
      authorize a future file copy that will be made by a given
      destination server on behalf of the given user.  (Section 15.3)

   COPY:  Used by the client to request a file copy.  (Section 15.2)

   OFFLOAD_CANCEL:  Used by the client to terminate an asynchronous file
      copy.  (Section 15.8)

   OFFLOAD_STATUS:  Used by the client to poll the status of an
      asynchronous file copy.  (Section 15.9)

   CB_OFFLOAD:  Used by the destination server to report the results of
      an asynchronous file copy to the client.  (Section 16.1)

4.1.2.  Requirements for Operations

   Inter-server copy, intra-server copy, and intra-server clone are each
   OPTIONAL features in the context of server-side copy.  A server may
   choose independently to implement any of them.  A server implementing
   any of these features may be REQUIRED to implement certain
   operations.  Other operations are OPTIONAL in the context of a
   particular feature (see Table 5 in Section 13) but may become
   REQUIRED, depending on server behavior.  Clients need to use these
   operations to successfully copy a file.

RFC7862 - Page 12

   For a client to do an intra-server file copy, it needs to use either
   the COPY or the CLONE operation.  If COPY is used, the client MUST
   support the CB_OFFLOAD operation.  If COPY is used and it returns a
   stateid, then the client MAY use the OFFLOAD_CANCEL and
   OFFLOAD_STATUS operations.

   For a client to do an inter-server file copy, it needs to use the
   COPY and COPY_NOTIFY operations and MUST support the CB_OFFLOAD
   operation.  If COPY returns a stateid, then the client MAY use the
   OFFLOAD_CANCEL and OFFLOAD_STATUS operations.

   If a server supports the intra-server COPY feature, then the server
   MUST support the COPY operation.  If a server's COPY operation
   returns a stateid, then the server MUST also support these
   operations: CB_OFFLOAD, OFFLOAD_CANCEL, and OFFLOAD_STATUS.

   If a server supports the CLONE feature, then it MUST support the
   CLONE operation and the clone_blksize attribute on any file system on
   which CLONE is supported (as either source or destination file).

   If a source server supports the inter-server COPY feature, then it
   MUST support the COPY_NOTIFY and OFFLOAD_CANCEL operations.  If a
   destination server supports the inter-server COPY feature, then it
   MUST support the COPY operation.  If a destination server's COPY
   operation returns a stateid, then the destination server MUST also
   support these operations: CB_OFFLOAD, OFFLOAD_CANCEL, COPY_NOTIFY,
   and OFFLOAD_STATUS.

   Each operation is performed in the context of the user identified by
   the Open Network Computing (ONC) RPC credential in the RPC request
   containing the COMPOUND or CB_COMPOUND request.  For example, an
   OFFLOAD_CANCEL operation issued by a given user indicates that a
   specified COPY operation initiated by the same user is to be
   canceled.  Therefore, an OFFLOAD_CANCEL MUST NOT interfere with a
   copy of the same file initiated by another user.

   An NFS server MAY allow an administrative user to monitor or cancel
   COPY operations using an implementation-specific interface.

RFC7862 - Page 13

4.2.  Requirements for Inter-Server Copy

   The specification of the inter-server copy is driven by several
   requirements:

   o  The specification MUST NOT mandate the server-to-server protocol.

   o  The specification MUST provide guidance for using NFSv4.x as a
      copy protocol.  For those source and destination servers willing
      to use NFSv4.x, there are specific security considerations that
      the specification MUST address.

   o  The specification MUST NOT mandate preconfiguration between the
      source and destination servers.  Requiring that the source and
      destination servers first have a "copying relationship" increases
      the administrative burden.  However, the specification MUST NOT
      preclude implementations that require preconfiguration.

   o  The specification MUST NOT mandate a trust relationship between
      the source and destination servers.  The NFSv4 security model
      requires mutual authentication between a principal on an NFS
      client and a principal on an NFS server.  This model MUST continue
      with the introduction of COPY.

4.3.  Implementation Considerations

4.3.1.  Locking the Files

   Both the source file and the destination file may need to be locked
   to protect the content during the COPY operations.  A client can
   achieve this by a combination of OPEN and LOCK operations.  That is,
   either share locks or byte-range locks might be desired.

   Note that when the client establishes a lock stateid on the source,
   the context of that stateid is for the client and not the
   destination.  As such, there might already be an outstanding stateid,
   issued to the destination as the client of the source, with the same
   value as that provided for the lock stateid.  The source MUST
   interpret the lock stateid as that of the client, i.e., when the
   destination presents it in the context of an inter-server copy, it is
   on behalf of the client.

RFC7862 - Page 14

4.3.2.  Client Caches

   In a traditional copy, if the client is in the process of writing to
   the file before the copy (and perhaps with a write delegation), it
   will be straightforward to update the destination server.  With an
   inter-server copy, the source has no insight into the changes cached
   on the client.  The client SHOULD write the data back to the source.
   If it does not do so, it is possible that the destination will
   receive a corrupt copy of the file.

4.4.  Intra-Server Copy

   To copy a file on a single server, the client uses a COPY operation.
   The server may respond to the COPY operation with the final results
   of the copy, or it may perform the copy asynchronously and deliver
   the results using a CB_OFFLOAD callback operation.  If the copy is
   performed asynchronously, the client may poll the status of the copy
   using OFFLOAD_STATUS or cancel the copy using OFFLOAD_CANCEL.

   A synchronous intra-server copy is shown in Figure 1.  In this
   example, the NFS server chooses to perform the copy synchronously.
   The COPY operation is completed, either successfully or
   unsuccessfully, before the server replies to the client's request.
   The server's reply contains the final result of the operation.

     Client                                  Server
        +                                      +
        |                                      |
        |--- OPEN ---------------------------->| Client opens
        |<------------------------------------/| the source file
        |                                      |
        |--- OPEN ---------------------------->| Client opens
        |<------------------------------------/| the destination file
        |                                      |
        |--- COPY ---------------------------->| Client requests
        |<------------------------------------/| a file copy
        |                                      |
        |--- CLOSE --------------------------->| Client closes
        |<------------------------------------/| the destination file
        |                                      |
        |--- CLOSE --------------------------->| Client closes
        |<------------------------------------/| the source file
        |                                      |
        |                                      |

                 Figure 1: A Synchronous Intra-Server Copy

RFC7862 - Page 15

   An asynchronous intra-server copy is shown in Figure 2.  In this
   example, the NFS server performs the copy asynchronously.  The
   server's reply to the copy request indicates that the COPY operation
   was initiated and the final result will be delivered at a later time.
   The server's reply also contains a copy stateid.  The client may use
   this copy stateid to poll for status information (as shown) or to
   cancel the copy using an OFFLOAD_CANCEL.  When the server completes
   the copy, the server performs a callback to the client and reports
   the results.

     Client                                  Server
        +                                      +
        |                                      |
        |--- OPEN ---------------------------->| Client opens
        |<------------------------------------/| the source file
        |                                      |
        |--- OPEN ---------------------------->| Client opens
        |<------------------------------------/| the destination file
        |                                      |
        |--- COPY ---------------------------->| Client requests
        |<------------------------------------/| a file copy
        |                                      |
        |                                      |
        |--- OFFLOAD_STATUS ------------------>| Client may poll
        |<------------------------------------/| for status
        |                                      |
        |                  .                   | Multiple OFFLOAD_STATUS
        |                  .                   | operations may be sent
        |                  .                   |
        |                                      |
        |<-- CB_OFFLOAD -----------------------| Server reports results
        |\------------------------------------>|
        |                                      |
        |--- CLOSE --------------------------->| Client closes
        |<------------------------------------/| the destination file
        |                                      |
        |--- CLOSE --------------------------->| Client closes
        |<------------------------------------/| the source file
        |                                      |
        |                                      |

                Figure 2: An Asynchronous Intra-Server Copy

RFC7862 - Page 16

4.5.  Inter-Server Copy

   A copy may also be performed between two servers.  The copy protocol
   is designed to accommodate a variety of network topologies.  As shown
   in Figure 3, the client and servers may be connected by multiple
   networks.  In particular, the servers may be connected by a
   specialized, high-speed network (network 192.0.2.0/24 in the diagram)
   that does not include the client.  The protocol allows the client to
   set up the copy between the servers (over network 203.0.113.0/24 in
   the diagram) and for the servers to communicate on the high-speed
   network if they choose to do so.

                             192.0.2.0/24
                 +-------------------------------------+
                 |                                     |
                 |                                     |
                 | 192.0.2.18                          | 192.0.2.56
         +-------+------+                       +------+------+
         |     Source   |                       | Destination |
         +-------+------+                       +------+------+
                 | 203.0.113.18                        | 203.0.113.56
                 |                                     |
                 |                                     |
                 |             203.0.113.0/24          |
                 +------------------+------------------+
                                    |
                                    |
                                    | 203.0.113.243
                              +-----+-----+
                              |   Client  |
                              +-----------+

            Figure 3: An Example Inter-Server Network Topology

   For an inter-server copy, the client notifies the source server that
   a file will be copied by the destination server using a COPY_NOTIFY
   operation.  The client then initiates the copy by sending the COPY
   operation to the destination server.  The destination server may
   perform the copy synchronously or asynchronously.

RFC7862 - Page 17

   A synchronous inter-server copy is shown in Figure 4.  In this case,
   the destination server chooses to perform the copy before responding
   to the client's COPY request.

     Client                Source         Destination
        +                    +                 +
        |                    |                 |
        |--- OPEN        --->|                 | Returns
        |<------------------/|                 | open state os1
        |                    |                 |
        |--- COPY_NOTIFY --->|                 |
        |<------------------/|                 |
        |                    |                 |
        |--- OPEN ---------------------------->| Returns
        |<------------------------------------/| open state os2
        |                    |                 |
        |--- COPY ---------------------------->|
        |                    |                 |
        |                    |                 |
        |                    |<----- READ -----|
        |                    |\--------------->|
        |                    |                 |
        |                    |        .        | Multiple READs may
        |                    |        .        | be necessary
        |                    |        .        |
        |                    |                 |
        |                    |                 |
        |<------------------------------------/| Destination replies
        |                    |                 | to COPY
        |                    |                 |
        |--- CLOSE --------------------------->| Release os2
        |<------------------------------------/|
        |                    |                 |
        |--- CLOSE       --->|                 | Release os1
        |<------------------/|                 |

                 Figure 4: A Synchronous Inter-Server Copy

RFC7862 - Page 18

   An asynchronous inter-server copy is shown in Figure 5.  In this
   case, the destination server chooses to respond to the client's COPY
   request immediately and then perform the copy asynchronously.

     Client                Source         Destination
       +                    +                 +
       |                    |                 |
       |--- OPEN        --->|                 | Returns
       |<------------------/|                 | open state os1
       |                    |                 |
       |--- LOCK        --->|                 | Optional; could be done
       |<------------------/|                 | with a share lock
       |                    |                 |
       |--- COPY_NOTIFY --->|                 | Need to pass in
       |<------------------/|                 | os1 or lock state
       |                    |                 |
       |                    |                 |
       |                    |                 |
       |--- OPEN ---------------------------->| Returns
       |<------------------------------------/| open state os2
       |                    |                 |
       |--- LOCK ---------------------------->| Optional ...
       |<------------------------------------/|
       |                    |                 |
       |--- COPY ---------------------------->| Need to pass in
       |<------------------------------------/| os2 or lock state
       |                    |                 |
       |                    |                 |
       |                    |<----- READ -----|
       |                    |\--------------->|
       |                    |                 |
       |                    |        .        | Multiple READs may
       |                    |        .        | be necessary
       |                    |        .        |
       |                    |                 |
       |                    |                 |
       |--- OFFLOAD_STATUS ------------------>| Client may poll
       |<------------------------------------/| for status
       |                    |                 |
       |                    |        .        | Multiple OFFLOAD_STATUS
       |                    |        .        | operations may be sent
       |                    |        .        |
       |                    |                 |
       |                    |                 |
       |                    |                 |
       |<-- CB_OFFLOAD -----------------------| Destination reports
       |\------------------------------------>| results
       |                    |                 |

RFC7862 - Page 19

       |--- LOCKU --------------------------->| Only if LOCK was done
       |<------------------------------------/|
       |                    |                 |
       |--- CLOSE --------------------------->| Release os2
       |<------------------------------------/|
       |                    |                 |
       |--- LOCKU       --->|                 | Only if LOCK was done
       |<------------------/|                 |
       |                    |                 |
       |--- CLOSE       --->|                 | Release os1
       |<------------------/|                 |
       |                    |                 |

                Figure 5: An Asynchronous Inter-Server Copy

4.6.  Server-to-Server Copy Protocol

   The choice of what protocol to use in an inter-server copy is
   ultimately the destination server's decision.  However, the
   destination server has to be cognizant that it is working on behalf
   of the client.

4.6.1.  Considerations on Selecting a Copy Protocol

   The client can have requirements over both the size of transactions
   and error recovery semantics.  It may want to split the copy up such
   that each chunk is synchronously transferred.  It may want the copy
   protocol to copy the bytes in consecutive order such that upon an
   error the client can restart the copy at the last known good offset.
   If the destination server cannot meet these requirements, the client
   may prefer the traditional copy mechanism such that it can meet those
   requirements.

4.6.2.  Using NFSv4.x as the Copy Protocol

   The destination server MAY use standard NFSv4.x (where x >= 1)
   operations to read the data from the source server.  If NFSv4.x is
   used for the server-to-server copy protocol, the destination server
   can use the source filehandle and ca_src_stateid provided in the COPY
   request with standard NFSv4.x operations to read data from the source
   server.  Note that the ca_src_stateid MUST be the cnr_stateid
   returned from the source via the COPY_NOTIFY (Section 15.3).

RFC7862 - Page 20

4.6.3.  Using an Alternative Copy Protocol

   In a homogeneous environment, the source and destination servers
   might be able to perform the file copy extremely efficiently using
   specialized protocols.  For example, the source and destination
   servers might be two nodes sharing a common file system format for
   the source and destination file systems.  Thus, the source and
   destination are in an ideal position to efficiently render the image
   of the source file to the destination file by replicating the file
   system formats at the block level.  Another possibility is that the
   source and destination might be two nodes sharing a common storage
   area network, and thus there is no need to copy any data at all;
   instead, ownership of the file and its contents might simply be
   reassigned to the destination.  To allow for these possibilities, the
   destination server is allowed to use a server-to-server copy protocol
   of its choice.

   In a heterogeneous environment, using a protocol other than NFSv4.x
   (e.g., HTTP [RFC7230] or FTP [RFC959]) presents some challenges.  In
   particular, the destination server is presented with the challenge of
   accessing the source file given only an NFSv4.x filehandle.

   One option for protocols that identify source files with pathnames is
   to use an ASCII hexadecimal representation of the source filehandle
   as the filename.

   Another option for the source server is to use URLs to direct the
   destination server to a specialized service.  For example, the
   response to COPY_NOTIFY could include the URL
   <ftp://s1.example.com:9999/_FH/0x12345>, where 0x12345 is the ASCII
   hexadecimal representation of the source filehandle.  When the
   destination server receives the source server's URL, it would use
   "_FH/0x12345" as the filename to pass to the FTP server listening on
   port 9999 of s1.example.com.  On port 9999 there would be a special
   instance of the FTP service that understands how to convert NFS
   filehandles to an open file descriptor (in many operating systems,
   this would require a new system call, one that is the inverse of the
   makefh() function that the pre-NFSv4 MOUNT service needs).

   Authenticating and identifying the destination server to the source
   server is also a challenge.  One solution would be to construct
   unique URLs for each destination server.

RFC7862 - Page 21

4.7.  netloc4 - Network Locations

   The server-side COPY operations specify network locations using the
   netloc4 data type shown below (see [RFC7863]):

   <CODE BEGINS>

   enum netloc_type4 {
           NL4_NAME        = 1,
           NL4_URL         = 2,
           NL4_NETADDR     = 3
   };

   union netloc4 switch (netloc_type4 nl_type) {
           case NL4_NAME:          utf8str_cis nl_name;
           case NL4_URL:           utf8str_cis nl_url;
           case NL4_NETADDR:       netaddr4    nl_addr;
   };

   <CODE ENDS>

   If the netloc4 is of type NL4_NAME, the nl_name field MUST be
   specified as a UTF-8 string.  The nl_name is expected to be resolved
   to a network address via DNS, the Lightweight Directory Access
   Protocol (LDAP), the Network Information Service (NIS), /etc/hosts,
   or some other means.  If the netloc4 is of type NL4_URL, a server URL
   [RFC3986] appropriate for the server-to-server COPY operation is
   specified as a UTF-8 string.  If the netloc4 is of type NL4_NETADDR,
   the nl_addr field MUST contain a valid netaddr4 as defined in
   Section 3.3.9 of [RFC5661].

   When netloc4 values are used for an inter-server copy as shown in
   Figure 3, their values may be evaluated on the source server,
   destination server, and client.  The network environment in which
   these systems operate should be configured so that the netloc4 values
   are interpreted as intended on each system.

4.8.  Copy Offload Stateids

   A server may perform a copy offload operation asynchronously.  An
   asynchronous copy is tracked using a copy offload stateid.  Copy
   offload stateids are included in the COPY, OFFLOAD_CANCEL,
   OFFLOAD_STATUS, and CB_OFFLOAD operations.

   A copy offload stateid will be valid until either (A) the client or
   server restarts or (B) the client returns the resource by issuing an
   OFFLOAD_CANCEL operation or the client replies to a CB_OFFLOAD
   operation.

RFC7862 - Page 22

   A copy offload stateid's seqid MUST NOT be zero.  In the context of a
   copy offload operation, it is inappropriate to indicate "the most
   recent copy offload operation" using a stateid with a seqid of zero
   (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
   stateid refers to internal state in the server and there may be
   several asynchronous COPY operations being performed in parallel on
   the same file by the server.  Therefore, a copy offload stateid with
   a seqid of zero MUST be considered invalid.

4.9.  Security Considerations for Server-Side Copy

   All security considerations pertaining to NFSv4.1 [RFC5661] apply to
   this section; as such, the standard security mechanisms used by the
   protocol can be used to secure the server-to-server operations.

   NFSv4 clients and servers supporting the inter-server COPY operations
   described in this section are REQUIRED to implement the mechanism
   described in Section 4.9.1.1 and to support rejecting COPY_NOTIFY
   requests that do not use the RPC security protocol (RPCSEC_GSS)
   [RFC7861] with privacy.  If the server-to-server copy protocol is
   based on ONC RPC, the servers are also REQUIRED to implement
   [RFC7861], including the RPCSEC_GSSv3 "copy_to_auth",
   "copy_from_auth", and "copy_confirm_auth" structured privileges.
   This requirement to implement is not a requirement to use; for
   example, a server may, depending on configuration, also allow
   COPY_NOTIFY requests that use only AUTH_SYS.

   If a server requires the use of an RPCSEC_GSSv3 copy_to_auth,
   copy_from_auth, or copy_confirm_auth privilege and it is not used,
   the server will reject the request with NFS4ERR_PARTNER_NO_AUTH.

4.9.1.  Inter-Server Copy Security

4.9.1.1.  Inter-Server Copy via ONC RPC with RPCSEC_GSSv3

   When the client sends a COPY_NOTIFY to the source server to expect
   the destination to attempt to copy data from the source server, it is
   expected that this copy is being done on behalf of the principal
   (called the "user principal") that sent the RPC request that encloses
   the COMPOUND procedure that contains the COPY_NOTIFY operation.  The
   user principal is identified by the RPC credentials.  A mechanism
   that allows the user principal to authorize the destination server to
   perform the copy, lets the source server properly authenticate the
   destination's copy, and does not allow the destination server to
   exceed this authorization is necessary.

RFC7862 - Page 23

   An approach that sends delegated credentials of the client's user
   principal to the destination server is not used for the following
   reason.  If the client's user delegated its credentials, the
   destination would authenticate as the user principal.  If the
   destination were using the NFSv4 protocol to perform the copy, then
   the source server would authenticate the destination server as the
   user principal, and the file copy would securely proceed.  However,
   this approach would allow the destination server to copy other files.
   The user principal would have to trust the destination server to not
   do so.  This is counter to the requirements and therefore is not
   considered.

   Instead, a feature of the RPCSEC_GSSv3 protocol [RFC7861] can be
   used: RPC-application-defined structured privilege assertion.  This
   feature allows the destination server to authenticate to the source
   server as acting on behalf of the user principal and to authorize the
   destination server to perform READs of the file to be copied from the
   source on behalf of the user principal.  Once the copy is complete,
   the client can destroy the RPCSEC_GSSv3 handles to end the
   authorization of both the source and destination servers to copy.

   For each structured privilege assertion defined by an RPC
   application, RPCSEC_GSSv3 requires the application to define a name
   string and a data structure that will be encoded and passed between
   client and server as opaque data.  For NFSv4, the data structures
   specified below MUST be serialized using XDR.

   Three RPCSEC_GSSv3 structured privilege assertions that work together
   to authorize the copy are defined here.  For each of the assertions,
   the description starts with the name string passed in the rp_name
   field of the rgss3_privs structure defined in Section 2.7.1.4 of
   [RFC7861] and specifies the XDR encoding of the associated structured
   data passed via the rp_privilege field of the structure.

RFC7862 - Page 24

   copy_from_auth:  A user principal is authorizing a source principal
      ("nfs@<source>") to allow a destination principal
      ("nfs@<destination>") to set up the copy_confirm_auth privilege
      required to copy a file from the source to the destination on
      behalf of the user principal.  This privilege is established on
      the source server before the user principal sends a COPY_NOTIFY
      operation to the source server, and the resultant RPCSEC_GSSv3
      context is used to secure the COPY_NOTIFY operation.

      <CODE BEGINS>

   struct copy_from_auth_priv {
           secret4             cfap_shared_secret;
           netloc4             cfap_destination;
           /* the NFSv4 user name that the user principal maps to */
           utf8str_mixed       cfap_username;
   };

      <CODE ENDS>

      cfap_shared_secret is an automatically generated random number
      secret value.

   copy_to_auth:  A user principal is authorizing a destination
      principal ("nfs@<destination>") to set up a copy_confirm_auth
      privilege with a source principal ("nfs@<source>") to allow it to
      copy a file from the source to the destination on behalf of the
      user principal.  This privilege is established on the destination
      server before the user principal sends a COPY operation to the
      destination server, and the resultant RPCSEC_GSSv3 context is used
      to secure the COPY operation.

      <CODE BEGINS>

   struct copy_to_auth_priv {
           /* equal to cfap_shared_secret */
           secret4              ctap_shared_secret;
           netloc4              ctap_source<>;
           /* the NFSv4 user name that the user principal maps to */
           utf8str_mixed        ctap_username;
   };

      <CODE ENDS>

      ctap_shared_secret is the automatically generated secret value
      used to establish the copy_from_auth privilege with the source
      principal.  See Section 4.9.1.1.1.

RFC7862 - Page 25

   copy_confirm_auth:  A destination principal ("nfs@<destination>") is
      confirming with the source principal ("nfs@<source>") that it is
      authorized to copy data from the source.  This privilege is
      established on the destination server before the file is copied
      from the source to the destination.  The resultant RPCSEC_GSSv3
      context is used to secure the READ operations from the source to
      the destination server.

      <CODE BEGINS>

   struct copy_confirm_auth_priv {
           /* equal to GSS_GetMIC() of cfap_shared_secret */
           opaque              ccap_shared_secret_mic<>;
           /* the NFSv4 user name that the user principal maps to */
           utf8str_mixed       ccap_username;
   };

      <CODE ENDS>

4.9.1.1.1.  Establishing a Security Context

   When the user principal wants to copy a file between two servers, if
   it has not established copy_from_auth and copy_to_auth privileges on
   the servers, it establishes them as follows:

   o  As noted in [RFC7861], the client uses an existing RPCSEC_GSSv3
      context termed the "parent" handle to establish and protect
      RPCSEC_GSSv3 structured privilege assertion exchanges.  The
      copy_from_auth privilege will use the context established between
      the user principal and the source server used to OPEN the source
      file as the RPCSEC_GSSv3 parent handle.  The copy_to_auth
      privilege will use the context established between the user
      principal and the destination server used to OPEN the destination
      file as the RPCSEC_GSSv3 parent handle.

   o  A random number is generated to use as a secret to be shared
      between the two servers.  Note that the random number SHOULD NOT
      be reused between establishing different security contexts.  The
      resulting shared secret will be placed in the copy_from_auth_priv
      cfap_shared_secret field and the copy_to_auth_priv
      ctap_shared_secret field.  Because of this shared_secret, the
      RPCSEC_GSS3_CREATE control messages for copy_from_auth and
      copy_to_auth MUST use a Quality of Protection (QoP) of
      rpc_gss_svc_privacy.

RFC7862 - Page 26

   o  An instance of copy_from_auth_priv is filled in with the shared
      secret, the destination server, and the NFSv4 user id of the user
      principal and is placed in rpc_gss3_create_args
      assertions[0].privs.privilege.  The string "copy_from_auth" is
      placed in assertions[0].privs.name.  The source server unwraps the
      rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and verifies that
      the NFSv4 user id being asserted matches the source server's
      mapping of the user principal.  If it does, the privilege is
      established on the source server as <copy_from_auth, user id,
      destination>.  The field "handle" in a successful reply is the
      RPCSEC_GSSv3 copy_from_auth "child" handle that the client will
      use in COPY_NOTIFY requests to the source server.

   o  An instance of copy_to_auth_priv is filled in with the shared
      secret, the cnr_source_server list returned by COPY_NOTIFY, and
      the NFSv4 user id of the user principal.  The copy_to_auth_priv
      instance is placed in rpc_gss3_create_args
      assertions[0].privs.privilege.  The string "copy_to_auth" is
      placed in assertions[0].privs.name.  The destination server
      unwraps the rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and
      verifies that the NFSv4 user id being asserted matches the
      destination server's mapping of the user principal.  If it does,
      the privilege is established on the destination server as
      <copy_to_auth, user id, source list>.  The field "handle" in a
      successful reply is the RPCSEC_GSSv3 copy_to_auth child handle
      that the client will use in COPY requests to the destination
      server involving the source server.

   As noted in Section 2.7.1 of [RFC7861] ("New Control Procedure -
   RPCSEC_GSS_CREATE"), both the client and the source server should
   associate the RPCSEC_GSSv3 child handle with the parent RPCSEC_GSSv3
   handle used to create the RPCSEC_GSSv3 child handle.

4.9.1.1.2.  Starting a Secure Inter-Server Copy

   When the client sends a COPY_NOTIFY request to the source server, it
   uses the privileged copy_from_auth RPCSEC_GSSv3 handle.
   cna_destination_server in the COPY_NOTIFY MUST be the same as
   cfap_destination specified in copy_from_auth_priv.  Otherwise, the
   COPY_NOTIFY will fail with NFS4ERR_ACCESS.  The source server
   verifies that the privilege <copy_from_auth, user id, destination>
   exists and annotates it with the source filehandle, if the user
   principal has read access to the source file and if administrative
   policies give the user principal and the NFS client read access to
   the source file (i.e., if the ACCESS operation would grant read
   access).  Otherwise, the COPY_NOTIFY will fail with NFS4ERR_ACCESS.

RFC7862 - Page 27

   When the client sends a COPY request to the destination server, it
   uses the privileged copy_to_auth RPCSEC_GSSv3 handle.
   ca_source_server list in the COPY MUST be the same as ctap_source
   list specified in copy_to_auth_priv.  Otherwise, the COPY will fail
   with NFS4ERR_ACCESS.  The destination server verifies that the
   privilege <copy_to_auth, user id, source list> exists and annotates
   it with the source and destination filehandles.  If the COPY returns
   a wr_callback_id, then this is an asynchronous copy and the
   wr_callback_id must also must be annotated to the copy_to_auth
   privilege.  If the client has failed to establish the copy_to_auth
   privilege, it will reject the request with NFS4ERR_PARTNER_NO_AUTH.

   If either the COPY_NOTIFY operation or the COPY operations fail, the
   associated copy_from_auth and copy_to_auth RPCSEC_GSSv3 handles MUST
   be destroyed.

4.9.1.1.3.  Securing ONC RPC Server-to-Server Copy Protocols

   After a destination server has a copy_to_auth privilege established
   on it and it receives a COPY request, if it knows it will use an ONC
   RPC protocol to copy data, it will establish a copy_confirm_auth
   privilege on the source server prior to responding to the COPY
   operation, as follows:

   o  Before establishing an RPCSEC_GSSv3 context, a parent context
      needs to exist between nfs@<destination> as the initiator
      principal and nfs@<source> as the target principal.  If NFS is to
      be used as the copy protocol, this means that the destination
      server must mount the source server using RPCSEC_GSSv3.

   o  An instance of copy_confirm_auth_priv is filled in with
      information from the established copy_to_auth privilege.  The
      value of the ccap_shared_secret_mic field is a GSS_GetMIC() of the
      ctap_shared_secret in the copy_to_auth privilege using the parent
      handle context.  The ccap_username field is the mapping of the
      user principal to an NFSv4 user name ("user"@"domain" form) and
      MUST be the same as the ctap_username in the copy_to_auth
      privilege.  The copy_confirm_auth_priv instance is placed in
      rpc_gss3_create_args assertions[0].privs.privilege.  The string
      "copy_confirm_auth" is placed in assertions[0].privs.name.

   o  The RPCSEC_GSS3_CREATE copy_from_auth message is sent to the
      source server with a QoP of rpc_gss_svc_privacy.  The source
      server unwraps the rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload
      and verifies the cap_shared_secret_mic by calling GSS_VerifyMIC()
      using the parent context on the cfap_shared_secret from the
      established copy_from_auth privilege, and verifies that the
      ccap_username equals the cfap_username.

RFC7862 - Page 28

   o  If all verifications succeed, the copy_confirm_auth privilege is
      established on the source server as <copy_confirm_auth,
      shared_secret_mic, user id>.  Because the shared secret has been
      verified, the resultant copy_confirm_auth RPCSEC_GSSv3 child
      handle is noted to be acting on behalf of the user principal.

   o  If the source server fails to verify the copy_from_auth privilege,
      the COPY_NOTIFY operation will be rejected with
      NFS4ERR_PARTNER_NO_AUTH.

   o  If the destination server fails to verify the copy_to_auth or
      copy_confirm_auth privilege, the COPY will be rejected with
      NFS4ERR_PARTNER_NO_AUTH, causing the client to destroy the
      associated copy_from_auth and copy_to_auth RPCSEC_GSSv3 structured
      privilege assertion handles.

   o  All subsequent ONC RPC READ requests sent from the destination to
      copy data from the source to the destination will use the
      RPCSEC_GSSv3 copy_confirm_auth child handle.

   Note that the use of the copy_confirm_auth privilege accomplishes the
   following:

   o  If a protocol like NFS is being used with export policies, the
      export policies can be overridden if the destination server is not
      authorized to act as an NFS client.

   o  Manual configuration to allow a copy relationship between the
      source and destination is not needed.

4.9.1.1.4.  Maintaining a Secure Inter-Server Copy

   If the client determines that either the copy_from_auth or the
   copy_to_auth handle becomes invalid during a copy, then the copy MUST
   be aborted by the client sending an OFFLOAD_CANCEL to both the source
   and destination servers and destroying the respective copy-related
   context handles as described in Section 4.9.1.1.5.

4.9.1.1.5.  Finishing or Stopping a Secure Inter-Server Copy

   Under normal operation, the client MUST destroy the copy_from_auth
   and the copy_to_auth RPCSEC_GSSv3 handle once the COPY operation
   returns for a synchronous inter-server copy or a CB_OFFLOAD reports
   the result of an asynchronous copy.

RFC7862 - Page 29

   The copy_confirm_auth privilege is constructed from information held
   by the copy_to_auth privilege and MUST be destroyed by the
   destination server (via an RPCSEC_GSS3_DESTROY call) when the
   copy_to_auth RPCSEC_GSSv3 handle is destroyed.

   The copy_confirm_auth RPCSEC_GSS3 handle is associated with a
   copy_from_auth RPCSEC_GSS3 handle on the source server via the shared
   secret and MUST be locally destroyed (there is no
   RPCSEC_GSS3_DESTROY, as the source server is not the initiator) when
   the copy_from_auth RPCSEC_GSSv3 handle is destroyed.

   If the client sends an OFFLOAD_CANCEL to the source server to rescind
   the destination server's synchronous copy privilege, it uses the
   privileged copy_from_auth RPCSEC_GSSv3 handle, and the
   cra_destination_server in the OFFLOAD_CANCEL MUST be the same as the
   name of the destination server specified in copy_from_auth_priv.  The
   source server will then delete the <copy_from_auth, user id,
   destination> privilege and fail any subsequent copy requests sent
   under the auspices of this privilege from the destination server.
   The client MUST destroy both the copy_from_auth and the copy_to_auth
   RPCSEC_GSSv3 handles.

   If the client sends an OFFLOAD_STATUS to the destination server to
   check on the status of an asynchronous copy, it uses the privileged
   copy_to_auth RPCSEC_GSSv3 handle, and the osa_stateid in the
   OFFLOAD_STATUS MUST be the same as the wr_callback_id specified in
   the copy_to_auth privilege stored on the destination server.

   If the client sends an OFFLOAD_CANCEL to the destination server to
   cancel an asynchronous copy, it uses the privileged copy_to_auth
   RPCSEC_GSSv3 handle, and the oaa_stateid in the OFFLOAD_CANCEL MUST
   be the same as the wr_callback_id specified in the copy_to_auth
   privilege stored on the destination server.  The destination server
   will then delete the <copy_to_auth, user id, source list> privilege
   and the associated copy_confirm_auth RPCSEC_GSSv3 handle.  The client
   MUST destroy both the copy_to_auth and copy_from_auth RPCSEC_GSSv3
   handles.

4.9.1.2.  Inter-Server Copy via ONC RPC without RPCSEC_GSS

   ONC RPC security flavors other than RPCSEC_GSS MAY be used with the
   server-side copy offload operations described in this section.  In
   particular, host-based ONC RPC security flavors such as AUTH_NONE and
   AUTH_SYS MAY be used.  If a host-based security flavor is used, a
   minimal level of protection for the server-to-server copy protocol is
   possible.

RFC7862 - Page 30

   The biggest issue is that there is a lack of a strong security method
   to allow the source server and destination server to identify
   themselves to each other.  A further complication is that in a
   multihomed environment the destination server might not contact the
   source server from the same network address specified by the client
   in the COPY_NOTIFY.  The cnr_stateid returned from the COPY_NOTIFY
   can be used to uniquely identify the destination server to the source
   server.  The use of the cnr_stateid provides initial authentication
   of the destination server but cannot defend against man-in-the-middle
   attacks after authentication or against an eavesdropper that observes
   the opaque stateid on the wire.  Other secure communication
   techniques (e.g., IPsec) are necessary to block these attacks.

   Servers SHOULD reject COPY_NOTIFY requests that do not use RPCSEC_GSS
   with privacy, thus ensuring that the cnr_stateid in the COPY_NOTIFY
   reply is encrypted.  For the same reason, clients SHOULD send COPY
   requests to the destination using RPCSEC_GSS with privacy.

5.  Support for Application I/O Hints

   Applications can issue client I/O hints via posix_fadvise()
   [posix_fadvise] to the NFS client.  While this can help the NFS
   client optimize I/O and caching for a file, it does not allow the NFS
   server and its exported file system to do likewise.  The IO_ADVISE
   procedure (Section 15.5) is used to communicate the client file
   access patterns to the NFS server.  The NFS server, upon receiving an
   IO_ADVISE operation, MAY choose to alter its I/O and caching behavior
   but is under no obligation to do so.

   Application-specific NFS clients such as those used by hypervisors
   and databases can also leverage application hints to communicate
   their specialized requirements.

6.  Sparse Files

   A sparse file is a common way of representing a large file without
   having to utilize all of the disk space for it.  Consequently, a
   sparse file uses less physical space than its size indicates.  This
   means the file contains "holes", byte ranges within the file that
   contain no data.  Most modern file systems support sparse files,
   including most UNIX file systems and Microsoft's New Technology File
   System (NTFS); however, it should be noted that Apple's Hierarchical
   File System Plus (HFS+) does not.  Common examples of sparse files
   include Virtual Machine (VM) OS/disk images, database files, log
   files, and even checkpoint recovery files most commonly used by the
   High-Performance Computing (HPC) community.

RFC7862 - Page 31

   In addition, many modern file systems support the concept of
   "unwritten" or "uninitialized" blocks, which have uninitialized space
   allocated to them on disk but will return zeros until data is written
   to them.  Such functionality is already present in the data model of
   the pNFS block/volume layout (see [RFC5663]).  Uninitialized blocks
   can be thought of as holes inside a space reservation window.

   If an application reads a hole in a sparse file, the file system must
   return all zeros to the application.  For local data access there is
   little penalty, but with NFS these zeros must be transferred back to
   the client.  If an application uses the NFS client to read data into
   memory, this wastes time and bandwidth as the application waits for
   the zeros to be transferred.

   A sparse file is typically created by initializing the file to be all
   zeros.  Nothing is written to the data in the file; instead, the hole
   is recorded in the metadata for the file.  So, an 8G disk image might
   be represented initially by a few hundred bits in the metadata (on
   UNIX file systems, the inode) and nothing on the disk.  If the VM
   then writes 100M to a file in the middle of the image, there would
   now be two holes represented in the metadata and 100M in the data.

   No new operation is needed to allow the creation of a sparsely
   populated file; when a file is created and a write occurs past the
   current size of the file, the non-allocated region will either be a
   hole or be filled with zeros.  The choice of behavior is dictated by
   the underlying file system and is transparent to the application.
   However, the abilities to read sparse files and to punch holes to
   reinitialize the contents of a file are needed.

   Two new operations -- DEALLOCATE (Section 15.4) and READ_PLUS
   (Section 15.10) -- are introduced.  DEALLOCATE allows for the hole
   punching, where an application might want to reset the allocation and
   reservation status of a range of the file.  READ_PLUS supports all
   the features of READ but includes an extension to support sparse
   files.  READ_PLUS is guaranteed to perform no worse than READ and can
   dramatically improve performance with sparse files.  READ_PLUS does
   not depend on pNFS protocol features but can be used by pNFS to
   support sparse files.

6.1.  Terminology

   Regular file:  An object of file type NF4REG or NF4NAMEDATTR.

   Sparse file:  A regular file that contains one or more holes.

   Hole:  A byte range within a sparse file that contains all zeros.  A
      hole might or might not have space allocated or reserved to it.

RFC7862 - Page 32

6.2.  New Operations

6.2.1.  READ_PLUS

   READ_PLUS is a new variant of the NFSv4.1 READ operation [RFC5661].
   Besides being able to support all of the data semantics of the READ
   operation, it can also be used by the client and server to
   efficiently transfer holes.  Because the client does not know in
   advance whether a hole is present or not, if the client supports
   READ_PLUS and so does the server, then it should always use the
   READ_PLUS operation in preference to the READ operation.

   READ_PLUS extends the response with a new arm representing holes to
   avoid returning data for portions of the file that are initialized to
   zero and may or may not contain a backing store.  Returning actual
   data blocks corresponding to holes wastes computational and network
   resources, thus reducing performance.

   When a client sends a READ operation, it is not prepared to accept a
   READ_PLUS-style response providing a compact encoding of the scope of
   holes.  If a READ occurs on a sparse file, then the server must
   expand such data to be raw bytes.  If a READ occurs in the middle of
   a hole, the server can only send back bytes starting from that
   offset.  By contrast, if a READ_PLUS occurs in the middle of a hole,
   the server can send back a range that starts before the offset and
   extends past the requested length.

6.2.2.  DEALLOCATE

   The client can use the DEALLOCATE operation on a range of a file as a
   hole punch, which allows the client to avoid the transfer of a
   repetitive pattern of zeros across the network.  This hole punch is a
   result of the unreserved space returning all zeros until overwritten.

(page 32 continued on part 3)