Tech-invite3GPPspaceIETFspace
9796959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 8166

Remote Direct Memory Access Transport for Remote Procedure Call Version 1

Pages: 55
Proposed Standard
Errata
Obsoletes:  5666
Updated by:  8797
Part 3 of 3 – Pages 44 to 55
First   Prev   None

Top   ToC   RFC8166 - Page 44   prevText

7. Protocol Extensibility

The RPC-over-RDMA header format is specified using XDR, unlike the message header used with RPC-over-TCP. To maintain a high degree of interoperability among implementations of RPC-over-RDMA, any change to this XDR requires a protocol version number change. New versions of RPC-over-RDMA may be published as separate protocol specifications without updating this document. The first four fields in every RPC-over-RDMA header must remain aligned at the same fixed offsets for all versions of the RPC-over- RDMA protocol. The version number must be in a fixed place to enable implementations to detect protocol version mismatches. For version mismatches to be reported in a fashion that all future version implementations can reliably decode, the rdma_proc field must remain in a fixed place, the value of ERR_VERS must always remain the same, and the field placement in struct rpc_rdma_errvers must always remain the same.

7.1. Conventional Extensions

Introducing new capabilities to RPC-over-RDMA version 1 is limited to the adoption of conventions that make use of existing XDR (defined in this document) and allowed abstract RDMA operations. Because no mechanism for detecting optional features exists in RPC-over-RDMA version 1, implementations must rely on ULPs to communicate the existence of such extensions. Such extensions must be specified in a Standards Track RFC with appropriate review by the NFSv4 Working Group and the IESG. An example of a conventional extension to RPC-over-RDMA version 1 is the specification of backward direction message support to enable NFSv4.1 callback operations, described in [RFC8167].

8. Security Considerations

8.1. Memory Protection

A primary consideration is the protection of the integrity and confidentiality of local memory by an RPC-over-RDMA transport. The use of an RPC-over-RDMA transport protocol MUST NOT introduce vulnerabilities to system memory contents nor to memory owned by user processes.
Top   ToC   RFC8166 - Page 45
   It is REQUIRED that any RDMA provider used for RPC transport be
   conformant to the requirements of [RFC5042] in order to satisfy these
   protections.  These protections are provided by the RDMA layer
   specifications, and in particular, their security models.

8.1.1. Protection Domains

The use of Protection Domains to limit the exposure of memory regions to a single connection is critical. Any attempt by an endpoint not participating in that connection to reuse memory handles needs to result in immediate failure of that connection. Because ULP security mechanisms rely on this aspect of Reliable Connection behavior, strong authentication of remote endpoints is recommended.

8.1.2. Handle Predictability

Unpredictable memory handles should be used for any operation requiring advertised memory regions. Advertising a continuously registered memory region allows a remote host to read or write to that region even when an RPC involving that memory is not under way. Therefore, implementations should avoid advertising persistently registered memory.

8.1.3. Memory Protection

Requesters should register memory regions for remote access only when they are about to be the target of an RPC operation that involves an RDMA Read or Write. Registered memory regions should be invalidated as soon as related RPC operations are complete. Invalidation and DMA unmapping of memory regions should be complete before message integrity checking is done and before the RPC consumer is allowed to continue execution and use or alter the contents of a memory region. An RPC transaction on a Requester might be terminated before a reply arrives if the RPC consumer exits unexpectedly (for example, it is signaled or a segmentation fault occurs). When an RPC terminates abnormally, memory regions associated with that RPC should be invalidated appropriately before the regions are released to be reused for other purposes on the Requester.

8.1.4. Denial of Service

A detailed discussion of denial-of-service exposures that can result from the use of an RDMA transport is found in Section 6.4 of [RFC5042].
Top   ToC   RFC8166 - Page 46
   A Responder is not obliged to pull Read chunks that are unreasonably
   large.  The Responder can use an RDMA_ERROR response to terminate
   RPCs with unreadable Read chunks.  If a Responder transmits more data
   than a Requester is prepared to receive in a Write or Reply chunk,
   the RDMA Network Interface Cards (RNICs) typically terminate the
   connection.  For further discussion, see Section 4.5.  Such repeated
   chunk errors can deny service to other users sharing the connection
   from the errant Requester.

   An RPC-over-RDMA transport implementation is not responsible for
   throttling the RPC request rate, other than to keep the number of
   concurrent RPC transactions at or under the number of credits granted
   per connection.  This is explained in Section 3.3.1.  A sender can
   trigger a self denial of service by exceeding the credit grant
   repeatedly.

   When an RPC has been canceled due to a signal or premature exit of an
   application process, a Requester may invalidate the RPC's Write and
   Reply chunks.  Invalidation prevents the subsequent arrival of the
   Responder's reply from altering the memory regions associated with
   those chunks after the memory has been reused.

   On the Requester, a malfunctioning application or a malicious user
   can create a situation where RPCs are continuously initiated and then
   aborted, resulting in Responder replies that terminate the underlying
   RPC-over-RDMA connection repeatedly.  Such situations can deny
   service to other users sharing the connection from that Requester.

8.2. RPC Message Security

ONC RPC provides cryptographic security via the RPCSEC_GSS framework [RFC7861]. RPCSEC_GSS implements message authentication (rpc_gss_svc_none), per-message integrity checking (rpc_gss_svc_integrity), and per-message confidentiality (rpc_gss_svc_privacy) in the layer above RPC-over-RDMA. The latter two services require significant computation and movement of data on each endpoint host. Some performance benefits enabled by RDMA transports can be lost.

8.2.1. RPC-over-RDMA Protection at Lower Layers

For any RPC transport, utilizing RPCSEC_GSS integrity or privacy services has performance implications. Protection below the RPC transport is often more appropriate in performance-sensitive deployments, especially if it, too, can be offloaded. Certain configurations of IPsec can be co-located in RDMA hardware, for example, without change to RDMA consumers and little loss of data
Top   ToC   RFC8166 - Page 47
   movement efficiency.  Such arrangements can also provide a higher
   degree of privacy by hiding endpoint identity or altering the
   frequency at which messages are exchanged, at a performance cost.

   The use of protection in a lower layer MAY be negotiated through the
   use of an RPCSEC_GSS security flavor defined in [RFC7861] in
   conjunction with the Channel Binding mechanism [RFC5056] and IPsec
   Channel Connection Latching [RFC5660].  Use of such mechanisms is
   REQUIRED where integrity or confidentiality is desired and where
   efficiency is required.

8.2.2. RPCSEC_GSS on RPC-over-RDMA Transports

Not all RDMA devices and fabrics support the above protection mechanisms. Also, per-message authentication is still required on NFS clients where multiple users access NFS files. In these cases, RPCSEC_GSS can protect NFS traffic conveyed on RPC-over-RDMA connections. RPCSEC_GSS extends the ONC RPC protocol [RFC5531] without changing the format of RPC messages. By observing the conventions described in this section, an RPC-over-RDMA transport can convey RPCSEC_GSS- protected RPC messages interoperably. As part of the ONC RPC protocol, protocol elements of RPCSEC_GSS that appear in the Payload stream of an RPC-over-RDMA message (such as control messages exchanged as part of establishing or destroying a security context or data items that are part of RPCSEC_GSS authentication material) MUST NOT be reduced.
8.2.2.1. RPCSEC_GSS Context Negotiation
Some NFS client implementations use a separate connection to establish a Generic Security Service (GSS) context for NFS operation. These clients use TCP and the standard NFS port (2049) for context establishment. To enable the use of RPCSEC_GSS with NFS/RDMA, an NFS server MUST also provide a TCP-based NFS service on port 2049.
8.2.2.2. RPC-over-RDMA with RPCSEC_GSS Authentication
The RPCSEC_GSS authentication service has no impact on the DDP- eligibility of data items in a ULP. However, RPCSEC_GSS authentication material appearing in an RPC message header can be larger than, say, an AUTH_SYS authenticator. In particular, when an RPCSEC_GSS pseudoflavor is in use, a Requester
Top   ToC   RFC8166 - Page 48
   needs to accommodate a larger RPC credential when marshaling RPC Call
   messages and needs to provide for a maximum size RPCSEC_GSS verifier
   when allocating reply buffers and Reply chunks.

   RPC messages, and thus Payload streams, are made larger as a result.
   ULP operations that fit in a Short Message when a simpler form of
   authentication is in use might need to be reduced, or conveyed via a
   Long Message, when RPCSEC_GSS authentication is in use.  It is more
   likely that a Requester provides both a Read list and a Reply chunk
   in the same RPC-over-RDMA header to convey a Long Call and provision
   a receptacle for a Long Reply.  More frequent use of Long Messages
   can impact transport efficiency.

8.2.2.3. RPC-over-RDMA with RPCSEC_GSS Integrity or Privacy
The RPCSEC_GSS integrity service enables endpoints to detect modification of RPC messages in flight. The RPCSEC_GSS privacy service prevents all but the intended recipient from viewing the cleartext content of RPC arguments and results. RPCSEC_GSS integrity and privacy services are end-to-end. They protect RPC arguments and results from application to server endpoint, and back. The RPCSEC_GSS integrity and encryption services operate on whole RPC messages after they have been XDR encoded for transmit, and before they have been XDR decoded after receipt. Both sender and receiver endpoints use intermediate buffers to prevent exposure of encrypted data or unverified cleartext data to RPC consumers. After verification, encryption, and message wrapping has been performed, the transport layer MAY use RDMA data transfer between these intermediate buffers. The process of reducing a DDP-eligible data item removes the data item and its XDR padding from the encoded XDR stream. XDR padding of a reduced data item is not transferred in an RPC-over-RDMA message. After reduction, the Payload stream contains fewer octets than the whole XDR stream did beforehand. XDR padding octets are often zero bytes, but they don't have to be. Thus, reducing DDP-eligible items affects the result of message integrity verification or encryption. Therefore, a sender MUST NOT reduce a Payload stream when RPCSEC_GSS integrity or encryption services are in use. Effectively, no data item is DDP-eligible in this situation, and Chunked Messages cannot be used. In this mode, an RPC-over-RDMA transport operates in the same manner as a transport that does not support DDP.
Top   ToC   RFC8166 - Page 49
   When an RPCSEC_GSS integrity or privacy service is in use, a
   Requester provides both a Read list and a Reply chunk in the same
   RPC-over-RDMA header to convey a Long Call and provision a receptacle
   for a Long Reply.

8.2.2.4. Protecting RPC-over-RDMA Transport Headers
Like the base fields in an ONC RPC message (XID, call direction, and so on), the contents of an RPC-over-RDMA message's Transport stream are not protected by RPCSEC_GSS. This exposes XIDs, connection credit limits, and chunk lists (but not the content of the data items they refer to) to malicious behavior, which could redirect data that is transferred by the RPC-over-RDMA message, result in spurious retransmits, or trigger connection loss. In particular, if an attacker alters the information contained in the chunk lists of an RPC-over-RDMA header, data contained in those chunks can be redirected to other registered memory regions on Requesters. An attacker might alter the arguments of RDMA Read and RDMA Write operations on the wire to similar effect. If such alterations occur, the use of RPCSEC_GSS integrity or privacy services enable a Requester to detect unexpected material in a received RPC message. Encryption at lower layers, as described in Section 8.2.1, protects the content of the Transport stream. To address attacks on RDMA protocols themselves, RDMA transport implementations should conform to [RFC5042].

9. IANA Considerations

A set of RPC netids for resolving RPC-over-RDMA services is specified by this document. This is unchanged from [RFC5666]. The RPC-over-RDMA transport has been assigned an RPC netid, which is an rpcbind [RFC1833] string used to describe the underlying protocol in order for RPC to select the appropriate transport framing, as well as the format of the service addresses and ports. The following netid registry strings are defined for this purpose: NC_RDMA "rdma" NC_RDMA6 "rdma6" The "rdma" netid is to be used when IPv4 addressing is employed by the underlying transport, and "rdma6" for IPv6 addressing. The netid assignment policy and registry are defined in [RFC5665].
Top   ToC   RFC8166 - Page 50
   These netids MAY be used for any RDMA network that satisfies the
   requirements of Section 2.3.2 and that is able to identify service
   endpoints using IP port addressing, possibly through use of a
   translation service as described in Section 5.

   The use of the RPC-over-RDMA protocol has no effect on RPC Program
   numbers or existing registered port numbers.  However, new port
   numbers MAY be registered for use by RPC-over-RDMA-enabled services,
   as appropriate to the new networks over which the services will
   operate.

   For example, the NFS/RDMA service defined in [RFC5667] has been
   assigned the port 20049 in the "Service Name and Transport Protocol
   Port Number Registry".  This is distinct from the port number defined
   for NFS on TCP, which is assigned the port 2049 in the same registry.
   NFS clients use the same RPC Program number for NFS (100003) when
   using either transport [RFC5531] (see the "Remote Procedure Call
   (RPC) Program Numbers" registry).

10. References

10.1. Normative References

[RFC1833] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", RFC 1833, DOI 10.17487/RFC1833, August 1995, <http://www.rfc-editor.org/info/rfc1833>. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>. [RFC4506] Eisler, M., Ed., "XDR: External Data Representation Standard", STD 67, RFC 4506, DOI 10.17487/RFC4506, May 2006, <http://www.rfc-editor.org/info/rfc4506>. [RFC5042] Pinkerton, J. and E. Deleganes, "Direct Data Placement Protocol (DDP) / Remote Direct Memory Access Protocol (RDMAP) Security", RFC 5042, DOI 10.17487/RFC5042, October 2007, <http://www.rfc-editor.org/info/rfc5042>. [RFC5056] Williams, N., "On the Use of Channel Bindings to Secure Channels", RFC 5056, DOI 10.17487/RFC5056, November 2007, <http://www.rfc-editor.org/info/rfc5056>. [RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol Specification Version 2", RFC 5531, DOI 10.17487/RFC5531, May 2009, <http://www.rfc-editor.org/info/rfc5531>.
Top   ToC   RFC8166 - Page 51
   [RFC5660]  Williams, N., "IPsec Channels: Connection Latching",
              RFC 5660, DOI 10.17487/RFC5660, October 2009,
              <http://www.rfc-editor.org/info/rfc5660>.

   [RFC5665]  Eisler, M., "IANA Considerations for Remote Procedure Call
              (RPC) Network Identifiers and Universal Address Formats",
              RFC 5665, DOI 10.17487/RFC5665, January 2010,
              <http://www.rfc-editor.org/info/rfc5665>.

   [RFC7861]  Adamson, A. and N. Williams, "Remote Procedure Call (RPC)
              Security Version 3", RFC 7861, DOI 10.17487/RFC7861,
              November 2016, <http://www.rfc-editor.org/info/rfc7861>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <http://www.rfc-editor.org/info/rfc8174>.

10.2. Informative References

[IBARCH] InfiniBand Trade Association, "InfiniBand Architecture Specification Volume 1", Release 1.3, March 2015, <http://www.infinibandta.org/content/ pages.php?pg=technology_download>. [RFC768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 10.17487/RFC0768, August 1980, <http://www.rfc-editor.org/info/rfc768>. [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, DOI 10.17487/RFC0793, September 1981, <http://www.rfc-editor.org/info/rfc793>. [RFC1094] Nowicki, B., "NFS: Network File System Protocol specification", RFC 1094, DOI 10.17487/RFC1094, March 1989, <http://www.rfc-editor.org/info/rfc1094>. [RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3 Protocol Specification", RFC 1813, DOI 10.17487/RFC1813, June 1995, <http://www.rfc-editor.org/info/rfc1813>. [RFC5040] Recio, R., Metzler, B., Culley, P., Hilland, J., and D. Garcia, "A Remote Direct Memory Access Protocol Specification", RFC 5040, DOI 10.17487/RFC5040, October 2007, <http://www.rfc-editor.org/info/rfc5040>.
Top   ToC   RFC8166 - Page 52
   [RFC5041]  Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct
              Data Placement over Reliable Transports", RFC 5041,
              DOI 10.17487/RFC5041, October 2007,
              <http://www.rfc-editor.org/info/rfc5041>.

   [RFC5532]  Talpey, T. and C. Juszczak, "Network File System (NFS)
              Remote Direct Memory Access (RDMA) Problem Statement",
              RFC 5532, DOI 10.17487/RFC5532, May 2009,
              <http://www.rfc-editor.org/info/rfc5532>.

   [RFC5661]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
              "Network File System (NFS) Version 4 Minor Version 1
              Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010,
              <http://www.rfc-editor.org/info/rfc5661>.

   [RFC5662]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
              "Network File System (NFS) Version 4 Minor Version 1
              External Data Representation Standard (XDR) Description",
              RFC 5662, DOI 10.17487/RFC5662, January 2010,
              <http://www.rfc-editor.org/info/rfc5662>.

   [RFC5666]  Talpey, T. and B. Callaghan, "Remote Direct Memory Access
              Transport for Remote Procedure Call", RFC 5666,
              DOI 10.17487/RFC5666, January 2010,
              <http://www.rfc-editor.org/info/rfc5666>.

   [RFC5667]  Talpey, T. and B. Callaghan, "Network File System (NFS)
              Direct Data Placement", RFC 5667, DOI 10.17487/RFC5667,
              January 2010, <http://www.rfc-editor.org/info/rfc5667>.

   [RFC7530]  Haynes, T., Ed. and D. Noveck, Ed., "Network File System
              (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
              March 2015, <http://www.rfc-editor.org/info/rfc7530>.

   [RFC8167]  Lever, C., "Bidirectional Remote Procedure Call on RPC-
              over-RDMA Transports", RFC 8167, DOI 10.17487/RFC8167,
              June 2017, <http://www.rfc-editor.org/info/rfc8167>.
Top   ToC   RFC8166 - Page 53

Appendix A. Changes from RFC 5666

A.1. Changes to the Specification

The following alterations have been made to the RPC-over-RDMA version 1 specification. The section numbers below refer to [RFC5666]. o Section 2 has been expanded to introduce and explain key RPC [RFC5531], XDR [RFC4506], and RDMA [RFC5040] terminology. These terms are now used consistently throughout the specification. o Section 3 has been reorganized and split into subsections to help readers locate specific requirements and definitions. o Sections 4 and 5 have been combined to improve the organization of this information. o The optional Connection Configuration Protocol has never been implemented. The specification of CCP has been deleted from this specification. o A section consolidating requirements for ULBs has been added. o An XDR extraction mechanism is provided, along with full copyright, matching the approach used in [RFC5662]. o The "Security Considerations" section has been expanded to include a discussion of how RPC-over-RDMA security depends on features of the underlying RDMA transport. o A subsection describing the use of RPCSEC_GSS [RFC7861] with RPC- over-RDMA version 1 has been added.

A.2. Changes to the Protocol

Although the protocol described herein interoperates with existing implementations of [RFC5666], the following changes have been made relative to the protocol described in that document: o Support for the Read-Read transfer model has been removed. Read- Read is a slower transfer model than Read-Write. As a result, implementers have chosen not to support it. Removal of Read-Read simplifies explanatory text, and the RDMA_DONE procedure is no longer part of the protocol.
Top   ToC   RFC8166 - Page 54
   o  The specification of RDMA_MSGP in [RFC5666] is not adequate,
      although some incomplete implementations exist.  Even if an
      adequate specification were provided and an implementation were
      produced, benefit for protocols such as NFSv4.0 [RFC7530] is
      doubtful.  Therefore, the RDMA_MSGP message type is no longer
      supported.

   o  Technical issues with regard to handling RPC-over-RDMA header
      errors have been corrected.

   o  Specific requirements related to implicit XDR roundup and complex
      XDR data types have been added.

   o  Explicit guidance is provided related to sizing Write chunks,
      managing multiple chunks in the Write list, and handling unused
      Write chunks.

   o  Clear guidance about Send and Receive buffer sizes has been
      introduced.  This enables better decisions about when a Reply
      chunk must be provided.

Acknowledgments

The editor gratefully acknowledges the work of Brent Callaghan and Tom Talpey on the original RPC-over-RDMA Version 1 specification [RFC5666]. Dave Noveck provided excellent review, constructive suggestions, and consistent navigational guidance throughout the process of drafting this document. Dave also contributed much of the organization and content of Section 7 and helped the authors understand the complexities of XDR extensibility. The comments and contributions of Karen Deitke, Dai Ngo, Chunli Zhang, Dominique Martinet, and Mahesh Siddheshwar are accepted with great thanks. The editor also wishes to thank Bill Baker, Greg Marsden, and Matt Benjamin for their support of this work. The extract.sh shell script and formatting conventions were first described by the authors of the NFSv4.1 XDR specification [RFC5662]. Special thanks go to Transport Area Director Spencer Dawkins, NFSV4 Working Group Chair and Document Shepherd Spencer Shepler, and NFSV4 Working Group Secretary Thomas Haynes for their support.
Top   ToC   RFC8166 - Page 55

Authors' Addresses

Charles Lever (editor) Oracle Corporation 1015 Granger Avenue Ann Arbor, MI 48104 United States of America Phone: +1 248 816 6463 Email: chuck.lever@oracle.com William Allen Simpson Red Hat 1384 Fontaine Madison Heights, MI 48071 United States of America Email: william.allen.simpson@gmail.com Tom Talpey Microsoft Corp. One Microsoft Way Redmond, WA 98052 United States of America Phone: +1 425 704-9945 Email: ttalpey@microsoft.com