Section 3.3.2 of
RFC 8166 defines the term "inline threshold". An inline threshold is the maximum number of bytes that can be transmitted using one RDMA Send and one RDMA Receive. There are a pair of inline thresholds for a connection: a client-to-server threshold and a server-to-client threshold.
If an incoming RDMA message exceeds the size of a receiver's inline threshold, the Receive operation fails and the RDMA provider typically terminates the connection. To convey an RPC message larger than the receiver's inline threshold without risking receive failure, a sender must use explicit RDMA data transfer operations, which are more expensive than an RDMA Send. See Sections
3.3 and
3.5 of [
RFC 8166] for a complete discussion.
The default value of inline thresholds for RPC-over-RDMA version 1 connections is 1024 bytes (as defined in
Section 3.3.3 of
RFC 8166). This value is adequate for nearly all NFS version 3 procedures.
NFS version 4 COMPOUND operations [
RFC 7530] are larger on average than NFS version 3 procedures [
RFC 1813], forcing clients to use explicit RDMA operations for frequently issued requests such as LOOKUP and GETATTR. The use of RPCSEC_GSS security also increases the average size of RPC messages, due to the larger size of RPCSEC_GSS credential material included in RPC headers [
RFC 7861].
If a sender and receiver could somehow agree on larger inline thresholds, frequently used RPC transactions avoid the cost of explicit RDMA operations.
After an RDMA data transfer operation completes, an RDMA consumer can request that its peer's RDMA Network Interface Card (RNIC) invalidate the Steering Tag (STag) associated with the data transfer [
RFC 5042].
An RDMA consumer requests remote invalidation by posting an RDMA Send with Invalidate operation in place of an RDMA Send operation. Each RDMA Send with Invalidate carries one STag to invalidate. The receiver of an RDMA Send with Invalidate performs the requested invalidation and then reports that invalidation as part of the completion of a waiting Receive operation.
If both peers support remote invalidation, an RPC-over-RDMA responder might use remote invalidation when replying to an RPC request that provided chunks. Because one of the chunks has already been invalidated, finalizing the results of the RPC is made simpler and faster.
However, there are some important caveats that contraindicate the blanket use of remote invalidation:
-
Remote invalidation is not supported by all RNICs.
-
Not all RPC-over-RDMA responder implementations can generateRDMA Send with Invalidate operations.
-
Not all RPC-over-RDMA requester implementations can recognizewhen remote invalidation has occurred.
-
On one connection in different RPC-over-RDMA transactions,or in a single RPC-over-RDMA transaction,an RPC-over-RDMA requester can expose a mixture of STagsthat may be invalidated remotelyand some that must not be.No indication is provided at the RDMA layer as to which is which.
A responder therefore must not employ remote invalidation unless it is aware of support for it in its own RDMA stack, and on the requester. And, without altering the XDR structure of RPC-over-RDMA version 1 messages, it is not possible to support remote invalidation with requesters that include an STag that must not be invalidated remotely in an RPC with STags that may be invalidated. Likewise, it is not possible to support remote invalidation with requesters that mix RPCs with STags that may be invalidated with RPCs with STags that must not be invalidated on the same connection.
There are some NFS/RDMA client implementations whose STags are always safe to invalidate remotely. For such clients, indicating to the responder that remote invalidation is always safe can enable such invalidation without the need for additional protocol elements to be defined.