Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-03 (part one of three)

Chuck Lever <> Wed, 21 December 2016 16:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2B83A1294B5 for <>; Wed, 21 Dec 2016 08:50:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -7.301
X-Spam-Status: No, score=-7.301 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, RP_MATCHES_RCVD=-3.1, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id wRdMrhqZPPDi for <>; Wed, 21 Dec 2016 08:50:13 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 08923129406 for <>; Wed, 21 Dec 2016 08:50:12 -0800 (PST)
Received: from ( []) by (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id uBLGo8tb026479 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 21 Dec 2016 16:50:08 GMT
Received: from ( []) by (8.13.8/8.14.4) with ESMTP id uBLGo83x029730 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 21 Dec 2016 16:50:08 GMT
Received: from ( []) by (8.14.4/8.14.4) with ESMTP id uBLGo6dc030434; Wed, 21 Dec 2016 16:50:06 GMT
Received: from (/ by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 21 Dec 2016 08:50:06 -0800
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Chuck Lever <>
In-Reply-To: <>
Date: Wed, 21 Dec 2016 11:50:05 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <>
To: David Noveck <>
X-Mailer: Apple Mail (2.3124)
X-Source-IP: []
Archived-At: <>
Cc: "" <>
Subject: Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-03 (part one of three)
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 21 Dec 2016 16:50:14 -0000

Hi Dave-

Thanks for the review. Responses to your General Comments are below.

> On Dec 19, 2016, at 5:56 PM, David Noveck <> wrote:
> Review Structure
> This is the first part of a multi-part review.  It is split into multiple emails to avoid running into mailing list size limits.
> This part consists of:
> 	• General Comments
> 	• Comments by Section (through Section 3)
> Later emails will contain the rest of the per-section comments.
> General Comments
> Overall Evaluation
> I think this document is fairly far along in meeting the original requirements, to replace RPC5667 and supplement it with a reasonable treatment of NFSv4, not really dealt with in RFC5667.
> However, in a number of respects, the current text has issues that need to be addressed before going forward and I've made some suggestions as to how those issues could be addressed.  These issues are:
> 	• Adaptation to the possiblility of multiple RPC_over-RDMA versions, whether that is done as outlined in draft-cel-nfsv4-rpcrdma-version-two or not.  This issue is discussed in Handling of RPC-over-RDMA Transport Versions.
> 	• The need to address some fundamental issues regarding the ability to bound reply size.  I have suggested a way to do that in the proposed new section 2.x.  Difficulties in Reply Size Estimation

I predict some resistance to section titles in a specification that include
the word "Difficulties".

The point has been made, however, that without proper Reply Size Estimation,
successful interoperation is at risk. This is why we now call out Reply Size
Estimation in RFC 5666bis as a necessary part of an Upper Layer Binding.

I agree that there are plenty of details that need to be sorted here. Your
comments in this area are most helpful, and I plan to integrate many of them
into this document. I hope we can resolve most RSE issues so that the use of
the word "Difficulties" will be rare or altogether unnecessary.

> Handling of RPC-over-RDMA Transport Versions
> Even though the I-D for Version Two has not yet been adopted as a working group document and there may be some dispute as to the proper direction such a new version might take, it is now clear that further development of RPC-over-RDMA is likely to take place.  In light of the fact that we can no longer assume Version One will be the only version, we have a number of choices:
> 	• Make the current document the ULB for NFS applying to RDMA-over-RDMA Version One only or to Version One plus a very small Version Two.  Note that later changes to an extensible transport version might make a lot of the current statements about how DDP will be done, inappropriate. For example, a version two based on draft-cel-nfsv4-rpcrdma-version-two might not require much change but the later addition of features for message continuation and  send-based DDP, as set out in draft-noveck-nfsv4-rpcrma-rtrext would essentially force a rewriting of the document, even if those feature were OPTIONAL.
> 	• Cut back the scope of the current to eliminate most of the explanatory material tied to Version One of the transport and leave it oly containg the mateial that rfc5666bis requires for ULBs.
> 	• Restructure the document to distinguish material required by rfc566bis and applicable to all transport versions from explanatory/illustrative  material tied to Version One.
> In my revision suggestions, I have adopted the last of these approaches.

I prefer to keep the NFS ULB document as agnostic to transport protocol
version as possible. That would increase its useful lifespan, rather 
than requiring a fresh ULB later to handle subsequent versions of

I assume you object to the use of terminology that mentions a particular
RDMA mechanism (eg. RDMA Read, or Read chunk), since that is incompatible
with potential extensions to RPC-over-RDMA which might use some other
mechanism (eg. structured Send buffers).

On considering replacing that language with abstract terminology such as
"DDP-eligible", that could make the document more difficult for implementers
to interpret. It would certainly bear little resemblance to the original RFC

Likely all of Section 2 would need to be rewritten or removed. I believe
removal of this section would be a disservice to implementers.

IMO it's treading a fine line to consider non-RDMA Direct Data Placement
mechanisms in a transport protocol called "RPC-over-RDMA".

Be that as it may, I will see what can be done to address your concerns in
this document. Because I'm not taking "the last of these approaches" (see
above), at least for now, I will probably not merge many of your proposed
changes in this area, but will use the body of your remarks as a guide to
rework the text to be agnostic to data placement mechanism.

Chuck Lever