Re: [nfsv4] Review of draft-cel-nfsv4-rpcrdma-cm-pvt-msg-00

Chuck Lever <chuck.lever@oracle.com> Mon, 06 March 2017 15:46 UTC

Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 985FB129534 for <nfsv4@ietfa.amsl.com>; Mon, 6 Mar 2017 07:46:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.702
X-Spam-Level:
X-Spam-Status: No, score=-3.702 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kk99qwlmogyC for <nfsv4@ietfa.amsl.com>; Mon, 6 Mar 2017 07:46:31 -0800 (PST)
Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 452231293DC for <nfsv4@ietf.org>; Mon, 6 Mar 2017 07:46:31 -0800 (PST)
Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v26FkTnv005562 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 6 Mar 2017 15:46:29 GMT
Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v26FkSfb010187 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 6 Mar 2017 15:46:29 GMT
Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v26FkScE004825; Mon, 6 Mar 2017 15:46:28 GMT
Received: from anon-dhcp-171.1015granger.net (/68.46.169.226) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 06 Mar 2017 07:46:28 -0800
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Chuck Lever <chuck.lever@oracle.com>
In-Reply-To: <CADaq8jdMkau_t-RW4d5VQz29tBmxqEONi1Rf_tkhOeuH6mVJ9w@mail.gmail.com>
Date: Mon, 06 Mar 2017 10:46:27 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <4DF3EB83-66F2-4F7B-8570-B003D7EA27D2@oracle.com>
References: <CADaq8jdMkau_t-RW4d5VQz29tBmxqEONi1Rf_tkhOeuH6mVJ9w@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
X-Mailer: Apple Mail (2.3124)
X-Source-IP: userv0021.oracle.com [156.151.31.71]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/LIcplZip-K3Owa0Ss1lVaoCCrh8>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Review of draft-cel-nfsv4-rpcrdma-cm-pvt-msg-00
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Mar 2017 15:46:33 -0000

> On Mar 6, 2017, at 6:47 AM, David Noveck <davenoveck@gmail.com> wrote:
> 
> General Comments
> 
> Purpose of Review
> 
> The purpose of this review is to help the working group decide on the appropriate next steps to take regaring this document and more particularly:
> 	• Whether it is in appropriate to convert this document to be a working group document.
> 	• Whether the working group needs any changes in this document to allow it to become a working group document.
> 	• What other changes should be considered as important to make in this document to allow it to help the working group meet its needs with regard to RPC-over-RDMA performance which are summarized in the first few paragraphs of Section 1 of the document.
> Context of the Review
> 
> Previously I had been assuming that addressing our immediate performance issues, required faster work on Version Two, necessitating a prompt advance of draft-cel-nfsv4-rpcrdma-version-two to working-group document status.
> 
> However, the simpler nature of a  transition to using the mechanism laid out in this document suggests the possibility of addressing our performance issues in the context of Version One.  This would enable a more measured approach to Version Two, given that the prime performance concerns can be addressed in a Version One context.
> 
> Overall Evaluation of Document
> 
> This document is in good shape and the issues within it should pose no difficulty to it becoming a working group document.  It is relatively far along so that the path to getting it to be ready for an eventual WGLC should be relatively short.
> 
> However there is a complex of issues that would benefit from a period of working group discussion.
> 
> These issues include:
> 	• Whether its current status of Experimental is appropriate given its expected use or whether it needs to be converted, at some point, to be standards-track.
> 	• Whether there needs to be some adjustment in the declared purpose on this document, to address the issues raised below in 1.  Introduction.
> Suggestions Regarding Procedure
> 
> I am assuming that Chuck will respond with his views regarding the issues raised in this review including his view regarding the basic issue of converting this document to working-group.  At that point, I would expect Chuck to make some suggestion about how to proceed.
> 
> With regard  to my own expectations about an appropriate procedure:
> 	• Chuck could at that point propose that the document be made into a working group document.
> 	• Chuck could, as part of that request, suggest an appropriate period for working group comments, and indicate his intentions regarding issues related to the purpose of that document and the document's current status as experimental status.
> 	• After completion of the comment period, we could quickly convert the document to working-group status as-is, without addressing any of the issues noted in the review.  I am anticipating that this sort of deferral might be required since Chuck might be occupied with issues relating to rfc5666bis, bidirection, and rfc56667bis.

Wrt document purpose: I'll need time to digest those comments and
refresh my memory of the current document text. I don't foresee
any particular issue with adjusting the purpose of this document.

Wrt the "Experimental" status: IMO that is appropriate for a
document that describes a protocol that may change or is still
under some development, and has not had any WG review. I'm OK
making document status changes as part of its promotion to WG
document, depending on WG discussion.

Wrt the reference to reminv-design: I understand your comments. I
would also add that we, as a WG, should not load the IESG and RFC
Editor with more documents than we have to. I will try, therefore,
to pull the relevant text from reminv-design into cm-pvt-msg, and
remove the troublesome citation. That might not work, but let's
see where it goes.

I can merge editorial comments now. I'll submit a fresh revision
in a few weeks that perhaps could be considered for promotion.

(You are correct that documents that are currently further along
in the pipeline are a higher priority for me at the moment).


> Comments by Section
> 
> Note that most of these comments are not directly connected with the original purpose of the review, determining whether  this document is an appropriate state to be  converted to a working group document.   During the ensuing  complete review, I noted  everything that I found, irrespective of whether the issue had to be addressed as part of the change.
> 
> As a result, many of the comments below relate only indirectly to the original purpose of the review and serve mainly to point to issues that will need to be addressed as the document moves forward toward WGLC.
> 
> Abstract
> 
> In the second sentence, suggest replacing "can" by "could".
> 
> With regard to the last sentence, I'm not sure what it means and why it is there.
> 
> Requirements Language
> 
> I know that the author didn't really write this sentence and that I'm probably the first person to actually read it.  Nevertheless, I feel the fact that it actually appears in the document and seems to make relevant statement makes it fair game.
> 
> I'm aware that I may be trying people's patience in noting this, but ask the readers' indulgence as I point out that RFC2119 concerns the use of these terms in standards-track documents and it isn't clear exactly what they are to mean in this document, if the current status is maintained.
> 
> 1.  Introduction
> 
> Normally, in per-section comments I proceed in order but in this case, the nature of the issue forces me to start at the end of the section.
> 
> The nature of the last sentence of the last paragraph is such that it needs to be changed along with a serious rethink of the purpose of this document.  In particular,
> 	• If this message format is only to be used in future transport versions, then what has been said about how challenging it is to extend version One becomes beside the point.
> 	• Future versions of the protocol might well have their own ways of addressing the issues cited at the beginning of this section
> In my view, it is appropriate to rewrite the last two sentences of the last paragraph as follows:
> The purpose of this message format is to allow Version One implementations to exchange information allowing them to resolve the shortcomings discussed above, while remaining compatible with existing implementations. Future versions of RPC-over-RDMA may use the same mechanism or may choose to address these issues in different ways.
> The document probably needs to be changed from Experimental to standards-track.  However:
> 	• I'm not sure if this is necessary since I don't understand how an Experimental RFC differs from a Proposed Standard (supposedly "just a proposal").
> 	• I am pretty clear that if this change is necessary, it need not be done immediately and should pose no obstacle to this becoming a working group  document.
> The fundamental issue that needs to be addressed is the purpose of the document.  If this is not addressed:
> 	• It doesn't seem to me that the document has any point since the relevant experimentation has already been done.
> 	• Version One would have no defined means of addressing the performance issues we have been discussing. leaving the task to Version Two.
> We now return to the start of the section.
> 
> I suggest rewriting the last sentence of the second (unbulleted) paragraph as follows:
> However, [I-D.ietf-nfsv4-rfc5666bis] eliminated support for this protocol making it unavailable for this purpose.
> With regard the third (unbulleted) paragraph, I have problems with the word "challenging".  The issue is that nobody knows what this means (Are some people wondering why this is a problem :-) .  I think the intention is to say it is impossible, which it is given the restrictions that have been placed on the Version One XDR.  In any case, I don't think we have to prove that this infeasible and simply state that it will not be done.  So, I suggest rewriting this paragraph as follows:
> Version One has no means of providing an extension mechanism that allows interoperability with existing implementations.  As a result, another out-of-band mechanism is required to help relieve these limitations for RPC-over-RDMA Version One implementations.
> 2.1.  Inline Threshold Size
> There are a couple of minor issues in the last paragraph:
> 	• In the first sentence, suggest replacing "Thus" by "To enable the proper size to be determined,"
> 	• The use of the terms "requester" and "responder" suggests that there might be separate limits for each direction of use.  Suggest replacing "requester" by "client" and "responder by "server".
> The other issue is more subtle.  At first, it appeared that the word "MUST", while natural, was overkill.  For example, consider a server that was prepared to allocate buffers of a maximum of 4K in size but was prepared to split them up into 2K or 1K buffers. In that situation, it isn't clear why, for example, the fact that the client could send a 3K request should essentially force the server to allocate 3K buffers wasting 25% of the buffer space, or why choosing 2K buffers might raise interoperability issues.
> Also, it isn't clear:
> 	• Why the fact that a client might send a 3K request in rare circumstances is relevant to he choice of the server's inline buffer threshold
> 	• How the client could possibly know how big its requests might be since the client, per se, is not in charge of that.
> 	• How, given the reply size estimation issues we have been wrestling with rfc5667bis the server can determine the maximum size of replies that might be sent.
> The following suggested revision aims:
> 	• To give the server a meaningful role in determining the receiver's buffer size, without promising more than can be delivered.
> 	• To make it clearer why "MUST" is the appropriate choice here
> 	• To address the minor issues noted above
> In any case, I suggest replacing the current last paragraph by the following:
> To enable the proper size to be determined, each peer advertises:
> 	• a target message size that it could use effectively in sending messages.  Note that this size does not include space for data items placed directly although it would include space for the associated chunk headers an the rest of RPC-over-RDMA headers.
> 	• the largest size buffer it is prepared to allocate to receive messages..  
> In order for each peer to determine its inline threshold in a manner consistent  with the value assumed by the other peer, each needs to use a common algorithm known to other peer, and based only on the private data known to both.  In order to arrive at these consistent values:
> 	• The client MUST use the smaller of its target send size and the server's maximum receive buffer size as its value for the server's inline threshold, while the server uses the same procedure to arrive at the same value.  
> 	• The server MUST use the smaller of its target send size and the client's maximum receive buffer size as its value for the client's inline threshold, while the client uses the same procedure to arrive at the same value.
> 2.2.  Support for Remote Invalidation
> My only concern in this section concerns the status of draft-cel-nfsv4-reminv-design, which is currently a private informational draft.
> If draft-cel-nfsv4-rpcrdma-cm-pvt-msg is to become an RFC, the only point of the exercise in my opinion, we have two options with regard to 
> draft-cel-nfsv4-reminv-design:
> 	• Making it first a working group document and eventually an Informational RFC.  Note that this is the type of exploratory document that, typically, we do not bother to publish as an RFC.
> 	• Eliminate the reference by prov using a brief targeted introduction to the subject focused on defing the trms used, taking into account all the references in this document.
> 3.  Private Data Message Format
> In the last sentence, suggest replacing "requesters and responders"  by "clients and servers".
> I recall hearing that there are some implementions with rather tight restrictions on the size of this private data.  if my recollection is correct, something shoul be said about this issue.
> 3.1.  Fixed Mandatory Fields
> In light of RFC2119's statement that these terms are to be used "sparingly", suggest replacing "MUST" in the first sentence by "is to".
> Under Version,  suggest rewriting the second sentence as follows:
> The value "1" in this field indicates that exactly eight octets are present, that they appear in the order described in this section, and that each has the meaning defined in this section.
> Under Flags, suggest revising 'boolean flags" as it appears redundant.  Possible replacements are "boolean values" and "flag bits".
> Under Send Size,  suggest rewriting the first sentence as follows:
> This 8-bit field contains an encoded value specifying a target message size that the peer could use effectively in sending messages using RDMA Send.
> Under Receive Size,  suggest rewriting the first sentence as follows:
> This 8-bit field contains an encoded value specifying the largest receive buffer size this peer is prepared to allocate.  Such buffers are used to receive messages sent using RDMA sends.
> 3.1.2.  Inline Threshold Encoding
> Suggest replacing the section title by "Message Size Encoding".
> In the first sentence, suggest replacing "Inline threshold sizes" by "Message and buffer sizes".
> In the last sentence, suggest replacing "complementary operations" by "a complementary set of operations"
> 3.2.  Extending The Private Message Format
> In the first sentence, suggest:
> 	• Replacing "to add" by "by adding"
> 	• Replacing "to make us of one of" by "making use of"
> In the second sentence, suggest:
> 	• Replacing "allocated" by "to be allocated"
> 	• Replacing "are defined" by "defined".
> In the third sentence, suggest replacing "must also be provided" by "is to to be provided as well".
> with regard to the last sentence, since at appears that the document is unikely to still be "a personal draft in the Experiemental category" a replcement nees to be provided.  Suggest the following sentence:
> Such situations are best addressed by specifying the new format in a document updating this one.
> 4.  Interoperability Considerations
> In the second sentence, suggest replacing "assume" by "act as if".
> In the third sentence, suggest replacing "behaves" by "is to behave".
> 7.2.  Informative References
> With regard to the reference to draft-cel-nfsv4-rpcrdma-cm-pvt-msg-00, it is probably the case that if this reference is to remain in the document, it should be converted to be normative, since the terms it defines are central to understating the function of this document.
> 
> With regard to the IB-IBTA reference, the following issues need to be addressed:
> 	• The URL currently there gets you to the appropriate web site but not to the referenced document
> 	• the version mentioned is no longer current and does not appear to be present on the web site.

--
Chuck Lever