Re: [nfsv4] Proposed changes to draft-cel-nfsv4-rpcrdma-version-two.

Chuck Lever <chuck.lever@oracle.com> Tue, 10 May 2016 14:41 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Chuck Lever <chuck.lever@oracle.com>
In-Reply-To: <CADaq8jfst1vuBeM=PL0nErngk2jktKb2dimqWhOmn2Jow9zGbw@mail.gmail.com>
Date: Tue, 10 May 2016 10:41:48 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <3DA09A88-EB03-45E5-A915-500101DFFF9C@oracle.com>
References: <CADaq8jfst1vuBeM=PL0nErngk2jktKb2dimqWhOmn2Jow9zGbw@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/vFVRCDzEjG14AfCcQF0CFzzJpDs>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Proposed changes to draft-cel-nfsv4-rpcrdma-version-two.
Precedence: list

> On May 10, 2016, at 8:57 AM, David Noveck <davenoveck@gmail.com> wrote:
> 
> On April 20, when I sent the message credit-related issues in draft-ietf-nfsv4-{rfc5666bis,rfc5666-implementation-experience,rpcrdma-bidirection}, I indicated that there would be corresponding Version Two changes.  Even though some of these changes are, in some sense, motivated by considerations related to bidirectional operation, I think we can discuss these without waiting for the WGLC of draft-ietf-nfsv4-rpcrdma-bidirection to complete.  
> 
> This mail includes those credit-related changes and proposals for a bunch of others that I think are needed, based on my work on draft-dnoveck-rpcrdma-xcharext and draft-dnoveck-rpcrdma-rtrissues.
> 
> In any case, there isn't a big rush and there would not be a problem waiting until late May, if you are busy with wrapping up Version One.  My main concern is that we get an up-to-date Version Two out in time for the draft cutoff for Berlin, on July 8.  Since I am hoping to be making corresponding changes in draft-dnoveck-nfsv4-rpcrdma-xcharext and the forthcoming draft-dnoveck-nfsv4-rpcrdma-rtrext, I'd hope we can get agreement on the substance of the current round of Version Two changes by the end of June.

I should be able to submit a revision of rpcrdma-version-two
before the end of May, depending on subsequent discussion of
the items below, and other workload (implementation and bugs).


> So far it looks like all of the necessary changes would go into 3.2.  Documentation Requirements.
> 
> Credit field ambiguity
> 
> To address the fact that, in the context of extensions, it needs to be clear whether the credit field is a request or a grant, I propose that we add a new bullet reading as follows:
> 
> For each new opttype code, it must be made clear how it is to be determined whether the existing credit field in the header is to be interpreted as a request or as a grant.   This can include specifying that, for particular codes, the field is to be interpreted as one or the other, that particular fields in the header definition are to direct the interpretation, or that fields in the associated RPC Payload Stream are to direct the interpretation.  In some cases, there may be no valid interpretation and the credit field is to be ignored.  In other cases, the credit field is to be interpreted either as a request or a grant and the complementary information is defined as available elsewhere.

I'd like to discuss this item now. The only other area
I might have some thoughts is "Feature Structure" below,
but I need some time to digest that. I agree that the
language of rpcrdma-version-two should attempt to be
consistent with other documents.

I recognize there is an issue with the credit field. I've
dealt with that in other documents by stating that the
field should be ignored in cases where it is not possible
to tell whether an incoming message contains a request or
grant.

Instead of the language you've proposed here, I'd like to
make an exception to my guideline of "no more changes in
V2 than extensibility plus an increase in inline threshold."

I propose adding a direction field to the RPC-over-RDMA V2
header. For messages that carry an RPC payload, this field
would contain a copy of the direction field from the RPC
header.

For messages that do not have such a payload, this field
would distinguish both the XID space (forward or backward)
and the credit value (request or grant).

It could be easily introduced for new rdma_proc's. It is
probably not necessary for RDMA_MSG, but RDMA_NOMSG should
have the new field to allow responders to utilize the XID
and credit fields before pulling the RPC payload. It would
need to be added to struct rpc_rdma_error in a backward-
compatible way.


> After the header
> 
> In the context of many desirable features (e.g. message continuation, send-based DDP), the existing fourth bullet is too restrictive.  I propose the following replacement:
> For each new opttype code, it must be made clear whether an RPC Payload Stream or a part thereof follows the Transport Header Stream.  It must be made clear whether any material in addition to the transport header is allowed and how such material is to be processed.  In cases in which a partial RPC Payload stream is allowed, it needs to be clearly specified how such partial Payload Streams are to be assembled into a complete Payload Stream.  Also, when the Payload Stream portion begins at any point other than the byte immediately following the end of the Transport Header, it needs to be stated how the start of the Payload Stream portion is to be found, using the information in the optional header.
> This bullet could be broken up using <vspace> if you like.  
> 
> That's my preference but I realize that some people aren't comfortable with that.
> 
> XDR Model
> 
> The current last paragraph of the section seems to be more appropriate to the extension model originally proposed for Version One, than it is to the extension model that is actually described in the current Version Two draft.  In the current model, we have a situation like that for pNFS, in which a nominally opaque protocol field is overlaid by an XDR description of that opaque field's contents.  As a result, the XDR is not extended as it is in NFSv4.x, by making previously invalid messages valid and providing an XDR description of the newly valid messages.  Instead the interpretation of a message which was previously valid according to the existing XDR is refined by the addition of the new overlay.  
> 
> I propose some rewrites below, in order to bring the treatment in line with the current extension model in the document.
> 
> I propose replacing the current first paragraph by the following:
> RPC-over-RDMA Version Two may be extended by defining a new rdma_opttype value and providing an XDR description of the corresponding rdma_optinfo content.  When this is done, the XDR description provided overlays, for that rdma_opttype value, the nominally opaque field rdma_optinfo.  As a result a new header type is effectively created.
> I propose replacing the current last paragraph by the following:
> Implementers will generally combine the XDR descriptions of the new features they intend to use with the XDR description of the base protocol, extracted from this document.  This may be necessary to create a valid XDR input file because:
> 	• Extension definitions are free to use types (e.g. chunk definitions) defined in the base protocol.
> 	• Later extensions may use types defined by earlier extensions on which they depend. 
> Because RPC-over-RDMA is not an RPC program and because we will often have two definitions of the same field (i.e. opaque and non-opaque versions of rdma_optinfo), the way in which XDR tools are normally used will not be available to those using the XDR definition.  For example, the base XDR definition may be used to parse the base transport definition while the extension definition may be used provide a more complete interpretation of what would otherwise be treated as opaque data. It may be that either, both, or neither of the levels would use XDR-generated code. 
> In any case, the combination of the XDR description for the RPC-over-RDMA Version Two protocol combined with that for any selected extensions will provide an adequate human-readable description of the extended protocol.
> 
> Feature Structure
> 
> As explained in draft-dnoveck-nfsv4-rpcrdma-rtrissues-00, there are some important synergies between message continuation and send-based DDP, making it desirable to describe both together while allowing implementation to independently choose whether to implement each particular feature.  I'm not sure if this was the intention, but it seems like the current document is written assuming a one feature-per-extension model. It seems that one feature-per-document is too restrictive in general, 
> 
> I'm not sure whether this adds clarity or not but it appears to me that the way the word "feature" is used in this part of the document matches the concept of "feature set" in draft-ietf-nfsv4-versioning., which is the unit of XDR extension there. I don't propose, however, to change the terminology.  Still, I think that this use of "feature" conflicts with the colloquial use of that word and things need to be made clearer.  BTW, although there are thirteen uses of the word "feature" in the document, I am going to leave the terminology alone for the most part and use "facility" for the more colloquial use of "feature" in this context.  YMMV.
> 
> In any case, I propose replacing the existing second paragraph with the following:
> A set of such new protocol elements may be introduced by a Standards Track document and so be together considered an OPTIONAL feature in that each implementation can be presumed to be either aware of all the protocol elements introduced by that feature or be aware of none of them.
> Despite this requirement, particular features may introduce a set of facilities, often colloquially referred to as "features," without requiring that each implementation provide support for either all or none of these. In such cases, it needs to be made clear how the peer implementations determine a common set of facilities supported and interoperate properly when different sets of facilities are supported by the communicating implementations.
> Nfsv4 Working Group and IESG review, together with appropriate testing of prototype implementations, should ensure continued interoperation with existing implementations.  Similar review and testing should ensure interoperation of those implementing the new feature.  In cases in which it is possible that not all implementations implement the same set of facilities, those same practices should ensure the correct interoperation of implementations supporting different subsets of available facilities.
> I also propose replacing the existing last bullet by the following two bullets:
> 	• description of interactions with existing extensions (e.g., requirements that another OPTIONAL feature needs to be present for newly added features to work).
> 	• description of interactions with facilities introduced by other extensions (e.g. requirements that a particular level of support be provided for some particular facility in the new extension to work).

--
Chuck Lever

[nfsv4] Proposed changes to draft-cel-nfsv4-rpcrd… David Noveck
Re: [nfsv4] Proposed changes to draft-cel-nfsv4-r… Chuck Lever
Re: [nfsv4] Proposed changes to draft-cel-nfsv4-r… David Noveck