Re: [nfsv4] Proposed changes to draft-cel-nfsv4-rpcrdma-version-two.

David Noveck <davenoveck@gmail.com> Tue, 10 May 2016 15:23 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 86E1D12D70F for <nfsv4@ietfa.amsl.com>; Tue, 10 May 2016 08:23:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VL6nAc4Z_Kry for <nfsv4@ietfa.amsl.com>; Tue, 10 May 2016 08:23:34 -0700 (PDT)
Received: from mail-lf0-x230.google.com (mail-lf0-x230.google.com [IPv6:2a00:1450:4010:c07::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 50F3212D702 for <nfsv4@ietf.org>; Tue, 10 May 2016 08:23:17 -0700 (PDT)
Received: by mail-lf0-x230.google.com with SMTP id y84so19021149lfc.0 for <nfsv4@ietf.org>; Tue, 10 May 2016 08:23:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=mWqMTxBK3OZ7YxGAqBLNh8kv3rr87ZsqD0i4HQtgfzw=; b=vHEwTyvGPEvdVVprMFMgzxSwWoD4+cVYIh15aHTrWseBU3eEGWki2TE40bw4Wz0ecw 4quTdnu85NQExgrmXiq24YVVZSlMROz2lafRs7tKFyELF7uUT5oH7qsOrtblx2YQIxx7 9IXkbxYLRQdqWI3DtQqzbhhuC3lqX4lCFJ4E5fQvvbM1Z2y8by+bpOE7IO5J/tI2XNxT O0xPw+tRh4j3yZ3NQdZZpQmfJ0k6h4vmPPb0OpmR7LHEmTFYiLCzszdTcWp0n0uaKBGG 1PNeFY858d9GJIaF0medgV6PQDEOIqJFXMh9WetCoaiUpr+dQ1jalEWLBMQWVTLGSLCO view==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=mWqMTxBK3OZ7YxGAqBLNh8kv3rr87ZsqD0i4HQtgfzw=; b=QQxlJcvteiKCuiqtjL+mp2UlpqrSKZTMC9piJM98mFPzkZTj/jhgMBSCBXJoRhIcqU N+MvEoycV9OT7A+NBr96x9/o6rEJAr0IC+exFGCEsnXLQ04nAiA3woUPReHN61aWRZcu ukJZuiQvEXJKqmr7VKdThcd9QlwUaSxFdXlS3YhlrstSJde5Jc7u/pxRp9+Q4xJ0lW5d +2h/l4riMiabmJ0WaUrBfUma8AirRinSApipJdDGNe6y0s0+JKwC9cEOTi++wtw85GN+ pEGvFdnTmBqh+kD7mqjDIpShY7QBlTZCVEIEb3DuEarKhhfnC494OuZsP1M5Cbo+QS0N OOJQ==
X-Gm-Message-State: AOPr4FWEVK7HizIMl49N8VttNwg848ATkFiXBV9Mh7wH+P07RwwRnMT0HImLSXyNhFON1mFq3bo2YJQO9O4yKw==
MIME-Version: 1.0
X-Received: by 10.25.145.198 with SMTP id t189mr13087761lfd.157.1462893795500; Tue, 10 May 2016 08:23:15 -0700 (PDT)
Received: by 10.112.61.162 with HTTP; Tue, 10 May 2016 08:23:15 -0700 (PDT)
In-Reply-To: <3DA09A88-EB03-45E5-A915-500101DFFF9C@oracle.com>
References: <CADaq8jfst1vuBeM=PL0nErngk2jktKb2dimqWhOmn2Jow9zGbw@mail.gmail.com> <3DA09A88-EB03-45E5-A915-500101DFFF9C@oracle.com>
Date: Tue, 10 May 2016 11:23:15 -0400
Message-ID: <CADaq8jfDNbSvMX2r9pCPGfLp4wJgRv+oqwhfgchMpbmJWUiP-g@mail.gmail.com>
From: David Noveck <davenoveck@gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a11402946066cab05327e810c"
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/ypnnRj3eYrT9rwJFmVBOOfZGGOU>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Proposed changes to draft-cel-nfsv4-rpcrdma-version-two.
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 May 2016 15:23:44 -0000

> It could be easily introduced for new rdma_proc's.

I presume that adding it to rrpcrdma2_optional is the main thing here.

> I'd like to
> make an exception to my guideline of "no more changes in
> V2 than extensibility plus an increase in inline threshold."

I don't think that you are making such a big exception, in that the change
you are talking about is fundamentally motivated by extensibility.
Addressing the NOMSG case, while desirable, is not big enough to make this
sort of change.

> It would need to be added to struct rpc_rdma_error in a backward-compatible
way.

It sounds like that could be done by adding the new field to the
RDMA_ERROR_{BAD_HEADER, INVAL_OPTION} cases of rpcrdma2_error.  I think
credit would still need to be ignored in wrong-version case.

Your proposed change would handle the issue of message continuation in
which the start of the payload stream might not be available.

My only concern is with regard to a possible future extension in which
multiple RPC messages are carried in a single SEND (i.e., the precise
opposite of message continuation).  When the language for the new field is
drafted, we should make sure that it doesn't assume all the messages in a
SEND are all related to a single direction of operation.
--

On Tue, May 10, 2016 at 10:41 AM, Chuck Lever <chuck.lever@oracle.com>
wrote:

>
> > On May 10, 2016, at 8:57 AM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > On April 20, when I sent the message credit-related issues in
> draft-ietf-nfsv4-{rfc5666bis,rfc5666-implementation-experience,rpcrdma-bidirection},
> I indicated that there would be corresponding Version Two changes.  Even
> though some of these changes are, in some sense, motivated by
> considerations related to bidirectional operation, I think we can discuss
> these without waiting for the WGLC of draft-ietf-nfsv4-rpcrdma-bidirection
> to complete.
> >
> > This mail includes those credit-related changes and proposals for a
> bunch of others that I think are needed, based on my work on
> draft-dnoveck-rpcrdma-xcharext and draft-dnoveck-rpcrdma-rtrissues.
> >
> > In any case, there isn't a big rush and there would not be a problem
> waiting until late May, if you are busy with wrapping up Version One.  My
> main concern is that we get an up-to-date Version Two out in time for the
> draft cutoff for Berlin, on July 8.  Since I am hoping to be making
> corresponding changes in draft-dnoveck-nfsv4-rpcrdma-xcharext and the
> forthcoming draft-dnoveck-nfsv4-rpcrdma-rtrext, I'd hope we can get
> agreement on the substance of the current round of Version Two changes by
> the end of June.
>
> I should be able to submit a revision of rpcrdma-version-two
> before the end of May, depending on subsequent discussion of
> the items below, and other workload (implementation and bugs).
>
>
> > So far it looks like all of the necessary changes would go into 3.2.
> Documentation Requirements.
> >
> > Credit field ambiguity
> >
> > To address the fact that, in the context of extensions, it needs to be
> clear whether the credit field is a request or a grant, I propose that we
> add a new bullet reading as follows:
> >
> > For each new opttype code, it must be made clear how it is to be
> determined whether the existing credit field in the header is to be
> interpreted as a request or as a grant.   This can include specifying that,
> for particular codes, the field is to be interpreted as one or the other,
> that particular fields in the header definition are to direct the
> interpretation, or that fields in the associated RPC Payload Stream are to
> direct the interpretation.  In some cases, there may be no valid
> interpretation and the credit field is to be ignored.  In other cases, the
> credit field is to be interpreted either as a request or a grant and the
> complementary information is defined as available elsewhere.
>
> I'd like to discuss this item now. The only other area
> I might have some thoughts is "Feature Structure" below,
> but I need some time to digest that. I agree that the
> language of rpcrdma-version-two should attempt to be
> consistent with other documents.
>
> I recognize there is an issue with the credit field. I've
> dealt with that in other documents by stating that the
> field should be ignored in cases where it is not possible
> to tell whether an incoming message contains a request or
> grant.
>
> Instead of the language you've proposed here, I'd like to
> make an exception to my guideline of "no more changes in
> V2 than extensibility plus an increase in inline threshold."
>
> I propose adding a direction field to the RPC-over-RDMA V2
> header. For messages that carry an RPC payload, this field
> would contain a copy of the direction field from the RPC
> header.
>
> For messages that do not have such a payload, this field
> would distinguish both the XID space (forward or backward)
> and the credit value (request or grant).
>
> It could be easily introduced for new rdma_proc's. It is
> probably not necessary for RDMA_MSG, but RDMA_NOMSG should
> have the new field to allow responders to utilize the XID
> and credit fields before pulling the RPC payload. It would
> need to be added to struct rpc_rdma_error in a backward-
> compatible way.
>
>
> > After the header
> >
> > In the context of many desirable features (e.g. message continuation,
> send-based DDP), the existing fourth bullet is too restrictive.  I propose
> the following replacement:
> > For each new opttype code, it must be made clear whether an RPC Payload
> Stream or a part thereof follows the Transport Header Stream.  It must be
> made clear whether any material in addition to the transport header is
> allowed and how such material is to be processed.  In cases in which a
> partial RPC Payload stream is allowed, it needs to be clearly specified how
> such partial Payload Streams are to be assembled into a complete Payload
> Stream.  Also, when the Payload Stream portion begins at any point other
> than the byte immediately following the end of the Transport Header, it
> needs to be stated how the start of the Payload Stream portion is to be
> found, using the information in the optional header.
> > This bullet could be broken up using <vspace> if you like.
> >
> > That's my preference but I realize that some people aren't comfortable
> with that.
> >
> > XDR Model
> >
> > The current last paragraph of the section seems to be more appropriate
> to the extension model originally proposed for Version One, than it is to
> the extension model that is actually described in the current Version Two
> draft.  In the current model, we have a situation like that for pNFS, in
> which a nominally opaque protocol field is overlaid by an XDR description
> of that opaque field's contents.  As a result, the XDR is not extended as
> it is in NFSv4.x, by making previously invalid messages valid and providing
> an XDR description of the newly valid messages.  Instead the interpretation
> of a message which was previously valid according to the existing XDR is
> refined by the addition of the new overlay.
> >
> > I propose some rewrites below, in order to bring the treatment in line
> with the current extension model in the document.
> >
> > I propose replacing the current first paragraph by the following:
> > RPC-over-RDMA Version Two may be extended by defining a new rdma_opttype
> value and providing an XDR description of the corresponding rdma_optinfo
> content.  When this is done, the XDR description provided overlays, for
> that rdma_opttype value, the nominally opaque field rdma_optinfo.  As a
> result a new header type is effectively created.
> > I propose replacing the current last paragraph by the following:
> > Implementers will generally combine the XDR descriptions of the new
> features they intend to use with the XDR description of the base protocol,
> extracted from this document.  This may be necessary to create a valid XDR
> input file because:
> >       • Extension definitions are free to use types (e.g. chunk
> definitions) defined in the base protocol.
> >       • Later extensions may use types defined by earlier extensions on
> which they depend.
> > Because RPC-over-RDMA is not an RPC program and because we will often
> have two definitions of the same field (i.e. opaque and non-opaque versions
> of rdma_optinfo), the way in which XDR tools are normally used will not be
> available to those using the XDR definition.  For example, the base XDR
> definition may be used to parse the base transport definition while the
> extension definition may be used provide a more complete interpretation of
> what would otherwise be treated as opaque data. It may be that either,
> both, or neither of the levels would use XDR-generated code.
> > In any case, the combination of the XDR description for the
> RPC-over-RDMA Version Two protocol combined with that for any selected
> extensions will provide an adequate human-readable description of the
> extended protocol.
> >
> > Feature Structure
> >
> > As explained in draft-dnoveck-nfsv4-rpcrdma-rtrissues-00, there are some
> important synergies between message continuation and send-based DDP, making
> it desirable to describe both together while allowing implementation to
> independently choose whether to implement each particular feature.  I'm not
> sure if this was the intention, but it seems like the current document is
> written assuming a one feature-per-extension model. It seems that one
> feature-per-document is too restrictive in general,
> >
> > I'm not sure whether this adds clarity or not but it appears to me that
> the way the word "feature" is used in this part of the document matches the
> concept of "feature set" in draft-ietf-nfsv4-versioning., which is the unit
> of XDR extension there. I don't propose, however, to change the
> terminology.  Still, I think that this use of "feature" conflicts with the
> colloquial use of that word and things need to be made clearer.  BTW,
> although there are thirteen uses of the word "feature" in the document, I
> am going to leave the terminology alone for the most part and use
> "facility" for the more colloquial use of "feature" in this context.  YMMV.
> >
> > In any case, I propose replacing the existing second paragraph with the
> following:
> > A set of such new protocol elements may be introduced by a Standards
> Track document and so be together considered an OPTIONAL feature in that
> each implementation can be presumed to be either aware of all the protocol
> elements introduced by that feature or be aware of none of them.
> > Despite this requirement, particular features may introduce a set of
> facilities, often colloquially referred to as "features," without requiring
> that each implementation provide support for either all or none of these.
> In such cases, it needs to be made clear how the peer implementations
> determine a common set of facilities supported and interoperate properly
> when different sets of facilities are supported by the communicating
> implementations.
> > Nfsv4 Working Group and IESG review, together with appropriate testing
> of prototype implementations, should ensure continued interoperation with
> existing implementations.  Similar review and testing should ensure
> interoperation of those implementing the new feature.  In cases in which it
> is possible that not all implementations implement the same set of
> facilities, those same practices should ensure the correct interoperation
> of implementations supporting different subsets of available facilities.
> > I also propose replacing the existing last bullet by the following two
> bullets:
> >       • description of interactions with existing extensions (e.g.,
> requirements that another OPTIONAL feature needs to be present for newly
> added features to work).
> >       • description of interactions with facilities introduced by other
> extensions (e.g. requirements that a particular level of support be
> provided for some particular facility in the new extension to work).
>
> --
> Chuck Lever
>
>
>
>