Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-06

Chuck Lever <chuck.lever@oracle.com> Sun, 05 March 2017 17:51 UTC

Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6833C129420 for <nfsv4@ietfa.amsl.com>; Sun, 5 Mar 2017 09:51:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.701
X-Spam-Level:
X-Spam-Status: No, score=-3.701 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hi8dAZmzh0Ex for <nfsv4@ietfa.amsl.com>; Sun, 5 Mar 2017 09:51:54 -0800 (PST)
Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 319311289C4 for <nfsv4@ietf.org>; Sun, 5 Mar 2017 09:51:54 -0800 (PST)
Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v25Hppq3011554 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sun, 5 Mar 2017 17:51:51 GMT
Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0021.oracle.com (8.13.8/8.14.4) with ESMTP id v25HppoG012673 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sun, 5 Mar 2017 17:51:51 GMT
Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id v25HpmGW009348; Sun, 5 Mar 2017 17:51:49 GMT
Received: from anon-dhcp-171.1015granger.net (/68.46.169.226) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 05 Mar 2017 09:51:48 -0800
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Chuck Lever <chuck.lever@oracle.com>
In-Reply-To: <CADaq8jdgdO1k3iW9yo7n2N1Yo6cAjvXznaWk-tN3ChftmzMJfQ@mail.gmail.com>
Date: Sun, 05 Mar 2017 09:51:47 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <4D6DCECB-BDF1-48E6-B59E-0A98D1252C8A@oracle.com>
References: <CADaq8je8zfRN5R11LxJw=0st-u-XOoKosGbZDBajOTiChzpS5Q@mail.gmail.com> <93F476D6-57F8-44AB-94C9-545608396F51@oracle.com> <CADaq8jcJ3WkpmPJVVec5aJc0ekKgdHPUok=S5_ofGVJnbqrrjA@mail.gmail.com> <5538FD5E-A71B-4F91-AC3A-CBD2F54AF9E3@oracle.com> <de109940-7de1-1a09-51f3-d3be44d98c60@talpey.com> <CADaq8jf5zU0y=v4gaUxVd4scQQwyAEcgWtp11Ddcn=U4jB17pA@mail.gmail.com> <CADaq8jea99i8L=tYKM=6T-Mu78n_qzmMwrKGSsWhmgpBytZMiQ@mail.gmail.com> <D2083198-E667-4B71-AAC5-D26318BE52D6@oracle.com> <CADaq8jeegoga-kB+a4e6QQEdLSCrTOmpbkSTk+4SmbqzCAfXgw@mail.gmail.com> <ACE665A3-0859-47E8-BBD6-E98A401B7656@oracle.com> <CADaq8jdgdO1k3iW9yo7n2N1Yo6cAjvXznaWk-tN3ChftmzMJfQ@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
X-Mailer: Apple Mail (2.3124)
X-Source-IP: aserv0021.oracle.com [141.146.126.233]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/07OvhZqh0K42uHloGgYdZJcN9HM>
Cc: Tom Talpey <tom@talpey.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-06
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Mar 2017 17:51:55 -0000

> On Mar 1, 2017, at 11:57 AM, David Noveck <davenoveck@gmail.com> wrote:
> 
> > It's not a question of whether or not I liked a particular
> > proposed mechanism. My job as editor of rfc5667bis is to
> > keep this document on track and limit creeping scope.
> 
> Fair enough.  Instead of saying:
> 
> I made some suggestions in that regard which you didn't like
> 
> I should have said:
> 
> I made some suggestions in that regard that you felt would
> lead the document off track and might result in an desirable
> expansion of scope.
> 
> > I'm having trouble considering more work and more special
> > casing in rf5667bis to detect support for features that
> > do not yet have a real world application, and for which
> > there is little ability to prototype and test.
> 
> I hear you.
> 
> > Given your publicly-stated desire to see this document
> > published soon, I'm surprised you would consider
> > introducing new mechanisms at this point.
> 
> I wanted to explore the ways this could be done.   I guess
> the question regarding "at this point" is exactly where we are.
> A few days ago, I thought the document was quite close to
> WGLC.  Now it appears it is not and I'm not sure exactly where
> this document is. 
> 
> > Further, I do not like the implication that we should do
> > something only because no-one has thought of a better
> > approach. 
> 
> When did I say/imply that?
> 
> > We always have the option of not doing
> > something that adds complexity with little or no actual
> > gain.
> 
> Right, but right now we don't have the option of doing nothing.
> Either we do a fairly major surgery on an existing document,
> or we do something else.  In any case, you are not comfortable
> with any of these alternatives and so we might as well drop
> discussion of them.

To be clear, it's a general discomfort, not something
specific to any of the particular alternatives. I don't
have an appetite for a lot of work on something that is
not immediately useful here, especially because our
mission is writing down how this stuff works right now.


> However, given how ingrained the idea of list of chunks is, I 
> don't think that this can be though of as a minimally invasive 
> featurectomy.

I'd like to understand this more. What makes this harder
than just setting a limit?


> That's why I thought of alternatives that you
> have trouble with.  If you and the rest of the working group,
> think the surgery to remove support for multi-chunk operation
> is the most expeditious way to proceed, I don't have a problem 
> with it.

IMO, a ULB is allowed to set some limits on the use of
the facilities in the underlying transport. The approach
I'm thinking of is defining those limits in rfc5667bis,
possibly replacing the description of how to use multiple
chunks.


> > I'm interested right now in hearing other people's opinions
> > on whether it is worth completing rfc5667bis as strictly a
> > document of existing implementations, or whether it should
> > continue to include what amounts to a speculative feature.
> 
> I'm also interested in hearing other people's opinions.
> 
> I'm having trouble with the idea that, a major part of the
> current rfc5667bis, which I thought was pretty close to
> WGLC, has suddenly become "speculative".  
> 
> That does not mean that I think we need to keep things 
> as they are, but I think it has to be understood that to 
> remove this, we would have to do some pretty substantial 
> surgery on the current document.
> 
> > I could make due with permitting only single chunks in
> > Version One, and explore support for multiple chunks
> > (and/or more complex COMPOUNDs) in Version Two.
> 
> If that;s what you want to do, and the rest of the working group is
> OK with it, I don't have a problem.  
> 
> But note that the following documents are written to support multiple
> chunks in a request:
> 	• RFC5666
> 	• RFC5667
> 	• rfc5666bis
> 	• rfc56667 (at least up until -06)
> 	• draft-cel-rpcrdma-version-two
> so the exploration involved is going to require prototype implementation, 
> if anyone is interested in doing that.
> 
> > > > and Version Two is years away from
> > > > appearing in storage products in a robust form.
> >>
> >> Probably so, but to me, the fact that something is going to take a
> >> while to do makes it more appropriate to push forward, rather than
> >> less.  And whatever you think about Version One,, it lacks:
> >>       • General remote invalidation support
> >>       • A default inline threshold size that is appropriate to a protocol that has neither remote invalidation support nor message continuation.
> >>       • The ability to decide on and use a threshold bigger than the default.
> 
> > Right now Linux has Remote Invalidation support, a
> > 4KB default inline threshold (when interoperating
> > with another Linux system), and the ability to decide
> > on and use a larger threshold (up to 64KB). In other
> > words, everything you've named here, minus "general
> > Remote Invalidation".
> 
> I hadn't known that.

You might be remembering that I was leery of a CM-private-
data-based approach, originally.

Christoph and others in the Linux community convinced me
this mechanism would be an appropriate platform for
experimenting with the features you listed above.


> > The mechanism it uses to do these things is documented
> > in a published personal draft:
> >
> >   draft-cel-nfsv4-rpcrdma-cm-pvt-msg
> 
> If, as I believe you are, suggesting that this is an alternative to
> accelerating work on Version Two, then I would be OK with that,
> provided that we push forward on this work instead.  I will be looking
> at this document with a view to seeing what barriers exist to making 
> it a working group document.

Please remember this was conceived of as an enabler for
experimentation. It is a naive design. However, as simple
as the CM-private-data approach is, it does have a limited
ability to be extended. It would not be difficult to add
fields that report how many of each chunk type an
implementation supports per RPC, for example.


> This shouldn't be too hard, given that
> prototypes exist and that it provides some of the performance 
> help we need.

> > Linux client happens to support responder's choice
> > Remote Invalidation. Proper generic support for Remote
> > Invalidation will be needed to include other clients,
> > though the only other current client implementation
> > would need significant internal re-architecture to use
> > Remote Invalidation of any kind.
> 
> OK.  If expanding the client set is blocked, perhaps it would
> be best if this document were made a working group item with a 
> view toward encouraging other servers to support this mechanism
> interoperably.

I have heard a rumor that the Solaris team is interested
in this idea, at least to allow somewhat larger inline
thresholds, both on the Solaris client and server.


> > Given the existence of the CM private data mechanism,
> > IMO we are safe focusing on adding real value to
> > Version Two rather than rushing it forward.
> 
> I don't think that's what I was proposing, but there isn't
> much point in arguing about it.  I'm OK with putting
> additional focus on the CM private data mechanism
> instead. I don't think I'm proposing "rushing" that forward
> either, but you will have a chance to object to any particular 
> steps you feel are imprudent.
> On Wed, Mar 1, 2017 at 11:40 AM, Chuck Lever <chuck.lever@oracle.com> wrote:

--
Chuck Lever