Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-06

karen deitke <karen.deitke@oracle.com> Mon, 06 March 2017 15:37 UTC

Return-Path: <karen.deitke@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7DD1C129463 for <nfsv4@ietfa.amsl.com>; Mon, 6 Mar 2017 07:37:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.202
X-Spam-Level:
X-Spam-Status: No, score=-4.202 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EyAwmgA3aQ6i for <nfsv4@ietfa.amsl.com>; Mon, 6 Mar 2017 07:37:36 -0800 (PST)
Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6761D126FDC for <nfsv4@ietf.org>; Mon, 6 Mar 2017 07:37:36 -0800 (PST)
Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v26FbZ2a030507 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <nfsv4@ietf.org>; Mon, 6 Mar 2017 15:37:36 GMT
Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v26FbZND013398 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <nfsv4@ietf.org>; Mon, 6 Mar 2017 15:37:35 GMT
Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id v26FbXvR003127 for <nfsv4@ietf.org>; Mon, 6 Mar 2017 15:37:34 GMT
Received: from [10.159.123.72] (/10.159.123.72) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 06 Mar 2017 07:37:32 -0800
To: nfsv4@ietf.org
References: <CADaq8je8zfRN5R11LxJw=0st-u-XOoKosGbZDBajOTiChzpS5Q@mail.gmail.com> <93F476D6-57F8-44AB-94C9-545608396F51@oracle.com> <CADaq8jcJ3WkpmPJVVec5aJc0ekKgdHPUok=S5_ofGVJnbqrrjA@mail.gmail.com> <5538FD5E-A71B-4F91-AC3A-CBD2F54AF9E3@oracle.com> <de109940-7de1-1a09-51f3-d3be44d98c60@talpey.com> <CADaq8jf5zU0y=v4gaUxVd4scQQwyAEcgWtp11Ddcn=U4jB17pA@mail.gmail.com> <CADaq8jea99i8L=tYKM=6T-Mu78n_qzmMwrKGSsWhmgpBytZMiQ@mail.gmail.com> <D2083198-E667-4B71-AAC5-D26318BE52D6@oracle.com> <CADaq8jeegoga-kB+a4e6QQEdLSCrTOmpbkSTk+4SmbqzCAfXgw@mail.gmail.com> <ACE665A3-0859-47E8-BBD6-E98A401B7656@oracle.com> <CADaq8jdgdO1k3iW9yo7n2N1Yo6cAjvXznaWk-tN3ChftmzMJfQ@mail.gmail.com> <4D6DCECB-BDF1-48E6-B59E-0A98D1252C8A@oracle.com> <CADaq8jeheLhSHhn9w+fiPXMQARGJAc0665NWpcwWC2QP-NQgJQ@mail.gmail.com>
From: karen deitke <karen.deitke@oracle.com>
Organization: Oracle Corporation
Message-ID: <bdd75daf-8d01-e0ab-d4aa-a759676e250b@oracle.com>
Date: Mon, 06 Mar 2017 08:37:25 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1
MIME-Version: 1.0
In-Reply-To: <CADaq8jeheLhSHhn9w+fiPXMQARGJAc0665NWpcwWC2QP-NQgJQ@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------68F99D70B7913B8D27DEE9F7"
X-Source-IP: userv0022.oracle.com [156.151.31.74]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/TojEHiQUOYcZsP7Zn9BFzWwNcI0>
Subject: Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-06
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Mar 2017 15:37:39 -0000


On 3/6/2017 4:22 AM, David Noveck wrote:
> >> However, given how ingrained the idea of list of chunks is, I
> >> don't think that this can be though of as a minimally invasive
> >> featurectomy.
>
> > I'd like to understand this more. What makes this harder
> > than just setting a limit?
>
> Perhaps I'm carrying the surgical analogy too far, but I want
> to make it clear that I'm not saying this would be anything like
> Ben Carson separating conjoined twins.
>
> What I'm basically saying this is a lot more than wart removal and
> is probably harder than you think.
>
> > The problem with "just setting a limit" is that it results in a
> > document that doesn't make sense.
>
> > IMO, a ULB is allowed to set some limits on the use of
> > the facilities in the underlying transport.
>
> I agree.
>
> > The approach I'm thinking of is defining those limits in rfc5667bis,
> > possibly replacing the description of how to use multiple
> > chunks.
>
> You've just described multiple approaches and need to
> decide on one before you scrub up.
>
> If you just define those limits, you will wind up with a
> document that doesn't make sense.  rfc5666bis defines
> sensible rules for matching write chunks with data items
> and rfc5667bis-06 sensibly modifies those to account for
> the peculiarities of READ_PLUS.
>
> As a result, simply adding those limits will result in a
> confusing document.
>
> If you remove those matching rule modifications, you will
> have to deal with the restrictions on what READ_PLUS
> can return.  Those are still valid, but would be stated
> differently in an environment in which there are no requests
> with multiple chunks.
>
> I had an easier task with nfsulb.  In that case, the existing text
> makes sense as the generic NFS ULB description, and adding
> a version-one-specific restriction is not incongruous.
>
> For rfc5667bis, this is surgery and the patent is likely to survive.
> Unfortunately, there is no way to properly compensate the
> surgeon :-(
>
> > You might be remembering that I was leery of a CM-private-
> > data-based approach, originally.
>
> I didn't remember that.
>
> > Christoph and others in the Linux community convinced me
>
> Sounds like one or more acknowledgements is in order.
>
> > this mechanism would be an appropriate platform for
> > experimenting with the features you listed above.
>
> It appears that the experiment has been successful :-)
>
> > Please remember this was conceived of as an enabler for
> > experimentation.
>
> I'm not sure exactly what that means.  Any proposal is an
> experiment in that it needs to be implemented.  If it can't,
> the experiment has failed.  In this case, the experiment has
> succeeded.
>
> There was certainly nothing slapdash about this document.
>
> > It is a naive design.
>
> I would not use the word "naive" regarding the design.
> It is simple and accomplishes what needs to be done,
> which is enough for me and I hope it will be enough for
> the working group.
>
> There is one small element that does seem naive to me
> and I cover it in my review, which should be out soon.
>
> > However, as simple
> > as the CM-private-data approach is, it does have a limited
> > ability to be extended.
>
> I noticed that but I wasn't particularly concerned about
> extensibility.
>
> > It would not be difficult to add
> > fields that report how many of each chunk type an
> > implementation supports per RPC, for example.
>
> I thought about adding a flag indicating that the server
> could support with more than one read chunk or write
> chunk, but I didn't think you wanted to make rfc5667bis
> dependent on a cm private data RFC.
>
> > I have heard a rumor that the Solaris team is interested
> > in this idea, at least to allow somewhat larger inline
> > thresholds, both on the Solaris client and server.
>
> Now the whole working group has heard that rumour.
>
> If we go forward with this, as I expect we will, then it
> would be good if we could hear from the Solaris team directly.
>
>
>
> On Sun, Mar 5, 2017 at 12:51 PM, Chuck Lever <chuck.lever@oracle.com 
> <mailto:chuck.lever@oracle.com>> wrote:
>
>
>     > On Mar 1, 2017, at 11:57 AM, David Noveck <davenoveck@gmail.com
>     <mailto:davenoveck@gmail.com>> wrote:
>     >
>     > > It's not a question of whether or not I liked a particular
>     > > proposed mechanism. My job as editor of rfc5667bis is to
>     > > keep this document on track and limit creeping scope.
>     >
>     > Fair enough.  Instead of saying:
>     >
>     > I made some suggestions in that regard which you didn't like
>     >
>     > I should have said:
>     >
>     > I made some suggestions in that regard that you felt would
>     > lead the document off track and might result in an desirable
>     > expansion of scope.
>     >
>     > > I'm having trouble considering more work and more special
>     > > casing in rf5667bis to detect support for features that
>     > > do not yet have a real world application, and for which
>     > > there is little ability to prototype and test.
>     >
>     > I hear you.
>     >
>     > > Given your publicly-stated desire to see this document
>     > > published soon, I'm surprised you would consider
>     > > introducing new mechanisms at this point.
>     >
>     > I wanted to explore the ways this could be done.  I guess
>     > the question regarding "at this point" is exactly where we are.
>     > A few days ago, I thought the document was quite close to
>     > WGLC.  Now it appears it is not and I'm not sure exactly where
>     > this document is.
>     >
>     > > Further, I do not like the implication that we should do
>     > > something only because no-one has thought of a better
>     > > approach.
>     >
>     > When did I say/imply that?
>     >
>     > > We always have the option of not doing
>     > > something that adds complexity with little or no actual
>     > > gain.
>     >
>     > Right, but right now we don't have the option of doing nothing.
>     > Either we do a fairly major surgery on an existing document,
>     > or we do something else.  In any case, you are not comfortable
>     > with any of these alternatives and so we might as well drop
>     > discussion of them.
>
>     To be clear, it's a general discomfort, not something
>     specific to any of the particular alternatives. I don't
>     have an appetite for a lot of work on something that is
>     not immediately useful here, especially because our
>     mission is writing down how this stuff works right now.
>
>
>     > However, given how ingrained the idea of list of chunks is, I
>     > don't think that this can be though of as a minimally invasive
>     > featurectomy.
>
>     I'd like to understand this more. What makes this harder
>     than just setting a limit?
>
>
>     > That's why I thought of alternatives that you
>     > have trouble with.  If you and the rest of the working group,
>     > think the surgery to remove support for multi-chunk operation
>     > is the most expeditious way to proceed, I don't have a problem
>     > with it.
>
>     IMO, a ULB is allowed to set some limits on the use of
>     the facilities in the underlying transport. The approach
>     I'm thinking of is defining those limits in rfc5667bis,
>     possibly replacing the description of how to use multiple
>     chunks.
>
>
>     > > I'm interested right now in hearing other people's opinions
>     > > on whether it is worth completing rfc5667bis as strictly a
>     > > document of existing implementations, or whether it should
>     > > continue to include what amounts to a speculative feature.
>     >
>     > I'm also interested in hearing other people's opinions.
>     >
>     > I'm having trouble with the idea that, a major part of the
>     > current rfc5667bis, which I thought was pretty close to
>     > WGLC, has suddenly become "speculative".
>     >
>     > That does not mean that I think we need to keep things
>     > as they are, but I think it has to be understood that to
>     > remove this, we would have to do some pretty substantial
>     > surgery on the current document.
>     >
>     > > I could make due with permitting only single chunks in
>     > > Version One, and explore support for multiple chunks
>     > > (and/or more complex COMPOUNDs) in Version Two.
>     >
>     > If that;s what you want to do, and the rest of the working group is
>     > OK with it, I don't have a problem.
>     >
>     > But note that the following documents are written to support
>     multiple
>     > chunks in a request:
>     >       • RFC5666
>     >       • RFC5667
>     >       • rfc5666bis
>     >       • rfc56667 (at least up until -06)
>     >       • draft-cel-rpcrdma-version-two
>     > so the exploration involved is going to require prototype
>     implementation,
>     > if anyone is interested in doing that.
>     >
>     > > > > and Version Two is years away from
>     > > > > appearing in storage products in a robust form.
>     > >>
>     > >> Probably so, but to me, the fact that something is going to
>     take a
>     > >> while to do makes it more appropriate to push forward, rather
>     than
>     > >> less.  And whatever you think about Version One,, it lacks:
>     > >>       • General remote invalidation support
>     > >>       • A default inline threshold size that is appropriate
>     to a protocol that has neither remote invalidation support nor
>     message continuation.
>     > >>       • The ability to decide on and use a threshold bigger
>     than the default.
>     >
>     > > Right now Linux has Remote Invalidation support, a
>     > > 4KB default inline threshold (when interoperating
>     > > with another Linux system), and the ability to decide
>     > > on and use a larger threshold (up to 64KB). In other
>     > > words, everything you've named here, minus "general
>     > > Remote Invalidation".
>     >
>     > I hadn't known that.
>
>     You might be remembering that I was leery of a CM-private-
>     data-based approach, originally.
>
>     Christoph and others in the Linux community convinced me
>     this mechanism would be an appropriate platform for
>     experimenting with the features you listed above.
>
>
>     > > The mechanism it uses to do these things is documented
>     > > in a published personal draft:
>     > >
>     > >   draft-cel-nfsv4-rpcrdma-cm-pvt-msg
>     >
>     > If, as I believe you are, suggesting that this is an alternative to
>     > accelerating work on Version Two, then I would be OK with that,
>     > provided that we push forward on this work instead. I will be
>     looking
>     > at this document with a view to seeing what barriers exist to making
>     > it a working group document.
>
>     Please remember this was conceived of as an enabler for
>     experimentation. It is a naive design. However, as simple
>     as the CM-private-data approach is, it does have a limited
>     ability to be extended. It would not be difficult to add
>     fields that report how many of each chunk type an
>     implementation supports per RPC, for example.
>
>
>     > This shouldn't be too hard, given that
>     > prototypes exist and that it provides some of the performance
>     > help we need.
>
>     > > Linux client happens to support responder's choice
>     > > Remote Invalidation. Proper generic support for Remote
>     > > Invalidation will be needed to include other clients,
>     > > though the only other current client implementation
>     > > would need significant internal re-architecture to use
>     > > Remote Invalidation of any kind.
>     >
>     > OK.  If expanding the client set is blocked, perhaps it would
>     > be best if this document were made a working group item with a
>     > view toward encouraging other servers to support this mechanism
>     > interoperably.
>
>     I have heard a rumor that the Solaris team is interested
>     in this idea, at least to allow somewhat larger inline
>     thresholds, both on the Solaris client and server.
>
Yes, the solaris team is definitely interesting in this implementation 
for larger inline thresholds.

Karen
>
>
>
>     > > Given the existence of the CM private data mechanism,
>     > > IMO we are safe focusing on adding real value to
>     > > Version Two rather than rushing it forward.
>     >
>     > I don't think that's what I was proposing, but there isn't
>     > much point in arguing about it.  I'm OK with putting
>     > additional focus on the CM private data mechanism
>     > instead. I don't think I'm proposing "rushing" that forward
>     > either, but you will have a chance to object to any particular
>     > steps you feel are imprudent.
>     > On Wed, Mar 1, 2017 at 11:40 AM, Chuck Lever
>     <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>> wrote:
>
>     --
>     Chuck Lever
>
>
>
>
>
>
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4