Re: [nfsv4] New draft for working group charter

Chuck Lever <chuck.lever@oracle.com> Fri, 12 May 2017 16:43 UTC

Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D8BA912EB1E; Fri, 12 May 2017 09:43:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.221
X-Spam-Level:
X-Spam-Status: No, score=-4.221 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6Cihw5FWpzbm; Fri, 12 May 2017 09:43:31 -0700 (PDT)
Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 95E1F12EC49; Fri, 12 May 2017 09:38:36 -0700 (PDT)
Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v4CGcYnk005839 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 12 May 2017 16:38:34 GMT
Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.13.8/8.14.4) with ESMTP id v4CGcYke022385 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 12 May 2017 16:38:34 GMT
Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v4CGcY6M030080; Fri, 12 May 2017 16:38:34 GMT
Received: from anon-dhcp-171.1015granger.net (/68.46.169.226) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 12 May 2017 09:38:34 -0700
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Chuck Lever <chuck.lever@oracle.com>
In-Reply-To: <CADaq8jeZwhPFp5MwQ5+0Cj5K6YygZVj6bQ+JRPkjdoo9p5RC9w@mail.gmail.com>
Date: Fri, 12 May 2017 12:38:33 -0400
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>, "nfsv4-chairs@ietf.org" <nfsv4-chairs@ietf.org>, Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <B7E5EBC0-388E-44E7-936E-B3D3D3F8F742@oracle.com>
References: <CADaq8jfiL4F4OmSMXOQRv-MYuQPFWc1Yo_U=KVphmr2KYc3mjw@mail.gmail.com> <237296BF-A24B-4AA6-A0B0-0E9F5B9F638C@oracle.com> <CADaq8jeg-EMPPu9dK3SzrOPqAD58i2tVFxkC9e=H+BEXR9LfkA@mail.gmail.com> <8C4B7F74-A336-4CAD-A1F4-568122312E43@oracle.com> <CADaq8jf3ee5q8BSmnVdNYRne0rAGTk3DdOKK7ba=Q9WxEyx8Gw@mail.gmail.com> <CADaq8jeQOadid_LQo9e3gXTBzB1daTAPXTD23X5jSra9K_v+Gg@mail.gmail.com> <CADaq8jeZwhPFp5MwQ5+0Cj5K6YygZVj6bQ+JRPkjdoo9p5RC9w@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
X-Mailer: Apple Mail (2.3124)
X-Source-IP: aserv0021.oracle.com [141.146.126.233]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/l1DXJITeNRZkaPZeZZCJ3RKCBkM>
Subject: Re: [nfsv4] New draft for working group charter
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 May 2017 16:43:34 -0000

> On May 12, 2017, at 9:54 AM, David Noveck <davenoveck@gmail.com> wrote:
> 
> 
> > I'm wondering why there is now a special charter focus on
> > performance, but not on virtualization, which has been a
> > de facto focus of WG effort for the past five years.
> 
> Let's be clear on the context and stress that what I posted is my draft.  I posted
> it to facilitate discussion.  It is not posted on a this-is-the-answer basis or a 
> take-it-or-leave-it basis.  It is posted on a suggest-a-change-or-an-
> alternative basis.
> 
> > Why isn't it enough to say "The WG is responsible for
> > extending these protocols as needed."
> 
> I think we could get by with that but let me explain why I
> had a specific section in my draft.  
> 
> I think that a draft without this section would not give the IESG
> a fair picture of the work we would be doing.

In that case, how about instead of a section dedicated to
just performance, there is a section that summarizes all
areas where WG members are focusing, which includes
enabling the performance attributes of advanced network
fabrics and new storage technologies, enabling use of NFS
in warehouse virtualization environments, helping NFS adapt
to increasing network security challenges, and extending
the life of legacy storage by enabling pNFS access to it?


> Let's consider 
> the likely agenda for our meeting at IETF99.
> 
> If you exclude the charter discussion, over half the minutes will
> be taken up by performance-related sessions: 20 minutes for
> you to "drone" :-), 40 minutes for Christoph's pNFS mapping types,
> and probably 15 minutes devoted to trunking-related stuff.   This
> is over 60% devoted to performance-related work.  Right now, it
> is what we are doing, for good reasons.
> 
> A lot of these things will not have milestones in the draft we submit.
> In that case we will have to add milestones afterwards.  it is likely
> that Spencer D would approve those (as extensions) even if there 
> were not a specific performance section.  However, I am worried
> that if there is no performance section in the charter, the IESG
> might wonder about us straying from the agreed charter in devoting
> so much effort to performance.
> 
> We might get pushback on the performance section from the IESG, 
> and if we get that we can drop it.  However I'd prefer us to explain now 
> the reasons we expect a considerable focus on performance over the
> next few years.
> 	• For a long while, due to some fortunate circumstances, we have been behind the performance curve and have struggled to catch up.
> 	• Just as our efforts to do so have started to bear fruit, there is another likely burst of technological innovation, in the form of storage class memory, that will up the performance ante, requiring the WG to spend further efforts on the performance front.
> > If there is a constraint on what may be included in our
> > charter, it seems like there will be a problem including
> > performance in particular, as there are no WG documents
> > specifically about performance.
> 
> In fact, you just spent a while working on three of them.  Clearly,
> the basic reason for RPC-over-RDMA is performance, even if the
> title is not anything like the Trumpian "Really Huge NFS 
> Performance" :-)

Yes, NFS/RDMA is a performance play, but IMO performance
is orthogonal here.

The WG's role in the RPC-over-RDMA Version One work was
clarification to produce refreshed standards. The WG is
responsible for ensuring correct interoperation. The WG
did not do any implementation or measurement, and did not
ask for a performance-related justification before I
started this work.

We'll have to agree to disagree on this.


> > If protocol innovation is indeed driven by the individuals
> > on the WG, 
> 
> It is.
> 
> > then it seems to me that "performance" should
> > not be part of our charter. 
> 
> I don't see the logic here.  Extensions are also driven by
> individuals in/on the WG, yet that is in the charter.  Putting
> something in the charter does not imply that this will be
> driven by someone other than the individuals in the WG.

Extension is a means to an end that is mediated by WG
actions. There can't be a protocol extension without
interaction with the WG.

Performance work is as much about implementation as it
is about protocol. Performance needs are discovered
outside the WG, and that is also where new protocol is
frequently invented. The WG's role in performance work
is IMO much more tenuous.


> > Improving performance should
> > be facilitated by the WG / IETF partnership, but driven by
> > the diaspora.
> 
> It should be driven by people who have ideas and are willing to 
> explain and implement them, wherever they happen to live.  I 
> hope the IETF will facilitate that.
> 
> > I'm comfortable with the Maintenance and Extension sections,
> 
> Good.  It appears that Spencer S. is not comfortable 
> with the latter although he has not proposed an alternative.
> 
> > and might even add -- for completeness -- that the WG is
> > responsible for approving changes to certain sections of
> > the IANA registries.
> 
> I don't see the point of doing that. I don't expect the IESG to 
> care very much.  This is the kind of thing I would add if it
> seemed that the charter was too short.

It's not filler, IMO. It's almost less than a single
sentence. Who would approve change requests on those
registries if the WG were dissolved?

I don't see any harm in adding this, considering there
has been activity in this area since the charter was
last updated.


> > I am in philosophical agreement with the technical goal of
> > better performance, and will continue working on it. 
> 
> Good.  Your work so far has put us in a much better position
> and I hope we can maintain the momentum.
> 
> 
> > Still not sure whether that is something that needs to appear in
> > the WG charter.
> 
> Is it fair to characterize your current position on this as "No Objection"?

I think I'd like you to consider the expanded summary
of WG member activity I proposed above, instead of a
section that focuses solely on performance work.


> If so, I'll leave the performance section in for now and we can resolve 
> this question at IETF99.
> 
> 
> 
> On Fri, May 12, 2017 at 8:56 AM, David Noveck <davenoveck@gmail.com> wrote:
> > cm-pvt-msg could move forward, but there has been a palpable
> > objection to making this work an official standard. Thus I
> > don't feel it is ready to promote without further discussion.
> 
> I wasn't suggesting that it move forward without further discussion.
> I was suggesting that we could well the have necessary discussion
> by IETF-99.  
> 
> My understanding is that the objection is to it being a standards-track 
> document so that it could move forward as a WG Experiental 
> document and so get a milestone.
> 
> I think this is right as a Proposed Standard but it isn't really worth 
> spending a lot of time arguing about the matter.  The
> important facts are that there is an implementation of this
> in the Linux client, and that it could help address two important 
> performance issues in RPC-over-RDMA Version One: invalidation 
> overhead and excessive use of reply chunks.   As a result, server
> implementers that care about performance will implement it, 
> whether it is a Proposed Standard, an Experimental RFC, an 
> Informational RFC, a work-in-progress, or an expired individual
> submission. 
> 
> I think it is best for us to have the discussion and move the
> document  forward with whatever status the working group 
> can agree on.
> 
> > RPC-over-RDMA Version Two could become a large project.
> 
> It doesn't have to be.  As I recall, you proposed, with good
> reason, a very small Version Two.
> 
> As far as I'm concerned Version Two, as defined in draft-cel-rpcrdma-version-two
> is essentially Version One Done Right.   It is designed so that it is easy to
> create an implementation that interoperates with Version One peers.  The
> changes are small and limited to:
> 	• A larger default inline threshhold.
> 	• A simple mechanism to exchange information about transport properties
> 	• The ability to define OPTIONAL extensions
> 	• A more flexible remote invalidation scheme

- There are significant improvements to error reporting.

- We need to figure out how to deal with one-way messages
in a protocol that uses request-grant credit management,
or drop the use of one-way messages.

- There needs to be new properties that describe transport
header size limits, chunk list length limits, segment
size limits, and so on.

- If V2 can take more time to mature, I would like to
consider ways to eliminate the need for reply size
estimation. I have a mechanism in mind that also can
reduce the frequency of Reply chunks.

- There are currently no V2 prototypes.

In other words, the scope of V2 has already stretched,
and what's more, it feels like more oak-barrel aging is
needed.


> > There are some issues that the WG needs to consider
> > carefully before enabling such an effort, most especially
> > what is the appetite by vendors/implementers for a new
> > version of the protocol, but also, who has the resources
> > to construct prototypes? 
> 
> I think the only way to find out is to ask people and the best way
> to do that is to discuss the issue of moving this forward.  If  we had
> that discussion and the working group was supportive, I would 
> probably look, when I implemented a Version One server, how 
> much work it would take to make it compatible with Version Two
> as well.   I think that information might be helpful to people.
> 
> I think we should just let the working group decide this issue.

Yes, I'd like more WG discussion. I'm simply not putting
a time limit on it because there seems to be a lot more
work to be done. Like you said above, if the WG reaches
some consensus about this later, it can be added to the
charter.

Or we can change the admittance criteria for the charter,
and let this in now, without a WG document. It seems to
me if the WG agrees to change the charter to include
some item, that is enough consensus to assume that there
will be some work done in that area. Perhaps having an
extant WG document is too high a bar?


> if we 
> can't do that by IETF99, we might be able to do so by the Fall
> Bakeathon.  I don't see what kind of advance consideration you 
> are anticipating having before proceeding to have the working 
> group discussion.server.   
> 
> > Do we understand and agree on
> > what problems we are trying to address in the longer term?
> 
> No we don't but that is the reason for an extensible protocol.
> We need that sort of agreement when we decide on what
> extensions to develop.  I agree that that decision is a ways off.
> 
> > Have we reached the end of RoRV1's ability to deliver
> > performance?
> 
> Probably not.  There is a lot of implementation work to be done, but if 
> people do find bottlenecks that require an extension, it is better to be
> prepared.  The reason for Version Two is to be able to make limited
> changes when necessary.  It will not make performance improvements
> on its own, except perhaps in environments that could benefit from the
> more general remote invalidation support.

My point is that there is a limited pool of resources to
work on both implementations and protocol development.
Any work done on V2 means there is something that falls
off of some part of the V1 plate.

Now that the V1 documents are reaching maturity and
stability, many of us would like to focus on improving
our V1 implementations, and move ahead with the new
pNFS layout type ideas.

That doesn't mean V2 work can't happen, but I don't like
the idea of treating it like there's a giant vacuum that
has to be filled right now with new protocol work.


> > There have been questions about the scope of RoRV2 for a
> > while. 
> 
> I recall haring claims that a big Version Two is needed but I
> haven't heard them for a while.  If there are still people 
> advocating a large/massive Version Two, we can have the
> necessary discussion as part of proposing advancing our
> minimal Versiion Two.
> 
> > The answer shifts depending on whether we think
> > there is or is not a suitable out-of-band mechanism for
> > enabling Remote Invalidation and large inline thresholds.
> 
> I don't think that's the case.  I think the question of
> the out-of-band mechanism affects the urgency of 
> proceeding with a minimal Version Two.

I think I said that.


> Put another way, if cm-pm-msg had never existed, the 
> Version Two would be an urgent matter to be addressed
> immediately.  Given that it does exist, I think we can take 
> a more leisurely approach and look to make this decision
> in the next six months.

A more leisurely approach means we have room to consider
a wider variety of approaches for V2. Unless we explicitly
decide not to.

The tl;dr is there needs to be an explicit discussion
revisiting the scope of V2, after the disposition of
cm-pvt-msg is decided, if only to confirm that the
scope remains the same.


> > To promote this document I would like to have clarification
> > of the scope of RoRV2; and that in turn requires clarity on
> > the disposition of cm-pvt-msg.
> 
> I don't see the need but I think the clarity is availble.  The formalites
> don't really matter.  What matters is running code and it appears that
> the running code provides the mechanism you are looking for, and it
> has been and will be implemented.  I don't see how that affects the 
> scope of Version Two.  What is in rpcrdma-version-two seems a 
> good fit for our current needs in any case.

--
Chuck Lever