Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-06

Chuck Lever <chuck.lever@oracle.com> Tue, 28 February 2017 17:47 UTC

Return-Path: <chuck.lever@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 625E6129660 for <nfsv4@ietfa.amsl.com>; Tue, 28 Feb 2017 09:47:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.202
X-Spam-Level:
X-Spam-Status: No, score=-4.202 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0H-YntwMvOPU for <nfsv4@ietfa.amsl.com>; Tue, 28 Feb 2017 09:47:03 -0800 (PST)
Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F1F0812965F for <nfsv4@ietf.org>; Tue, 28 Feb 2017 09:47:02 -0800 (PST)
Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v1SHl0rX002015 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Feb 2017 17:47:01 GMT
Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v1SHl0fV023636 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Feb 2017 17:47:00 GMT
Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v1SHkwwB000823; Tue, 28 Feb 2017 17:46:58 GMT
Received: from dhcp184.cthon.org (/70.197.14.15) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 28 Feb 2017 09:46:57 -0800
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Chuck Lever <chuck.lever@oracle.com>
In-Reply-To: <CADaq8jea99i8L=tYKM=6T-Mu78n_qzmMwrKGSsWhmgpBytZMiQ@mail.gmail.com>
Date: Tue, 28 Feb 2017 09:46:58 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <D2083198-E667-4B71-AAC5-D26318BE52D6@oracle.com>
References: <CADaq8je8zfRN5R11LxJw=0st-u-XOoKosGbZDBajOTiChzpS5Q@mail.gmail.com> <93F476D6-57F8-44AB-94C9-545608396F51@oracle.com> <CADaq8jcJ3WkpmPJVVec5aJc0ekKgdHPUok=S5_ofGVJnbqrrjA@mail.gmail.com> <5538FD5E-A71B-4F91-AC3A-CBD2F54AF9E3@oracle.com> <de109940-7de1-1a09-51f3-d3be44d98c60@talpey.com> <CADaq8jf5zU0y=v4gaUxVd4scQQwyAEcgWtp11Ddcn=U4jB17pA@mail.gmail.com> <CADaq8jea99i8L=tYKM=6T-Mu78n_qzmMwrKGSsWhmgpBytZMiQ@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
X-Mailer: Apple Mail (2.3124)
X-Source-IP: aserv0022.oracle.com [141.146.126.234]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/EHelnuxqYdEpFbIggt7bThXBVfQ>
Cc: Tom Talpey <tom@talpey.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-06
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Feb 2017 17:47:04 -0000

> On Feb 27, 2017, at 6:09 PM, David Noveck <davenoveck@gmail.com> wrote:
> 
> > Earlier I had believed we had some historical reasons
> > for matching the intent of RFC 5667. 
> 
> I think we do, where that intent is clear.
> 
> > But if we are
> > interested only in documenting the behavior of
> > existing implementations, 
> 
> For some of the cases you mention, there is no possible 
> ambiguity.  The clear intent of RFC5666 and RFC5667
> is that multiple READs and WRITEs were supposed to be
> allowed and considerable effort was expended in explaining
> how they would work.

If you're referring to handling multiple Read or
Write chunks per RPC, there is no clear intent.
That text in RFC 5667 is clearly unfinished, and
its state is a major reason the WG embarked on
this update.


> So it appears that some the things on your liist are bugs.  We
> may be forced, for practical reasons, to turn a blind eye to these
> bugs for some period of time, but this is not the same sort of
> situation as the PZRC-read-chunk issue which is a fairly minor nit
> based on text with rfc56667 that defies clear interpretation.  Sigh!

Again, the WG still has one author of that document
who is available to us to confirm the intent of
that text.

The problem is existing implementations do not
match the capabilities described in RFC 5667.


> > note that:
> 
> interesting/depressing list deleted

Not depressing. Basically this list means that the
world has survived adequately with incomplete
implementations of RPC-over-RDMA and NFSv4 for quite
some time. There has been no need to support multiple
NFSv4 READ and WRITE operations in a COMPOUND on RDMA
or on any other transport.

The issue is whether the WG believes rectifying that
situation is an immediate need, or something that
would be nice to allow for the future, or something
that is not necessary for the remaining future of
RPC-over-RDMA Version One.


> > This could be an argument to, in addition, remove a
> > large piece of text that discusses multiple Write chunks.
> 
> Perhaps it could.
> 
> > However, it might be interesting to some to leave
> > some flexibility for future use.
> 
> Do you mean future use within Version One?  Or we
> could limit Version One to what it is now., in the expectation
> that it will not exist for very long.

Again I am at odds with your view. Version One
has plenty of capability to last for quite a
while, and Version Two is years away from
appearing in storage products in a robust form.


> Then we 
> could use base Version Two to implement what
> Version One was supposed to be and get a small set
> of performance goodies at the same time.
> 
> If we on;t want to write off Version One in that way,  
> we need a reliable means of distinguishing servers
> with full COMPOUND support  from those that you 
> describe.

> For example, when you say a server only supports a single READ
> operation, what does it do if it gets a COMPOUND with more than
> one?  We could have something to work with if:
> 	• All such servers did the same thing.
> 	• It didn't result in disconnection or memory corruption.
> I thought about an extension adding a full_rdma_compound_support
> attribute but that doesn't work for V4.0.
> 
> BTW, do any of these old-fashioned servers with these bugs, recognize
> and report DDP-eligibility violations?  If not this could be a fool proof way
> to distinguish servers who may have these implememtation gaps from newer
> ones that should have full COMPOUND support for RDMA.

Given the experience we had trying to detect
the completeness of protocol features with
RPC-over-RDMA, I really don't relish going
down that path.

As far as I am aware, no client I'm aware of
sends such COMPOUNDs. There is some support in
the Linux NFS server to handle such COMPOUNDS,
but as you might guess, there's been no testing
of this facility with real clients.


> On Mon, Feb 27, 2017 at 11:38 AM, David Noveck <davenoveck@gmail.com> wrote:
> One of the issues that Chuck has to deal with is the need
> to make new implementations interoperable with existing
> implementation.  That has been the source of some new
> MUSTs.
> 
> Regarding the use of MUSTs, I think these terms are overused
> in general and that RFC2119's suggestion that these be used
> sparingly (which ironically uses a "MUST") is too often ignored,
> including by  the IESG. 
> 
> Regarding your suggestion that 5667bis is using "MUST" too
> much, a comparison with RFC5667 is instructive.  Including
> "MUST NOT"s, the RFC2119 term "MUST" Is  used:
> 	• 22 times in RFC5667, a 10-page spec.
> 	• 14 times in rf5667bis-06, an 18-page spec.
> Part of Chuck's advantage here is that he deleted a lot of the
> duplication in which RFC5667 ether repeated what was in 
> RFC5666 specialized to NFS (useless) or contradicted it (which
> really had to go).
> 
> 
> On Mon, Feb 27, 2017 at 8:16 AM, Tom Talpey <tom@talpey.com> wrote:
> On 2/26/2017 3:29 PM, Chuck Lever wrote:
> On Feb 25, 2017, at 3:54 PM, David Noveck <davenoveck@gmail.com> wrote:
> 
> RFC 5667 Section 4 says:
> 
> Similarly, a single RDMA Read list entry MAY be posted by the client
> to supply the opaque file data for a WRITE request or the pathname
> for a SYMLINK request.
> 
> Part of the problem here is that, as you discuss later, this statement is
> ambiguous, as the meaning of "read list entry" is not clear.
> 
> The server MUST ignore any Read list for
> other NFS procedures,
> 
> As I understand it, this statement cannot apply to PZRCs, and rfc5666bis
> has already dealt with that issue.  So, if one tried to maintain this paragraph,
> in something like the RFC5667-form, some modification would have been
> necessary to avoid essentially preventing any use of PZRCs
> 
> as well as additional Read list entries beyond
> the first in the list.
> 
> I take "Read list entry" to mean Read chunk, composed of
> multiple list entries that share the same XDR position.
> This comports with similar language describing Write
> chunks where a single list entry is indeed allowed to
> have multiple segments.
> 
> Makes sense to me.
> 
> However, the original intent might have been "single
> Read segment".
> 
> It might have been but there is no way to be sure.
> 
> We can ask Tom Talpey. If he does not recall, then
> we have no way to be sure.
> 
> I agree the paragraph in question could have been more clear. I'll
> hazard a guess that it should have been written as "Read list" instead
> of "Read list entry", meaning, an entire scatter list is provided.
> This woud certainly match the semantic for the result of an ordinary
> NFS Read.
> 
> I will also observe that the statement is a MAY. That is, it prescribes
> no behavior, and offers a choice to the implementer. It does not rule
> out the option of posting a list.
> 
> I think you guys need to stop worrying about writing these "rules"
> down so literally. The only goal of RFC5667 was to isolate the tidbits
> of NFS behaviors separate from the core rpcrdma transport. The
> document makes relatively few MUST requirements.
> 
> Tom.
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4
> 
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4

--
Chuck Lever