Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-03 (part one of three)

David Noveck <davenoveck@gmail.com> Wed, 21 December 2016 18:16 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 72A731293DA for <nfsv4@ietfa.amsl.com>; Wed, 21 Dec 2016 10:16:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zNscn0yumzEX for <nfsv4@ietfa.amsl.com>; Wed, 21 Dec 2016 10:16:09 -0800 (PST)
Received: from mail-oi0-x22a.google.com (mail-oi0-x22a.google.com [IPv6:2607:f8b0:4003:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7E4D41294FE for <nfsv4@ietf.org>; Wed, 21 Dec 2016 10:16:09 -0800 (PST)
Received: by mail-oi0-x22a.google.com with SMTP id b126so218423417oia.2 for <nfsv4@ietf.org>; Wed, 21 Dec 2016 10:16:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=hZAhuiu6qgqvYuJHbcC5CnEzuNM6e5c0alfQ2RlNmAo=; b=TKMvq13NT7TmaqJEUHCNxUFzUSkRBLQPuDc4OKPvSWoVdE2Z8+gb0CL/cduAcKFwfm LxG/VxDr0bkvS7zu1dTx1DjSzchkG5w5I4cR2n/3C0un/1+GiWdMYlS6hN6w2Te4zNhk 8gMwJSvtr4c150bd0PMxEQMKATvZZTgn/REMp6qwxbREvFL97S5SP9BSokLXa6ZGsroU zU3VBtDcFgTK8nS3JmN8fNu/PL8KZSI/Kni9RTLoKhm9cVaMpyDxARZmUY+i05oqwI3l GZmWxhzyYs70ktkSWc93AYLPFUco6qlwKw9Shn1AIb5Gs5Ck/rHTNWJytjtBUB/TLHHq CWIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=hZAhuiu6qgqvYuJHbcC5CnEzuNM6e5c0alfQ2RlNmAo=; b=U8RUsdnRdicTPyvS0ROcgBDxvTmTmOsWjCsESRiIk+n9niDkMH/FOGRfw6HXmbol9Y cxB6pAfHCUwJKA+COKt2iyfvD6XvZFFSs4hzSDd7tN2Vi1ZDLODAUZzpZYRCdsFEJC5b dDhunzxdfVKGUlbT1BJRVPLnCHJEsShyURVFBod+nXq9/WMIvCi846jAZOj+yFBLnL48 RkiqYaX0igc4nb/T4wo1DMIgEd6wsx76RyxtZSeRDG6m3hdNULdNFHW6OWY5M0AbSt04 HflbsCUxcXssgWx2FReE2wzPWzccXa8nbjU7uk3BWUlGyAwBAu1Kl6tja/G6GmH9UEFG SwHw==
X-Gm-Message-State: AIkVDXKDuPkyzTMmQd0HQdJhGyZGI9JxbNOyqicDw8WE6NwXsDSV3AX57lFXiuKXGj1ny5h2K0IEV6Jw2cqyhA==
X-Received: by 10.157.33.3 with SMTP id i3mr2420884otb.185.1482344168715; Wed, 21 Dec 2016 10:16:08 -0800 (PST)
MIME-Version: 1.0
Received: by 10.182.137.202 with HTTP; Wed, 21 Dec 2016 10:16:08 -0800 (PST)
In-Reply-To: <5D4C5F04-CEB9-479E-BCAF-B1A40E48C6D9@oracle.com>
References: <CADaq8jdPR4+iodgRxhMsuhQDK2ufPo_s8PYMde3sjtZyc0B-9Q@mail.gmail.com> <5D4C5F04-CEB9-479E-BCAF-B1A40E48C6D9@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Wed, 21 Dec 2016 13:16:08 -0500
Message-ID: <CADaq8jfOdyeL0x7uoSh5ctWYzGphrNCdnFUPjm8oGO0ajDBRVQ@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a113e04b49c79ef05442f2541"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/NmdsJ31rsuMVxUn08u-jROGAGrQ>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-03 (part one of three)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Dec 2016 18:16:12 -0000

>I predict some resistance to section titles in a specification that include
> the word "Difficulties".

I see your point.  How about "Issues with"?

> I prefer to keep the NFS ULB document as agnostic to transport protocol
> version as possible.

That's in line with the requirements for ULBs presented in rfc5666bis.

> That would increase its useful lifespan, rather
> than requiring a fresh ULB later to handle subsequent versions of
> RPC-over-RDMA.

Which, I presume, is why you wrote the ULB requirements in rfc5666bis
the way you did.

> I assume you object to the use of terminology that mentions a particular
> RDMA mechanism (eg. RDMA Read, or Read chunk), since that is incompatible
> with potential extensions to RPC-over-RDMA which might use some other
> mechanism (eg. structured Send buffers).

That's the specific issue that I'm most concerned about, but I think you
have summarized the general issue above.  We want to make this
a ULB document that applies to RDMA-over-RPC in general and not iie
it to the specifics of Version One.

Another area of concern is that much of the current text implicitly assumes
that
message continuation is not (and will never be) available.  That also is
Version-One-specific

> On considering replacing that language with abstract terminology such as
> "DDP-eligible", that could make the document more difficult for
implementers
> to interpret.

I don't think it will.  Consider what happened in rfc5666bis where that
terminology
was introduced.  While RFC5666 mentioned only the specifics of the
implementation,
rfc566bis introduced the abstract concept (i.e DDP-eligibility) and was
able to describe
how, that abstract concept was implemented.  I found the result easier to
interpret because
it presented both the abstraction and the implementation.

> It would certainly bear little resemblance to the original RFC 5667.

I think that is overstating it.  In the end I expect the difference between
RFC5667 and
rfc5667bis to be about the same as between RFC5666 and rfc5666bis.

> Likely all of Section 2 would need to be rewritten or removed. I believe
> removal of this section would be a disservice to implementers.

I don't believe that either of those should be done and I din't propose
major
changes to Section 2.  Instead I think that this material which
Version-One-Specific
be identified as Version-One-Specific.  I think it can serve as a helpful
bridge,
supplementing rfc566bis between the level of abstract concepts and that of
Chunks
and explicit RDMA operations.

> IMO it's treading a fine line to consider non-RDMA Direct Data Placement
> mechanisms in a transport protocol called "RPC-over-RDMA"

I don't think the phrase "non-RDMA Direct Data Placement mechanisms" is
correct.  If we adopt the terminology of rfc5666bis, the use of SEND-based
DDP
(using structured receive buffers) does use RDMA since RDMA SEND is defined
as an RDMA operation.  Such operations as RDMA READ and RDMA WRITE are
referred to as "explicit RDMA operations".

In my view, if this is a line we cannot tread (or consider approaching), we
are forced
into a cramped subset of the possible RDMA operations.  RDMA-over-RPC
should
be free to use the most efficient operation to do what it needs to do,
including to implement
DDP.  Using explicit RDMA operations is one possible implementation but it
isn't a goal.

> Because I'm not taking "the last of these approaches" (see
> above),

This third approach was:

Restructure the document to distinguish material required by rfc566bis and
applicable to all transport versions from explanatory/illustrative
material tied to Version One.

I'm not clear why you object to this and what you propose to do instead.
You are very clear why you don't want to adopt either of the first two
approaches listed above.

> at least for now, I will probably not merge many of your proposed
> changes in this area, but will use the body of your remarks as a guide to
> rework the text to be agnostic to data placement mechanism.

OK, but your point above is that this might result in a document that is
unclear to implementers, so I think it will need to supplemented by
material which is not transport-version-agnostic.  If you have such
material, it needs to be clearly labeled as transport-version specific.
.

On Wed, Dec 21, 2016 at 11:50 AM, Chuck Lever <chuck.lever@oracle.com>
wrote:

> Hi Dave-
>
> Thanks for the review. Responses to your General Comments are below.
>
>
> > On Dec 19, 2016, at 5:56 PM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > Review Structure
> >
> > This is the first part of a multi-part review.  It is split into
> multiple emails to avoid running into mailing list size limits.
> >
> > This part consists of:
> >       • General Comments
> >       • Comments by Section (through Section 3)
> > Later emails will contain the rest of the per-section comments.
> >
> > General Comments
> >
> > Overall Evaluation
> >
> > I think this document is fairly far along in meeting the original
> requirements, to replace RPC5667 and supplement it with a reasonable
> treatment of NFSv4, not really dealt with in RFC5667.
> >
> > However, in a number of respects, the current text has issues that need
> to be addressed before going forward and I've made some suggestions as to
> how those issues could be addressed.  These issues are:
> >       • Adaptation to the possiblility of multiple RPC_over-RDMA
> versions, whether that is done as outlined in draft-cel-nfsv4-rpcrdma-version-two
> or not.  This issue is discussed in Handling of RPC-over-RDMA Transport
> Versions.
> >       • The need to address some fundamental issues regarding the
> ability to bound reply size.  I have suggested a way to do that in the
> proposed new section 2.x.  Difficulties in Reply Size Estimation
>
> I predict some resistance to section titles in a specification that include
> the word "Difficulties".
>
> The point has been made, however, that without proper Reply Size
> Estimation,
> successful interoperation is at risk. This is why we now call out Reply
> Size
> Estimation in RFC 5666bis as a necessary part of an Upper Layer Binding.
>
> I agree that there are plenty of details that need to be sorted here. Your
> comments in this area are most helpful, and I plan to integrate many of
> them
> into this document. I hope we can resolve most RSE issues so that the use
> of
> the word "Difficulties" will be rare or altogether unnecessary.
>
>
> > Handling of RPC-over-RDMA Transport Versions
> >
> > Even though the I-D for Version Two has not yet been adopted as a
> working group document and there may be some dispute as to the proper
> direction such a new version might take, it is now clear that further
> development of RPC-over-RDMA is likely to take place.  In light of the fact
> that we can no longer assume Version One will be the only version, we have
> a number of choices:
> >       • Make the current document the ULB for NFS applying to
> RDMA-over-RDMA Version One only or to Version One plus a very small Version
> Two.  Note that later changes to an extensible transport version might make
> a lot of the current statements about how DDP will be done, inappropriate.
> For example, a version two based on draft-cel-nfsv4-rpcrdma-version-two
> might not require much change but the later addition of features for
> message continuation and  send-based DDP, as set out in
> draft-noveck-nfsv4-rpcrma-rtrext would essentially force a rewriting of
> the document, even if those feature were OPTIONAL.
> >       • Cut back the scope of the current to eliminate most of the
> explanatory material tied to Version One of the transport and leave it oly
> containg the mateial that rfc5666bis requires for ULBs.
> >       • Restructure the document to distinguish material required by
> rfc566bis and applicable to all transport versions from
> explanatory/illustrative  material tied to Version One.
> > In my revision suggestions, I have adopted the last of these approaches.
>
> I prefer to keep the NFS ULB document as agnostic to transport protocol
> version as possible. That would increase its useful lifespan, rather
> than requiring a fresh ULB later to handle subsequent versions of
> RPC-over-RDMA.
>
> I assume you object to the use of terminology that mentions a particular
> RDMA mechanism (eg. RDMA Read, or Read chunk), since that is incompatible
> with potential extensions to RPC-over-RDMA which might use some other
> mechanism (eg. structured Send buffers).
>
> On considering replacing that language with abstract terminology such as
> "DDP-eligible", that could make the document more difficult for
> implementers
> to interpret. It would certainly bear little resemblance to the original
> RFC
> 5667.
>
> Likely all of Section 2 would need to be rewritten or removed. I believe
> removal of this section would be a disservice to implementers.
>
> IMO it's treading a fine line to consider non-RDMA Direct Data Placement
> mechanisms in a transport protocol called "RPC-over-RDMA".
>
> Be that as it may, I will see what can be done to address your concerns in
> this document. Because I'm not taking "the last of these approaches" (see
> above), at least for now, I will probably not merge many of your proposed
> changes in this area, but will use the body of your remarks as a guide to
> rework the text to be agnostic to data placement mechanism.
>
>
> --
> Chuck Lever
>
>
>
>