[nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-03 (part two of three)

David Noveck <davenoveck@gmail.com> Mon, 19 December 2016 22:57 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7E6BD1295B8 for <nfsv4@ietfa.amsl.com>; Mon, 19 Dec 2016 14:57:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mBgHetyAmENR for <nfsv4@ietfa.amsl.com>; Mon, 19 Dec 2016 14:57:12 -0800 (PST)
Received: from mail-oi0-x229.google.com (mail-oi0-x229.google.com [IPv6:2607:f8b0:4003:c06::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B01771293F5 for <nfsv4@ietf.org>; Mon, 19 Dec 2016 14:57:12 -0800 (PST)
Received: by mail-oi0-x229.google.com with SMTP id v84so160367514oie.3 for <nfsv4@ietf.org>; Mon, 19 Dec 2016 14:57:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=9Lz/vwPqZJGZUh0ldSUnggRIDINECIMRyvv5GZ2mNFk=; b=rpNJQDUZFaR06LidPyYUsYpsBb7i8iCOtrHMJjivc7iwdjHO5em+knBIJ9zaByVwPS LdQwIVZJ0Yk+UssFdaWyRj9/AOttNfygFhmn2yr9Qlc7ge2jRZgWxS0vbtg72N707QA2 2+XcpoUznV9XrOf1zKCz7FEs/gsE+S1nUaWqZ+DSpwweAlT4Q+szXQO/njJO2U8/8M2P siopz/BGGcl/oNbw7zqAcneleO4OGcX1BHv7I4YtIK4uiPX9lpbVPUuD04YjqpkccQQW Lj0Qi075J9ahgVUKlFuheLD3jh29hMH0tj8i/IHT66DEn8D+LTvA01F2O1DlxcGgS5pz rNkQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=9Lz/vwPqZJGZUh0ldSUnggRIDINECIMRyvv5GZ2mNFk=; b=rUb9hwHdGk3Mvvu0s3TL+qAhK90pmsnTJz27GJS7xMfYQSMNoUn/VBGifIfyEJPMgo BBAA2nL7UR5cuTHsLC/xBIhZhQhpvXWFNlNIKNg/tUXsdD/8xesb37s+gVIkgDsnjy71 tyuRDXiwNXvCyF2lR73fi9GwGL7BsauUD+pENLYHOA5H6VLCTJDhZGwRhZxgB25eE+YE muKoXmgevpG7RvKjnLa1KySLNgf1bbR401oiFc6D9Mv6ZxCFQKeVKLwbPQBvo8jc1IKB JkU9oggdXdNndbaSNtsX+PHb/kKyGmJZyRR0w3Fj/fWWMtjxJwC9TDBAiYsomy5R5YUc J/VQ==
X-Gm-Message-State: AIkVDXKpROUWzOTk+y9gDfYWYV35joawWtICiufeKcmGTsK5kb8BUug8P7hTh35Wo2S+b4t9sADTuoPfHE9AOg==
X-Received: by 10.202.212.200 with SMTP id l191mr10015033oig.153.1482188231909; Mon, 19 Dec 2016 14:57:11 -0800 (PST)
MIME-Version: 1.0
Received: by 10.182.137.202 with HTTP; Mon, 19 Dec 2016 14:57:11 -0800 (PST)
From: David Noveck <davenoveck@gmail.com>
Date: Mon, 19 Dec 2016 17:57:11 -0500
Message-ID: <CADaq8jetsJ4mx9hPp0xy-zFdhmZ2z_uSAmy2Ap1LgkTBvO536g@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a113accac0da9ca05440ad780"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/BjpkLZvwDwj_-rRSxiqMS3voLCs>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: [nfsv4] Review of draft-ietf-nfsv4-rfc5667bis-03 (part two of three)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Dec 2016 22:57:14 -0000

*Review Structure*

This is the second part of a multi-part review.  It is split into multiple
emails to avoid running into mailing list size limits.

This part consists has per-section comments for sections 3.1 through 4.1.1.

Note that all general comments are in the first section

Other emails contain the rest of the per-section comments.

*Comments by Section (from Section 3.1 through Section 4.1.1)*

*3.1.  Auxiliary Protocols*

I anticipate that this will become section 3.3

In the second sentence of the first paragraph, suggest deleting the word "main".

Suggest replacing the last sentence of the first paragraph by the following:

NFSACL is treated here as a de facto standard as there are several
interoperating implementations.

Suggest replacing the last sentence of the second paragraph by the following:

In order to allow use of each of these on an RDMA transport, an Upper
Layer Binding must be provided.

The third paragraph is not clear. While the first sentence is true, the
fact that something is typically not done does not imply that it should not
be done or that the issues that it raises nee not be addressed. The
remaining sentences seem to summarize the reasons why providing the ULB is
quite easy to o without actually providing one. I think you need to more
clearly explain why a ULB is not provided. A possible replacement paragraph
might be:

In the past, MOUNT, NLM, and NSM have been conveyed using TCP rather than
RPC-over-RDMA transport. Because the volume of messages is low and because
there is no opportunity to use Direct Data placement, there appears to be
little need for RPC-over-RDMA support for these protocols and ULBs for them
will not be provided in this document. Given that these protocol do not
appear to have any data items that could be made DDP-eligible,it shoul be
easy to provide the necessary ULB in an additional document, when
necessary. The size of RPC messages for these protocols is uniformly small
and he maximum size of replies cold be easily determined by examining the
XDR definitions of these protocols.

Suggest replacing the fourth and fifth paragraphs by the following:

Implementations that support the NFSACL protocol typically send NFSACL
procedures on the same connection as the NFS protoco.l.  In order to
allow NFS implementations that include NFSACL support to use
RPC-over-RDMA, a ULB for this protocol will be provided in this
document.

No data item in this protocol is DDP-eligible.  As, there is no
protocol-specified size limit for NFS version 3 ACL objects., the
approach described in Section 2.x is to be followed to deal with
replies that can contain ACL objects. For more detail on the specifics
as they apply to NFSACL, see the next section.

*3.1.1. Reply Size Estimation for NFSACL*

Suggesting the following new subsection:


For procedures whose responses do not include an ACL object, the reply size
is determinable from the XDR definition and replies can be expected to fit
within the inline buffer size.

For procedures whose responses do include an ACL object, a reply size limit
of at least 4K SHOULD be initially used. When the inline buffer size is
smaller than this size a reply chunk MUST be provided.

When a transport error indicates that this reply size was inadequate, the
operation MUST be retried with a larger reply chunk. This larger reply
chunk SHOULD be at least twice the size previously used.

If a retry is needed, the requester SHOULD issue future request for
procedures whose replies to include an ACL object, providing a reply chunk
at least as large as the largest one already used.

*4.  NFS Version 4 Upper Layer Binding*

Suggest replacing title by "Upper Layer Binding for NFS Version 4
Minor Versions".

I'm having trouble with the treatment of the callback program as if it
were a separate protocol, as it is not described as such by RFCs 7530,
7531, 5661, 5662, 7872, 7863.  I know rfc566bis has its own
idiosyncratic treatment of this issue and I'm not proposing a change
in that.  Nevertheless, this document need to be understood by those
not steeped in rfc566bis terminology and I think it is best if this
document follows common practice.  Also , if the callback program were
a separate protocol, it would need its own ULB, which it doesn't have.
I suggest replacing the second sentence of the first paragraph by the
following.

It also applies to the RPC messages used to send callbacks as part of
the callback programs defined  as part of each of these minor
versions, as described in the same documents.

*4.1.  DDP-Eligibility*

I'm suggesting that Section 4.1 be re-organized, for the same reasons
and along the same lines as suggested above in *3.  NFS Versions 2 And
3 Upper Layer Binding*.  As part of that reorganization;


   - Section 4.1, would be limited to issues related to
DDP-eligibility, and would be entitled *DDP-Eligibility of Forward
Direction RPC Requests*.
   - A new section 4.2, would get most of the remaining material in
the current section 3.1 and would be entitled *Handling
Forward-direction RPC Requests in RPC-over-RDMA Version One.*

I suggest that the new section 4.1 be written along the following lines:

The only RPC procedure containing DDP-eligible items is the COMPOUND
procedure.  The NULL proceure does not contain any DDP-eligible items.

For COMPOUND requests, the followng arguments are DDP-elgible:


   - The file dara argument of the WRITE operation.
   - The pathname argument of the CREATE operation, when the create is
for an object of type NF4LNK.

For COMPOUND replies, the result items listed below are DDP-eligible.
While each operation response is embedded within a switch, some of
whose arms contain DDP-eligible items, it is not necessary to treat
each COMPOUND response as potentially containing DDp-eligible items.
esponse should only be treated as potentially containing DDP-eligible
items if the response for the operations actually peformed in a
particular COMPOUND request includes DDP-eligible items.


   - The opaque file data returned by a READ operation.
   - The opaque file data with in NFS4_CONENT_DATA case of the reply
to a READ_PLUS operation.  Note that (see Section 4.2.1 [numbering
subject to change]) only the first such item for a given READ_PLUS
operation is considre DDP-eligible.
   - The pathname returned by a READLINK operation.

I suggest that the current third paragraph be eleted. In part this because
it repeats stuff more approprite to rfc5666bis but the main reason to elete
this is that it contradicts rfc5666bis. rfc566bis iscusses how the
responder is to deal with DDP-elgibility violations while this tells
requester that "MUSTNOT" commit them. This would give the responder license
to not follow rfc5666bis. (Note that Section 3 tells the responder to
ignore them which also contrdicts rfc5666bis.

The second sentence of the current fourth paragraph has a number of issues:

   - You can have DDP-eligible item in both request and response and have
   neither a long call or a long reply, if the removal of the DDP_eligible ata
   is sufficient to make the call/reply small enough.
   - When the reply is too long, it results in a reply chunk rather than an
   element in the write list.

Suggest rewriting this sentence as follows:

An NFS version 4 client MAY provide a Read list and a Write list in
the same transaction if it is performing operations which have
DDP-eligible arguments as well as those which have DDP-eligible
results.  This situation can also arise when a long Call occurs
together with operations which have DDP-eligible results.  Reply
chunks can also appear together a Read list on requests that perform
operations which have DDP-eligible arguments.

In order to accommodate the approach to reply estimation outlined in *2.x.
Difficulties in Reply Size Estimation*, I am suggesting that the last
paragraph be rewritten as follows:

Such error replies are permanent errors an should not be retried
without addressing the any underlying resource issue.  Except where
the requests are idempotent and can be used to allocate an adequately
size reply chunk (see Section 2.x), the error constitutes completion
of the RPC transaction and a valid server response.  The responder
SHOULD NOT drop the transport connection in this case.

*4.1.1.  Session-Related Considerations*

I suggest that this section be numbered 4.3 or 4.2.1 (more likely).
It does not relate to DDP-eligibility.

In the first sentence of the current (unbulleted) third paragraph,
suggest replacing "which arise only due to incorrect implementations"
by "which arise only due to incorrect implementations or an
unavoidable mis-estimation of the maximum reply size"

Suggest replacing the first sentence of the current (unbulleted)
fourth paragraph by the following:

In both instances, the client should not retry the operation without
addressing any possible resource inadequacy (e.g no reply chunk or one
of insufficient size).

In the second sentence of the current (unbulleted) fourth paragraph,
suggest replacing "A" by "Such a"