Re: [nfsv4] rfc5667bis open issues
David Noveck <davenoveck@gmail.com> Tue, 27 September 2016 02:39 UTC
Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 61DC012B3CD for <nfsv4@ietfa.amsl.com>; Mon, 26 Sep 2016 19:39:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EP5UBokPb1J9 for <nfsv4@ietfa.amsl.com>; Mon, 26 Sep 2016 19:39:02 -0700 (PDT)
Received: from mail-oi0-x234.google.com (mail-oi0-x234.google.com [IPv6:2607:f8b0:4003:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B4F0F12B3C0 for <nfsv4@ietf.org>; Mon, 26 Sep 2016 19:39:02 -0700 (PDT)
Received: by mail-oi0-x234.google.com with SMTP id w11so677401oia.2 for <nfsv4@ietf.org>; Mon, 26 Sep 2016 19:39:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=nGvkBkbfceER942veqzy2RtQSdMkVRpYxuXbfwGUdGA=; b=QKLxTMWuK0F8C6dDtSMNApVoYZp2C2plUV4B2zj+Z5+4h7Ust/WEV7kdJjImHm2Sb5 /BjEZt7/xy1s9Xvg81xGd+Q7FxhyVdW+XzAVnpfdkXW8GEWWyvZqr8FQ15tfV4K2a18x 61ZQw78o/JWl1ZOcRvM8KLphy5+Y1jDzVUr8PUEd6fng6RmMzMVxCTosR7vdBqUlGbWt d/V4hN6zZz33grEd4wyMuRSXzNL5f/Rm24ITKR9C8Wxzeq7hWsaiAmq6Jn4H7uvTSpgy 3nincy1e8E9PLBHI5r9GHku10ekfIQeaRiLlwlYx22NYsBHT5RPmaQpv6fFIjsd//Pen aJMQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=nGvkBkbfceER942veqzy2RtQSdMkVRpYxuXbfwGUdGA=; b=HPcHM84eWtBVcFxaUPeFhF74Qc9jyX6l67+HyCyAm+y4sQMZ7SlMY3uzgu/8IYbEMR iXXDLAFFLLt+qdL4YtqM3UXrxOiOIg3IOTAMHCmT1N5FLHIBh4n3oj799d6HXxaPJRm8 mFZD0oZeACgvznYr7obRQlb7kfICOE8FxMfg4yYNcZEtJovlL3W7yBh678RZcTTltfyC y4yyfDhXahekv9OEgDlTBVrigTlN35rVT/EsMHrywWCXv1cwoZT97dabC60xsLZDI+st 5wRZU3rNu9ptMkcbgPni6ZXzoBULolrF+sJ8Uozbn7zPeO27uY6bxlTVQy38q74NevD2 YOPA==
X-Gm-Message-State: AE9vXwP4OA2Ec83c+KQqRkL9ZgYtLuqcP27U8aDj5yiBAkwUhaHkI+QciO91MA25uL9IB2oLVuPYOTDXnAmp0A==
X-Received: by 10.202.172.82 with SMTP id v79mr32501666oie.178.1474943942069; Mon, 26 Sep 2016 19:39:02 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.249.46 with HTTP; Mon, 26 Sep 2016 19:39:01 -0700 (PDT)
In-Reply-To: <11DB3812-B605-4426-A316-176CE31910B2@oracle.com>
References: <15F62327-B73F-45CF-B4A5-8535955E954F@oracle.com> <65E80EDE-6031-4A83-9B73-3A88C91F8E6A@oracle.com> <CADaq8jc50Ca6eDZ3D6zRvfG+Q2DngNN6+mN9WKXj9AS=d1iQVg@mail.gmail.com> <D0ECCDF7-F785-4419-AA93-33B2054C4737@oracle.com> <CADaq8jcSxc6BQKJ1SZ=OrpRcEGpgpfdLDcPpBp=GfGQJwkbLEw@mail.gmail.com> <11DB3812-B605-4426-A316-176CE31910B2@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Mon, 26 Sep 2016 22:39:01 -0400
Message-ID: <CADaq8jfVDVWcqu2tHBG7dvFDHRo7HGqUANthP4hQp9UwiyZBVw@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a113c37babb1297053d7425b0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/7PBseg4jd_gukCOU26aV40CUJaI>
Cc: NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] rfc5667bis open issues
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Sep 2016 02:39:05 -0000
> > As far as the session case, the server will consider the request executed but the client does not have a reply containing the slot and sequence. To deal with that case, he would have to use rdma_xid in the message with the ERR_CHUNK to get the needed context and so conclude that the slot was available for reuse. > This feels like something that belongs in 5667bis. I'm no real > expert on session behavior. Can you sketch some text that can > be added? I'll write it assuming it will constitute a new subsection 4.x dealing with session-related issues. On Mon, Sep 26, 2016 at 1:20 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > > On Sep 24, 2016, at 8:11 AM, David Noveck <davenoveck@gmail.com> wrote: > > > > >The issue is that the language allows ERR_CHUNK to be > > > returned after a server has processed the RPC request when > > > a client has not provided adequate Write list or Reply chunk > > > resources to convey the reply. > > > > I forgot that we made ERR_CHUNK ambiguous, apparently because of > reluctance to add things to the XDR. There is no pressing need to do this > since the responder is aware of the difference and his management of the > DRC can follow based on his knowledge. > > > > However, in the context of Version Two it would be better if we avoided > those ambiguities, given that we have lots of space for distinct error > codes. > > > > > In that case, it makes sense for the server to have added > > > the request to its DRC. > > > > Agree that in the case in which the server has executed the request, it > should add the request to the DRC. In practical terms, there are not > likely to be cases in which there is a non-idempotent request with a reply > longer than 1K. > > For NFSv3, that is largely true. > > For NFSv4, I believe a large reply to a non-idempotent request is > possible, and may even be common. Any time a client does something > like this: > > { SEQUENCE, PUTFH, SETATTR, GETATTR } > > Where the GETATTR requests an ACL or security label, is problematic > if the client does not estimate the reply buffer size correctly. > > However, this case is for when the client has a bug; it's not a > case where we expect one or the other side to perform heroic > recovery. ERR_CHUNK would terminate the RPC on the client, which > would very likely return EIO to the application. I think that's > about the sanest outcome we can expect. > > > > As far as the session case, the server will consider the request > executed but the client does not have a reply containing the slot and > sequence. To deal with that case, he would have to use rdma_xid in the > message with the ERR_CHUNK to get the needed context and so conclude that > the slot was available for reuse. > > This feels like something that belongs in 5667bis. I'm no real > expert on session behavior. Can you sketch some text that can be > added? > > > > > What I propose is that if the first READ_PLUS returns > > > NFS4_CONTENT_HOLE, the server would return an empty first > > > Write chunk. Then the second READ_PLUS result always > > > lines up with the second Write chunk, which IMO is much better > for > clients. > > > > I'm OK with this but I think you will need to adjust the text to reflect > the fact that READ_PLUS can return an array of read_plus_content's, > although, in practice, those that return more than one are extremely rare. > > I hadn't realized READ_PLUS returned an array. > > If an NFS server is allowed to structure its reply in a way that > the client cannot predict, then I think we'll have to limit the > way READ_PLUS uses DDP. I propose these rules: > > - The client can provide no more than one Write chunk if it expects > NFS4_CONTENT_DATA. (No Write chunk or an empty Write chunk, following > the previous rules, would be for when the client predicts that the > reply can go inline). > > - If that Write chunk is non-empty, it MUST be large enough to > receive all expected payload bytes in a single NFS4_CONTENT_DATA > element. > > - The server uses that Write chunk for the first array element that > has an NFS4_CONTENT_DATA arm. > > > Then we have a choice, depending on whether it is more desirable > to return data in a single round-trip, or more desirable to preserve > holes. Either: > > - If the server finds that the array has grown larger than can be > returned inline or via the supplied Reply chunk, it MUST return > the payload data in a single NFS4_CONTENT_DATA element via the > provided Write chunk. > > Or: > > - The server MUST return as much payload as it can fit within the > resources provided by the client, and return it as a short READ > result. The client is responsible for retrying the READ_PLUS to > read the remaining payload. > > Somehow we have to deal with the case where the server cannot fit > any of the payload in the client-provided resources. > > > READ_PLUS is actually a poor fit for offloaded DDP anyway. The whole > point of offload is that the client has to do no work; the payload > arrives in its memory without any effort on its part. > > I would just as soon require that, on RDMA transports, READ_PLUS > returns only a hole or exactly one contiguous piece of content. > > > -- > Chuck Lever > > > >
- [nfsv4] rfc5667bis open issues Chuck Lever
- Re: [nfsv4] rfc5667bis open issues Chuck Lever
- Re: [nfsv4] rfc5667bis open issues David Noveck
- Re: [nfsv4] rfc5667bis open issues Chuck Lever
- Re: [nfsv4] rfc5667bis open issues David Noveck
- Re: [nfsv4] rfc5667bis open issues Chuck Lever
- Re: [nfsv4] rfc5667bis open issues Chuck Lever
- Re: [nfsv4] rfc5667bis open issues David Noveck
- Re: [nfsv4] rfc5667bis open issues Chuck Lever