Re: [nfsv4] preliminary review of draft-cel-nfsv4-reminv-design
David Noveck <davenoveck@gmail.com> Thu, 04 August 2016 20:49 UTC
Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B9BE412D675 for <nfsv4@ietfa.amsl.com>; Thu, 4 Aug 2016 13:49:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3CHOwibPq2Xl for <nfsv4@ietfa.amsl.com>; Thu, 4 Aug 2016 13:49:42 -0700 (PDT)
Received: from mail-oi0-x22d.google.com (mail-oi0-x22d.google.com [IPv6:2607:f8b0:4003:c06::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D69E712D819 for <nfsv4@ietf.org>; Thu, 4 Aug 2016 13:49:41 -0700 (PDT)
Received: by mail-oi0-x22d.google.com with SMTP id f189so63442946oig.3 for <nfsv4@ietf.org>; Thu, 04 Aug 2016 13:49:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=vCeG2+jEvelz/MmH80gFhA07xtFWAonjciPxDuDvEZU=; b=XBEId43amp3jJp2OawbCVGfpmOFHBaad8plccxFU0SYPl6V5bvKMGSeT3LmWPwD85K BVoojlhtIYmR9n6+4lqcVRG2/RAUbl1KozoZR7nPHAruB1lEAqWD7K5mBiM4mo3W6wTa V9JPgmjPLREvHbh2cdA8vemX3JKRFbE8k0qODrxT24sHzuElu9mk/KstrlrMpFnx3yM1 hIcQOkDMeG+o9xLi8NratvbUEu5TJsSrX+NucOafX4SUtlvv23bRH3h1DBPMieGcDKGw /Mg9+fd/o0NQAqFDJgeKiWiyOytrHiO1W3c2A6Jw9zKC/6doN6uXiCQBZwJDaq9FT7ME X96Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=vCeG2+jEvelz/MmH80gFhA07xtFWAonjciPxDuDvEZU=; b=jSHjCfqwbuBMF6i0duGrzKrw8E0bNaPa4MR9i/mpYy26+f4dxrPcE4HDGXddWtzHec sg4PREeCY7MoH4HbFWvQuNCsApj0Eb4fLX1m38k21R4mTNF4FSfnrT01TT/PvzDOZyAW I/opco8MKV2rEi2UQAE9VOvwwU4aFDEKdG46Rc9zdvtrna7bZew+htqFViQxReqdoyW7 VGqFwjs55Ebii1mV6oX7C+OMLb6kATa55YRGfvZY9OQYXp3294Z12/YPCvIRz6f5zlMi Jj9oWoejzY6ZgWGmTLS4q1yxOzR3zd4vUHSjTtjjzTUUvQICXkQ137crFNAhVDUdMhR7 6S0g==
X-Gm-Message-State: AEkoouuPo62AaEn4ndLmIIkkh/07bm+sYU3sDfupH7WuOR/DnB5w/Dwc/g2YweCKY6bgAn2g6ikg2DdgOZLePA==
X-Received: by 10.157.42.115 with SMTP id t106mr16153577ota.6.1470343781134; Thu, 04 Aug 2016 13:49:41 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.20.72 with HTTP; Thu, 4 Aug 2016 13:49:40 -0700 (PDT)
In-Reply-To: <7E1F0B4A-EE1D-4686-BC9B-358AF8FCF095@oracle.com>
References: <CADaq8jdor+Ju+F=ZBcV6zY3PWJerpM_sDtuTPy9EZTo6hymFPQ@mail.gmail.com> <7E1F0B4A-EE1D-4686-BC9B-358AF8FCF095@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Thu, 04 Aug 2016 16:49:40 -0400
Message-ID: <CADaq8jecc4dqgcgOZ7yH+n-0tMGZ7XmamM_us7fjAn4+H=C=pQ@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a113d069cc5b86c05394516a0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/M_pg6FRFHK9IkJX8XeehL330vbM>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] preliminary review of draft-cel-nfsv4-reminv-design
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Aug 2016 20:49:45 -0000
> In this case, however, I think the benefits outweigh the costs > of altering the base XDR. Whether this is the case depends on the proportion of implementations that are adequately supported by a simpler approach that does not provide the extended functionality of allowing per-request selection of invalidation approach. If 80% or 90% or 95% of implementations are OK with the simpler implementation, the balance of costs and benefits are different than they would be in 30%, 50%, or 60% need the'more flexible approach. I think we need to hear from the rest of the working group about what they think is important here. > A client can't use RI at > all if it can't tolerate an arbitrary choice of which handle > in an RPC is invalidated remotely. The question is how common such clients are. > In Linux, it's simply a matter of adding an "if vers == 2, > insert (or expect) two u32 fields here." In lots of places. > With an rpcgen-based > implementation, the change is also fairly trivial (in fact most > of it is handled by machine-generated code). Essentially you have two implementation rather than one extensible one. > Speaking as an implementer, having to support two new RDMA > message types (in addition to RDMA_OPT_INIT_XCHAR) is much more > effort than having to deal with one or two extra fields in > rpcrdma2_chunk_lists.e problems with the simpler approach. That would be the case if every implementer had to support those OPTIONAL messages types. If only a few would, the situation would be different. Those who did not need the more extensive support would not have to deal with any XDR change at all. > Thus to support Remote Invalidation > in full, an implementation would need to support RDMA_MSG, > RDMA_NOMSG, RDMA_OPT_INIT_XCHAR, RDMA_MESSAGEX, and > RDMA_NOMSGX. so this all boils down to the question of how many implementations want or need to support this extended form of remote invalidation. On Thu, Aug 4, 2016 at 12:35 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > Hi Dave- > > Thanks for your review comments. > > > > On Aug 4, 2016, at 10:48 AM, David Noveck <davenoveck@gmail.com> wrote: > > > > The first issue concerns the structure rpcrdma2_chunk_lists. Although > this structure is defined in draft-cel-nfsv4-rpcrdma-version-two-01, > there is nothing that uses it. The structure that RDMA_MSG and RDMA_NOMSG > include is rpcrdma2_chunks, which is not defined in the XDR. It appear > that this issue has existed since draft-cel-nfsv4-rpcrdma-version-two-00. > > > > Although that issue can and should be fixed, I think we need to have a > more complete discussion of the extension model for version two and for > RPC-over-RDMA as a whole in the near term. > > > > As I understand things, the original intention was that Version Two be a > compatible extension of Version One. Now, with the inclusion on the > direction indication added in -01, and now the R-key proposed in > reminv-design, Version One and Version Two messages will be > OTW-incompatible, and I think are better off with a model in which Version > Two consists of subset which equal to Version One and a set of additions, > primarily optional. I think we should diverge from that model only if the > benefits are sufficiently important to justify this discontinuity. > > Yes, the original intention was to be as compatible as possible, > and any XDR divergence with Version One would not be undertaken > lightly. That preference is why we have elected to defer changes > like this one to extensions when it makes sense. > > However, with a version number bump, we are no longer entirely > shackled by existing implementations or XDR definitions. It's > very easy to take that too far, of course, which is why the > default choice is to extend rather than alter the base XDR. > > In this case, however, I think the benefits outweigh the costs > of altering the base XDR. > > > > With these changes, the implementation barrier to convert a Version One > implementation to be compatible Version Two becomes significant. I think > we would be better off, if most Version One implementations could be made > compatible with very easily, making the decision to do so more or less > automatic. > > In Linux, it's simply a matter of adding an "if vers == 2, > insert (or expect) two u32 fields here." With an rpcgen-based > implementation, the change is also fairly trivial (in fact most > of it is handled by machine-generated code). > > Speaking as an implementer, having to support two new RDMA > message types (in addition to RDMA_OPT_INIT_XCHAR) is much more > effort than having to deal with one or two extra fields in > rpcrdma2_chunk_lists. > > > > To get back to remote invalidation, I would prefer your section 3.3 as a > baseline to be made accessible with no XDR changes in the base Version > Two. The additional functionality provided by 3.4, while desirable, is not > of sufficient benefit to justify a non-compatible XDR change. I feel that > this functionality should be available as an OPTIONAL extension. > > The benefit of the new field can be described this way: > > RPC-over-RDMA allows multiple handles per RPC. > > The burden of selecting a handle to invalidate remotely should > be the client's. I believe one existing client implementation > does mix persistently registered handles with dynamically > registered handles in the same RPC. A client can't use RI at > all if it can't tolerate an arbitrary choice of which handle > in an RPC is invalidated remotely. > > The "big switch" approach is simply not generic enough when > multiple handles are in play and we don't have control over > client implementation choices. It would exclude a portion of > implementations, limiting the appeal of RPC-over-RDMA Version > Two. > > In the case of SMB Direct, all implementers decided that they > would go with all FRWR registration. A big switch works in > that scenario. In fact, the switch is always "on". > > Providing rdma_inv_handle in each RPC-over-RDMA header goes > along with the design of having lists of segments, each with > their own handle. > > Should the protocol be designed to discourage implementations > that need to communicate handles on a per-RPC basis in favor > of ones that can work with just an exchange of a transport > characteristic? > > > > When I say that it should be an "OPTIONAL Extension", I don't mean to > imply: > > • That it needs to be implemented as a subcase of RDMA_OPTIONAL. > > • That it should be documented in a separate document, as opposed > to being documented (eventually) in draft-ietf-nfsv4-rpcrdma-version-two. > > What I do mean is that we should define new message type for extensions > (so that we maintain Version One as a subset of Version Two) and that we > should (not "SHOULD" :-) make these new message types "OPTIONAL" (in the > RFC2119 sense). I can see, if there is sufficient reason, making an > extension REQUIRED, but I don't see a reason to change an existing message > type in an incompatible way. > > > > One way to do this is to define new message types RDMA_MESSAGEX and > RDMA_NOMSGX which include direction and rdma_handle but there are other > ways to do this. To make it easier to determine whether support for > OPTIONAL message types is present, we could define a transport > characteristic/attribute that provides a bit mask of supported message > types. > > (An xchar that carries a bitmap of supported message types seems > appropriate in the initial set of supported characteristics.) > > IIRC, earlier we had decided that message types RDMA_MSG and > RDMA_NOMSG would be REQUIRED. Thus to support Remote Invalidation > in full, an implementation would need to support RDMA_MSG, > RDMA_NOMSG, RDMA_OPT_INIT_XCHAR, RDMA_MESSAGEX, and RDMA_NOMSGX. > > But now we are talking about significant XDR changes, and a > significant implementation effort. > > > -- > Chuck Lever > > > >
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… karen deitke
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… karen deitke
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… Chuck Lever
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… David Noveck
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… Chuck Lever
- [nfsv4] preliminary review of draft-cel-nfsv4-rem… David Noveck