Re: [nfsv4] preliminary review of draft-cel-nfsv4-reminv-design
karen deitke <karen.deitke@oracle.com> Thu, 11 August 2016 15:10 UTC
Return-Path: <karen.deitke@oracle.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 833C112D778 for <nfsv4@ietfa.amsl.com>; Thu, 11 Aug 2016 08:10:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.467
X-Spam-Level:
X-Spam-Status: No, score=-5.467 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.247, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a-lfYGwU5pFH for <nfsv4@ietfa.amsl.com>; Thu, 11 Aug 2016 08:10:39 -0700 (PDT)
Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 587CF12D750 for <nfsv4@ietf.org>; Thu, 11 Aug 2016 08:10:39 -0700 (PDT)
Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u7BFAbi5026476 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Aug 2016 15:10:37 GMT
Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id u7BFAaNp014884 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Aug 2016 15:10:37 GMT
Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id u7BFAa8T018573; Thu, 11 Aug 2016 15:10:36 GMT
Received: from [10.159.93.194] (/10.159.93.194) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 11 Aug 2016 08:10:35 -0700
To: David Noveck <davenoveck@gmail.com>, nfsv4@ietf.org
References: <CADaq8jdor+Ju+F=ZBcV6zY3PWJerpM_sDtuTPy9EZTo6hymFPQ@mail.gmail.com> <7E1F0B4A-EE1D-4686-BC9B-358AF8FCF095@oracle.com> <CADaq8jecc4dqgcgOZ7yH+n-0tMGZ7XmamM_us7fjAn4+H=C=pQ@mail.gmail.com> <57AB7CB3.1020901@oracle.com> <CADaq8je23qLBbQ-SSrmwOrDv_CSA_hJ7zymLBsKw2ZV4uvLQ+g@mail.gmail.com>
From: karen deitke <karen.deitke@oracle.com>
Organization: Oracle Corporation
Message-ID: <57AC9570.9000905@oracle.com>
Date: Thu, 11 Aug 2016 09:10:40 -0600
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <CADaq8je23qLBbQ-SSrmwOrDv_CSA_hJ7zymLBsKw2ZV4uvLQ+g@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------070506010909050307080602"
X-Source-IP: aserv0022.oracle.com [141.146.126.234]
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/XNWI57TLGwcUZrxrrJipAND0Bb4>
Subject: Re: [nfsv4] preliminary review of draft-cel-nfsv4-reminv-design
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Aug 2016 15:10:43 -0000
Thanks. The ietf email was dropped, adding it back so others can also reply. Karen On 8/11/2016 4:44 AM, David Noveck wrote: > > This is a need for solaris. > > Msg received but I think we need to close the door at some point on > incompatible XDR changes in Version Two. We already have two. I > would have preferred zero but I can live with two that can be tested > for together, as described in Chuck's I-D. > > If there is anybody else who has a need for such an XDR change in > version two, I think it is time to raise the issue now and get issues > relating to potential XDR changes resolved before we get Version Two > ready to be a working group document, which I hope will be soon. > > On Wed, Aug 10, 2016 at 3:12 PM, karen deitke <karen.deitke@oracle.com > <mailto:karen.deitke@oracle.com>> wrote: > > > > On 8/4/2016 2:49 PM, David Noveck wrote: >> > In this case, however, I think the benefits outweigh the costs >> > of altering the base XDR. >> >> Whether this is the case depends on the proportion of >> implementations that are >> adequately supported by a simpler approach that does not provide >> the extended >> functionality of allowing per-request selection of invalidation >> approach. >> >> If 80% or 90% or 95% of implementations are OK with the simpler >> implementation, >> the balance of costs and benefits are different than they would >> be in 30%, 50%, or >> 60% need the'more flexible approach. I think we need to hear >> from the rest of the >> working group about what they think is important here. >> >> > A client can't use RI at >> > all if it can't tolerate an arbitrary choice of which handle >> > in an RPC is invalidated remotely. >> >> The question is how common such clients are. > Solaris can represent one such client. Currently we do not > implement frwr which is needed for RI, however when we do, we > would not be able to utilize RI if the client can not identify > which chunks should be invalidate. Random invalidation by the > server would be detrimental on solaris as it would potentially > also invalidate other memory buffers sharing the same stag. > Additionally it could invalidate other memory we intend to keep > registered. > >> >> > In Linux, it's simply a matter of adding an "if vers == 2, >> > insert (or expect) two u32 fields here." >> >> In lots of places. >> >> > With an rpcgen-based >> > implementation, the change is also fairly trivial (in fact most >> > of it is handled by machine-generated code). >> >> Essentially you have two implementation rather than one >> extensible one. >> >> > Speaking as an implementer, having to support two new RDMA >> > message types (in addition to RDMA_OPT_INIT_XCHAR) is much more >> > effort than having to deal with one or two extra fields in >> > rpcrdma2_chunk_lists.e problems with the simpler approach. >> >> That would be the case if every implementer had to support those >> OPTIONAL >> messages types. If only a few would, the situation would be >> different. >> >> Those who did not need the more extensive support would not have >> to deal with any >> XDR change at all. >> >> > Thus to support Remote Invalidation >> > in full, an implementation would need to support RDMA_MSG, >> > RDMA_NOMSG, RDMA_OPT_INIT_XCHAR, RDMA_MESSAGEX, and >> > RDMA_NOMSGX. >> >> so this all boils down to the question of how many >> implementations want or need to support >> this extended form of remote invalidation. > This is a need for solaris. > > Karen >> >> >> >> >> On Thu, Aug 4, 2016 at 12:35 PM, Chuck Lever >> <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>> wrote: >> >> Hi Dave- >> >> Thanks for your review comments. >> >> >> > On Aug 4, 2016, at 10:48 AM, David Noveck >> <davenoveck@gmail.com <mailto:davenoveck@gmail.com>> wrote: >> > >> > The first issue concerns the structure >> rpcrdma2_chunk_lists. Although this structure is defined in >> draft-cel-nfsv4-rpcrdma-version-two-01, there is nothing that >> uses it. The structure that RDMA_MSG and RDMA_NOMSG include >> is rpcrdma2_chunks, which is not defined in the XDR. It >> appear that this issue has existed since >> draft-cel-nfsv4-rpcrdma-version-two-00. >> > >> > Although that issue can and should be fixed, I think we >> need to have a more complete discussion of the extension >> model for version two and for RPC-over-RDMA as a whole in the >> near term. >> > >> > As I understand things, the original intention was that >> Version Two be a compatible extension of Version One. Now, >> with the inclusion on the direction indication added in -01, >> and now the R-key proposed in reminv-design, Version One and >> Version Two messages will be OTW-incompatible, and I think >> are better off with a model in which Version Two consists of >> subset which equal to Version One and a set of additions, >> primarily optional. I think we should diverge from that >> model only if the benefits are sufficiently important to >> justify this discontinuity. >> >> Yes, the original intention was to be as compatible as possible, >> and any XDR divergence with Version One would not be undertaken >> lightly. That preference is why we have elected to defer changes >> like this one to extensions when it makes sense. >> >> However, with a version number bump, we are no longer entirely >> shackled by existing implementations or XDR definitions. It's >> very easy to take that too far, of course, which is why the >> default choice is to extend rather than alter the base XDR. >> >> In this case, however, I think the benefits outweigh the costs >> of altering the base XDR. >> >> >> > With these changes, the implementation barrier to convert a >> Version One implementation to be compatible Version Two >> becomes significant. I think we would be better off, if most >> Version One implementations could be made compatible with >> very easily, making the decision to do so more or less automatic. >> >> In Linux, it's simply a matter of adding an "if vers == 2, >> insert (or expect) two u32 fields here." With an rpcgen-based >> implementation, the change is also fairly trivial (in fact most >> of it is handled by machine-generated code). >> >> Speaking as an implementer, having to support two new RDMA >> message types (in addition to RDMA_OPT_INIT_XCHAR) is much more >> effort than having to deal with one or two extra fields in >> rpcrdma2_chunk_lists. >> >> >> > To get back to remote invalidation, I would prefer your >> section 3.3 as a baseline to be made accessible with no XDR >> changes in the base Version Two. The additional >> functionality provided by 3.4, while desirable, is not of >> sufficient benefit to justify a non-compatible XDR change. I >> feel that this functionality should be available as an >> OPTIONAL extension. >> >> The benefit of the new field can be described this way: >> >> RPC-over-RDMA allows multiple handles per RPC. >> >> The burden of selecting a handle to invalidate remotely should >> be the client's. I believe one existing client implementation >> does mix persistently registered handles with dynamically >> registered handles in the same RPC. A client can't use RI at >> all if it can't tolerate an arbitrary choice of which handle >> in an RPC is invalidated remotely. >> >> The "big switch" approach is simply not generic enough when >> multiple handles are in play and we don't have control over >> client implementation choices. It would exclude a portion of >> implementations, limiting the appeal of RPC-over-RDMA Version >> Two. >> >> In the case of SMB Direct, all implementers decided that they >> would go with all FRWR registration. A big switch works in >> that scenario. In fact, the switch is always "on". >> >> Providing rdma_inv_handle in each RPC-over-RDMA header goes >> along with the design of having lists of segments, each with >> their own handle. >> >> Should the protocol be designed to discourage implementations >> that need to communicate handles on a per-RPC basis in favor >> of ones that can work with just an exchange of a transport >> characteristic? >> >> >> > When I say that it should be an "OPTIONAL Extension", I >> don't mean to imply: >> > • That it needs to be implemented as a subcase of >> RDMA_OPTIONAL. >> > • That it should be documented in a separate >> document, as opposed to being documented (eventually) in >> draft-ietf-nfsv4-rpcrdma-version-two. >> > What I do mean is that we should define new message type >> for extensions (so that we maintain Version One as a subset >> of Version Two) and that we should (not "SHOULD" :-) make >> these new message types "OPTIONAL" (in the RFC2119 sense). I >> can see, if there is sufficient reason, making an extension >> REQUIRED, but I don't see a reason to change an existing >> message type in an incompatible way. >> > >> > One way to do this is to define new message types >> RDMA_MESSAGEX and RDMA_NOMSGX which include direction and >> rdma_handle but there are other ways to do this. To make it >> easier to determine whether support for OPTIONAL message >> types is present, we could define a transport >> characteristic/attribute that provides a bit mask of >> supported message types. >> >> (An xchar that carries a bitmap of supported message types seems >> appropriate in the initial set of supported characteristics.) >> >> IIRC, earlier we had decided that message types RDMA_MSG and >> RDMA_NOMSG would be REQUIRED. Thus to support Remote Invalidation >> in full, an implementation would need to support RDMA_MSG, >> RDMA_NOMSG, RDMA_OPT_INIT_XCHAR, RDMA_MESSAGEX, and RDMA_NOMSGX. >> >> But now we are talking about significant XDR changes, and a >> significant implementation effort. >> >> >> -- >> Chuck Lever >> >> >> >> >> >> >> _______________________________________________ >> nfsv4 mailing list >> nfsv4@ietf.org <mailto:nfsv4@ietf.org> >> https://www.ietf.org/mailman/listinfo/nfsv4 >> <https://www.ietf.org/mailman/listinfo/nfsv4> > _______________________________________________ nfsv4 mailing list > nfsv4@ietf.org <mailto:nfsv4@ietf.org> > https://www.ietf.org/mailman/listinfo/nfsv4 > <https://www.ietf.org/mailman/listinfo/nfsv4> >
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… karen deitke
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… karen deitke
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… Chuck Lever
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… David Noveck
- Re: [nfsv4] preliminary review of draft-cel-nfsv4… Chuck Lever
- [nfsv4] preliminary review of draft-cel-nfsv4-rem… David Noveck