Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
David Noveck <davenoveck@gmail.com> Wed, 22 January 2020 09:11 UTC
Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1FE9B12003E; Wed, 22 Jan 2020 01:11:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.755
X-Spam-Level:
X-Spam-Status: No, score=-0.755 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, NORMAL_HTTP_TO_IP=0.001, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CuXpDiqxnkKc; Wed, 22 Jan 2020 01:11:06 -0800 (PST)
Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6D0EA12001E; Wed, 22 Jan 2020 01:11:05 -0800 (PST)
Received: by mail-ed1-x535.google.com with SMTP id j17so5981664edp.3; Wed, 22 Jan 2020 01:11:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=uwUyKVpsx/5WG3COk2IEleAb154aXK0kPJwxcgIhXd4=; b=SZ7lrXq5Sye1XG8EuJexq4hB3MPypRC6/A2UAQaa7MaGRkkz5CpTfpnSckbpvAUCan yXmNjSlTyZ6eQiozIZdJ9WjSMazt/V9i79LJxEes3t2lb/TiVZ/HEvNCYKwTjb/+wISk hvJ955qLduLm2R1SJ2RZdI+/6vGGdn9V6HpXN60KqZyncrzqsyJ/+Puz/MM4mah2BhzM 6mMf7wpI3XHpp8Orh7q5v5MHV3OHRGe5AF1M2c42J1iT7JAugy0Oc5cj3O5j/brdlUAH ucZe1VgaBrGRzt60pFSPsrjDyH9B94NS3GDeDLOFTOFnBVdtlfjv8hoUY5drJ1ol37MJ +tew==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=uwUyKVpsx/5WG3COk2IEleAb154aXK0kPJwxcgIhXd4=; b=SieelmhN8pudtMNRjGjY+7njXnWx9ali7Rq4pI8OWHlLlrEPjbyUQ3/Y62S1xV7LmS GWMlt4KGjFFfbIIxYJzUjhPjq2k25sXGIEOGdeuEFx5V23e9sODvw1BfPolavzp2eYii /TqhrJmfX8JBdZfHijJcIyr7Mk1JxvzOf+mcepFFAqD2BJXrX+HimhbwVlVR2N6YLH2o IzbKWnegXQ8hfwztaQ8iezhwuHNgN3n++T0LdJm4XM2ELimUMQ3Gf/ICC2pQlfgLBM5s yOO0/D12lpe7ynuLpPfi3kicob8xd3tLsf8JGlJsFojoExf+tokCGne+upBXtwbWoEf3 U5qw==
X-Gm-Message-State: APjAAAUeIlvjCUsyukn0VaoomJVXRj00uDvu+ut4RFCb3EadttbdzJzC bD2vwzDi1Z151FkGr1nVRr7F3LgmWcQnkwCChllmVg==
X-Google-Smtp-Source: APXvYqwWQTxE+9nQt50uO4R2PphdmIvc3BACpGsirsAT+60whyCoZzPOv80tzRPoox3RMEWJqRvPTL3SUPIzKH7CZ9E=
X-Received: by 2002:a17:906:84c1:: with SMTP id f1mr1746906ejy.206.1579684263520; Wed, 22 Jan 2020 01:11:03 -0800 (PST)
MIME-Version: 1.0
References: <157665795217.30033.16985899397047966102.idtracker@ietfa.amsl.com> <CADaq8jegizL79V4yJf8=itMVUYDuHf=-pZgZEh-yqdT30ZdJ5w@mail.gmail.com> <CADaq8jcURAKZsNvs17MhNFT7eBNtkvOdrur5hHY2J1gXH7QdsA@mail.gmail.com> <20200113225411.GI66991@kduck.mit.edu> <CADaq8jcUWHo9KANDavHER0CA0AMW4t88t+Hg8PykV4S=hXF_HA@mail.gmail.com> <20200122031650.GE80030@kduck.mit.edu>
In-Reply-To: <20200122031650.GE80030@kduck.mit.edu>
From: David Noveck <davenoveck@gmail.com>
Date: Wed, 22 Jan 2020 04:10:50 -0500
Message-ID: <CADaq8jdZg0SCD1eTK9koN46_SGcCmLmJLqWD6RciU_zkcRX2Lw@mail.gmail.com>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: The IESG <iesg@ietf.org>, draft-ietf-nfsv4-rfc5661sesqui-msns@ietf.org, "nfsv4-chairs@ietf.org" <nfsv4-chairs@ietf.org>, Magnus Westerlund <magnus.westerlund@ericsson.com>, NFSv4 <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000627384059cb6e702"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/yq4WYlojFIFHmwLKaWdy1NgDKAU>
Subject: Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Jan 2020 09:11:14 -0000
I should be able to submit a -04 soon, including any last-minute nits. On Tue, Jan 21, 2020, 10:16 PM Benjamin Kaduk <kaduk@mit.edu> wrote: > Not attempting to trim anything, so sparsely inline... > > On Sat, Jan 18, 2020 at 08:30:18AM -0500, David Noveck wrote: > > On Mon, Jan 13, 2020 at 5:54 PM Benjamin Kaduk <kaduk@mit.edu> wrote: > > > > > Hi David, > > > > > > Trimming lots of good stuff here as well... > > > > > > On Thu, Jan 02, 2020 at 10:09:02AM -0500, David Noveck wrote: > > > > On Wed, Dec 18, 2019 at 3:32 AM Benjamin Kaduk via Datatracker < > > > > noreply@ietf.org> wrote: > > > > > > > > > Benjamin Kaduk has entered the following ballot position for > > > > > draft-ietf-nfsv4-rfc5661sesqui-msns-03: Discuss > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > DISCUSS: > > > > > > ---------------------------------------------------------------------- > > > > > > > > > > Responded to these on 12/20. > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > COMMENT: > > > > > > ---------------------------------------------------------------------- > > > > > > > > > > I think I may have mistakenly commented on some sections that are > > > > > actually just moved text, since my lookahead window in the diff > was too > > > > > small. > > > > > > > > > > > > No harm, no foul. > > > > > > > > > > > > > > > > > > Since the "Updates:" header is part of the immutable RFC text > (though > > > > > "Updated by:" is mutable), we should probably explicitly state that > > > "the > > > > > updates that RFCs 8178 and 8434 made to RFC 5661 apply equally to > this > > > > > document". > > > > > > > > > > > > > I think we could update the last paragraph of Section 1.1 to be more > > > > explicit about > > > > this. Perhaps it could read: > > > > > > > > Until the above work is done, there will not be a consistent set > of > > > > documents providing a description of the NFSv4.1 protocol and any > > > > full description would involve documents updating other documents > > > > within the specification. The updates applied by RFC 8434 [66] > and > > > > RFC 8178 [63] to RFC5661 also apply to this specification, and > > > > will apply to any subsequent v4.1 specification until that work is > > > done. > > > > > > Sounds good. > > > > > > > > > > > > > I note inline (in what is probably too many places; please don't > reply > > > > > at all of them!) some question about how clear the text is that a > file > > > > > system migration is something done at a per-file-system > granularity, > > > and > > > > > that migrating a client at a time is not possible. > > > > > > > > > > > > It might be possible but doing so is not a goal of this specfication. > > > > > > > > I'm not sure how to address your concern. I don't know why anyone > would > > > > assume that migrating entire clients is a goal of this specification. > > > As > > > > far as > > > > I can see, when the word "migration" is used it is always in > connection > > > with > > > > migrating a file system. Is there some specific place where you > think > > > > this > > > > issue is likely to arise? > > > > > > I think I garbled my point; my apologies. > > > To give a semi-concrete example, suppose I have clients A and B that > are > > > accessing filesystem F on server X, and filesystem F is also available > on > > > server Y. If X decides that it needs to migrate access to F away from > X > > > (e.g., for maintenance), then the "file system migration event" > involves > > > telling both A and B to look to Y for access to F, at basically the > same > > > time. > > > > > > This clarifies things for me. When you were speaking of "migrating a > > client" > > i ssumed you worried anout consistency of fs's F, G,, H for a particular > > client. > > Now it appears the issue is consistency among clients A, B, c, all > > accessing a > > common F. > > > > If X tries to tell only A but not B to access F via Y but lets B > > > continue to access F at X, then I think there can be some subtle > > > consistency issues. > > > > > > > Or worse, some decidely unsubltle ones :-( > > > > > > > > In some sense, this is easy to consider as a dichotomy between > "migration > > > is for server maintenance" vs. "migration is for load balancing". > > > > > > That categorization helps. > > > > > > > Assuming > > > I understand correctly (not a trivial assumption!), there was never any > > > intent to use these mechanisms for load balancing, > > > > > > Well "Never" covers a lot. There are cases which you do want to do > > load balancing. For example, if you are dealing with multiple network > > access path to the same replica, there is no issue with the load > balancing > > approach. In the case of multiple replicas where data consistency > applies > > between them, then you might lod balance but it is the server's > > resposibility to > > provide the consistency, meaning that he needs to be warned of the > > possibility > > of issues that might arise if clients modifying the same dara are > placed > > on > > different replicas. In the case in which you don't guarantee data > > consistency among > > replicas, you might as well say about doing load balacing that "there be > > dragons". > > > > and if we can explicitly > > > disclaim such usage, then we don't have to try to reason through any > > > potential subtle consistency issues. > > > > > > > I think we can disclaim the really problematic part. I think the new > text > > will be needed in the migration section. Issues with replication are > > different and do not involve any server choice. > > > > I anticipate revising section 11.5.5 to read as follows: > > > > When a file system is present and becomes inacessible using the > > current access path, the NFSv4.1 protocol provides a means by which > > clients can be given the opportunity to have continued access to > > their data. This may involve use of a different access path to the > > existing replica or by providing a path to a different replica. The > > new access path or the location of the new replica is specified by a > > file system location attribute. The ensuing migration of acceess > > includes the ability to retain locks across the transition. > > Depending on circumstances, this can involve: > > > > o The continued use of the existing clientid when accessing the > > current replica using a new access path. > > > > o Use of lock reclaim, taking advantage of a per-fs grace period. > > > > o Use of Tranparent State Migration. > > > > Typically, a client will be accessing the file system in question, > > get an NFS4ERR_MOVED error, and then use a file system location > > attribute to determine new the access path for the data. When > > fs_locations_info is used, additional information will be available > > that will define the nature of the client's handling of the > > transition to a new server. > > > > In most instances clients will choose to migrate all clients using a > > (I assume s/clients/servers/ (just the first time)) > > > particular file system to a successor replica at the same time to > > avoid cases in which different clients are updating diufferent > > replicas. However migration of individual client can be helpful in > > providing load balancing, as long as the replicas in question are > > such that they represent the same data as described in > > Section 11.11.8. > > > > o In the case in which there is no transition between replicas > > (i.e., only a change in access path), there are no special > > difficulties in using of this mechanism to effect load balancing. > > > > o In the case in which the two replicas are sufficiently co- > > ordinated as to allow coherent simultaneous access to both by a > > single client, there is, in general, no obstacle to use of > > migration of particular clients to effect load balancing. > > Generally, such simultaneous use involves co-operation between > > servers to ennsure that locks granted on two co-ordinated replica > > cannot conflict and can remain effective when transferred to a > > common replica. > > > > o In the case in which a large set of clients are accessing a file > > system in a read-only fashion, in can be helpful to migrate all > > clients with writable access simultaneously, while using load > > balancing on the set of read-only copies, as long as the rules > > appearing in Section 11.11.8, designed to prevent data reversion > > are adhered to. > > > > In other cases, the client might not have sufficient guarantees of > > data similarity/coherence to function prperly (e.g. the data in the > > two replicas is similar but not identical), and the possibility that > > different clients are updating different replicas can exacerbate the > > difficulties, making use of load balancing in such situations a > > perilous enterprise. > > > > The protocol does not specify how the file system will be moved > > between servers or how updates to multiple replicas will be co- > > ordinated. It is anticipated that a number of different server-to- > > server co-ordination mechanisms might be used with the choice left to > > the server implementer. The NFSv4.1 protocol specifies the method > > used to communicate the migration event between client and server. > > > > The new location may be, in the case of various forms of server > > clustering, another server providing access to the same physical file > > system. The client's responsibilities in dealing with this > > transition will depend on whether a switch between replicas has > > occurred and the means the server has chosen to provide continuity of > > locking state. These issues will be discussed in detail below. > > > > Although a single successor location is typical, multiple locations > > may be provided. When multiple locations are provided, the client > > will typically use the first one provided. If that is inaccessible > > for some reason, later ones can be used. In such cases the client > > might consider the transition to the new replica to be a migration > > event, even though some of the servers involved might not be aware of > > the use of the server which was inaccessible. In such a case, a > > client might lose access to locking state as a result of the access > > transfer. > > > > When an alternate location is designated as the target for migration, > > it must designate the same data (with metadata being the same to the > > degree indicated by the fs_locations_info attribute). Where file > > systems are writable, a change made on the original file system must > > be visible on all migration targets. Where a file system is not > > writable but represents a read-only copy (possibly periodically > > updated) of a writable file system, similar requirements apply to the > > propagation of updates. Any change visible in the original file > > system must already be effected on all migration targets, to avoid > > any possibility that a client, in effecting a transition to the > > migration target, will see any reversion in file system state. > > > > > > > > As was the case for > > > > > my Discuss point about addresses/port-numbers, I'm missing the > context > > > > > of the rest of the document, so perhaps this is a non-issue, but > the > > > > > consequences of getting it wrong seem severe enough that I wanted > to > > > > > check. > > > > > > > > > > > > > I'm not seeing any severe consequences. Am I missing something? > > > > > > > > > > > > > > > This is clearer now. I think we can avoid any severe consequences. > > > > > > > > > > > > Section 1.1 > > > > > > > > > > The revised description of the NFS version 4 minor version 1 > > > > > (NFSv4.1) protocol presented in this update is necessary to > enable > > > > > full use of trunking in connection with multi-server namespace > > > > > features and to enable the use of transparent state migration in > > > > > connection with NFSv4.1. [...] > > > > > > > > > > nit: do we expect all readers to know what is meant by "trunking" > with > > > > > no other lead-in? > > > > > > > > > > > > > Good point. perhaps it could be addressed by rewriting the material > in > > > the > > > > first paragraph of Section 1.1 to read as follows;. > > > > > > > > Two important features previously defined in minor version 0 but > > > > never fully addressed in minor version 1 are trunking, the use of > > > > multiple connections between a client and server potentially to > > > > different network addresses, and transparent state migration, > which > > > > allows a file system to be transferred betwwen servers in a way > that > > > > provides for the client to maintain its existing locking state > accross > > > > the transfer. > > > > > > Maybe "the simultaneous use of multiple connections"? > > > > > > > Will add. > > > > > > > nit: s/betwwen/between/ > > > > > > > Fixed. > > > > > > > > > > The revised description of the NFS version 4 minor version 1 > > > > (NFSv4.1) protocol presented in this update is necessary to enable > > > > full use of these features with other multi-server namespace > features > > > > This document is in the form of an updated description of the NFS > 4.1 > > > > protocol previously defined in RFC5661 [62]. RFC5661 is > obsoleted by > > > > this document. However, the update has a limited scope and is > focused > > > > on enabling full use of trunkinng and transparent state migration. > > > The > > > > need for these changes is discussed in Appendix A. Appendix B > > > describes > > > > the specific changes made to arrive at the current text. > > > > > > This looks good, thanks. > > > > > > > :-) > > > > > > > > > > [...] > > > > > > > > > > o Work would have to be done to address many erratas relevant > to > > > RFC > > > > > 5661, other than errata 2006 [60], which is addressed in this > > > > > document. That errata was not deferrable because of the > > > > > interaction of the changes suggested in that errata and > handling > > > > > of state and session migration. The erratas that have been > > > > > deferred include changes originally suggested by a particular > > > > > errata, which change consensus decisions made in RFC 5661, > which > > > > > need to be changed to ensure compatibility with existing > > > > > implementations that do not follow the handling delineated > in RFC > > > > > 5661. Note that it is expected that such erratas will remain > > > > > > > > > > This sentence is pretty long and hard to follow; maybe it could be > > > split > > > > > after "change consensus decisions made in RFC 5661" and the second > half > > > > > start with a more declarative statement about existing > implementations? > > > > > (E.g., "Existing implementations did not perform handling as > > > delineated in > > > > > RFC > > > > > 5661 since the procedures therein were not workable, and in order > to > > > > > have the specification accurately reflect the existing deployment > base, > > > > > changes are needed [...]") > > > > > > > > > > > > > I will clean this bullet up. See below for a proposed replcement. > > > > > > > > > > > > > > > > > > relevant to implementers and the authors of an eventual > > > > > rfc5661bis, despite the fact that this document, when > approved, > > > > > will obsolete RFC 5661. > > > > > > > > > > (I assume the RFC Editor can tweak this line to reflect what > actually > > > > > happens; my understanding is that the errata reports will get > cloned to > > > > > this-RFC.) > > > > > > > > > > > > > I understand that Magnus has already got that issue addressed. I'll > > > > discuss the appropriate text with him. > > > > > > > > > > > > > [rant about "errata" vs. "erratum" elided] > > > > > > > > > > > > > This is annoying but there is no way we are going to get people to > use > > > > "erratum". What I've tried to do in my propsed replacement text > > > > is to refer to "errata report(s)", which is more accurate and allows > > > > people who speak English to use English singulars and plurals, > without > > > > having to worry about Latin grammar. > > > > > > That's what I try to do as well :) > > > > > > > Here's my proposed replacement for the troubled bullet: > > > > > > > > o Work needs to be done to address many errata reports relevant > to > > > > RFC 5661, other than errata report 2006 [60], which is > addressed > > > > in this document. Addressing of that report was not deferrable > > > > because of the interaction of the changes suggested there and > the > > > > newly described handling of state and session migration. > > > > > > > > The errata reports that have been deferred and that will need > to > > > > be addressed in a later document include reports currently > > > > assigned a range of statuses in the errata reporting system > > > > including reports marked Accepted and those marked Held Over > > > > > > nit: it's "Hold For Document Update" > > > > > > Fixed > > > > > > because the change was too minor to address immediately. > > > > > > > > In addition, there is a set of other reports, including at > least > > > > one in state Rejected, which will need to be addressed in a > later > > > > document. This will involve making changes to consensus > decisions > > > > reflected in RFC 5661, in situations in which the working > group has > > > > already decided that the treatment in RFC 5661 is incorrect, > and > > > > needs > > > > to be revised to reflect the working group's new consensus and > > > ensure > > > > compatibility with existing implementations that do not follow > the > > > > handling described in in RFC 5661. > > > > > > > > Note that it is expected that such all errata reports will > remain > > > > > > nit: s/such all/all such/ > > > > > > Fixed. > > > > > > relevant to implementers and the authors of an eventual > > > > rfc5661bis, despite the fact that this document, when approved, > > > > will obsolete RFC 5661 [62]. > > > > > > This looks really good! > > > > > > > > > > > > Section 2.10.4 > > > > > > > > > > Servers each specify a server scope value in the form of an > opaque > > > > > string eir_server_scope returned as part of the results of an > > > > > EXCHANGE_ID operation. The purpose of the server scope is to > allow > > > a > > > > > group of servers to indicate to clients that a set of servers > > > sharing > > > > > the same server scope value has arranged to use compatible > values of > > > > > otherwise opaque identifiers. Thus, the identifiers generated > by > > > two > > > > > servers within that set can be assumed compatible so that, in > some > > > > > cases, identifiers generated by one server in that set may be > > > > > presented to another server of the same scope. > > > > > > > > > > Is there more that we can say than "in some cases"? > > > > > > > > > > > > Not really. In general, when a server sends you an id, it comes > with an > > > > implied promise to recognize it when you present it subsequently to > the > > > > same server. > > > > > > > > The fact that two servers have decided to co-operate in their Id > > > assignment > > > > does not change that. > > > > > > > > The previous text > > > > > implies a higher level of reliability than just "some cases", to > me. > > > > > > > > > > > > > I think I need to change the text, perhaps by replacing "use > compatible > > > > values of otherwise > > > > opaque identifiers" by "use distinct values of otherwise opaque > > > identifiers > > > > so that the two > > > > servers never assign the same value to two distinct objects". > > > > > > > > I anticipate the following replacement for the first two paragraphs > of > > > > Section 2.10.4: > > > > > > > > Servers each specify a server scope value in the form of an opaque > > > > string eir_server_scope returned as part of the results of an > > > > EXCHANGE_ID operation. The purpose of the server scope is to > allow a > > > > group of servers to indicate to clients that a set of servers > sharing > > > > the same server scope value has arranged to use distinct values of > > > > opaque identifiers so that the two servers never assign the same > > > > value to two distinct object. Thus, the identifiers generated by > two > > > > servers within that set can be assumed compatible so that, in > certain > > > > important cases, identifiers generated by one server in that set > may > > > > be presented to another server of the same scope. > > > > > > > > The use of such compatible values does not imply that a value > > > > generated by one server will always be accepted by another. In > most > > > > cases, it will not. However, a server will not accept a value > > > > generated by another inadvertently. When it does accept it, it > will > > > > > > nit: I think it flows better to put "invertently" as "will not > > > inadvertently accept". > > > > > > > OK. Fixed. > > > > > > > > > > > be because it is recognized as valid and carrying the same > meaning as > > > > on another server of the same scope. > > > > > > > > > > > > As an illustration of the (limited) value of this information, > consider > > > the > > > > case of client recovery from a server reboot. The client has to > reclaim > > > > his locks using file handles returned by the previous server > instance. > > > If > > > > the server scopes are the same (they almost always are), the client > is > > > not > > > > sure he will get his locks back (e.g. the file might have been > deleted), > > > > but he does know that, if the lock reclaim succeeds, it is for the > same > > > > file. If the server scopes are not the same, he has no such > assurance. > > > > > > Thanks, the new text (and explanation here) is very clear about what's > > > going on. > > > > > > [...] > > > > > Section 11.5.5 > > > > > > > > > > will typically use the first one provided. If that is > inaccessible > > > > > for some reason, later ones can be used. In such cases the > client > > > > > might consider that the transition to the new replica as a > migration > > > > > event, even though some of the servers involved might not be > aware > > > of > > > > > the use of the server which was inaccessible. In such a case, a > > > > > > > > > > nit: the grammar here got wonky; maybe s/as a/is a/? > > > > > > > > > > > > > > How about s/as a/to be a/ ? > > > > > > That works if you drop the earlier "that", for "the client might > consider > > > the transition to the new replica to be a migration event". > > > > > > Did that. > > > > > [...] > > > > > > > > > > o The "local" representation of all owners and groups must be > the > > > > > same on all servers. The word "local" is used here since > that is > > > > > the way that numeric user and group ids are described in > > > > > Section 5.9. However, when AUTH_SYS or stringified owners or > > > > > group are used, these identifiers are not truly local, since > they > > > > > are known tothe clients as well as the server. > > > > > > > > > > I am trying to find a way to note that the AUTH_SYS case mentioned > here > > > > > is precisely because of the requirement being imposed by this > bullet > > > > > point, > > > > > > > > > > > > Not sure what you mean by that. I think the requirement is to allow > the > > > > client > > > > to be able to use AUTH_SYS, without the contortions that would be > > > required > > > > if > > > > different fs's had the same uid's meaning different things. > > > > > > > > while acknowledging that the "stringified owners or group" case > > > > > is separate, but not having much luck. > > > > > > > > > > > > > My attempt to revise this area is below: > > > > > > > > Note that there is no requirement in general that the users > > > > corresponding to particular security principals have the same > local > > > > representation on each server, even though it is most often the > case > > > > that this is so. > > > > > > > > When AUTH_SYS is used, the following additional requirements must > be > > > > met: > > > > > > > > o Only a single NFSv4 domain can be supported through use of > > > > AUTH_SYS. > > > > > > > > o The "local" representation of all owners and groups must be the > > > > same on all servers. The word "local" is used here since that > is > > > > the way that numeric user and group ids are described in > > > > Section 5.9. However, when AUTH_SYS or stringified numeric > owners > > > > or groups are used, these identifiers are not truly local, > since > > > > they are known to the clients as well as the server. > > > > > > > > Similarly, when strigified numeric user and group ids are used, > the > > > > "local" representation of all owners and groups must be the same > on > > > > all servers, even when AUTH_SYS is not used. > > > > > > I really like this rewriting; thank you for undertaking it. > > > I think that what I was trying to say here is roughly that we need > > > scare-quotes for "local" because of things like AUTH_SYS (or > stringified > > > user/group ids) that involve sending local representations over the > > > network. So your rewrite did in fact address my concern, even though I > > > didn't manage to say it very well the first time :) > > > > > > [...] > > > > > > > > > > o When there are no potential replacement addresses in use but > > > there > > > > > > > > > > What is a "replacement address"? > > > > > > > > > > > > > I've explained that in some new text added before these bullets, as > a new > > > > second > > > > paragraph of this section: > > > > > > > > The appropriate action depends on the set of replacement addresses > > > > (i.e. server endpoints which are server-trunkable with one > previously > > > > being used) which are available for use. > > > > > > > > > > > > > > are valid addresses session-trunkable with the one whose use > is > > > to > > > > > be discontinued, the client can use BIND_CONN_TO_SESSION to > > > access > > > > > the existing session using the new address. Although the > target > > > > > session will generally be accessible, there may be cases in > which > > > > > that session is no longer accessible. In this case, the > client > > > > > can create a new session to enable continued access to the > > > > > existing instance and provide for use of existing > filehandles, > > > > > stateids, and client ids while providing continuity of > locking > > > > > state. > > > > > > > > > > I'm not sure I understand this last sentence. On its own, the "new > > > > > session to enable continued access to the existing instance" sounds > > > like > > > > > the continued access would be on the address whose use is to > cease, and > > > > > thus the new session would be there. > > > > > > > > > > > > That is not the intention. Will need to clarify. > > > > > > > > > > > > > But why make a new session when > > > > > the old one is still good, > > > > > > > > > > > > It isn't usable on the new connection. > > > > > > > > > > > > > especially when we just said in the previous > > > > > sentence that the old session can't be moved to the new > > > > > connection/address? > > > > > > > > > > > > > Because we can't use it on the new connection, we have to create a > > > > new session to access the client. > > > > > > > > Perhaps a forward reference down to Section 11.12.{4,5} for this and > the > > > > > next bullet point would help as well as rewording? > > > > > > > > > > > > > It rurns out these would add confusion since they deal with migration > > > > situations > > > > and deciding wheher transparent stte miugration has occurred in the > > > switch > > > > between > > > > replicas. In the cases we are dealing with, ther is only a single > > > > replicas/fs and no > > > > migration.. > > > > > > > > Here is my proposed replacement text for the two bullets in question: > > > > > > > > o When there are no potential replacement addresses in use but > there > > > > are valid addresses session-trunkable with the one whose use > is to > > > > be discontinued, the client can use BIND_CONN_TO_SESSION to > access > > > > the existing session using the new address. Although the > target > > > > session will generally be accessible, there may be rare > situations > > > > in which that session is no longer accessible, when an attempt > is > > > > made tto bind the new conntectin to it. In this case, the > client > > > > > > nits: s/tto/to/, s/conntectin/connection/ > > > > > > > Fixed. > > > > > > > > > can create a new session to enable continued access to the > > > > existing instance and provide for use of existing filehandles, > > > > stateids, and client ids while providing continuity of locking > > > > state. > > > > > > Just to check: this sounds like even in the case where the client > creates > > > a new session, the filehandle, stateid, clientid, and locking state > > > (values) are in effect "transparently preserved" by the server, so the > > > client has no need to do any reclamation of locking state. I think > that's > > > what's intended, but holler if I'm wrong about that. > > > > > Ok. > > I'll holler that *you're right about that.* > > > > > > > > > o When there is no potential replacement address in use and there > > > > are no valid addresses session-trunkable with the one whose > use is > > > > to be discontinued, other server-trunkable addresses may be > used > > > > to provide continued access. Although use of CREATE_SESSION is > > > > available to provide continued access to the existing instance, > > > > servers have the option of providing continued access to the > > > > existing session through the new network access path in a > fashion > > > > similar to that provided by session migration (see Section > 11.12). > > > > To take advantage of this possibility, clients can perform an > > > > initial BIND_CONN_TO_SESSION, as in the previous case, and use > > > > CREATE_SESSION only if that fails. > > > > > > > > > > > > > Section 11.10.6 > > > > > > > > > > In a file system transition, the two file systems might be > clustered > > > > > in the handling of unstably written data. When this is the > case, > > > and > > > > > > > > > > What does "clustered in the handling of unstably written data" > mean? > > > > > > > > > > the two file systems belong to the same write-verifier class, > write > > > > > > > > > > How is the client supposed to determine "when this is the case"? > > > > > > > > > > > > > Here's a prpoed replcment for this pargraph: > > > > > > > > In a file system transition, the two file systems might be > > > > cooperating in the handling of unstably written data. Clients can > > > > ditermine if this is the case, by seeing if the two file systems > > > > belong to the same write-verifier class. When this is the case, > > > > write verifiers returned from one system may be compared to those > > > > returned by the other and superfluous writes avoided. > > > > > > > > > > > > > Section 11.10.7 > > > > > > > > > > In a file system transition, the two file systems might be > > > consistent > > > > > in their handling of READDIR cookies and verifiers. When this > is > > > the > > > > > case, and the two file systems belong to the same readdir class, > > > > > > > > > > As above, how is the client supposed to determine "when this is the > > > > > case"? > > > > > > > > > > > > > READDIR cookies and verifiers from one system may be recognized > by > > > > > the other and READDIR operations started on one server may be > > > validly > > > > > continued on the other, simply by presenting the cookie and > verifier > > > > > returned by a READDIR operation done on the first file system > to the > > > > > second. > > > > > > > > > > Are these "may be"s supposed to admit the possibility that the > > > > > destination server can just decide to not honor them arbitrarily? > > > > > > > > > > > > > No. They are intended to indicate that the client might or might not > use > > > > the capability > > > > > > > > Here is proposed replacement text for the paragraph: > > > > > > > > In a file system transition, the two file systems might be > consistent > > > > in their handling of READDIR cookies and verifiers. Clients can > > > > determine if this is the case, by seeing if the two file systems > > > > belong to the same readdit class. When this is the case, readdir > > > > > > nit: s/readdit/readdirt > > > > > > > Fixed. > > > > > > > > > > > class, READDIR cookies and verifiers from one system will be > > > > recognized by the other and READDIR operations started on one > server > > > > can be validly continued on the other, simply by presenting the > > > > cookie and verifier returned > > > > > > Ah, this formulation (for both write-verifier and readdir) is very > helpful. > > > > > > [...] > > > > > Section 11.16.1 > > > > > > > > > > With the exception of the transport-flag field (at offset > > > > > FSLI4BX_TFLAGS with the fls_info array), all of this data > applies to > > > > > the replica specified by the entry, rather that the specific > network > > > > > path used to access it. > > > > > > > > > > Is it clear that this applies only to the fields defined by this > > > > > specification (since, as mentioned later, future extensions must > > > specify > > > > > whether they apply to the replica or the entry)? > > > > > > > > > > > > > Intend to use the following replacement text: > > > > > > > > With the exception of the transport-flag field (at offset > > > > FSLI4BX_TFLAGS with the fls_info array), all of this data > defuined in > > > > > > nit: s/defuined/defined/ > > > > > > > Fixed. > > > > > > > > this specification applies to the replica specified by the entry, > > > > rather that the specific network path used to access it. The > > > > classification of data in extensions to this data is discussed > > > > below > > > > > > [...] > > > > > Section 18.35.3 > > > > > > > > > > I a little bit wonder if we want to reaffirm that co_verifier > remains > > > > > fixed when the client is establishing multiple connections for > trunking > > > > > usage -- the "incarnation of the client" language here could make a > > > > > reader wonder, though I think the discussion of its use elsewhere > as > > > > > relating to "client restart" is sufficiently clear. > > > > > > > > > > > > > This should be made clearer but the clarification needs to be done > > > multiple > > > > places. > > > > > > > > Possible replacement text for eighth non-code paragraph of section > 2.4: > > > > > > > > The first field, co_verifier, is a client incarnation verifier, > > > > allowing the server to distingish successive incarnations (e.g. > > > > reboots) of the same client. The server will start the process of > > > > canceling the client's leased state if co_verifier is different > than > > > > what the server has previously recorded for the identified client > (as > > > > specified in the co_ownerid field). > > > > > > > > Likely replacement text for the seventh paragraph of this section: > > > > > > > > The eia_clientowner field is composed of a co_verifier field and a > > > > co_ownerid string. As noted in Section 2.4, the co_ownerid > describes > > > > the client, and the co_verifier is the incarnation of the > client. An > > > > EXCHANGE_ID sent with a new incarnation of the client will lead to > > > > the server removing lock state of the old incarnation. Whereas an > > > > EXCHANGE_ID sent with the current incarnation and co_ownerid will > > > > result in an error, an update of the client ID's properties, > > > > depending on the arguments to EXCHANGE_ID, or the return of > > > > information about the existing client_id as might happen when this > > > > opration is done to the same seerver using different network > > > > addresses as part of creating trunked connections. > > > > > > > Not sure what error that error text was referring to above. Think it > > added to the > > confusion.. > > > > > > > > I think I get the general sense of what is going on here (i.e., the > last > > > sentence) but am still uncertain on the specifics. Namely, "most of > the > > > time" (TM), sending EXCHANGE_ID with current incarnation/ownerid will > be an > > > error, since it's a client bug to try to register the same way twice > in a > > > row. > > > > > > No it isn't. This is case 2 on page 508, " Non-Update on Existing > Client > > ID". > > Given retries and possible communication difficulties, it is just too > hard > > to > > make this case an error. > > > > However, some times we might have to do that in order to update > > > properties of the client or get some new information that a server has > > > associated to a given client ID. I *think* (but am not sure) that the > > > error case is exactly when the (same-incarnation/ownerid) EXCHANGE_ID > is > > > done to the same *server and address* as the original EXCHANGE_ID, and > that > > > the "update properties or get new information back" case is exactly > when > > > the EXCHANGE_ID is done to a different server/address combination. > > > > > > If I'm right about that, then I'd suggest: > > > > > > % the server removing lock state of the old incarnation. Whereas an > > > % EXCHANGE_ID sent with the current incarnation and co_ownerid will > > > % result in an error when sent to a given server at a given address > for > > > % a second time, it is not an error to send EXCHANGE_ID with current > > > % incarnation and co_ownerid to a different server (e.g., as part > of a > > > % migration event). In such cases, the EXCHANGE_ID can allow for an > > > % update of the client ID's properties, depending on the arguments > to > > > % EXCHANGE_ID, or the return of (potentially updated) information > about > > > % the existing client_id, as might happen when this opration is > done to > > > % the same server using different network addresses as part of > creating > > > % trunked connections. > > > > > > > I think I have to revise the paragraph above to be clearer. I > anticipate > > replacing the seventh paragraph of section 18.35.3 by the following > > replacement: > > > > The eia_clientowner field is composed of a co_verifier field and a > > co_ownerid string. As noted in Section 2.4, the co_ownerid > > identifies the client, and the co_verifier specfies a particular > > incarnation of that client. An EXCHANGE_ID sent with a new > > incarnation of the client will lead to the server removing lock state > > of the old incarnation. On the other hand, an EXCHANGE_ID sent with > > the current incarnation and co_ownerid will, when it does not result > > in an unrelated error, porentially update an existing client ID's > > properties, or simply return of information about the existing > > client_id. That latter would happen when this operation is done to > > the same server using different network addresses as part of creating > > trunked connections. > > Ah, I think I get it now. Thanks. > > > > > > > > Section 21 > > > > > > > > > > Some other topics at least somewhat related to trunking and > migration > > > > > that we could potentially justify including in the current, > > > > > limited-scope, update (as opposed to deferring for a full -bis) > > > include: > > > > > > > > > > > > > Some of these are related to multi-server namespace but not related > to > > > > security, as far as I can see. > > > > > > It does look like it; in some sense I was going through a brainstorming > > > exercise to make this list, and appreciate the sanity checks. (To be > > > clear, I am not insisting that any of them get covered in specifically > the > > > sesqui update, just mentioning topics for potential consideration.) > > > > > > > > > > > > > > - clients that lie about reclaimed locks during a post-migration > grace > > > > > period > > > > > > > > > > > > > Will address in a number of places: > > > > > > > > First of all, I inted to add a new paragraph to Section 21, to be > placed > > > as > > > > the > > > > sixth non-bulleted paragraph and to read as follows: > > > > > > > > Security consideration for lock reclaim differ between the state > > > > reclaim done after server failure (discussed in Section 8.4.2.1.1 > and > > > > the per-fs state reclaim done in support of migration/replication > > > > (discussed in Section 11.11.9.1). > > > > > > > > Next is a new proposed new section to appear as Section 11.11.9.1: > > > > > > > > 11.11.9.1. Security Consideration Related to Reclaiming Lock State > > > > after File System Transitions > > > > > > > > Although it is possible for a client reclaiming state to > misrepresent > > > > its state, in the same fashion as described in Section 8.4.2.1.1, > > > > most implementations providing for such reclamation in the case of > > > > file system transitions will have the ability to detect such > > > > misreprsentations. this limits the ability of unauthenicatd > clients > > > > > > typos: "misrepresentations", "This", "unauthenticated" > > > > > > > Fixed. > > > > > > > > > > > to execute denial-of-service attacks in these cirsumstances. > > > > > > "circumstances" > > > > > > > > Fixed. > > > > > > > > Nevertheless, the rules stated in Section 8.4.2.1.1, regarding > > > > prinipal verification for reclaim requests, apply in this > situation > > > > as well. > > > > > > > > Typically,implementations support file system transitions will > have > > > > > > nits: space after comma, and "that" for "that support" > > > > > > Fixed. > > > > > > > > extensive information about the locks to be transferred. This is > > > > because: > > > > > > > > o Since failure is not involved, there is no need store to > locking > > > > information in persistent storage. > > > > > > > > o There is no need, as there is in the failure case, to update > > > > multiple repositories containg locking state to keep them in > sync. > > > > Instead, there is a one-time communication of locking state > from > > > > the source to the destination server. > > > > > > > > o Providing this information avoids potential interference with > > > > existing clients using the destination file system, by denying > > > > them the ability to obtain new locks during the grace period. > > > > > > > > When such detailed locking infornation, not necessarily including > the > > > > associated stateid,s is available, > > > > > > nits: "information", s/stateid,s/stateids,/ > > > > > > > Fixed. > > > > > > > > > > > > o It is possible to detect reclaim requests that attempt to > reclsim > > > > > > nit: s/reclsim/reclaim/ > > > > > > > Fixed. > > > > > > > > > locks that did not exist before the transfer, rejecting them > with > > > > NFS4ERR_RECLAIM_BAD (Section 15.1.9.4). > > > > > > > > o It is possible when dealing with non-reclaim requests, to > > > > determine whether they conflict with existing locks, > eliminating > > > > the need to return NFS4ERR_GRACE ((Section 15.1.9.2) on non- > > > > reclaim requests. > > > > > > > > It is possible for implementations of grace periods in connection > > > > with file system transitions not to have detailed locking > information > > > > available at the destination server, in which case the security > > > > situation is exactly as described in Section 8.4.2.1.1. > > > > > > > > I think I should also draw your attention to a revised Section > 15.1.9. > > > > These > > > > includes some revisions originally done for > > > > draft-ietf-nfsv4-rfc5661-msns-update, > > > > which somehow got dropped as a few that turned up as necessary in > writing > > > > 11.11.9.1: > > > > > > > > 15.1.9. Reclaim Errors > > > > > > > > These errors relate to the process of reclaiming locks after a > server > > > > restart. > > > > > > > > 15.1.9.1. NFS4ERR_COMPLETE_ALREADY (Error Code 10054) > > > > > > > > The client previously sent a successful RECLAIM_COMPLETE operation > > > > specifying the same scope, whether that scope is global or for the > > > > same file system in the case of a per-fs RECLAIM_COMPLETE. An > > > > additional RECLAIM_COMPLETE operation is not necessary and > results in > > > > this error. > > > > > > > > 15.1.9.2. NFS4ERR_GRACE (Error Code 10013) > > > > > > > > This error is returned when the server was in its recovery or > grace > > > > period. with regard to the file system object for which the lock > was > > > > > > (no full stop) > > > > > > > Fixed. > > > > > > > > requested resulting in a situation in which a non-reclaim locking > > > > request could not be granted. This can occur because either > > > > > > > > o The server does not have sufficiuent information about locks > that > > > > might be poentially reclaimed to determine whether the lock > could > > > > validly be granted. > > > > > > > > o The request is made by a client responsible for reclaiming its > > > > locks that has not yet done the appropriate RECLAIM_COMPLETE > > > > operation, allowing it to proceed to obtain new locks. > > > > > > > > It should be noted that, in the case of a per-fs grace period, > there > > > > may be clients, i.e. those currently using the destination file > > > > system who might be unaware of the circumstances resulting in the > > > > > > nit: comma after "file system" > > > > > > > This phrase is now within parentheses. > > > > > > > > > intiation of the grace period. Such clients need to periodically > > > > retry the request until the grace period is over, just as other > > > > clients do. > > > > > > > > 15.1.9.3. NFS4ERR_NO_GRACE (Error Code 10033) > > > > > > > > A reclaim of client state was attempted in circumstances in which > the > > > > server cannot guarantee that conflicting state has not been > provided > > > > to another client. This occurs if there is no active grace period > > > > appliying to the file system object for which the request was > made,if > > > > the client making the request has no current role in reclaining > > > > locks, or because previous operations have created a situation in > > > > which the server is not able to determine that a > reclaim-interfering > > > > edge condition does not exist. > > > > > > > > 15.1.9.4. NFS4ERR_RECLAIM_BAD (Error Code 10034) > > > > > > > > The server has determined that a reclaim attempted by the client > is > > > > not valid, i.e. the lock specified as being reclaimed could not > > > > possibly have existed before the server restart or file system > > > > migration event. A server is not obliged to make this > determination > > > > and will typically rely on the client to only reclaim locks that > the > > > > client was granted prior to restart. However, when a server does > > > > have reliable information to enable it make this determination, > this > > > > error indicates that the reclaim has been rejected as invalid. > This > > > > is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see > > > > Section 15.1.9.5) where the server can only determine that there > has > > > > been an invalid reclaim, but cannot determine which request is > > > > invalid. > > > > > > > > 15.1.9.5. NFS4ERR_RECLAIM_CONFLICT (Error Code 10035) > > > > > > > > The reclaim attempted by the client has encountered a conflict and > > > > cannot be satisfied. This potentially indicates a misbehaving > > > > client, although not necessarily the one receiving the error. The > > > > misbehavior might be on the part of the client that established > the > > > > lock with which this client conflicted. See also Section 15.1.9.4 > > > > for the related error, NFS4ERR_RECLAIM_BAD > > > > > > Thanks for remembering to fetch these updates from the full bis WIP! > > > > > > > > > > > > - how attacker capabilities compare by using a compromised server > to > > > > > give bogus referrals/etc. as opposed to just giving bogus > data/etc. > > > > > > > > > > > > > Will address. See the paragraphs to be added to the end of Section > 21. > > > > > > > > > > > > > - an attacker in the network trying to shift client traffic (in > terms > > > of > > > > > what endpoints/connections they use) to overload a server > > > > > > > > > > > > > Will address. See the paragraphs to be added to the end of Section > 21. > > > > > > > > > > > > > - how asynchronous replication can cause clients to repeat > > > > > non-idempotent actions > > > > > > > > > > > > > Not sure what you are referring to. > > > > > > I don't have something fully fleshed out here, but it's in the general > > > space when there are multiple replicas that get updates at (varying) > > > delays from the underlying write. A contrived situation would be if > you > > > have a pool of worker machines that use NFS for state management (I > know, a > > > pretty questionable idea), and try to do compare-and-set on a state > file. > > > If one worker tries to assert that it owns the state but other NFS > replicas > > > see delayed updates, additional worker machines could also try to > claim the > > > state and perform whatever operation the state file is controlling. > > > > > > Basically, the point here is that if you as an NFS consumer are using > NFS > > > with relaxed replication semantics, you have to think through how your > > > workflow will behave in the presence of such relaxed updates. Which > ought > > > to be obvious, when I say it like that, but perhaps is not always > actually > > > obvious. > > > > > > > > > > > > - the potential for state skew and/or data loss if migration events > > > > > happen in close succession and the client "misses a notification" > > > > > > > > > > > > > Is there a specfic problem that needs to be addressed? > > > > > > I don't have a concrete scenario that's specific to NFS, no; this is > just a > > > generic possibility for any scheme that involves discrete updates > (e.g., > > > file-modifying RPCs) and the potential for asynchronous replication. > > > > > > > I think that the necessary discussion can be folded into some > clarification > > of replication discussed in > > another thread. > > Sure. > > > > > > > > > > > > - cases where a filesystem moves and there's no longer anything > running > > > > > at the old network endpoint to return NFS4ERR_MOVED > > > > > > > > > > > > > This seems to me just a recognition that sometimes system fail. Not > sure > > > > specifically what to address. > > > > > > Okay. > > > > > > > > - what can happen when non-idempotent requests are in a COMPOUND > before > > > > > a request that gets NFS4ERR_MOVED > > > > > > > > > > > > > Intend to address in Section 15.1.2.4: > > > > > > > > The file system that contains the current filehandle object is not > > > > present at the server, or is not accessible using the network > > > > addressed. It may have been made accessible on a different ser of > > > > network addresses, relocated or migrated to another server, or it > may > > > > have never been present. The client may obtain the new file > system > > > > location by obtaining the "fs_locations" or "fs_locations_info" > > > > attribute for the current filehandle. For further discussion, > refer > > > > to Section 11.3. > > > > > > > > As with the case of NFS4ERR_DELAY, it is possible that one or more > > > > non-idempotent operation may have been successfully executed > within a > > > > COMPOUND before NFS4ERR_MOVED is returned. Because of this, once > the > > > > new location is determined, the original request which received > the > > > > NFS4ERR_MOVED should not be re-executed in full. Instead, a new > > > > COMPOUND, with any successfully executed non-idempotent operation > > > > removed should be executed. This new request should have > different > > > > slot id or sequence in those cases in which the same session is > used > > > > for the new request (i.e. transparent session migration or an > > > > > > nit: comma after "i.e.". > > > > > > > Fixed. > > > > > > > > > > > endpoint transition to a new address session-trunkable with the > > > > original one). > > > > > > > > > > > > > - how bad it is if the client messes up at Transparent State > Migration > > > > > discovery, most notably in the case when some lock state is lost > > > > > > > > > > > > > Propose to address this by adding the following paragraph to the end > of > > > > Section 11.13.2: > > > > > > > > Lease discovery needs to be provided as described above, in order > to > > > > ensure that migrations are discovered soon enough to ensure that > > > > leases moved to new servers are discovred in time to make sure > that > > > > leases are renewed early enough to avoid lease expiration, > leading to > > > > loss of locking state. While the consequences of such loss can be > > > > > > nit: the double "are discovered {soon enough,in time}" is a little > awkward > > > of a construction; how about "Lease discovery needs to be provided as > > > described above, in order to ensure that migrations are discovered soon > > > enough that leases moved to new servers can successfully be renewed > before > > > they expire, avoiding loss of locking state"? > > > > > > > Went with the following: > > > > Lease discovery needs to be provided as described above, in > order to > > ensure that migrations are discovered soon enough to enable > > leases moved to new servers to be appropriately renewed in > order to > > avoid lease expiration, leading to loss of locking state. > > I think maybe we should leave any further tweaks to the RFC Editor; my only > concern is whether it can be misread as saying that the renewal leads to > loss of locking state (which is, admittedly, a nonsensical interpretation). > > > > > > > > ameliorated through implementations of courtesy locks, servers are > > > > under no obligation to do, and a conflicting lock request may > means > > > > > > nit: s/means/mean/ > > > > > > > Fixed. > > > > > > > > > > > that a lock is revoked unexpectedly. Clients should be aware of > this > > > > possibility. > > > > > > > > > > > > > > > > > - the interactions between cached replies and migration(-like) > events, > > > > > though a lot of this is discussed in section 11.13.X and 15.1.1.3 > > > > > already > > > > > > > > > > > > > Will address any specfics that you feel aren't adequately addressed. > > > > > > I don't remember any particular specifics, so we should probably just > let > > > this go for now. > > > > > > > > > > > > > > > > > but I defer to the WG as to what to cover now vs. later. > > > > > > > > > > In light of the ongoing work on draft-ietf-nfsv4-rpc-tls, it might > be > > > > > reasonable to just talk about "integrity protection" as an abstract > > > > > thing without the specific focus on RPCSEC_GSS's integrity > protection > > > > > (or authentication) > > > > > > > > > > > > > > I was initially leery of this, but when I looked at the text, I was > able > > > to > > > > avoid referring to RPCSEC_GSS in most cases in which integrity was > > > > mentioned:-). The same does not seem posible for authentication :-( > > > > > > We'll take the easy wins and try to not fret too much about the other > > > stuff. > > > > > > > > RPCSEC_GSS does not > > > > > % protect the binding from one server to another as part of a > > > referral > > > > > % or migration event. The source server must be trusted to > provide > > > > > % the correct information, based on whatever factors are > available to > > > > > % the client. > > > > > > > > > > > > > These are both situations for which RPCSEC_GSS has no solution, but > > > neither > > > > is there another one. It is probably best to just say that without > > > > reference > > > > to integrity protection. > > > > > > True. > > > > > > > I have added new paragraphs after these bullets that may address > some of > > > > the > > > > issues you were concerned about. > > > > > > > > Even if such requests are not interfered with in flight, it is > > > > possible for a compromised server to direct the client to use > > > > inappropriate servers, such as those under the control of the > > > > attacker. It is not clear that being directed to such servers > > > > represents a greater threat to the client than the damage that > could > > > > be done by the comprromised server itself. However, it is > possible > > > > that some sorts of transient server compromises might be taken > > > > advantage of to direct a client to a server capable of doing > greater > > > > damage over a longer time. One useful step to guard against this > > > > possibility is to issue requests to fetch location data using > > > > RPCSEC_GSS, even if no mapping to an RPCSEC_GSS principal is > > > > available. In this case, RPCSEC_GSS would not be used, as it > > > > typically is, to identify the client principal to the server, but > > > > rather to make sure (via RPCSEC_GSS mutual authentication) that > the > > > > server being contacted is the one intended. > > > > > > > > Similar considrations apply if the threat to be avoided is the > > > > direction of client traffic to inappropriate (i.e. poorly > performing) > > > > servers. In both cases, there is no reason for the information > > > > returned to depend on the identity of the client principal > requesting > > > > it, while the validity of the server information, which has the > > > > capability to affect all client principals, is of considerable > > > > importance. > > > > > > These do address some of the issues I mentioned; thank you. I do have > a > > > couple further comments: > > > > > > - I'm not sure what "no mapping to an RPCSEC_GSS principal is > available" > > > means (but maybe that's just because I've not read the RPCSEC_GSS > RFCs > > > recently enough) > > > > > > > This is partly because " even if no mapping to an RPCSEC_GSS principal is > > available" > > is misleading. It would have been better to say "even if no mapping is > > to an > > RPCSEC_GSS principal is available for the user currently obtaining the > > information". > > The issue is not within RPCSEC_GSS itself but relates to how it is used > by > > NFSv4. > > Servers are required to support RPCSEG_GSS but to use it, you need to > > translate > > a uid to an RPCSEC_GSS principal, Where that mapping is not available, > as > > it often > > is not, you cannot use RPCSEC_GSS for that user. > > Ah, that was enough to jostle the neurons into place (and no change to the > text needed). > > Thanks again, > > Ben > > > > available > > > > > > - w.r.t. "there is no reason for the information returned to depnd on > the > > > identity of the client principal", I could perhaps imagine some setup > > > that uses information from a corporate contacts database to > determine the > > > current office/location of a given user and provide a referral to a > > > spatially-local replica. So "no reason" may be too absolute (but I > don't > > > have a proposed alternative and don't object to using this text). > > > > > > > > > [...] > > > > > Section B.4 > > > > > > > > > > o The discussion of trunking which appeared in Section 2.10.5 > of > > > > > RFC5661 [62] needed to be revised, to more clearly explain > the > > > > > multiple types of trunking supporting and how the client can > be > > > > > made aware of the existing trunking configuration. In > addition > > > > > the last paragraph (exclusive of sub-sections) of that > section, > > > > > dealing with server_owner changes, is literally true, it has > been > > > > > a source of confusion. [...] > > > > > > > > > > nit: the grammar here is weird; I think there's a missing "while" > or > > > > > similar. > > > > > > > > > > > > > > > > > > > Anticipate using the following replacement text: > > > > > > > > o The discussion of trunking which appeared in Section 2.10.5 of > > > > RFC5661 [62] needed to be revised, to more clearly explain the > > > > multiple types of trunking supporting and how the client can be > > > > > > nit: just "trunking support" (not "-ing")? > > > > > > > Fixed. > > > > > > > > > > > made aware of the existing trunking configuration. In > addition, > > > > while the last paragraph (exclusive of sub-sections) of that > > > > section, dealing with server_owner changes, is literally true, > it > > > > has been a source of confusion. Since the existing paragraph > can > > > > be read as suggesting that such changes be dealt with non- > > > > disruptively, the issue needs to be clarified in the revised > > > > section, which appears in Section 2.10.5. > > > > > > Thanks again for going through my giant pile of comments; I hope that > you > > > think the improvements to the document are worth the time spent. > > > > > > > I do. > > > > > > > -Ben > > > >
- [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nf… Benjamin Kaduk via Datatracker
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nf… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Magnus Westerlund