Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
Magnus Westerlund <magnus.westerlund@ericsson.com> Tue, 25 February 2020 09:40 UTC
Return-Path: <magnus.westerlund@ericsson.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A3023A076C; Tue, 25 Feb 2020 01:40:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Tn2Gy8nXmJUH; Tue, 25 Feb 2020 01:40:17 -0800 (PST)
Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70083.outbound.protection.outlook.com [40.107.7.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0AFDD3A0769; Tue, 25 Feb 2020 01:40:13 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bbjiISM0dapmJef3s3CZMCVDq/zBNMronK03XGe7LfHE11+pdULtd/tpEJv990ZcoBroc48HmtPgfHMf/f1Y56EbXXdR4KJ0/MtmfMcb96veResf6UHEan1su1IHPhUnoB6IDAN70GmWUNSCLHqIZ8x/3SJO89eck7wJ9s98A2cm7fn+PO8+g0o2S2dbhayj0f32qaxWE82oTuuZhn3MVXEJEXj1tXl4JkfwvX0XkwV6USE4j4oNH1YSDnBmCMJGy5NOi1GKrCRugbybo1XgCpjXSTOdTLOplUBbbzUJggeGKPDhB9A7ZI6TtRHWj2Yk+z8GogMWica6KkfeE3hBcg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VzrCohNRM+QZfXtT0E1nJsfkGvY3IHBYTZhPh7BwUo4=; b=noaKHnxBbGLuuvUbpd4+d6MRqDcO0FNaJmrWDEMJFxVPbRcmlOHF1rmMKHrN1O+o/1HIw7qorZ1xtD2bNA5drcToY84b4KQSe4s1hyrLo6rdb/PczJdR4UTcVf8SOEbH4SwCibC45C4Y7WmnkFhHtnsOgGdijcaEaeJoqGQbDooJU7zuwg4Dp9F0Fx/qXnoPqB8/x8mwklbAPsEQguNIOZEyyPRGyop35JSvFfY0meXA96yOAQJ3pHXHOUn5HZj9VAjLUml/WXouImFCMDbQx9zohBZjBUqG89JSzPLLrzv5czUCSvGtXVQp1vmE3iK3CrBad/cistjbOKKx6fHyog==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ericsson.com; dmarc=pass action=none header.from=ericsson.com; dkim=pass header.d=ericsson.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VzrCohNRM+QZfXtT0E1nJsfkGvY3IHBYTZhPh7BwUo4=; b=Hwz6Gk/dOjxHDVK6REfqqLRQYnJmnIc7zLB23oUJw7t7i/FDbwLZgb2KY0dpe5g1JBg107rjWVn8nWyLXHNXvLf/CZARO4pl2MEh7ZjqaQmD/5mcLU0DFjH9zK/kiXSVJMaruXNSElZVMOyOtnUjLquwDUPjdo7OtarqKSh4DFY=
Received: from DB7PR07MB4572.eurprd07.prod.outlook.com (52.135.133.12) by DB7PR07MB5179.eurprd07.prod.outlook.com (20.178.40.224) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2772.9; Tue, 25 Feb 2020 09:40:06 +0000
Received: from DB7PR07MB4572.eurprd07.prod.outlook.com ([fe80::5dc9:9b70:83a1:cbfd]) by DB7PR07MB4572.eurprd07.prod.outlook.com ([fe80::5dc9:9b70:83a1:cbfd%7]) with mapi id 15.20.2772.012; Tue, 25 Feb 2020 09:40:06 +0000
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
To: "davenoveck@gmail.com" <davenoveck@gmail.com>, "kaduk@mit.edu" <kaduk@mit.edu>
CC: "draft-ietf-nfsv4-rfc5661sesqui-msns@ietf.org" <draft-ietf-nfsv4-rfc5661sesqui-msns@ietf.org>, "iesg@ietf.org" <iesg@ietf.org>, "nfsv4-chairs@ietf.org" <nfsv4-chairs@ietf.org>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Thread-Topic: Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
Thread-Index: AQHVtX3KOSemQ3CrbESPtw/FdbnaO6fDHYIAgBR0pACAEcubgIAHPhwAgAWd7QCANdpXAA==
Date: Tue, 25 Feb 2020 09:40:05 +0000
Message-ID: <2fab76c0c810795835862d5197d503066f51b40e.camel@ericsson.com>
References: <157665795217.30033.16985899397047966102.idtracker@ietfa.amsl.com> <CADaq8jegizL79V4yJf8=itMVUYDuHf=-pZgZEh-yqdT30ZdJ5w@mail.gmail.com> <CADaq8jcURAKZsNvs17MhNFT7eBNtkvOdrur5hHY2J1gXH7QdsA@mail.gmail.com> <20200113225411.GI66991@kduck.mit.edu> <CADaq8jcUWHo9KANDavHER0CA0AMW4t88t+Hg8PykV4S=hXF_HA@mail.gmail.com> <20200122031650.GE80030@kduck.mit.edu>
In-Reply-To: <20200122031650.GE80030@kduck.mit.edu>
Accept-Language: sv-SE, en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=magnus.westerlund@ericsson.com;
x-originating-ip: [192.176.1.81]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ecf857be-6e21-4e15-a2df-08d7b9d6b235
x-ms-traffictypediagnostic: DB7PR07MB5179:
x-microsoft-antispam-prvs: <DB7PR07MB517925848439ED27799E6A2495ED0@DB7PR07MB5179.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:5516;
x-forefront-prvs: 0324C2C0E2
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(396003)(346002)(136003)(376002)(39860400002)(366004)(189003)(199004)(5660300002)(4326008)(478600001)(6512007)(86362001)(26005)(8676002)(110136005)(186003)(54906003)(44832011)(66616009)(6506007)(2616005)(53546011)(66556008)(36756003)(8936002)(81166006)(66446008)(2906002)(64756008)(66476007)(81156014)(76116006)(6486002)(91956017)(316002)(30864003)(66946007)(71200400001)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR07MB5179; H:DB7PR07MB4572.eurprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 1w3kRgaLZ86U8UJ+crMA96YRcwkDm1BXEbmI85l6cL5z526871PaS9MHVNyPcI4yfsJcQpBaRoCPPNXFRTarQga8DdA+Ev5EOxSfb4IVu9Ek6TQ6dWdSaF0KZ1p3AHO+7vNiZc7iXlQ1Oy7YHtg8IWF1KyJLtogE0g1Vjjtj2NDNsS382obfkkeP8ggedqpVExVdt7VKECAjM4VWAvMs9wKLhifKElemWIOcJaa5nSs3CpJ6nDWH0Qv+zkIxaDEXFe3KFdwg2G8MhFO0lWmdJp0qnask/JEA9kvC7JVEZc4dNdYPt37UWBKYK9QvI6BtEoqmlKb95LO8XR+F5CmV/mNsVL3H770Pam/DaPEVQRl/4XrdP8Qwgk5FOKFzaH1UuJH/yUEHtH89skybXbxTMOIKWtgDVOUtOFQMSgrU6yaZMRd22nUAksaqDFgJqiKk
x-ms-exchange-antispam-messagedata: VbNKlyQlC/NK0CIsyU/2JSrdiq6TEKavH6Knp57qg+fWpZz63QPd3D5RNjIglF9PJ0R/lRKS5awG7ye2f9eUEcfKIXsVG9diISyx1qwOaEBPg+8LahDQIKCB8MJyEEwv4HguncZnkN6xz2iLXmb4gQ==
x-ms-exchange-transport-forked: True
Content-Type: multipart/signed; micalg="sha-256"; protocol="application/x-pkcs7-signature"; boundary="=-43xHWldGumZEwDzF4MmZ"
MIME-Version: 1.0
X-OriginatorOrg: ericsson.com
X-MS-Exchange-CrossTenant-Network-Message-Id: ecf857be-6e21-4e15-a2df-08d7b9d6b235
X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Feb 2020 09:40:05.9244 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: HKC2kol6azh9Lg+T+6EPWseEnjy5SF2tLopqgpLVd8nWArYqdfY7OEwVoADLFD/maz1K+pnWZl1kxgWOve8MWUpbb5meByxpgq2A9EbOrPw=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR07MB5179
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/3y6KaB1mh5IxqKUQmdIQcprNUwU>
Subject: Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Feb 2020 09:40:23 -0000
Hi Ben, -04 has been available for a while now. Can you please check if it addresses your issues or not. Thanks Magnus On Tue, 2020-01-21 at 19:16 -0800, Benjamin Kaduk wrote: > Not attempting to trim anything, so sparsely inline... > > On Sat, Jan 18, 2020 at 08:30:18AM -0500, David Noveck wrote: > > On Mon, Jan 13, 2020 at 5:54 PM Benjamin Kaduk <kaduk@mit.edu> wrote: > > > > > Hi David, > > > > > > Trimming lots of good stuff here as well... > > > > > > On Thu, Jan 02, 2020 at 10:09:02AM -0500, David Noveck wrote: > > > > On Wed, Dec 18, 2019 at 3:32 AM Benjamin Kaduk via Datatracker < > > > > noreply@ietf.org> wrote: > > > > > > > > > Benjamin Kaduk has entered the following ballot position for > > > > > draft-ietf-nfsv4-rfc5661sesqui-msns-03: Discuss > > > > > > > > > > ---------------------------------------------------------------------- > > > > > DISCUSS: > > > > > ---------------------------------------------------------------------- > > > > > > > > > > Responded to these on 12/20. > > > > > > > > > > ---------------------------------------------------------------------- > > > > > COMMENT: > > > > > ---------------------------------------------------------------------- > > > > > > > > > > I think I may have mistakenly commented on some sections that are > > > > > actually just moved text, since my lookahead window in the diff was > > > > > too > > > > > small. > > > > > > > > > > > > No harm, no foul. > > > > > > > > > > > > > > > > > > Since the "Updates:" header is part of the immutable RFC text (though > > > > > "Updated by:" is mutable), we should probably explicitly state that > > > > > > "the > > > > > updates that RFCs 8178 and 8434 made to RFC 5661 apply equally to this > > > > > document". > > > > > > > > > > > > > I think we could update the last paragraph of Section 1.1 to be more > > > > explicit about > > > > this. Perhaps it could read: > > > > > > > > Until the above work is done, there will not be a consistent set of > > > > documents providing a description of the NFSv4.1 protocol and any > > > > full description would involve documents updating other documents > > > > within the specification. The updates applied by RFC 8434 [66] and > > > > RFC 8178 [63] to RFC5661 also apply to this specification, and > > > > will apply to any subsequent v4.1 specification until that work is > > > > > > done. > > > > > > Sounds good. > > > > > > > > > > > > > I note inline (in what is probably too many places; please don't reply > > > > > at all of them!) some question about how clear the text is that a file > > > > > system migration is something done at a per-file-system granularity, > > > > > > and > > > > > that migrating a client at a time is not possible. > > > > > > > > > > > > It might be possible but doing so is not a goal of this specfication. > > > > > > > > I'm not sure how to address your concern. I don't know why anyone > > > > would > > > > assume that migrating entire clients is a goal of this specification. > > > > > > As > > > > far as > > > > I can see, when the word "migration" is used it is always in connection > > > > > > with > > > > migrating a file system. Is there some specific place where you think > > > > this > > > > issue is likely to arise? > > > > > > I think I garbled my point; my apologies. > > > To give a semi-concrete example, suppose I have clients A and B that are > > > accessing filesystem F on server X, and filesystem F is also available on > > > server Y. If X decides that it needs to migrate access to F away from X > > > (e.g., for maintenance), then the "file system migration event" involves > > > telling both A and B to look to Y for access to F, at basically the same > > > time. > > > > > > This clarifies things for me. When you were speaking of "migrating a > > client" > > i ssumed you worried anout consistency of fs's F, G,, H for a particular > > client. > > Now it appears the issue is consistency among clients A, B, c, all > > accessing a > > common F. > > > > If X tries to tell only A but not B to access F via Y but lets B > > > continue to access F at X, then I think there can be some subtle > > > consistency issues. > > > > > > > Or worse, some decidely unsubltle ones :-( > > > > > > > > In some sense, this is easy to consider as a dichotomy between "migration > > > is for server maintenance" vs. "migration is for load balancing". > > > > > > That categorization helps. > > > > > > > Assuming > > > I understand correctly (not a trivial assumption!), there was never any > > > intent to use these mechanisms for load balancing, > > > > > > Well "Never" covers a lot. There are cases which you do want to do > > load balancing. For example, if you are dealing with multiple network > > access path to the same replica, there is no issue with the load balancing > > approach. In the case of multiple replicas where data consistency applies > > between them, then you might lod balance but it is the server's > > resposibility to > > provide the consistency, meaning that he needs to be warned of the > > possibility > > of issues that might arise if clients modifying the same dara are placed > > on > > different replicas. In the case in which you don't guarantee data > > consistency among > > replicas, you might as well say about doing load balacing that "there be > > dragons". > > > > and if we can explicitly > > > disclaim such usage, then we don't have to try to reason through any > > > potential subtle consistency issues. > > > > > > > I think we can disclaim the really problematic part. I think the new text > > will be needed in the migration section. Issues with replication are > > different and do not involve any server choice. > > > > I anticipate revising section 11.5.5 to read as follows: > > > > When a file system is present and becomes inacessible using the > > current access path, the NFSv4.1 protocol provides a means by which > > clients can be given the opportunity to have continued access to > > their data. This may involve use of a different access path to the > > existing replica or by providing a path to a different replica. The > > new access path or the location of the new replica is specified by a > > file system location attribute. The ensuing migration of acceess > > includes the ability to retain locks across the transition. > > Depending on circumstances, this can involve: > > > > o The continued use of the existing clientid when accessing the > > current replica using a new access path. > > > > o Use of lock reclaim, taking advantage of a per-fs grace period. > > > > o Use of Tranparent State Migration. > > > > Typically, a client will be accessing the file system in question, > > get an NFS4ERR_MOVED error, and then use a file system location > > attribute to determine new the access path for the data. When > > fs_locations_info is used, additional information will be available > > that will define the nature of the client's handling of the > > transition to a new server. > > > > In most instances clients will choose to migrate all clients using a > > (I assume s/clients/servers/ (just the first time)) > > > particular file system to a successor replica at the same time to > > avoid cases in which different clients are updating diufferent > > replicas. However migration of individual client can be helpful in > > providing load balancing, as long as the replicas in question are > > such that they represent the same data as described in > > Section 11.11.8. > > > > o In the case in which there is no transition between replicas > > (i.e., only a change in access path), there are no special > > difficulties in using of this mechanism to effect load balancing. > > > > o In the case in which the two replicas are sufficiently co- > > ordinated as to allow coherent simultaneous access to both by a > > single client, there is, in general, no obstacle to use of > > migration of particular clients to effect load balancing. > > Generally, such simultaneous use involves co-operation between > > servers to ennsure that locks granted on two co-ordinated replica > > cannot conflict and can remain effective when transferred to a > > common replica. > > > > o In the case in which a large set of clients are accessing a file > > system in a read-only fashion, in can be helpful to migrate all > > clients with writable access simultaneously, while using load > > balancing on the set of read-only copies, as long as the rules > > appearing in Section 11.11.8, designed to prevent data reversion > > are adhered to. > > > > In other cases, the client might not have sufficient guarantees of > > data similarity/coherence to function prperly (e.g. the data in the > > two replicas is similar but not identical), and the possibility that > > different clients are updating different replicas can exacerbate the > > difficulties, making use of load balancing in such situations a > > perilous enterprise. > > > > The protocol does not specify how the file system will be moved > > between servers or how updates to multiple replicas will be co- > > ordinated. It is anticipated that a number of different server-to- > > server co-ordination mechanisms might be used with the choice left to > > the server implementer. The NFSv4.1 protocol specifies the method > > used to communicate the migration event between client and server. > > > > The new location may be, in the case of various forms of server > > clustering, another server providing access to the same physical file > > system. The client's responsibilities in dealing with this > > transition will depend on whether a switch between replicas has > > occurred and the means the server has chosen to provide continuity of > > locking state. These issues will be discussed in detail below. > > > > Although a single successor location is typical, multiple locations > > may be provided. When multiple locations are provided, the client > > will typically use the first one provided. If that is inaccessible > > for some reason, later ones can be used. In such cases the client > > might consider the transition to the new replica to be a migration > > event, even though some of the servers involved might not be aware of > > the use of the server which was inaccessible. In such a case, a > > client might lose access to locking state as a result of the access > > transfer. > > > > When an alternate location is designated as the target for migration, > > it must designate the same data (with metadata being the same to the > > degree indicated by the fs_locations_info attribute). Where file > > systems are writable, a change made on the original file system must > > be visible on all migration targets. Where a file system is not > > writable but represents a read-only copy (possibly periodically > > updated) of a writable file system, similar requirements apply to the > > propagation of updates. Any change visible in the original file > > system must already be effected on all migration targets, to avoid > > any possibility that a client, in effecting a transition to the > > migration target, will see any reversion in file system state. > > > > > > > > As was the case for > > > > > my Discuss point about addresses/port-numbers, I'm missing the context > > > > > of the rest of the document, so perhaps this is a non-issue, but the > > > > > consequences of getting it wrong seem severe enough that I wanted to > > > > > check. > > > > > > > > > > > > > I'm not seeing any severe consequences. Am I missing something? > > > > > > > > > > > > This is clearer now. I think we can avoid any severe consequences. > > > > > > > > > > > > Section 1.1 > > > > > > > > > > The revised description of the NFS version 4 minor version 1 > > > > > (NFSv4.1) protocol presented in this update is necessary to enable > > > > > full use of trunking in connection with multi-server namespace > > > > > features and to enable the use of transparent state migration in > > > > > connection with NFSv4.1. [...] > > > > > > > > > > nit: do we expect all readers to know what is meant by "trunking" with > > > > > no other lead-in? > > > > > > > > > > > > > Good point. perhaps it could be addressed by rewriting the material in > > > > > > the > > > > first paragraph of Section 1.1 to read as follows;. > > > > > > > > Two important features previously defined in minor version 0 but > > > > never fully addressed in minor version 1 are trunking, the use of > > > > multiple connections between a client and server potentially to > > > > different network addresses, and transparent state migration, which > > > > allows a file system to be transferred betwwen servers in a way that > > > > provides for the client to maintain its existing locking state > > > > accross > > > > the transfer. > > > > > > Maybe "the simultaneous use of multiple connections"? > > > > > > > Will add. > > > > > > > nit: s/betwwen/between/ > > > > > > > Fixed. > > > > > > > > The revised description of the NFS version 4 minor version 1 > > > > (NFSv4.1) protocol presented in this update is necessary to enable > > > > full use of these features with other multi-server namespace features > > > > This document is in the form of an updated description of the NFS 4.1 > > > > protocol previously defined in RFC5661 [62]. RFC5661 is obsoleted by > > > > this document. However, the update has a limited scope and is > > > > focused > > > > on enabling full use of trunkinng and transparent state migration. > > > > > > The > > > > need for these changes is discussed in Appendix A. Appendix B > > > > > > describes > > > > the specific changes made to arrive at the current text. > > > > > > This looks good, thanks. > > > > > > > :-) > > > > > > > > > > [...] > > > > > > > > > > o Work would have to be done to address many erratas relevant to > > > > > > RFC > > > > > 5661, other than errata 2006 [60], which is addressed in this > > > > > document. That errata was not deferrable because of the > > > > > interaction of the changes suggested in that errata and handling > > > > > of state and session migration. The erratas that have been > > > > > deferred include changes originally suggested by a particular > > > > > errata, which change consensus decisions made in RFC 5661, which > > > > > need to be changed to ensure compatibility with existing > > > > > implementations that do not follow the handling delineated in > > > > > RFC > > > > > 5661. Note that it is expected that such erratas will remain > > > > > > > > > > This sentence is pretty long and hard to follow; maybe it could be > > > > > > split > > > > > after "change consensus decisions made in RFC 5661" and the second > > > > > half > > > > > start with a more declarative statement about existing > > > > > implementations? > > > > > (E.g., "Existing implementations did not perform handling as > > > > > > delineated in > > > > > RFC > > > > > 5661 since the procedures therein were not workable, and in order to > > > > > have the specification accurately reflect the existing deployment > > > > > base, > > > > > changes are needed [...]") > > > > > > > > > > > > > I will clean this bullet up. See below for a proposed replcement. > > > > > > > > > > > > > > > > > > relevant to implementers and the authors of an eventual > > > > > rfc5661bis, despite the fact that this document, when approved, > > > > > will obsolete RFC 5661. > > > > > > > > > > (I assume the RFC Editor can tweak this line to reflect what actually > > > > > happens; my understanding is that the errata reports will get cloned > > > > > to > > > > > this-RFC.) > > > > > > > > > > > > > I understand that Magnus has already got that issue addressed. I'll > > > > discuss the appropriate text with him. > > > > > > > > > > > > > [rant about "errata" vs. "erratum" elided] > > > > > > > > > > > > > This is annoying but there is no way we are going to get people to use > > > > "erratum". What I've tried to do in my propsed replacement text > > > > is to refer to "errata report(s)", which is more accurate and allows > > > > people who speak English to use English singulars and plurals, without > > > > having to worry about Latin grammar. > > > > > > That's what I try to do as well :) > > > > > > > Here's my proposed replacement for the troubled bullet: > > > > > > > > o Work needs to be done to address many errata reports relevant to > > > > RFC 5661, other than errata report 2006 [60], which is addressed > > > > in this document. Addressing of that report was not deferrable > > > > because of the interaction of the changes suggested there and the > > > > newly described handling of state and session migration. > > > > > > > > The errata reports that have been deferred and that will need to > > > > be addressed in a later document include reports currently > > > > assigned a range of statuses in the errata reporting system > > > > including reports marked Accepted and those marked Held Over > > > > > > nit: it's "Hold For Document Update" > > > > > > Fixed > > > > because the change was too minor to address immediately. > > > > > > > > In addition, there is a set of other reports, including at least > > > > one in state Rejected, which will need to be addressed in a later > > > > document. This will involve making changes to consensus decisions > > > > reflected in RFC 5661, in situations in which the working group > > > > has > > > > already decided that the treatment in RFC 5661 is incorrect, and > > > > needs > > > > to be revised to reflect the working group's new consensus and > > > > > > ensure > > > > compatibility with existing implementations that do not follow the > > > > handling described in in RFC 5661. > > > > > > > > Note that it is expected that such all errata reports will remain > > > > > > nit: s/such all/all such/ > > > > > > Fixed. > > > > relevant to implementers and the authors of an eventual > > > > rfc5661bis, despite the fact that this document, when approved, > > > > will obsolete RFC 5661 [62]. > > > > > > This looks really good! > > > > > > > > > > > > Section 2.10.4 > > > > > > > > > > Servers each specify a server scope value in the form of an opaque > > > > > string eir_server_scope returned as part of the results of an > > > > > EXCHANGE_ID operation. The purpose of the server scope is to allow > > > > > > a > > > > > group of servers to indicate to clients that a set of servers > > > > > > sharing > > > > > the same server scope value has arranged to use compatible values > > > > > of > > > > > otherwise opaque identifiers. Thus, the identifiers generated by > > > > > > two > > > > > servers within that set can be assumed compatible so that, in some > > > > > cases, identifiers generated by one server in that set may be > > > > > presented to another server of the same scope. > > > > > > > > > > Is there more that we can say than "in some cases"? > > > > > > > > > > > > Not really. In general, when a server sends you an id, it comes with an > > > > implied promise to recognize it when you present it subsequently to the > > > > same server. > > > > > > > > The fact that two servers have decided to co-operate in their Id > > > > > > assignment > > > > does not change that. > > > > > > > > The previous text > > > > > implies a higher level of reliability than just "some cases", to me. > > > > > > > > > > > > > I think I need to change the text, perhaps by replacing "use compatible > > > > values of otherwise > > > > opaque identifiers" by "use distinct values of otherwise opaque > > > > > > identifiers > > > > so that the two > > > > servers never assign the same value to two distinct objects". > > > > > > > > I anticipate the following replacement for the first two paragraphs of > > > > Section 2.10.4: > > > > > > > > Servers each specify a server scope value in the form of an opaque > > > > string eir_server_scope returned as part of the results of an > > > > EXCHANGE_ID operation. The purpose of the server scope is to allow a > > > > group of servers to indicate to clients that a set of servers sharing > > > > the same server scope value has arranged to use distinct values of > > > > opaque identifiers so that the two servers never assign the same > > > > value to two distinct object. Thus, the identifiers generated by two > > > > servers within that set can be assumed compatible so that, in certain > > > > important cases, identifiers generated by one server in that set may > > > > be presented to another server of the same scope. > > > > > > > > The use of such compatible values does not imply that a value > > > > generated by one server will always be accepted by another. In most > > > > cases, it will not. However, a server will not accept a value > > > > generated by another inadvertently. When it does accept it, it will > > > > > > nit: I think it flows better to put "invertently" as "will not > > > inadvertently accept". > > > > > > > OK. Fixed. > > > > > > > > > > > be because it is recognized as valid and carrying the same meaning as > > > > on another server of the same scope. > > > > > > > > > > > > As an illustration of the (limited) value of this information, consider > > > > > > the > > > > case of client recovery from a server reboot. The client has to reclaim > > > > his locks using file handles returned by the previous server instance. > > > > > > If > > > > the server scopes are the same (they almost always are), the client is > > > > > > not > > > > sure he will get his locks back (e.g. the file might have been deleted), > > > > but he does know that, if the lock reclaim succeeds, it is for the same > > > > file. If the server scopes are not the same, he has no such assurance. > > > > > > Thanks, the new text (and explanation here) is very clear about what's > > > going on. > > > > > > [...] > > > > > Section 11.5.5 > > > > > > > > > > will typically use the first one provided. If that is inaccessible > > > > > for some reason, later ones can be used. In such cases the client > > > > > might consider that the transition to the new replica as a > > > > > migration > > > > > event, even though some of the servers involved might not be aware > > > > > > of > > > > > the use of the server which was inaccessible. In such a case, a > > > > > > > > > > nit: the grammar here got wonky; maybe s/as a/is a/? > > > > > > > > > > > > How about s/as a/to be a/ ? > > > > > > That works if you drop the earlier "that", for "the client might consider > > > the transition to the new replica to be a migration event". > > > > > > Did that. > > > [...] > > > > > > > > > > o The "local" representation of all owners and groups must be the > > > > > same on all servers. The word "local" is used here since that > > > > > is > > > > > the way that numeric user and group ids are described in > > > > > Section 5.9. However, when AUTH_SYS or stringified owners or > > > > > group are used, these identifiers are not truly local, since > > > > > they > > > > > are known tothe clients as well as the server. > > > > > > > > > > I am trying to find a way to note that the AUTH_SYS case mentioned > > > > > here > > > > > is precisely because of the requirement being imposed by this bullet > > > > > point, > > > > > > > > > > > > Not sure what you mean by that. I think the requirement is to allow the > > > > client > > > > to be able to use AUTH_SYS, without the contortions that would be > > > > > > required > > > > if > > > > different fs's had the same uid's meaning different things. > > > > > > > > while acknowledging that the "stringified owners or group" case > > > > > is separate, but not having much luck. > > > > > > > > > > > > > My attempt to revise this area is below: > > > > > > > > Note that there is no requirement in general that the users > > > > corresponding to particular security principals have the same local > > > > representation on each server, even though it is most often the case > > > > that this is so. > > > > > > > > When AUTH_SYS is used, the following additional requirements must be > > > > met: > > > > > > > > o Only a single NFSv4 domain can be supported through use of > > > > AUTH_SYS. > > > > > > > > o The "local" representation of all owners and groups must be the > > > > same on all servers. The word "local" is used here since that is > > > > the way that numeric user and group ids are described in > > > > Section 5.9. However, when AUTH_SYS or stringified numeric owners > > > > or groups are used, these identifiers are not truly local, since > > > > they are known to the clients as well as the server. > > > > > > > > Similarly, when strigified numeric user and group ids are used, the > > > > "local" representation of all owners and groups must be the same on > > > > all servers, even when AUTH_SYS is not used. > > > > > > I really like this rewriting; thank you for undertaking it. > > > I think that what I was trying to say here is roughly that we need > > > scare-quotes for "local" because of things like AUTH_SYS (or stringified > > > user/group ids) that involve sending local representations over the > > > network. So your rewrite did in fact address my concern, even though I > > > didn't manage to say it very well the first time :) > > > > > > [...] > > > > > > > > > > o When there are no potential replacement addresses in use but > > > > > > there > > > > > > > > > > What is a "replacement address"? > > > > > > > > > > > > > I've explained that in some new text added before these bullets, as a > > > > new > > > > second > > > > paragraph of this section: > > > > > > > > The appropriate action depends on the set of replacement addresses > > > > (i.e. server endpoints which are server-trunkable with one previously > > > > being used) which are available for use. > > > > > > > > > > > > > > are valid addresses session-trunkable with the one whose use is > > > > > > to > > > > > be discontinued, the client can use BIND_CONN_TO_SESSION to > > > > > > access > > > > > the existing session using the new address. Although the target > > > > > session will generally be accessible, there may be cases in > > > > > which > > > > > that session is no longer accessible. In this case, the client > > > > > can create a new session to enable continued access to the > > > > > existing instance and provide for use of existing filehandles, > > > > > stateids, and client ids while providing continuity of locking > > > > > state. > > > > > > > > > > I'm not sure I understand this last sentence. On its own, the "new > > > > > session to enable continued access to the existing instance" sounds > > > > > > like > > > > > the continued access would be on the address whose use is to cease, > > > > > and > > > > > thus the new session would be there. > > > > > > > > > > > > That is not the intention. Will need to clarify. > > > > > > > > > > > > > But why make a new session when > > > > > the old one is still good, > > > > > > > > > > > > It isn't usable on the new connection. > > > > > > > > > > > > > especially when we just said in the previous > > > > > sentence that the old session can't be moved to the new > > > > > connection/address? > > > > > > > > > > > > > Because we can't use it on the new connection, we have to create a > > > > new session to access the client. > > > > > > > > Perhaps a forward reference down to Section 11.12.{4,5} for this and the > > > > > next bullet point would help as well as rewording? > > > > > > > > > > > > > It rurns out these would add confusion since they deal with migration > > > > situations > > > > and deciding wheher transparent stte miugration has occurred in the > > > > > > switch > > > > between > > > > replicas. In the cases we are dealing with, ther is only a single > > > > replicas/fs and no > > > > migration.. > > > > > > > > Here is my proposed replacement text for the two bullets in question: > > > > > > > > o When there are no potential replacement addresses in use but there > > > > are valid addresses session-trunkable with the one whose use is to > > > > be discontinued, the client can use BIND_CONN_TO_SESSION to access > > > > the existing session using the new address. Although the target > > > > session will generally be accessible, there may be rare situations > > > > in which that session is no longer accessible, when an attempt is > > > > made tto bind the new conntectin to it. In this case, the client > > > > > > nits: s/tto/to/, s/conntectin/connection/ > > > > > > > Fixed. > > > > > > > > > can create a new session to enable continued access to the > > > > existing instance and provide for use of existing filehandles, > > > > stateids, and client ids while providing continuity of locking > > > > state. > > > > > > Just to check: this sounds like even in the case where the client creates > > > a new session, the filehandle, stateid, clientid, and locking state > > > (values) are in effect "transparently preserved" by the server, so the > > > client has no need to do any reclamation of locking state. I think that's > > > what's intended, but holler if I'm wrong about that. > > > > > > > Ok. > > I'll holler that *you're right about that.* > > > > > > > > > o When there is no potential replacement address in use and there > > > > are no valid addresses session-trunkable with the one whose use is > > > > to be discontinued, other server-trunkable addresses may be used > > > > to provide continued access. Although use of CREATE_SESSION is > > > > available to provide continued access to the existing instance, > > > > servers have the option of providing continued access to the > > > > existing session through the new network access path in a fashion > > > > similar to that provided by session migration (see Section 11.12). > > > > To take advantage of this possibility, clients can perform an > > > > initial BIND_CONN_TO_SESSION, as in the previous case, and use > > > > CREATE_SESSION only if that fails. > > > > > > > > > > > > > Section 11.10.6 > > > > > > > > > > In a file system transition, the two file systems might be > > > > > clustered > > > > > in the handling of unstably written data. When this is the case, > > > > > > and > > > > > > > > > > What does "clustered in the handling of unstably written data" mean? > > > > > > > > > > the two file systems belong to the same write-verifier class, write > > > > > > > > > > How is the client supposed to determine "when this is the case"? > > > > > > > > > > > > > Here's a prpoed replcment for this pargraph: > > > > > > > > In a file system transition, the two file systems might be > > > > cooperating in the handling of unstably written data. Clients can > > > > ditermine if this is the case, by seeing if the two file systems > > > > belong to the same write-verifier class. When this is the case, > > > > write verifiers returned from one system may be compared to those > > > > returned by the other and superfluous writes avoided. > > > > > > > > > > > > > Section 11.10.7 > > > > > > > > > > In a file system transition, the two file systems might be > > > > > > consistent > > > > > in their handling of READDIR cookies and verifiers. When this is > > > > > > the > > > > > case, and the two file systems belong to the same readdir class, > > > > > > > > > > As above, how is the client supposed to determine "when this is the > > > > > case"? > > > > > > > > > > > > > READDIR cookies and verifiers from one system may be recognized by > > > > > the other and READDIR operations started on one server may be > > > > > > validly > > > > > continued on the other, simply by presenting the cookie and > > > > > verifier > > > > > returned by a READDIR operation done on the first file system to > > > > > the > > > > > second. > > > > > > > > > > Are these "may be"s supposed to admit the possibility that the > > > > > destination server can just decide to not honor them arbitrarily? > > > > > > > > > > > > > No. They are intended to indicate that the client might or might not use > > > > the capability > > > > > > > > Here is proposed replacement text for the paragraph: > > > > > > > > In a file system transition, the two file systems might be consistent > > > > in their handling of READDIR cookies and verifiers. Clients can > > > > determine if this is the case, by seeing if the two file systems > > > > belong to the same readdit class. When this is the case, readdir > > > > > > nit: s/readdit/readdirt > > > > > > > Fixed. > > > > > > > > > > > class, READDIR cookies and verifiers from one system will be > > > > recognized by the other and READDIR operations started on one server > > > > can be validly continued on the other, simply by presenting the > > > > cookie and verifier returned > > > > > > Ah, this formulation (for both write-verifier and readdir) is very > > > helpful. > > > > > > [...] > > > > > Section 11.16.1 > > > > > > > > > > With the exception of the transport-flag field (at offset > > > > > FSLI4BX_TFLAGS with the fls_info array), all of this data applies > > > > > to > > > > > the replica specified by the entry, rather that the specific > > > > > network > > > > > path used to access it. > > > > > > > > > > Is it clear that this applies only to the fields defined by this > > > > > specification (since, as mentioned later, future extensions must > > > > > > specify > > > > > whether they apply to the replica or the entry)? > > > > > > > > > > > > > Intend to use the following replacement text: > > > > > > > > With the exception of the transport-flag field (at offset > > > > FSLI4BX_TFLAGS with the fls_info array), all of this data defuined in > > > > > > nit: s/defuined/defined/ > > > > > > > Fixed. > > > > > > > > this specification applies to the replica specified by the entry, > > > > rather that the specific network path used to access it. The > > > > classification of data in extensions to this data is discussed > > > > below > > > > > > [...] > > > > > Section 18.35.3 > > > > > > > > > > I a little bit wonder if we want to reaffirm that co_verifier remains > > > > > fixed when the client is establishing multiple connections for > > > > > trunking > > > > > usage -- the "incarnation of the client" language here could make a > > > > > reader wonder, though I think the discussion of its use elsewhere as > > > > > relating to "client restart" is sufficiently clear. > > > > > > > > > > > > > This should be made clearer but the clarification needs to be done > > > > > > multiple > > > > places. > > > > > > > > Possible replacement text for eighth non-code paragraph of section 2.4: > > > > > > > > The first field, co_verifier, is a client incarnation verifier, > > > > allowing the server to distingish successive incarnations (e.g. > > > > reboots) of the same client. The server will start the process of > > > > canceling the client's leased state if co_verifier is different than > > > > what the server has previously recorded for the identified client (as > > > > specified in the co_ownerid field). > > > > > > > > Likely replacement text for the seventh paragraph of this section: > > > > > > > > The eia_clientowner field is composed of a co_verifier field and a > > > > co_ownerid string. As noted in Section 2.4, the co_ownerid describes > > > > the client, and the co_verifier is the incarnation of the client. An > > > > EXCHANGE_ID sent with a new incarnation of the client will lead to > > > > the server removing lock state of the old incarnation. Whereas an > > > > EXCHANGE_ID sent with the current incarnation and co_ownerid will > > > > result in an error, an update of the client ID's properties, > > > > depending on the arguments to EXCHANGE_ID, or the return of > > > > information about the existing client_id as might happen when this > > > > opration is done to the same seerver using different network > > > > addresses as part of creating trunked connections. > > > > Not sure what error that error text was referring to above. Think it > > added to the > > confusion.. > > > > > > > > I think I get the general sense of what is going on here (i.e., the last > > > sentence) but am still uncertain on the specifics. Namely, "most of the > > > time" (TM), sending EXCHANGE_ID with current incarnation/ownerid will be > > > an > > > error, since it's a client bug to try to register the same way twice in a > > > row. > > > > > > No it isn't. This is case 2 on page 508, " Non-Update on Existing Client > > ID". > > Given retries and possible communication difficulties, it is just too hard > > to > > make this case an error. > > > > However, some times we might have to do that in order to update > > > properties of the client or get some new information that a server has > > > associated to a given client ID. I *think* (but am not sure) that the > > > error case is exactly when the (same-incarnation/ownerid) EXCHANGE_ID is > > > done to the same *server and address* as the original EXCHANGE_ID, and > > > that > > > the "update properties or get new information back" case is exactly when > > > the EXCHANGE_ID is done to a different server/address combination. > > > > > > If I'm right about that, then I'd suggest: > > > > > > % the server removing lock state of the old incarnation. Whereas an > > > % EXCHANGE_ID sent with the current incarnation and co_ownerid will > > > % result in an error when sent to a given server at a given address for > > > % a second time, it is not an error to send EXCHANGE_ID with current > > > % incarnation and co_ownerid to a different server (e.g., as part of a > > > % migration event). In such cases, the EXCHANGE_ID can allow for an > > > % update of the client ID's properties, depending on the arguments to > > > % EXCHANGE_ID, or the return of (potentially updated) information about > > > % the existing client_id, as might happen when this opration is done to > > > % the same server using different network addresses as part of creating > > > % trunked connections. > > > > > > > I think I have to revise the paragraph above to be clearer. I anticipate > > replacing the seventh paragraph of section 18.35.3 by the following > > replacement: > > > > The eia_clientowner field is composed of a co_verifier field and a > > co_ownerid string. As noted in Section 2.4, the co_ownerid > > identifies the client, and the co_verifier specfies a particular > > incarnation of that client. An EXCHANGE_ID sent with a new > > incarnation of the client will lead to the server removing lock state > > of the old incarnation. On the other hand, an EXCHANGE_ID sent with > > the current incarnation and co_ownerid will, when it does not result > > in an unrelated error, porentially update an existing client ID's > > properties, or simply return of information about the existing > > client_id. That latter would happen when this operation is done to > > the same server using different network addresses as part of creating > > trunked connections. > > Ah, I think I get it now. Thanks. > > > > > > > > Section 21 > > > > > > > > > > Some other topics at least somewhat related to trunking and migration > > > > > that we could potentially justify including in the current, > > > > > limited-scope, update (as opposed to deferring for a full -bis) > > > > > > include: > > > > > > > > > > > > > Some of these are related to multi-server namespace but not related to > > > > security, as far as I can see. > > > > > > It does look like it; in some sense I was going through a brainstorming > > > exercise to make this list, and appreciate the sanity checks. (To be > > > clear, I am not insisting that any of them get covered in specifically the > > > sesqui update, just mentioning topics for potential consideration.) > > > > > > > > > > > > > > - clients that lie about reclaimed locks during a post-migration grace > > > > > period > > > > > > > > > > > > > Will address in a number of places: > > > > > > > > First of all, I inted to add a new paragraph to Section 21, to be placed > > > > > > as > > > > the > > > > sixth non-bulleted paragraph and to read as follows: > > > > > > > > Security consideration for lock reclaim differ between the state > > > > reclaim done after server failure (discussed in Section 8.4.2.1.1 and > > > > the per-fs state reclaim done in support of migration/replication > > > > (discussed in Section 11.11.9.1). > > > > > > > > Next is a new proposed new section to appear as Section 11.11.9.1: > > > > > > > > 11.11.9.1. Security Consideration Related to Reclaiming Lock State > > > > after File System Transitions > > > > > > > > Although it is possible for a client reclaiming state to misrepresent > > > > its state, in the same fashion as described in Section 8.4.2.1.1, > > > > most implementations providing for such reclamation in the case of > > > > file system transitions will have the ability to detect such > > > > misreprsentations. this limits the ability of unauthenicatd clients > > > > > > typos: "misrepresentations", "This", "unauthenticated" > > > > > > > Fixed. > > > > > > > > > > > to execute denial-of-service attacks in these cirsumstances. > > > > > > "circumstances" > > > > > > > > > > Fixed. > > > > > > > > Nevertheless, the rules stated in Section 8.4.2.1.1, regarding > > > > prinipal verification for reclaim requests, apply in this situation > > > > as well. > > > > > > > > Typically,implementations support file system transitions will have > > > > > > nits: space after comma, and "that" for "that support" > > > > > > Fixed. > > > > > > > > extensive information about the locks to be transferred. This is > > > > because: > > > > > > > > o Since failure is not involved, there is no need store to locking > > > > information in persistent storage. > > > > > > > > o There is no need, as there is in the failure case, to update > > > > multiple repositories containg locking state to keep them in sync. > > > > Instead, there is a one-time communication of locking state from > > > > the source to the destination server. > > > > > > > > o Providing this information avoids potential interference with > > > > existing clients using the destination file system, by denying > > > > them the ability to obtain new locks during the grace period. > > > > > > > > When such detailed locking infornation, not necessarily including the > > > > associated stateid,s is available, > > > > > > nits: "information", s/stateid,s/stateids,/ > > > > > > > Fixed. > > > > > > > > > > > > o It is possible to detect reclaim requests that attempt to reclsim > > > > > > nit: s/reclsim/reclaim/ > > > > > > > Fixed. > > > > > > > > > locks that did not exist before the transfer, rejecting them with > > > > NFS4ERR_RECLAIM_BAD (Section 15.1.9.4). > > > > > > > > o It is possible when dealing with non-reclaim requests, to > > > > determine whether they conflict with existing locks, eliminating > > > > the need to return NFS4ERR_GRACE ((Section 15.1.9.2) on non- > > > > reclaim requests. > > > > > > > > It is possible for implementations of grace periods in connection > > > > with file system transitions not to have detailed locking information > > > > available at the destination server, in which case the security > > > > situation is exactly as described in Section 8.4.2.1.1. > > > > > > > > I think I should also draw your attention to a revised Section 15.1.9. > > > > These > > > > includes some revisions originally done for > > > > draft-ietf-nfsv4-rfc5661-msns-update, > > > > which somehow got dropped as a few that turned up as necessary in > > > > writing > > > > 11.11.9.1: > > > > > > > > 15.1.9. Reclaim Errors > > > > > > > > These errors relate to the process of reclaiming locks after a server > > > > restart. > > > > > > > > 15.1.9.1. NFS4ERR_COMPLETE_ALREADY (Error Code 10054) > > > > > > > > The client previously sent a successful RECLAIM_COMPLETE operation > > > > specifying the same scope, whether that scope is global or for the > > > > same file system in the case of a per-fs RECLAIM_COMPLETE. An > > > > additional RECLAIM_COMPLETE operation is not necessary and results in > > > > this error. > > > > > > > > 15.1.9.2. NFS4ERR_GRACE (Error Code 10013) > > > > > > > > This error is returned when the server was in its recovery or grace > > > > period. with regard to the file system object for which the lock was > > > > > > (no full stop) > > > > > > > Fixed. > > > > > > > > requested resulting in a situation in which a non-reclaim locking > > > > request could not be granted. This can occur because either > > > > > > > > o The server does not have sufficiuent information about locks that > > > > might be poentially reclaimed to determine whether the lock could > > > > validly be granted. > > > > > > > > o The request is made by a client responsible for reclaiming its > > > > locks that has not yet done the appropriate RECLAIM_COMPLETE > > > > operation, allowing it to proceed to obtain new locks. > > > > > > > > It should be noted that, in the case of a per-fs grace period, there > > > > may be clients, i.e. those currently using the destination file > > > > system who might be unaware of the circumstances resulting in the > > > > > > nit: comma after "file system" > > > > > > > This phrase is now within parentheses. > > > > > > > > > intiation of the grace period. Such clients need to periodically > > > > retry the request until the grace period is over, just as other > > > > clients do. > > > > > > > > 15.1.9.3. NFS4ERR_NO_GRACE (Error Code 10033) > > > > > > > > A reclaim of client state was attempted in circumstances in which the > > > > server cannot guarantee that conflicting state has not been provided > > > > to another client. This occurs if there is no active grace period > > > > appliying to the file system object for which the request was made,if > > > > the client making the request has no current role in reclaining > > > > locks, or because previous operations have created a situation in > > > > which the server is not able to determine that a reclaim-interfering > > > > edge condition does not exist. > > > > > > > > 15.1.9.4. NFS4ERR_RECLAIM_BAD (Error Code 10034) > > > > > > > > The server has determined that a reclaim attempted by the client is > > > > not valid, i.e. the lock specified as being reclaimed could not > > > > possibly have existed before the server restart or file system > > > > migration event. A server is not obliged to make this determination > > > > and will typically rely on the client to only reclaim locks that the > > > > client was granted prior to restart. However, when a server does > > > > have reliable information to enable it make this determination, this > > > > error indicates that the reclaim has been rejected as invalid. This > > > > is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see > > > > Section 15.1.9.5) where the server can only determine that there has > > > > been an invalid reclaim, but cannot determine which request is > > > > invalid. > > > > > > > > 15.1.9.5. NFS4ERR_RECLAIM_CONFLICT (Error Code 10035) > > > > > > > > The reclaim attempted by the client has encountered a conflict and > > > > cannot be satisfied. This potentially indicates a misbehaving > > > > client, although not necessarily the one receiving the error. The > > > > misbehavior might be on the part of the client that established the > > > > lock with which this client conflicted. See also Section 15.1.9.4 > > > > for the related error, NFS4ERR_RECLAIM_BAD > > > > > > Thanks for remembering to fetch these updates from the full bis WIP! > > > > > > > > > > > > - how attacker capabilities compare by using a compromised server to > > > > > give bogus referrals/etc. as opposed to just giving bogus data/etc. > > > > > > > > > > > > > Will address. See the paragraphs to be added to the end of Section 21. > > > > > > > > > > > > > - an attacker in the network trying to shift client traffic (in terms > > > > > > of > > > > > what endpoints/connections they use) to overload a server > > > > > > > > > > > > > Will address. See the paragraphs to be added to the end of Section 21. > > > > > > > > > > > > > - how asynchronous replication can cause clients to repeat > > > > > non-idempotent actions > > > > > > > > > > > > > Not sure what you are referring to. > > > > > > I don't have something fully fleshed out here, but it's in the general > > > space when there are multiple replicas that get updates at (varying) > > > delays from the underlying write. A contrived situation would be if you > > > have a pool of worker machines that use NFS for state management (I know, > > > a > > > pretty questionable idea), and try to do compare-and-set on a state file. > > > If one worker tries to assert that it owns the state but other NFS > > > replicas > > > see delayed updates, additional worker machines could also try to claim > > > the > > > state and perform whatever operation the state file is controlling. > > > > > > Basically, the point here is that if you as an NFS consumer are using NFS > > > with relaxed replication semantics, you have to think through how your > > > workflow will behave in the presence of such relaxed updates. Which ought > > > to be obvious, when I say it like that, but perhaps is not always actually > > > obvious. > > > > > > > > > > > > - the potential for state skew and/or data loss if migration events > > > > > happen in close succession and the client "misses a notification" > > > > > > > > > > > > > Is there a specfic problem that needs to be addressed? > > > > > > I don't have a concrete scenario that's specific to NFS, no; this is just > > > a > > > generic possibility for any scheme that involves discrete updates (e.g., > > > file-modifying RPCs) and the potential for asynchronous replication. > > > > > > > I think that the necessary discussion can be folded into some clarification > > of replication discussed in > > another thread. > > Sure. > > > > > > > > > > > > - cases where a filesystem moves and there's no longer anything > > > > > running > > > > > at the old network endpoint to return NFS4ERR_MOVED > > > > > > > > > > > > > This seems to me just a recognition that sometimes system fail. Not > > > > sure > > > > specifically what to address. > > > > > > Okay. > > > > > > > > - what can happen when non-idempotent requests are in a COMPOUND > > > > > before > > > > > a request that gets NFS4ERR_MOVED > > > > > > > > > > > > > Intend to address in Section 15.1.2.4: > > > > > > > > The file system that contains the current filehandle object is not > > > > present at the server, or is not accessible using the network > > > > addressed. It may have been made accessible on a different ser of > > > > network addresses, relocated or migrated to another server, or it may > > > > have never been present. The client may obtain the new file system > > > > location by obtaining the "fs_locations" or "fs_locations_info" > > > > attribute for the current filehandle. For further discussion, refer > > > > to Section 11.3. > > > > > > > > As with the case of NFS4ERR_DELAY, it is possible that one or more > > > > non-idempotent operation may have been successfully executed within a > > > > COMPOUND before NFS4ERR_MOVED is returned. Because of this, once the > > > > new location is determined, the original request which received the > > > > NFS4ERR_MOVED should not be re-executed in full. Instead, a new > > > > COMPOUND, with any successfully executed non-idempotent operation > > > > removed should be executed. This new request should have different > > > > slot id or sequence in those cases in which the same session is used > > > > for the new request (i.e. transparent session migration or an > > > > > > nit: comma after "i.e.". > > > > > > > Fixed. > > > > > > > > > > > endpoint transition to a new address session-trunkable with the > > > > original one). > > > > > > > > > > > > > - how bad it is if the client messes up at Transparent State Migration > > > > > discovery, most notably in the case when some lock state is lost > > > > > > > > > > > > > Propose to address this by adding the following paragraph to the end of > > > > Section 11.13.2: > > > > > > > > Lease discovery needs to be provided as described above, in order to > > > > ensure that migrations are discovered soon enough to ensure that > > > > leases moved to new servers are discovred in time to make sure that > > > > leases are renewed early enough to avoid lease expiration, leading to > > > > loss of locking state. While the consequences of such loss can be > > > > > > nit: the double "are discovered {soon enough,in time}" is a little awkward > > > of a construction; how about "Lease discovery needs to be provided as > > > described above, in order to ensure that migrations are discovered soon > > > enough that leases moved to new servers can successfully be renewed before > > > they expire, avoiding loss of locking state"? > > > > > > > Went with the following: > > > > Lease discovery needs to be provided as described above, in order to > > ensure that migrations are discovered soon enough to enable > > leases moved to new servers to be appropriately renewed in order to > > avoid lease expiration, leading to loss of locking state. > > I think maybe we should leave any further tweaks to the RFC Editor; my only > concern is whether it can be misread as saying that the renewal leads to > loss of locking state (which is, admittedly, a nonsensical interpretation). > > > > > > > > ameliorated through implementations of courtesy locks, servers are > > > > under no obligation to do, and a conflicting lock request may means > > > > > > nit: s/means/mean/ > > > > > > > Fixed. > > > > > > > > > > > that a lock is revoked unexpectedly. Clients should be aware of this > > > > possibility. > > > > > > > > > > > > > > > > > - the interactions between cached replies and migration(-like) events, > > > > > though a lot of this is discussed in section 11.13.X and 15.1.1.3 > > > > > already > > > > > > > > > > > > > Will address any specfics that you feel aren't adequately addressed. > > > > > > I don't remember any particular specifics, so we should probably just let > > > this go for now. > > > > > > > > > > > > > > > > > but I defer to the WG as to what to cover now vs. later. > > > > > > > > > > In light of the ongoing work on draft-ietf-nfsv4-rpc-tls, it might be > > > > > reasonable to just talk about "integrity protection" as an abstract > > > > > thing without the specific focus on RPCSEC_GSS's integrity protection > > > > > (or authentication) > > > > > > > > > > > > > > > > > > I was initially leery of this, but when I looked at the text, I was able > > > > > > to > > > > avoid referring to RPCSEC_GSS in most cases in which integrity was > > > > mentioned:-). The same does not seem posible for authentication :-( > > > > > > We'll take the easy wins and try to not fret too much about the other > > > stuff. > > > > > > > > RPCSEC_GSS does not > > > > > % protect the binding from one server to another as part of a > > > > > > referral > > > > > % or migration event. The source server must be trusted to provide > > > > > % the correct information, based on whatever factors are available > > > > > to > > > > > % the client. > > > > > > > > > > > > > These are both situations for which RPCSEC_GSS has no solution, but > > > > > > neither > > > > is there another one. It is probably best to just say that without > > > > reference > > > > to integrity protection. > > > > > > True. > > > > > > > I have added new paragraphs after these bullets that may address some of > > > > the > > > > issues you were concerned about. > > > > > > > > Even if such requests are not interfered with in flight, it is > > > > possible for a compromised server to direct the client to use > > > > inappropriate servers, such as those under the control of the > > > > attacker. It is not clear that being directed to such servers > > > > represents a greater threat to the client than the damage that could > > > > be done by the comprromised server itself. However, it is possible > > > > that some sorts of transient server compromises might be taken > > > > advantage of to direct a client to a server capable of doing greater > > > > damage over a longer time. One useful step to guard against this > > > > possibility is to issue requests to fetch location data using > > > > RPCSEC_GSS, even if no mapping to an RPCSEC_GSS principal is > > > > available. In this case, RPCSEC_GSS would not be used, as it > > > > typically is, to identify the client principal to the server, but > > > > rather to make sure (via RPCSEC_GSS mutual authentication) that the > > > > server being contacted is the one intended. > > > > > > > > Similar considrations apply if the threat to be avoided is the > > > > direction of client traffic to inappropriate (i.e. poorly performing) > > > > servers. In both cases, there is no reason for the information > > > > returned to depend on the identity of the client principal requesting > > > > it, while the validity of the server information, which has the > > > > capability to affect all client principals, is of considerable > > > > importance. > > > > > > These do address some of the issues I mentioned; thank you. I do have a > > > couple further comments: > > > > > > - I'm not sure what "no mapping to an RPCSEC_GSS principal is available" > > > means (but maybe that's just because I've not read the RPCSEC_GSS RFCs > > > recently enough) > > > > > > > This is partly because " even if no mapping to an RPCSEC_GSS principal is > > available" > > is misleading. It would have been better to say "even if no mapping is > > to an > > RPCSEC_GSS principal is available for the user currently obtaining the > > information". > > The issue is not within RPCSEC_GSS itself but relates to how it is used by > > NFSv4. > > Servers are required to support RPCSEG_GSS but to use it, you need to > > translate > > a uid to an RPCSEC_GSS principal, Where that mapping is not available, as > > it often > > is not, you cannot use RPCSEC_GSS for that user. > > Ah, that was enough to jostle the neurons into place (and no change to the > text needed). > > Thanks again, > > Ben > > > > available > > > > > > - w.r.t. "there is no reason for the information returned to depnd on the > > > identity of the client principal", I could perhaps imagine some setup > > > that uses information from a corporate contacts database to determine > > > the > > > current office/location of a given user and provide a referral to a > > > spatially-local replica. So "no reason" may be too absolute (but I > > > don't > > > have a proposed alternative and don't object to using this text). > > > > > > > > > [...] > > > > > Section B.4 > > > > > > > > > > o The discussion of trunking which appeared in Section 2.10.5 of > > > > > RFC5661 [62] needed to be revised, to more clearly explain the > > > > > multiple types of trunking supporting and how the client can be > > > > > made aware of the existing trunking configuration. In addition > > > > > the last paragraph (exclusive of sub-sections) of that section, > > > > > dealing with server_owner changes, is literally true, it has > > > > > been > > > > > a source of confusion. [...] > > > > > > > > > > nit: the grammar here is weird; I think there's a missing "while" or > > > > > similar. > > > > > > > > > > > > > > > > > Anticipate using the following replacement text: > > > > > > > > o The discussion of trunking which appeared in Section 2.10.5 of > > > > RFC5661 [62] needed to be revised, to more clearly explain the > > > > multiple types of trunking supporting and how the client can be > > > > > > nit: just "trunking support" (not "-ing")? > > > > > > > Fixed. > > > > > > > > > > > made aware of the existing trunking configuration. In addition, > > > > while the last paragraph (exclusive of sub-sections) of that > > > > section, dealing with server_owner changes, is literally true, it > > > > has been a source of confusion. Since the existing paragraph can > > > > be read as suggesting that such changes be dealt with non- > > > > disruptively, the issue needs to be clarified in the revised > > > > section, which appears in Section 2.10.5. > > > > > > Thanks again for going through my giant pile of comments; I hope that you > > > think the improvements to the document are worth the time spent. > > > > > > > I do. > > > > > > > -Ben > > > -- Cheers Magnus Westerlund ---------------------------------------------------------------------- Networks, Ericsson Research ---------------------------------------------------------------------- Ericsson AB | Phone +46 10 7148287 Torshamnsgatan 23 | Mobile +46 73 0949079 SE-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com ----------------------------------------------------------------------
- [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nf… Benjamin Kaduk via Datatracker
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nf… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
- Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Magnus Westerlund