Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)

Magnus Westerlund <magnus.westerlund@ericsson.com> Tue, 25 February 2020 09:40 UTC

Return-Path: <magnus.westerlund@ericsson.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A3023A076C; Tue, 25 Feb 2020 01:40:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Tn2Gy8nXmJUH; Tue, 25 Feb 2020 01:40:17 -0800 (PST)
Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70083.outbound.protection.outlook.com [40.107.7.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0AFDD3A0769; Tue, 25 Feb 2020 01:40:13 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bbjiISM0dapmJef3s3CZMCVDq/zBNMronK03XGe7LfHE11+pdULtd/tpEJv990ZcoBroc48HmtPgfHMf/f1Y56EbXXdR4KJ0/MtmfMcb96veResf6UHEan1su1IHPhUnoB6IDAN70GmWUNSCLHqIZ8x/3SJO89eck7wJ9s98A2cm7fn+PO8+g0o2S2dbhayj0f32qaxWE82oTuuZhn3MVXEJEXj1tXl4JkfwvX0XkwV6USE4j4oNH1YSDnBmCMJGy5NOi1GKrCRugbybo1XgCpjXSTOdTLOplUBbbzUJggeGKPDhB9A7ZI6TtRHWj2Yk+z8GogMWica6KkfeE3hBcg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VzrCohNRM+QZfXtT0E1nJsfkGvY3IHBYTZhPh7BwUo4=; b=noaKHnxBbGLuuvUbpd4+d6MRqDcO0FNaJmrWDEMJFxVPbRcmlOHF1rmMKHrN1O+o/1HIw7qorZ1xtD2bNA5drcToY84b4KQSe4s1hyrLo6rdb/PczJdR4UTcVf8SOEbH4SwCibC45C4Y7WmnkFhHtnsOgGdijcaEaeJoqGQbDooJU7zuwg4Dp9F0Fx/qXnoPqB8/x8mwklbAPsEQguNIOZEyyPRGyop35JSvFfY0meXA96yOAQJ3pHXHOUn5HZj9VAjLUml/WXouImFCMDbQx9zohBZjBUqG89JSzPLLrzv5czUCSvGtXVQp1vmE3iK3CrBad/cistjbOKKx6fHyog==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ericsson.com; dmarc=pass action=none header.from=ericsson.com; dkim=pass header.d=ericsson.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VzrCohNRM+QZfXtT0E1nJsfkGvY3IHBYTZhPh7BwUo4=; b=Hwz6Gk/dOjxHDVK6REfqqLRQYnJmnIc7zLB23oUJw7t7i/FDbwLZgb2KY0dpe5g1JBg107rjWVn8nWyLXHNXvLf/CZARO4pl2MEh7ZjqaQmD/5mcLU0DFjH9zK/kiXSVJMaruXNSElZVMOyOtnUjLquwDUPjdo7OtarqKSh4DFY=
Received: from DB7PR07MB4572.eurprd07.prod.outlook.com (52.135.133.12) by DB7PR07MB5179.eurprd07.prod.outlook.com (20.178.40.224) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2772.9; Tue, 25 Feb 2020 09:40:06 +0000
Received: from DB7PR07MB4572.eurprd07.prod.outlook.com ([fe80::5dc9:9b70:83a1:cbfd]) by DB7PR07MB4572.eurprd07.prod.outlook.com ([fe80::5dc9:9b70:83a1:cbfd%7]) with mapi id 15.20.2772.012; Tue, 25 Feb 2020 09:40:06 +0000
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
To: "davenoveck@gmail.com" <davenoveck@gmail.com>, "kaduk@mit.edu" <kaduk@mit.edu>
CC: "draft-ietf-nfsv4-rfc5661sesqui-msns@ietf.org" <draft-ietf-nfsv4-rfc5661sesqui-msns@ietf.org>, "iesg@ietf.org" <iesg@ietf.org>, "nfsv4-chairs@ietf.org" <nfsv4-chairs@ietf.org>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Thread-Topic: Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
Thread-Index: AQHVtX3KOSemQ3CrbESPtw/FdbnaO6fDHYIAgBR0pACAEcubgIAHPhwAgAWd7QCANdpXAA==
Date: Tue, 25 Feb 2020 09:40:05 +0000
Message-ID: <2fab76c0c810795835862d5197d503066f51b40e.camel@ericsson.com>
References: <157665795217.30033.16985899397047966102.idtracker@ietfa.amsl.com> <CADaq8jegizL79V4yJf8=itMVUYDuHf=-pZgZEh-yqdT30ZdJ5w@mail.gmail.com> <CADaq8jcURAKZsNvs17MhNFT7eBNtkvOdrur5hHY2J1gXH7QdsA@mail.gmail.com> <20200113225411.GI66991@kduck.mit.edu> <CADaq8jcUWHo9KANDavHER0CA0AMW4t88t+Hg8PykV4S=hXF_HA@mail.gmail.com> <20200122031650.GE80030@kduck.mit.edu>
In-Reply-To: <20200122031650.GE80030@kduck.mit.edu>
Accept-Language: sv-SE, en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=magnus.westerlund@ericsson.com;
x-originating-ip: [192.176.1.81]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ecf857be-6e21-4e15-a2df-08d7b9d6b235
x-ms-traffictypediagnostic: DB7PR07MB5179:
x-microsoft-antispam-prvs: <DB7PR07MB517925848439ED27799E6A2495ED0@DB7PR07MB5179.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:5516;
x-forefront-prvs: 0324C2C0E2
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(396003)(346002)(136003)(376002)(39860400002)(366004)(189003)(199004)(5660300002)(4326008)(478600001)(6512007)(86362001)(26005)(8676002)(110136005)(186003)(54906003)(44832011)(66616009)(6506007)(2616005)(53546011)(66556008)(36756003)(8936002)(81166006)(66446008)(2906002)(64756008)(66476007)(81156014)(76116006)(6486002)(91956017)(316002)(30864003)(66946007)(71200400001)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR07MB5179; H:DB7PR07MB4572.eurprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 1w3kRgaLZ86U8UJ+crMA96YRcwkDm1BXEbmI85l6cL5z526871PaS9MHVNyPcI4yfsJcQpBaRoCPPNXFRTarQga8DdA+Ev5EOxSfb4IVu9Ek6TQ6dWdSaF0KZ1p3AHO+7vNiZc7iXlQ1Oy7YHtg8IWF1KyJLtogE0g1Vjjtj2NDNsS382obfkkeP8ggedqpVExVdt7VKECAjM4VWAvMs9wKLhifKElemWIOcJaa5nSs3CpJ6nDWH0Qv+zkIxaDEXFe3KFdwg2G8MhFO0lWmdJp0qnask/JEA9kvC7JVEZc4dNdYPt37UWBKYK9QvI6BtEoqmlKb95LO8XR+F5CmV/mNsVL3H770Pam/DaPEVQRl/4XrdP8Qwgk5FOKFzaH1UuJH/yUEHtH89skybXbxTMOIKWtgDVOUtOFQMSgrU6yaZMRd22nUAksaqDFgJqiKk
x-ms-exchange-antispam-messagedata: VbNKlyQlC/NK0CIsyU/2JSrdiq6TEKavH6Knp57qg+fWpZz63QPd3D5RNjIglF9PJ0R/lRKS5awG7ye2f9eUEcfKIXsVG9diISyx1qwOaEBPg+8LahDQIKCB8MJyEEwv4HguncZnkN6xz2iLXmb4gQ==
x-ms-exchange-transport-forked: True
Content-Type: multipart/signed; micalg="sha-256"; protocol="application/x-pkcs7-signature"; boundary="=-43xHWldGumZEwDzF4MmZ"
MIME-Version: 1.0
X-OriginatorOrg: ericsson.com
X-MS-Exchange-CrossTenant-Network-Message-Id: ecf857be-6e21-4e15-a2df-08d7b9d6b235
X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Feb 2020 09:40:05.9244 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: HKC2kol6azh9Lg+T+6EPWseEnjy5SF2tLopqgpLVd8nWArYqdfY7OEwVoADLFD/maz1K+pnWZl1kxgWOve8MWUpbb5meByxpgq2A9EbOrPw=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR07MB5179
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/3y6KaB1mh5IxqKUQmdIQcprNUwU>
Subject: Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Feb 2020 09:40:23 -0000

Hi Ben,

-04 has been available for a while now. Can you please check if it addresses
your issues or not. 

Thanks

Magnus

On Tue, 2020-01-21 at 19:16 -0800, Benjamin Kaduk wrote:
> Not attempting to trim anything, so sparsely inline...
> 
> On Sat, Jan 18, 2020 at 08:30:18AM -0500, David Noveck wrote:
> > On Mon, Jan 13, 2020 at 5:54 PM Benjamin Kaduk <kaduk@mit.edu> wrote:
> > 
> > > Hi David,
> > > 
> > > Trimming lots of good stuff here as well...
> > > 
> > > On Thu, Jan 02, 2020 at 10:09:02AM -0500, David Noveck wrote:
> > > > On Wed, Dec 18, 2019 at 3:32 AM Benjamin Kaduk via Datatracker <
> > > > noreply@ietf.org> wrote:
> > > > 
> > > > > Benjamin Kaduk has entered the following ballot position for
> > > > > draft-ietf-nfsv4-rfc5661sesqui-msns-03: Discuss
> > > > > 
> > > > > ----------------------------------------------------------------------
> > > > > DISCUSS:
> > > > > ----------------------------------------------------------------------
> > > > > 
> > > > > Responded to these on 12/20.
> > > > > 
> > > > > ----------------------------------------------------------------------
> > > > > COMMENT:
> > > > > ----------------------------------------------------------------------
> > > > > 
> > > > > I think I may have mistakenly commented on some sections that are
> > > > > actually just moved text, since my lookahead window in the diff was
> > > > > too
> > > > > small.
> > > > 
> > > > 
> > > > No harm, no foul.
> > > > 
> > > > 
> > > > > 
> > > > > Since the "Updates:" header is part of the immutable RFC text (though
> > > > > "Updated by:" is mutable), we should probably explicitly state that
> > > 
> > > "the
> > > > > updates that RFCs 8178 and 8434 made to RFC 5661 apply equally to this
> > > > > document".
> > > > > 
> > > > 
> > > > I think we could update the last paragraph of Section 1.1 to be more
> > > > explicit about
> > > > this.  Perhaps it could read:
> > > > 
> > > >    Until the above work is done, there will not be a consistent set of
> > > >    documents providing a description of the NFSv4.1 protocol and any
> > > >    full description would involve documents updating other documents
> > > >    within the specification.   The updates applied by RFC 8434 [66] and
> > > >    RFC 8178 [63] to RFC5661 also apply to this specification, and
> > > >    will apply to any subsequent v4.1 specification until that work is
> > > 
> > > done.
> > > 
> > > Sounds good.
> > > 
> > > > > 
> > > > > I note inline (in what is probably too many places; please don't reply
> > > > > at all of them!) some question about how clear the text is that a file
> > > > > system migration is something done at a per-file-system granularity,
> > > 
> > > and
> > > > > that migrating a client at a time is not possible.
> > > > 
> > > > 
> > > > It might be possible but doing so is not a goal of this specfication.
> > > > 
> > > > I'm not sure how to address your concern.   I don't know why anyone
> > > > would
> > > > assume that migrating entire clients is a goal of this specification.
> > > 
> > >  As
> > > > far as
> > > > I can see, when the word "migration" is used it is always in connection
> > > 
> > > with
> > > > migrating a file system.   Is there some specific place where you think
> > > > this
> > > > issue is likely to arise?
> > > 
> > > I think I garbled my point; my apologies.
> > > To give a semi-concrete example, suppose I have clients A and B that are
> > > accessing filesystem F on server X, and filesystem F is also available on
> > > server Y.  If X decides that it needs to migrate access to F away from X
> > > (e.g., for maintenance), then the "file system migration event" involves
> > > telling both A and B to look to Y for access to F, at basically the same
> > > time.
> > 
> > 
> > This clarifies things for me.   When you were speaking of "migrating a
> > client"
> > i ssumed you  worried anout consistency of fs's F, G,, H for a particular
> > client.
> > Now it appears the issue is consistency among clients A, B, c, all
> > accessing a
> > common F.
> > 
> > If X tries to tell only A but not B to access F via Y but lets B
> > > continue to access F at X, then I think there can be some subtle
> > > consistency issues.
> > > 
> > 
> > Or worse, some decidely unsubltle ones :-(
> > 
> > > 
> > > In some sense, this is easy to consider as a dichotomy between "migration
> > > is for server maintenance" vs. "migration is for load balancing".
> > 
> > 
> > That categorization helps.
> > 
> > 
> > > Assuming
> > > I understand correctly (not a trivial assumption!), there was never any
> > > intent to use these mechanisms for load balancing,
> > 
> > 
> > Well "Never" covers a lot.    There are cases which you do want to do
> > load balancing.   For example, if you are dealing with multiple network
> > access path to the same replica, there is no issue with the load balancing
> > approach.   In the case of multiple replicas where data consistency applies
> > between them, then you might lod balance but it is the server's
> > resposibility to
> > provide the consistency, meaning that he needs to be warned of the
> > possibility
> > of issues that might arise if clients   modifying the same dara are placed
> > on
> > different replicas. In the case in which you don't guarantee data
> > consistency among
> > replicas, you might as well say about doing load balacing that "there be
> > dragons".
> > 
> > and if we can explicitly
> > > disclaim such usage, then we don't have to try to reason through any
> > > potential subtle consistency issues.
> > > 
> > 
> > I think we can disclaim the really problematic part.  I think the new text
> > will be needed in the migration section.  Issues with replication are
> > different and do not involve any server choice.
> > 
> > I anticipate revising section 11.5.5 to read as follows:
> > 
> >    When a file system is present and becomes inacessible using the
> >    current access path, the NFSv4.1 protocol provides a means by which
> >    clients can be given the opportunity to have continued access to
> >    their data.  This may involve use of a different access path to the
> >    existing replica or by providing a path to a different replica.  The
> >    new access path or the location of the new replica is specified by a
> >    file system location attribute.  The ensuing migration of acceess
> >    includes the ability to retain locks across the transition.
> >    Depending on circumstances, this can involve:
> > 
> >    o  The continued use of the existing clientid when accessing the
> >       current replica using a new access path.
> > 
> >    o  Use of lock reclaim, taking advantage of a per-fs grace period.
> > 
> >    o  Use of Tranparent State Migration.
> > 
> >    Typically, a client will be accessing the file system in question,
> >    get an NFS4ERR_MOVED error, and then use a file system location
> >    attribute to determine new the access path for the data.  When
> >    fs_locations_info is used, additional information will be available
> >    that will define the nature of the client's handling of the
> >    transition to a new server.
> > 
> >    In most instances clients will choose to migrate all clients using a
> 
> (I assume s/clients/servers/ (just the first time))
> 
> >    particular file system to a successor replica at the same time to
> >    avoid cases in which different clients are updating diufferent
> >    replicas.  However migration of individual client can be helpful in
> >    providing load balancing, as long as the replicas in question are
> >    such that they represent the same data as described in
> >    Section 11.11.8.
> > 
> >    o  In the case in which there is no transition between replicas
> >       (i.e., only a change in access path), there are no special
> >       difficulties in using of this mechanism to effect load balancing.
> > 
> >    o  In the case in which the two replicas are sufficiently co-
> >       ordinated as to allow coherent simultaneous access to both by a
> >       single client, there is, in general, no obstacle to use of
> >       migration of particular clients to effect load balancing.
> >       Generally, such simultaneous use involves co-operation between
> >       servers to ennsure that locks granted on two co-ordinated replica
> >       cannot conflict and can remain effective when transferred to a
> >       common replica.
> > 
> >    o  In the case in which a large set of clients are accessing a file
> >       system in a read-only fashion, in can be helpful to migrate all
> >       clients with writable access simultaneously, while using load
> >       balancing on the set of read-only copies, as long as the rules
> >       appearing in Section 11.11.8, designed to prevent data reversion
> >       are adhered to.
> > 
> >    In other cases, the client might not have sufficient guarantees of
> >    data similarity/coherence to function prperly (e.g. the data in the
> >    two replicas is similar but not identical), and the possibility that
> >    different clients are updating different replicas can exacerbate the
> >    difficulties, making use of load balancing in such situations a
> >    perilous enterprise.
> > 
> >    The protocol does not specify how the file system will be moved
> >    between servers or how updates to multiple replicas will be co-
> >    ordinated.  It is anticipated that a number of different server-to-
> >    server co-ordination mechanisms might be used with the choice left to
> >    the server implementer.  The NFSv4.1 protocol specifies the method
> >    used to communicate the migration event between client and server.
> > 
> >    The new location may be, in the case of various forms of server
> >    clustering, another server providing access to the same physical file
> >    system.  The client's responsibilities in dealing with this
> >    transition will depend on whether a switch between replicas has
> >    occurred and the means the server has chosen to provide continuity of
> >    locking state.  These issues will be discussed in detail below.
> > 
> >    Although a single successor location is typical, multiple locations
> >    may be provided.  When multiple locations are provided, the client
> >    will typically use the first one provided.  If that is inaccessible
> >    for some reason, later ones can be used.  In such cases the client
> >    might consider the transition to the new replica to be a migration
> >    event, even though some of the servers involved might not be aware of
> >    the use of the server which was inaccessible.  In such a case, a
> >    client might lose access to locking state as a result of the access
> >    transfer.
> > 
> >    When an alternate location is designated as the target for migration,
> >    it must designate the same data (with metadata being the same to the
> >    degree indicated by the fs_locations_info attribute).  Where file
> >    systems are writable, a change made on the original file system must
> >    be visible on all migration targets.  Where a file system is not
> >    writable but represents a read-only copy (possibly periodically
> >    updated) of a writable file system, similar requirements apply to the
> >    propagation of updates.  Any change visible in the original file
> >    system must already be effected on all migration targets, to avoid
> >    any possibility that a client, in effecting a transition to the
> >    migration target, will see any reversion in file system state.
> > 
> > 
> > > > As was the case for
> > > > > my Discuss point about addresses/port-numbers, I'm missing the context
> > > > > of the rest of the document, so perhaps this is a non-issue, but the
> > > > > consequences of getting it wrong seem severe enough that I wanted to
> > > > > check.
> > > > > 
> > > > 
> > > > I'm not seeing any severe consequences.   Am I missing something?
> > > > 
> > > > 
> > 
> > This is clearer now. I think we can avoid any severe consequences.
> > 
> > 
> > > > 
> > > > Section 1.1
> > > > > 
> > > > >    The revised description of the NFS version 4 minor version 1
> > > > >    (NFSv4.1) protocol presented in this update is necessary to enable
> > > > >    full use of trunking in connection with multi-server namespace
> > > > >    features and to enable the use of transparent state migration in
> > > > >    connection with NFSv4.1.  [...]
> > > > > 
> > > > > nit: do we expect all readers to know what is meant by "trunking" with
> > > > > no other lead-in?
> > > > > 
> > > > 
> > > > Good point.  perhaps it could be addressed by rewriting the material in
> > > 
> > > the
> > > > first paragraph of  Section 1.1 to read as follows;.
> > > > 
> > > >    Two important features previously defined in minor version 0 but
> > > >    never fully addressed in minor version 1 are trunking, the use of
> > > >    multiple connections between a client and server potentially to
> > > >    different network addresses, and transparent state migration, which
> > > >    allows a file system to be transferred betwwen servers in a way that
> > > >    provides for the client to maintain its existing locking state
> > > > accross
> > > >    the transfer.
> > > 
> > > Maybe "the simultaneous use of multiple connections"?
> > > 
> > 
> > Will add.
> > 
> > 
> > > nit: s/betwwen/between/
> > > 
> > 
> > Fixed.
> > 
> > > 
> > >    The revised description of the NFS version 4 minor version 1
> > > >    (NFSv4.1) protocol presented in this update is necessary to enable
> > > >    full use of these features with other multi-server namespace features
> > > >    This document is in the form of an updated description of the NFS 4.1
> > > >    protocol previously defined in RFC5661 [62].  RFC5661 is obsoleted by
> > > >    this document.  However, the update has a limited scope and is
> > > > focused
> > > >    on enabling full use of trunkinng and transparent state migration.
> > > 
> > > The
> > > >    need for these changes is discussed in Appendix A.  Appendix B
> > > 
> > > describes
> > > >    the specific changes made to arrive at the current text.
> > > 
> > > This looks good, thanks.
> > > 
> > 
> > :-)
> > 
> > 
> > > 
> > > [...]
> > > > > 
> > > > >    o  Work would have to be done to address many erratas relevant to
> > > 
> > > RFC
> > > > >       5661, other than errata 2006 [60], which is addressed in this
> > > > >       document.  That errata was not deferrable because of the
> > > > >       interaction of the changes suggested in that errata and handling
> > > > >       of state and session migration.  The erratas that have been
> > > > >       deferred include changes originally suggested by a particular
> > > > >       errata, which change consensus decisions made in RFC 5661, which
> > > > >       need to be changed to ensure compatibility with existing
> > > > >       implementations that do not follow the handling delineated in
> > > > > RFC
> > > > >       5661.  Note that it is expected that such erratas will remain
> > > > > 
> > > > > This sentence is pretty long and hard to follow; maybe it could be
> > > 
> > > split
> > > > > after "change consensus decisions made in RFC 5661" and the second
> > > > > half
> > > > > start with a more declarative statement about existing
> > > > > implementations?
> > > > > (E.g., "Existing implementations did not perform handling as
> > > 
> > > delineated in
> > > > > RFC
> > > > > 5661 since the procedures therein were not workable, and in order to
> > > > > have the specification accurately reflect the existing deployment
> > > > > base,
> > > > > changes are needed [...]")
> > > > > 
> > > > 
> > > > I will clean this bullet up.  See below for a proposed replcement.
> > > > 
> > > > 
> > > > > 
> > > > >       relevant to implementers and the authors of an eventual
> > > > >       rfc5661bis, despite the fact that this document, when approved,
> > > > >       will obsolete RFC 5661.
> > > > > 
> > > > > (I assume the RFC Editor can tweak this line to reflect what actually
> > > > > happens; my understanding is that the errata reports will get cloned
> > > > > to
> > > > > this-RFC.)
> > > > > 
> > > > 
> > > > I understand that Magnus has already got that issue addressed.  I'll
> > > > discuss the appropriate text with him.
> > > > 
> > > > 
> > > > > [rant about "errata" vs. "erratum" elided]
> > > > > 
> > > > 
> > > > This is annoying but there is no way we are going to get people to use
> > > > "erratum".   What I've tried to do in my propsed replacement text
> > > > is to refer to "errata report(s)", which is more accurate and allows
> > > > people who speak English to use English singulars and plurals, without
> > > > having to worry about Latin grammar.
> > > 
> > > That's what I try to do as well :)
> > > 
> > > > Here's my proposed replacement for the troubled bullet:
> > > > 
> > > >    o  Work needs to be done to address many errata reports relevant to
> > > >       RFC 5661, other than errata report 2006 [60], which is addressed
> > > >       in this document.  Addressing of that report was not deferrable
> > > >       because of the interaction of the changes suggested there and the
> > > >       newly described handling of state and session migration.
> > > > 
> > > >       The errata reports that have been deferred and that will need to
> > > >       be addressed in a later document include reports currently
> > > >       assigned a range of statuses in the errata reporting system
> > > >       including reports marked Accepted and those marked Held Over
> > > 
> > > nit: it's "Hold For Document Update"
> > > 
> > > Fixed
> > > >       because the change was too minor to address immediately.
> > > > 
> > > >       In addition, there is a set of other reports, including at least
> > > >       one in state Rejected, which will need to be addressed in a later
> > > >       document.  This will involve making changes to consensus decisions
> > > >       reflected in RFC 5661, in situations in which the working group
> > > > has
> > > >       already decided that the treatment in RFC 5661 is incorrect, and
> > > > needs
> > > >       to be revised to reflect the working group's new consensus and
> > > 
> > > ensure
> > > >       compatibility with existing implementations that do not follow the
> > > >       handling described in in RFC 5661.
> > > > 
> > > >       Note that it is expected that such all errata reports will remain
> > > 
> > > nit: s/such all/all such/
> > > 
> > > Fixed.
> > > >       relevant to implementers and the authors of an eventual
> > > >       rfc5661bis, despite the fact that this document, when approved,
> > > >       will obsolete RFC 5661 [62].
> > > 
> > > This looks really good!
> > > 
> > > > 
> > > > > Section 2.10.4
> > > > > 
> > > > >    Servers each specify a server scope value in the form of an opaque
> > > > >    string eir_server_scope returned as part of the results of an
> > > > >    EXCHANGE_ID operation.  The purpose of the server scope is to allow
> > > 
> > > a
> > > > >    group of servers to indicate to clients that a set of servers
> > > 
> > > sharing
> > > > >    the same server scope value has arranged to use compatible values
> > > > > of
> > > > >    otherwise opaque identifiers.  Thus, the identifiers generated by
> > > 
> > > two
> > > > >    servers within that set can be assumed compatible so that, in some
> > > > >    cases, identifiers generated by one server in that set may be
> > > > >    presented to another server of the same scope.
> > > > > 
> > > > > Is there more that we can say than "in some cases"?
> > > > 
> > > > 
> > > > Not really.  In general, when a server sends you an id, it comes with an
> > > > implied promise to recognize it when you present it subsequently to the
> > > > same server.
> > > > 
> > > > The fact that two servers have decided to co-operate in their Id
> > > 
> > > assignment
> > > > does not change that.
> > > > 
> > > > The previous text
> > > > > implies a higher level of reliability than just "some cases", to me.
> > > > > 
> > > > 
> > > > I think I need to change the text, perhaps by replacing "use compatible
> > > > values of otherwise
> > > > opaque identifiers" by "use distinct values of otherwise opaque
> > > 
> > > identifiers
> > > > so that the two
> > > > servers never assign the same value to two distinct objects".
> > > > 
> > > > I anticipate the following replacement for the first two paragraphs of
> > > > Section 2.10.4:
> > > > 
> > > >    Servers each specify a server scope value in the form of an opaque
> > > >    string eir_server_scope returned as part of the results of an
> > > >    EXCHANGE_ID operation.  The purpose of the server scope is to allow a
> > > >    group of servers to indicate to clients that a set of servers sharing
> > > >    the same server scope value has arranged to use distinct values of
> > > >    opaque identifiers so that the two servers never assign the same
> > > >    value to two distinct object.  Thus, the identifiers generated by two
> > > >    servers within that set can be assumed compatible so that, in certain
> > > >    important cases, identifiers generated by one server in that set may
> > > >    be presented to another server of the same scope.
> > > > 
> > > >    The use of such compatible values does not imply that a value
> > > >    generated by one server will always be accepted by another.  In most
> > > >    cases, it will not.  However, a server will not accept a value
> > > >    generated by another inadvertently.  When it does accept it, it will
> > > 
> > > nit: I think it flows better to put "invertently" as "will not
> > > inadvertently accept".
> > > 
> > 
> > OK.  Fixed.
> > 
> > 
> > > 
> > > >    be because it is recognized as valid and carrying the same meaning as
> > > >    on another server of the same scope.
> > > > 
> > > > 
> > > > As an illustration of the (limited) value of this information, consider
> > > 
> > > the
> > > > case of client recovery from a server reboot.  The client has to reclaim
> > > > his locks using file handles returned by the previous server instance.
> > > 
> > > If
> > > > the server scopes are the same (they almost always are), the client is
> > > 
> > > not
> > > > sure he will get his locks back (e.g. the file might have been deleted),
> > > > but he does know that, if the lock reclaim succeeds, it is for the same
> > > > file.  If the server scopes are not the same, he has no such assurance.
> > > 
> > > Thanks, the new text (and explanation here) is very clear about what's
> > > going on.
> > > 
> > > [...]
> > > > > Section 11.5.5
> > > > > 
> > > > >    will typically use the first one provided.  If that is inaccessible
> > > > >    for some reason, later ones can be used.  In such cases the client
> > > > >    might consider that the transition to the new replica as a
> > > > > migration
> > > > >    event, even though some of the servers involved might not be aware
> > > 
> > > of
> > > > >    the use of the server which was inaccessible.  In such a case, a
> > > > > 
> > > > > nit: the grammar here got wonky; maybe s/as a/is a/?
> > > > 
> > > > 
> > > > How about s/as a/to be a/ ?
> > > 
> > > That works if you drop the earlier "that", for "the client might consider
> > > the transition to the new replica to be a migration event".
> > > 
> > > Did that.
> > > [...]
> > > > > 
> > > > >    o  The "local" representation of all owners and groups must be the
> > > > >       same on all servers.  The word "local" is used here since that
> > > > > is
> > > > >       the way that numeric user and group ids are described in
> > > > >       Section 5.9.  However, when AUTH_SYS or stringified owners or
> > > > >       group are used, these identifiers are not truly local, since
> > > > > they
> > > > >       are known tothe clients as well as the server.
> > > > > 
> > > > > I am trying to find a way to note that the AUTH_SYS case mentioned
> > > > > here
> > > > > is precisely because of the requirement being imposed by this bullet
> > > > > point,
> > > > 
> > > > 
> > > > Not sure what you mean by that.  I think the requirement is to allow the
> > > > client
> > > > to be able to use AUTH_SYS, without the contortions that would be
> > > 
> > > required
> > > > if
> > > > different fs's had the same uid's meaning different things.
> > > > 
> > > > while acknowledging that the "stringified owners or group" case
> > > > > is separate, but not having much luck.
> > > > > 
> > > > 
> > > > My attempt to revise this area is below:
> > > > 
> > > >    Note that there is no requirement in general that the users
> > > >    corresponding to particular security principals have the same local
> > > >    representation on each server, even though it is most often the case
> > > >    that this is so.
> > > > 
> > > >    When AUTH_SYS is used, the following additional requirements must be
> > > >    met:
> > > > 
> > > >    o  Only a single NFSv4 domain can be supported through use of
> > > >       AUTH_SYS.
> > > > 
> > > >    o  The "local" representation of all owners and groups must be the
> > > >       same on all servers.  The word "local" is used here since that is
> > > >       the way that numeric user and group ids are described in
> > > >       Section 5.9.  However, when AUTH_SYS or stringified numeric owners
> > > >       or groups are used, these identifiers are not truly local, since
> > > >       they are known to the clients as well as the server.
> > > > 
> > > >    Similarly, when strigified numeric user and group ids are used, the
> > > >    "local" representation of all owners and groups must be the same on
> > > >    all servers, even when AUTH_SYS is not used.
> > > 
> > > I really like this rewriting; thank you for undertaking it.
> > > I think that what I was trying to say here is roughly that we need
> > > scare-quotes for "local" because of things like AUTH_SYS (or stringified
> > > user/group ids) that involve sending local representations over the
> > > network.  So your rewrite did in fact address my concern, even though I
> > > didn't manage to say it very well the first time :)
> > > 
> > > [...]
> > > > > 
> > > > >    o  When there are no potential replacement addresses in use but
> > > 
> > > there
> > > > > 
> > > > > What is a "replacement address"?
> > > > > 
> > > > 
> > > > I've explained that in some new text added before these bullets, as a
> > > > new
> > > > second
> > > > paragraph of this section:
> > > > 
> > > >    The appropriate action depends on the set of replacement addresses
> > > >    (i.e. server endpoints which are server-trunkable with one previously
> > > >    being used) which are available for use.
> > > > 
> > > > > 
> > > > >       are valid addresses session-trunkable with the one whose use is
> > > 
> > > to
> > > > >       be discontinued, the client can use BIND_CONN_TO_SESSION to
> > > 
> > > access
> > > > >       the existing session using the new address.  Although the target
> > > > >       session will generally be accessible, there may be cases in
> > > > > which
> > > > >       that session is no longer accessible.  In this case, the client
> > > > >       can create a new session to enable continued access to the
> > > > >       existing instance and provide for use of existing filehandles,
> > > > >       stateids, and client ids while providing continuity of locking
> > > > >       state.
> > > > > 
> > > > > I'm not sure I understand this last sentence.  On its own, the "new
> > > > > session to enable continued access to the existing instance" sounds
> > > 
> > > like
> > > > > the continued access would be on the address whose use is to cease,
> > > > > and
> > > > > thus the new session would be there.
> > > > 
> > > > 
> > > > That is not the intention.  Will need to clarify.
> > > > 
> > > > 
> > > > > But why make a new session when
> > > > > the old one is still good,
> > > > 
> > > > 
> > > > It isn't usable on the new connection.
> > > > 
> > > > 
> > > > > especially when we just said in the previous
> > > > > sentence that the old session can't be moved to the new
> > > > > connection/address?
> > > > > 
> > > > 
> > > > Because we can't use it on the new connection, we have to create a
> > > > new session to access  the client.
> > > > 
> > > > Perhaps a forward reference down to Section 11.12.{4,5} for this and the
> > > > > next bullet point would help as well as rewording?
> > > > > 
> > > > 
> > > > It rurns out these would add confusion since they deal with migration
> > > > situations
> > > > and deciding wheher transparent stte miugration has occurred in the
> > > 
> > > switch
> > > > between
> > > > replicas.  In the cases we are dealing with, ther is only a  single
> > > > replicas/fs and no
> > > > migration..
> > > > 
> > > > Here is my proposed replacement text for the two bullets in question:
> > > > 
> > > >    o  When there are no potential replacement addresses in use but there
> > > >       are valid addresses session-trunkable with the one whose use is to
> > > >       be discontinued, the client can use BIND_CONN_TO_SESSION to access
> > > >       the existing session using the new address.  Although the target
> > > >       session will generally be accessible, there may be rare situations
> > > >       in which that session is no longer accessible, when an attempt is
> > > >       made tto bind the new conntectin to it.  In this case, the client
> > > 
> > > nits: s/tto/to/, s/conntectin/connection/
> > > 
> > 
> > Fixed.
> > 
> > > 
> > > >       can create a new session to enable continued access to the
> > > >       existing instance and provide for use of existing filehandles,
> > > >       stateids, and client ids while providing continuity of locking
> > > >       state.
> > > 
> > > Just to check: this sounds like even in the case where the client creates
> > > a new session, the filehandle, stateid, clientid, and locking state
> > > (values) are in effect "transparently preserved" by the server, so the
> > > client has no need to do any reclamation of locking state.  I think that's
> > > what's intended, but holler if I'm wrong about that.
> > > 
> > 
> > Ok.
> > I'll holler that *you're right about that.*
> > 
> > > 
> > > >    o  When there is no potential replacement address in use and there
> > > >       are no valid addresses session-trunkable with the one whose use is
> > > >       to be discontinued, other server-trunkable addresses may be used
> > > >       to provide continued access.  Although use of CREATE_SESSION is
> > > >       available to provide continued access to the existing instance,
> > > >       servers have the option of providing continued access to the
> > > >       existing session through the new network access path in a fashion
> > > >       similar to that provided by session migration (see Section 11.12).
> > > >       To take advantage of this possibility, clients can perform an
> > > >       initial BIND_CONN_TO_SESSION, as in the previous case, and use
> > > >       CREATE_SESSION only if that fails.
> > > > 
> > > > 
> > > > > Section 11.10.6
> > > > > 
> > > > >    In a file system transition, the two file systems might be
> > > > > clustered
> > > > >    in the handling of unstably written data.  When this is the case,
> > > 
> > > and
> > > > > 
> > > > > What does "clustered in the handling of unstably written data" mean?
> > > > > 
> > > > >    the two file systems belong to the same write-verifier class, write
> > > > > 
> > > > > How is the client supposed to determine "when this is the case"?
> > > > > 
> > > > 
> > > > Here's a prpoed replcment for this pargraph:
> > > > 
> > > >    In a file system transition, the two file systems might be
> > > >    cooperating in the handling of unstably written data.  Clients can
> > > >    ditermine if this is the case, by seeing if the two file systems
> > > >    belong to the same write-verifier class.  When this is the case,
> > > >    write verifiers returned from one system may be compared to those
> > > >    returned by the other and superfluous writes avoided.
> > > > 
> > > > 
> > > > > Section 11.10.7
> > > > > 
> > > > >    In a file system transition, the two file systems might be
> > > 
> > > consistent
> > > > >    in their handling of READDIR cookies and verifiers.  When this is
> > > 
> > > the
> > > > >    case, and the two file systems belong to the same readdir class,
> > > > > 
> > > > > As above, how is the client supposed to determine "when this is the
> > > > > case"?
> > > > 
> > > > 
> > > > >    READDIR cookies and verifiers from one system may be recognized by
> > > > >    the other and READDIR operations started on one server may be
> > > 
> > > validly
> > > > >    continued on the other, simply by presenting the cookie and
> > > > > verifier
> > > > >    returned by a READDIR operation done on the first file system to
> > > > > the
> > > > >    second.
> > > > > 
> > > > > Are these "may be"s supposed to admit the possibility that the
> > > > > destination server can just decide to not honor them arbitrarily?
> > > > > 
> > > > 
> > > > No. They are intended to indicate that the client might or might not use
> > > > the capability
> > > > 
> > > > Here is proposed replacement text for the paragraph:
> > > > 
> > > >    In a file system transition, the two file systems might be consistent
> > > >    in their handling of READDIR cookies and verifiers.  Clients can
> > > >    determine if this is the case, by seeing if the two file systems
> > > >    belong to the same readdit class.  When this is the case, readdir
> > > 
> > > nit: s/readdit/readdirt
> > > 
> > 
> > Fixed.
> > 
> > 
> > > 
> > > >    class, READDIR cookies and verifiers from one system will be
> > > >    recognized by the other and READDIR operations started on one server
> > > >    can be validly continued on the other, simply by presenting the
> > > >    cookie and verifier returned
> > > 
> > > Ah, this formulation (for both write-verifier and readdir) is very
> > > helpful.
> > > 
> > > [...]
> > > > > Section 11.16.1
> > > > > 
> > > > >    With the exception of the transport-flag field (at offset
> > > > >    FSLI4BX_TFLAGS with the fls_info array), all of this data applies
> > > > > to
> > > > >    the replica specified by the entry, rather that the specific
> > > > > network
> > > > >    path used to access it.
> > > > > 
> > > > > Is it clear that this applies only to the fields defined by this
> > > > > specification (since, as mentioned later, future extensions must
> > > 
> > > specify
> > > > > whether they apply to the replica or the entry)?
> > > > > 
> > > > 
> > > > Intend to use the following replacement text:
> > > > 
> > > >    With the exception of the transport-flag field (at offset
> > > >    FSLI4BX_TFLAGS with the fls_info array), all of this data defuined in
> > > 
> > > nit: s/defuined/defined/
> > > 
> > 
> >  Fixed.
> > 
> > 
> > > >    this specification applies to the replica specified by the entry,
> > > >    rather that the specific network path used to access it.  The
> > > >    classification of data in extensions to this data is discussed
> > > >    below
> > > 
> > > [...]
> > > > > Section 18.35.3
> > > > > 
> > > > > I a little bit wonder if we want to reaffirm that co_verifier remains
> > > > > fixed when the client is establishing multiple connections for
> > > > > trunking
> > > > > usage -- the "incarnation of the client" language here could make a
> > > > > reader wonder, though I think the discussion of its use elsewhere as
> > > > > relating to "client restart" is sufficiently clear.
> > > > > 
> > > > 
> > > > This should be made clearer but the clarification needs to be done
> > > 
> > > multiple
> > > > places.
> > > > 
> > > > Possible replacement text for eighth non-code paragraph of section 2.4:
> > > > 
> > > >    The first field, co_verifier, is a client incarnation verifier,
> > > >    allowing the server to distingish successive incarnations (e.g.
> > > >    reboots) of the same client.  The server will start the process of
> > > >    canceling the client's leased state if co_verifier is different than
> > > >    what the server has previously recorded for the identified client (as
> > > >    specified in the co_ownerid field).
> > > > 
> > > > Likely replacement text for the seventh paragraph of this section:
> > > > 
> > > >    The eia_clientowner field is composed of a co_verifier field and a
> > > >    co_ownerid string.  As noted in Section 2.4, the co_ownerid describes
> > > >    the client, and the co_verifier is the incarnation of the client.  An
> > > >    EXCHANGE_ID sent with a new incarnation of the client will lead to
> > > >    the server removing lock state of the old incarnation.  Whereas an
> > > >    EXCHANGE_ID sent with the current incarnation and co_ownerid will
> > > >    result in an error, an update of the client ID's properties,
> > > >    depending on the arguments to EXCHANGE_ID, or the return of
> > > >    information about the existing client_id as might happen when this
> > > >    opration is done to the same seerver using different network
> > > >    addresses as part of creating trunked connections.
> > 
> > Not sure what error that error text was  referring to above.   Think it
> > added to the
> > confusion..
> > 
> > > 
> > > I think I get the general sense of what is going on here (i.e., the last
> > > sentence) but am still uncertain on the specifics.  Namely, "most of the
> > > time" (TM), sending EXCHANGE_ID with current incarnation/ownerid will be
> > > an
> > > error, since it's a client bug to try to register the same way twice in a
> > > row.
> > 
> > 
> > No it isn't.   This is case 2 on page 508, " Non-Update on Existing Client
> > ID".
> > Given  retries and possible communication difficulties, it is just too hard
> > to
> > make this case an error.
> > 
> > However, some times we might have to do that in order to update
> > > properties of the client or get some new information that a server has
> > > associated to a given client ID.  I *think* (but am not sure) that the
> > > error case is exactly when the (same-incarnation/ownerid) EXCHANGE_ID is
> > > done to the same *server and address* as the original EXCHANGE_ID, and
> > > that
> > > the "update properties or get new information back" case is exactly when
> > > the EXCHANGE_ID is done to a different server/address combination.
> > > 
> > > If I'm right about that, then I'd suggest:
> > > 
> > > %    the server removing lock state of the old incarnation.  Whereas an
> > > %    EXCHANGE_ID sent with the current incarnation and co_ownerid will
> > > %    result in an error when sent to a given server at a given address for
> > > %    a second time, it is not an error to send EXCHANGE_ID with current
> > > %    incarnation and co_ownerid to a different server (e.g., as part of a
> > > %    migration event).  In such cases, the EXCHANGE_ID can allow for an
> > > %    update of the client ID's properties, depending on the arguments to
> > > %    EXCHANGE_ID, or the return of (potentially updated) information about
> > > %    the existing client_id, as might happen when this opration is done to
> > > %    the same server using different network addresses as part of creating
> > > %    trunked connections.
> > > 
> > 
> > I think I have to revise the paragraph above to be clearer.   I anticipate
> > replacing the seventh paragraph of section 18.35.3 by the following
> > replacement:
> > 
> >    The eia_clientowner field is composed of a co_verifier field and a
> >    co_ownerid string.  As noted in Section 2.4, the co_ownerid
> >    identifies the client, and the co_verifier specfies a particular
> >    incarnation of that client.  An EXCHANGE_ID sent with a new
> >    incarnation of the client will lead to the server removing lock state
> >    of the old incarnation.  On the other hand, an EXCHANGE_ID sent with
> >    the current incarnation and co_ownerid will, when it does not result
> >    in an unrelated error, porentially update an existing client ID's
> >    properties, or simply return of information about the existing
> >    client_id.  That latter would happen when this operation is done to
> >    the same server using different network addresses as part of creating
> >    trunked connections.
> 
> Ah, I think I get it now.  Thanks.
> 
> > 
> > > > > Section 21
> > > > > 
> > > > > Some other topics at least somewhat related to trunking and migration
> > > > > that we could potentially justify including in the current,
> > > > > limited-scope, update (as opposed to deferring for a full -bis)
> > > 
> > > include:
> > > > > 
> > > > 
> > > > Some of these are related to multi-server namespace but not related to
> > > > security, as far as I can see.
> > > 
> > > It does look like it; in some sense I was going through a brainstorming
> > > exercise to make this list, and appreciate the sanity checks.  (To be
> > > clear, I am not insisting that any of them get covered in specifically the
> > > sesqui update, just mentioning topics for potential consideration.)
> > > > 
> > > > > 
> > > > > - clients that lie about reclaimed locks during a post-migration grace
> > > > >   period
> > > > > 
> > > > 
> > > > Will address in a number of places:
> > > > 
> > > > First of all, I inted to add a new paragraph to Section 21, to be placed
> > > 
> > > as
> > > > the
> > > > sixth non-bulleted paragraph and to read as follows:
> > > > 
> > > >    Security consideration for lock reclaim differ between the state
> > > >    reclaim done after server failure (discussed in Section 8.4.2.1.1 and
> > > >    the per-fs state reclaim done in support of migration/replication
> > > >    (discussed in Section 11.11.9.1).
> > > > 
> > > > Next is a new proposed new section to appear as Section 11.11.9.1:
> > > > 
> > > > 11.11.9.1.  Security Consideration Related to Reclaiming Lock State
> > > >             after File System Transitions
> > > > 
> > > >    Although it is possible for a client reclaiming state to misrepresent
> > > >    its state, in the same fashion as described in Section 8.4.2.1.1,
> > > >    most implementations providing for such reclamation in the case of
> > > >    file system transitions will have the ability to detect such
> > > >    misreprsentations.  this limits the ability of unauthenicatd clients
> > > 
> > > typos: "misrepresentations", "This", "unauthenticated"
> > > 
> > 
> > Fixed.
> > 
> > 
> > > 
> > > >    to execute denial-of-service attacks in these cirsumstances.
> > > 
> > > "circumstances"
> > > 
> > > 
> > 
> > Fixed.
> > 
> > 
> > > >    Nevertheless, the rules stated in Section 8.4.2.1.1, regarding
> > > >    prinipal verification for reclaim requests, apply in this situation
> > > >    as well.
> > > > 
> > > >    Typically,implementations support file system transitions will have
> > > 
> > > nits: space after comma, and "that" for "that support"
> > > 
> > > Fixed.
> > 
> > 
> > > >    extensive information about the locks to be transferred.  This is
> > > >    because:
> > > > 
> > > >    o  Since failure is not involved, there is no need store to locking
> > > >       information in persistent storage.
> > > > 
> > > >    o  There is no need, as there is in the failure case, to update
> > > >       multiple repositories containg locking state to keep them in sync.
> > > >       Instead, there is a one-time communication of locking state from
> > > >       the source to the destination server.
> > > > 
> > > >    o  Providing this information avoids potential interference with
> > > >       existing clients using the destination file system, by denying
> > > >       them the ability to obtain new locks during the grace period.
> > > > 
> > > >    When such detailed locking infornation, not necessarily including the
> > > >    associated stateid,s is available,
> > > 
> > > nits: "information", s/stateid,s/stateids,/
> > > 
> > 
> > Fixed.
> > 
> > 
> > > > 
> > > >    o  It is possible to detect reclaim requests that attempt to reclsim
> > > 
> > > nit: s/reclsim/reclaim/
> > > 
> > 
> > Fixed.
> > 
> > > 
> > > >       locks that did not exist before the transfer, rejecting them with
> > > >       NFS4ERR_RECLAIM_BAD (Section 15.1.9.4).
> > > > 
> > > >    o  It is possible when dealing with non-reclaim requests, to
> > > >       determine whether they conflict with existing locks, eliminating
> > > >       the need to return NFS4ERR_GRACE ((Section 15.1.9.2) on non-
> > > >       reclaim requests.
> > > > 
> > > >    It is possible for implementations of grace periods in connection
> > > >    with file system transitions not to have detailed locking information
> > > >    available at the destination server, in which case the security
> > > >    situation is exactly as described in Section 8.4.2.1.1.
> > > > 
> > > > I think I should also draw your attention to a revised Section 15.1.9.
> > > >  These
> > > > includes some revisions originally done for
> > > > draft-ietf-nfsv4-rfc5661-msns-update,
> > > > which somehow got dropped as a few that turned up as necessary in
> > > > writing
> > > > 11.11.9.1:
> > > > 
> > > > 15.1.9.  Reclaim Errors
> > > > 
> > > >    These errors relate to the process of reclaiming locks after a server
> > > >    restart.
> > > > 
> > > > 15.1.9.1.  NFS4ERR_COMPLETE_ALREADY (Error Code 10054)
> > > > 
> > > >    The client previously sent a successful RECLAIM_COMPLETE operation
> > > >    specifying the same scope, whether that scope is global or for the
> > > >    same file system in the case of a per-fs RECLAIM_COMPLETE.  An
> > > >    additional RECLAIM_COMPLETE operation is not necessary and results in
> > > >    this error.
> > > > 
> > > > 15.1.9.2.  NFS4ERR_GRACE (Error Code 10013)
> > > > 
> > > >    This error is returned when the server was in its recovery or grace
> > > >    period. with regard to the file system object for which the lock was
> > > 
> > > (no full stop)
> > > 
> > 
> >  Fixed.
> > 
> > 
> > > >    requested resulting in a situation in which a non-reclaim locking
> > > >    request could not be granted.  This can occur because either
> > > > 
> > > >    o  The server does not have sufficiuent information about locks that
> > > >       might be poentially reclaimed to determine whether the lock could
> > > >       validly be granted.
> > > > 
> > > >    o  The request is made by a client responsible for reclaiming its
> > > >       locks that has not yet done the appropriate RECLAIM_COMPLETE
> > > >       operation, allowing it to proceed to obtain new locks.
> > > > 
> > > >    It should be noted that, in the case of a per-fs grace period, there
> > > >    may be clients, i.e. those currently using the destination file
> > > >    system who might be unaware of the circumstances resulting in the
> > > 
> > > nit: comma after "file system"
> > > 
> > 
> > This phrase is now within parentheses.
> > 
> > > 
> > > >    intiation of the grace period.  Such clients need to periodically
> > > >    retry the request until the grace period is over, just as other
> > > >    clients do.
> > > > 
> > > > 15.1.9.3.  NFS4ERR_NO_GRACE (Error Code 10033)
> > > > 
> > > >    A reclaim of client state was attempted in circumstances in which the
> > > >    server cannot guarantee that conflicting state has not been provided
> > > >    to another client.  This occurs if there is no active grace period
> > > >    appliying to the file system object for which the request was made,if
> > > >    the client making the request has no current role in reclaining
> > > >    locks, or because previous operations have created a situation in
> > > >    which the server is not able to determine that a reclaim-interfering
> > > >    edge condition does not exist.
> > > > 
> > > > 15.1.9.4.  NFS4ERR_RECLAIM_BAD (Error Code 10034)
> > > > 
> > > >    The server has determined that a reclaim attempted by the client is
> > > >    not valid, i.e. the lock specified as being reclaimed could not
> > > >    possibly have existed before the server restart or file system
> > > >    migration event.  A server is not obliged to make this determination
> > > >    and will typically rely on the client to only reclaim locks that the
> > > >    client was granted prior to restart.  However, when a server does
> > > >    have reliable information to enable it make this determination, this
> > > >    error indicates that the reclaim has been rejected as invalid.  This
> > > >    is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see
> > > >    Section 15.1.9.5) where the server can only determine that there has
> > > >    been an invalid reclaim, but cannot determine which request is
> > > >    invalid.
> > > > 
> > > > 15.1.9.5.  NFS4ERR_RECLAIM_CONFLICT (Error Code 10035)
> > > > 
> > > >    The reclaim attempted by the client has encountered a conflict and
> > > >    cannot be satisfied.  This potentially indicates a misbehaving
> > > >    client, although not necessarily the one receiving the error.  The
> > > >    misbehavior might be on the part of the client that established the
> > > >    lock with which this client conflicted.  See also Section 15.1.9.4
> > > >    for the related error, NFS4ERR_RECLAIM_BAD
> > > 
> > > Thanks for remembering to fetch these updates from the full bis WIP!
> > > 
> > > > 
> > > > > - how attacker capabilities compare by using a compromised server to
> > > > >   give bogus referrals/etc. as opposed to just giving bogus data/etc.
> > > > > 
> > > > 
> > > > Will address. See the paragraphs to be added to the end of Section 21.
> > > > 
> > > > 
> > > > > - an attacker in the network trying to shift client traffic (in terms
> > > 
> > > of
> > > > >   what endpoints/connections they use) to overload a server
> > > > > 
> > > > 
> > > > Will address. See the paragraphs to be added to the end of Section 21.
> > > > 
> > > > 
> > > > > - how asynchronous replication can cause clients to repeat
> > > > >   non-idempotent actions
> > > > > 
> > > > 
> > > > Not sure what you are referring to.
> > > 
> > > I don't have something fully fleshed out here, but it's in the general
> > > space when there are multiple replicas that get updates at (varying)
> > > delays from the underlying write.  A contrived situation would be if you
> > > have a pool of worker machines that use NFS for state management (I know,
> > > a
> > > pretty questionable idea), and try to do compare-and-set on a state file.
> > > If one worker tries to assert that it owns the state but other NFS
> > > replicas
> > > see delayed updates, additional worker machines could also try to claim
> > > the
> > > state and perform whatever operation the state file is controlling.
> > > 
> > > Basically, the point here is that if you as an NFS consumer are using NFS
> > > with relaxed replication semantics, you have to think through how your
> > > workflow will behave in the presence of such relaxed updates.  Which ought
> > > to be obvious, when I say it like that, but perhaps is not always actually
> > > obvious.
> > > 
> > > > 
> > > > > - the potential for state skew and/or data loss if migration events
> > > > >   happen in close succession and the client "misses a notification"
> > > > > 
> > > > 
> > > > Is there a specfic problem that needs to be addressed?
> > > 
> > > I don't have a concrete scenario that's specific to NFS, no; this is just
> > > a
> > > generic possibility for any scheme that involves discrete updates (e.g.,
> > > file-modifying RPCs) and the potential for asynchronous replication.
> > > 
> > 
> > I think that the necessary discussion can be folded into some clarification
> > of replication discussed in
> > another thread.
> 
> Sure.
> 
> > 
> > > > 
> > > > > - cases where a filesystem moves and there's no longer anything
> > > > > running
> > > > >   at the old network endpoint to return NFS4ERR_MOVED
> > > > > 
> > > > 
> > > > This seems to me just a recognition that sometimes system fail.  Not
> > > > sure
> > > > specifically what to address.
> > > 
> > > Okay.
> > > 
> > > > > - what can happen when non-idempotent requests are in a COMPOUND
> > > > > before
> > > > >   a request that gets NFS4ERR_MOVED
> > > > > 
> > > > 
> > > > Intend to address in Section 15.1.2.4:
> > > > 
> > > >    The file system that contains the current filehandle object is not
> > > >    present at the server, or is not accessible using the network
> > > >    addressed.  It may have been made accessible on a different ser of
> > > >    network addresses, relocated or migrated to another server, or it may
> > > >    have never been present.  The client may obtain the new file system
> > > >    location by obtaining the "fs_locations" or "fs_locations_info"
> > > >    attribute for the current filehandle.  For further discussion, refer
> > > >    to Section 11.3.
> > > > 
> > > >    As with the case of NFS4ERR_DELAY, it is possible that one or more
> > > >    non-idempotent operation may have been successfully executed within a
> > > >    COMPOUND before NFS4ERR_MOVED is returned.  Because of this, once the
> > > >    new location is determined, the original request which received the
> > > >    NFS4ERR_MOVED should not be re-executed in full.  Instead, a new
> > > >    COMPOUND, with any successfully executed non-idempotent operation
> > > >    removed should be executed.  This new request should have different
> > > >    slot id or sequence in those cases in which the same session is used
> > > >    for the new request (i.e. transparent session migration or an
> > > 
> > > nit: comma after "i.e.".
> > > 
> > 
> > Fixed.
> > 
> > 
> > > 
> > > >    endpoint transition to a new address session-trunkable with the
> > > >    original one).
> > > > 
> > > > 
> > > > > - how bad it is if the client messes up at Transparent State Migration
> > > > >   discovery, most notably in the case when some lock state is lost
> > > > > 
> > > > 
> > > > Propose to address this by adding the following paragraph to the end of
> > > > Section 11.13.2:
> > > > 
> > > >    Lease discovery needs to be provided as described above, in order to
> > > >    ensure that migrations are discovered soon enough to ensure that
> > > >    leases moved to new servers are discovred in time to make sure that
> > > >    leases are renewed early enough to avoid lease expiration, leading to
> > > >    loss of locking state.  While the consequences of such loss can be
> > > 
> > > nit: the double "are discovered {soon enough,in time}" is a little awkward
> > > of a construction; how about "Lease discovery needs to be provided as
> > > described above, in order to ensure that migrations are discovered soon
> > > enough that leases moved to new servers can successfully be renewed before
> > > they expire, avoiding loss of locking state"?
> > > 
> > 
> > Went with the following:
> > 
> >         Lease discovery needs to be provided as described above, in order to
> >         ensure that migrations are discovered soon enough to enable
> >         leases moved to new servers to be  appropriately renewed in order to
> >         avoid lease expiration, leading to loss of locking state.
> 
> I think maybe we should leave any further tweaks to the RFC Editor; my only
> concern is whether it can be misread as saying that the renewal leads to
> loss of locking state (which is, admittedly, a nonsensical interpretation).
> 
> > > 
> > > >    ameliorated through implementations of courtesy locks, servers are
> > > >    under no obligation to do, and a conflicting lock request may means
> > > 
> > > nit: s/means/mean/
> > > 
> > 
> > Fixed.
> > 
> > 
> > > 
> > > >    that a lock is revoked unexpectedly.  Clients should be aware of this
> > > >    possibility.
> > > > 
> > > > 
> > > > 
> > > > > - the interactions between cached replies and migration(-like) events,
> > > > >   though a lot of this is discussed in section 11.13.X and 15.1.1.3
> > > > >   already
> > > > > 
> > > > 
> > > > Will address any specfics that you feel aren't adequately addressed.
> > > 
> > > I don't remember any particular specifics, so we should probably just let
> > > this go for now.
> > > 
> > > > 
> > > > > 
> > > > > but I defer to the WG as to what to cover now vs. later.
> > > > > 
> > > > > In light of the ongoing work on draft-ietf-nfsv4-rpc-tls, it might be
> > > > > reasonable to just talk about "integrity protection" as an abstract
> > > > > thing without the specific focus on RPCSEC_GSS's integrity protection
> > > > > (or authentication)
> > > > > 
> > > > > 
> > > > 
> > > > I was initially leery of this, but when I looked at the text, I was able
> > > 
> > > to
> > > > avoid referring to RPCSEC_GSS in most cases in which integrity was
> > > > mentioned:-).  The same does not seem posible for authentication :-(
> > > 
> > > We'll take the easy wins and try to not fret too much about the other
> > > stuff.
> > > 
> > > > > RPCSEC_GSS does not
> > > > > %   protect the binding from one server to another as part of a
> > > 
> > > referral
> > > > > %   or migration event.  The source server must be trusted to provide
> > > > > %   the correct information, based on whatever factors are available
> > > > > to
> > > > > %   the client.
> > > > > 
> > > > 
> > > > These are both situations for which RPCSEC_GSS has no solution, but
> > > 
> > > neither
> > > > is there another one.   It is probably best to just say that without
> > > > reference
> > > > to integrity protection.
> > > 
> > > True.
> > > 
> > > > I have added new paragraphs after these bullets that may address some of
> > > > the
> > > > issues you were concerned about.
> > > > 
> > > >    Even if such requests are not interfered with in flight, it is
> > > >    possible for a compromised server to direct the client to use
> > > >    inappropriate servers, such as those under the control of the
> > > >    attacker.  It is not clear that being directed to such servers
> > > >    represents a greater threat to the client than the damage that could
> > > >    be done by the comprromised server itself.  However, it is possible
> > > >    that some sorts of transient server compromises might be taken
> > > >    advantage of to direct a client to a server capable of doing greater
> > > >    damage over a longer time.  One useful step to guard against this
> > > >    possibility is to issue requests to fetch location data using
> > > >    RPCSEC_GSS, even if no mapping to an RPCSEC_GSS principal is
> > > >    available.  In this case, RPCSEC_GSS would not be used, as it
> > > >    typically is, to identify the client principal to the server, but
> > > >    rather to make sure (via RPCSEC_GSS mutual authentication) that the
> > > >    server being contacted is the one intended.
> > > > 
> > > >    Similar considrations apply if the threat to be avoided is the
> > > >    direction of client traffic to inappropriate (i.e. poorly performing)
> > > >    servers.  In both cases, there is no reason for the information
> > > >    returned to depend on the identity of the client principal requesting
> > > >    it, while the validity of the server information, which has the
> > > >    capability to affect all client principals, is of considerable
> > > >    importance.
> > > 
> > > These do address some of the issues I mentioned; thank you.  I do have a
> > > couple further comments:
> > > 
> > > - I'm not sure what "no mapping to an RPCSEC_GSS principal is available"
> > >   means (but maybe that's just because I've not read the RPCSEC_GSS RFCs
> > >   recently enough)
> > > 
> > 
> > This is partly because " even if no mapping to an RPCSEC_GSS principal is
> > available"
> > is misleading.   It would have been better to say "even if no mapping  is
> > to an
> > RPCSEC_GSS principal is available for the user currently obtaining the
> > information".
> > The issue is not within RPCSEC_GSS itself but relates to how it is used by
> > NFSv4.
> > Servers are required to support RPCSEG_GSS but to use it, you need to
> > translate
> > a uid to an RPCSEC_GSS principal,   Where that mapping is not available, as
> > it often
> > is not, you cannot use RPCSEC_GSS for that user.
> 
> Ah, that was enough to jostle the neurons into place (and no change to the
> text needed).
> 
> Thanks again,
> 
> Ben
> 
> > >    available
> > > 
> > > - w.r.t. "there is no reason for the information returned to depnd on the
> > >   identity of the client principal", I could perhaps imagine some setup
> > >   that uses information from a corporate contacts database to determine
> > > the
> > >   current office/location of a given user and provide a referral to a
> > >   spatially-local replica.  So "no reason" may be too absolute (but I
> > > don't
> > >   have a proposed alternative and don't object to using this text).
> > > 
> > > 
> > > [...]
> > > > > Section B.4
> > > > > 
> > > > >    o  The discussion of trunking which appeared in Section 2.10.5 of
> > > > >       RFC5661 [62] needed to be revised, to more clearly explain the
> > > > >       multiple types of trunking supporting and how the client can be
> > > > >       made aware of the existing trunking configuration.  In addition
> > > > >       the last paragraph (exclusive of sub-sections) of that section,
> > > > >       dealing with server_owner changes, is literally true, it has
> > > > > been
> > > > >       a source of confusion.  [...]
> > > > > 
> > > > > nit: the grammar here is weird; I think there's a missing "while" or
> > > > > similar.
> > > > 
> > > > > 
> > > > 
> > > >  Anticipate using the following replacement text:
> > > > 
> > > >   o  The discussion of trunking which appeared in Section 2.10.5 of
> > > >       RFC5661 [62] needed to be revised, to more clearly explain the
> > > >       multiple types of trunking supporting and how the client can be
> > > 
> > > nit: just "trunking support" (not "-ing")?
> > > 
> > 
> > Fixed.
> > 
> > 
> > > 
> > > >       made aware of the existing trunking configuration.  In addition,
> > > >       while the last paragraph (exclusive of sub-sections) of that
> > > >       section, dealing with server_owner changes, is literally true, it
> > > >       has been a source of confusion.  Since the existing paragraph can
> > > >       be read as suggesting that such changes be dealt with non-
> > > >       disruptively, the issue needs to be clarified in the revised
> > > >       section, which appears in Section 2.10.5.
> > > 
> > > Thanks again for going through my giant pile of comments; I hope that you
> > > think the improvements to the document are worth the time spent.
> > > 
> > 
> > I do.
> > 
> > 
> > > -Ben
> > > 
-- 
Cheers

Magnus Westerlund 


----------------------------------------------------------------------
Networks, Ericsson Research
----------------------------------------------------------------------
Ericsson AB                 | Phone  +46 10 7148287
Torshamnsgatan 23           | Mobile +46 73 0949079
SE-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com
----------------------------------------------------------------------