Re: [nfsv4] Progressing RFC errata for RFC 5661

Rick Macklem <rmacklem@uoguelph.ca> Thu, 19 September 2019 16:51 UTC

Return-Path: <rmacklem@uoguelph.ca>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 92EBD120077 for <nfsv4@ietfa.amsl.com>; Thu, 19 Sep 2019 09:51:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TEhG0oWDUwaT for <nfsv4@ietfa.amsl.com>; Thu, 19 Sep 2019 09:51:19 -0700 (PDT)
Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670065.outbound.protection.outlook.com [40.107.67.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0038E12012A for <nfsv4@ietf.org>; Thu, 19 Sep 2019 09:51:18 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CUyg0agNa6SgqToWWVfv8LWMLaGFXTj1LJUdfvr2jkfZfCepmUzpB0FNWUHfeYOqLtoxLODnRymKb+XD6J6QRvQeFUC0+HSq9qmwLeFSCILcQ729hrI570lrd4M7weZBdEImro246G4WHAKJjF7yiso+091RxZwYt0PNPMjqd2m/Es3qltzvaXt8J0q+sJ/IHRwHjfXKdxSYrZjATjgaCgGdgCa+mM9ioq4vSm3GFT8xTMwemAudhTanBdJUG7jKRt5aQCEWogJN5YLzKpa0KMtrI0tqjhK0+GXxgbGowKKw8mxaczvbZWgNACUhcPkZ2ziSyAdqtLVoYqDf1Dxr/w==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iB+/C6A6xeJYvQZ6w8HlKmWPHe6UmvvbXqNoyMOr+eQ=; b=Uuy8m49/Q+OuIm+AaRZXEvOEj350ZIrCEfyS1h4AYaPf+wvvPgA153SQqXRX/PfgOvNlwM8aeLk9Mqqyppj2TX/jBmbZB+RBNlkekstI1Owc1XCrEDK3xnrr+m6FpkHugrX7uCvu/kL0ozkUj8D1GL8gHBF5itaCiTQIxJamPFMX7KLp9h9L1psW7sar8HwVR2Jyhnz+LWf7qLY+aeX5CL0PLl3PXpZ/UdegBiVhvyuAPbsw+AOIh0kMv6prbSVkXGNmk0YuMR0fNTDvuKPpXKWecV1WBVZF0K0O5RfnPZRv0AYvAgD6tnlKpXA+n/yohroGyZWHE9Bg3sKKraezyg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none
Received: from YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM (10.255.40.86) by YT1PR01MB3883.CANPRD01.PROD.OUTLOOK.COM (10.255.44.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2284.20; Thu, 19 Sep 2019 16:51:13 +0000
Received: from YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM ([fe80::3d91:e96a:3efb:3385]) by YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM ([fe80::3d91:e96a:3efb:3385%5]) with mapi id 15.20.2284.009; Thu, 19 Sep 2019 16:51:13 +0000
From: Rick Macklem <rmacklem@uoguelph.ca>
To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
CC: NFSv4 <nfsv4@ietf.org>
Thread-Topic: [nfsv4] Progressing RFC errata for RFC 5661
Thread-Index: AQHVbM0EuGFC68ZdA0WMehAxtQCJUqcwwJsEofkpcSPeB7DymYAAQlCAgACGM4R9S3qt9fwWcjuYgAAILqE=
Date: Thu, 19 Sep 2019 16:51:13 +0000
Message-ID: <YT1PR01MB35935EBFB2ED93141B18FB80DD890@YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM>
References: <DB7PR07MB5736124B2F507DA20F317BC195B30@DB7PR07MB5736.eurprd07.prod.outlook.com> <CADaq8jd4u-Lwvy_Csu2jqrcGFZ_tLOeSkqKwUW0eivuc=trsBg@mail.gmail.com> <CAABAsM5TDGx0qiMv+Ln4WLOKjuQiTFKr6HD6d9zqD3NfjpvoFg@mail.gmail.com> <YTXPR0101MB2189B0CB69FA090BD1D54F92DD8E0@YTXPR0101MB2189.CANPRD01.PROD.OUTLOOK.COM> <1941956044.30576385.1568804777656.JavaMail.zimbra@desy.de> <YTXPR0101MB21892F5E56B089AC6A255506DD8E0@YTXPR0101MB2189.CANPRD01.PROD.OUTLOOK.COM> <CAABAsM6wB-Jik_RqEHkzy3RrsOhrCx4X=LSxpqWz=PKvVJ7QxA@mail.gmail.com> <YT1PR01MB35931DB2A308E81571FFDD38DD890@YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM>, <908189693.30823713.1568884477388.JavaMail.zimbra@desy.de>, <YT1PR01MB35934BCA64CDF2C8B87CA998DD890@YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YT1PR01MB35934BCA64CDF2C8B87CA998DD890@YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 6b8b3908-2bbc-4d3d-2950-08d73d2194f4
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600167)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:YT1PR01MB3883;
x-ms-traffictypediagnostic: YT1PR01MB3883:
x-ms-exchange-purlcount: 4
x-microsoft-antispam-prvs: <YT1PR01MB3883E7F63CE706141D4F607BDD890@YT1PR01MB3883.CANPRD01.PROD.OUTLOOK.COM>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 016572D96D
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39860400002)(136003)(376002)(346002)(366004)(396003)(189003)(199004)(13464003)(6916009)(74316002)(305945005)(86362001)(71200400001)(71190400001)(102836004)(14454004)(478600001)(6506007)(966005)(11346002)(486006)(7696005)(25786009)(186003)(14444005)(256004)(4326008)(476003)(99286004)(446003)(76176011)(46003)(6246003)(33656002)(8676002)(81156014)(81166006)(8936002)(5660300002)(76116006)(64756008)(66946007)(316002)(2940100002)(52536014)(66556008)(66446008)(786003)(6436002)(229853002)(2906002)(66476007)(55016002)(6306002)(9686003); DIR:OUT; SFP:1101; SCL:1; SRVR:YT1PR01MB3883; H:YT1PR01MB3593.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: wX933k+dwr6qoTC5XNN2i2vla/rv2EkpHRo/Mm6Yk/OI9pkh/YSttyUUIHg28sOk6hGu32IxBqiNr/BjOBPr+HMFDSxhgDljAZ7420Ovj501OWMLo5wHQRINoq9MbCpjsn5u2Fs/YAkZ0iartfFQN8aVXohcvRmTXWRw+3arz5RfeObj6rFqKOdPxvdEHXC9bpVfJoTaSBUmtHtc3QIV/RZJ+Sf70uBEaMNPFI/tE7ecP6x02ke5030Y6mXywB5Ie5sMc7Yo00Or+e5axES2CmmeE1OmxUGb5vngbL5+XdVdegI1eUmyhQwzl+9HAcpeT8HayG6003nI2Y829wqKoQUhimZFvhvmaZAQeQTPIulC/g4BAMdlZolYIz2s+VHFAmt2Y8hfeUXsK90Agz5lc6+00rYgYtTpks+3gIjF4uI=
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: uoguelph.ca
X-MS-Exchange-CrossTenant-Network-Message-Id: 6b8b3908-2bbc-4d3d-2950-08d73d2194f4
X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Sep 2019 16:51:13.6909 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 5DOoGtGLoJ3R97htRyucvbnf+Eu2CDRUoD/bolBwxKLpUdtJuHSsQ3vS4OEfJp/UXG2IFGHoxCnfZouAWOEGbw==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: YT1PR01MB3883
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/7DhWjtUEeNgiB8DN4L3MrA79txo>
Subject: Re: [nfsv4] Progressing RFC errata for RFC 5661
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Sep 2019 16:51:22 -0000

Rick Macklem wrote:
>Mkrtchyan, Tigran wrote:
>>----- Original Message -----
>>> From: "Rick Macklem" <rmacklem@uoguelph.ca>
>>> To: "Trond Myklebust" <trondmy@gmail.com>
>>> Cc: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>, "Dave Noveck" <davenoveck@gmail.com>, "Magnus Westerlund"
>>> <magnus.westerlund@ericsson.com>, "NFSv4" <nfsv4@ietf.org>
>>> Sent: Thursday, September 19, 2019 6:07:32 AM
>>> Subject: Re: [nfsv4] Progressing RFC errata for RFC 5661
>>
>>> Trond Myklebust wrote:
>>>>Rick,
>>>>
>>>>That errata predates most of the Linux pNFS client implementation. We
>>>>wrote the implementation to conform to the errata.
>>>>
>>>>So no. It's not a bug. It's a deliberate design based on a decision
>>>>that was discussed in the IETF WG, on the mailing list
>>>>
>>>>https://mailarchive.ietf.org/arch/msg/nfsv4/_KTtO6uz-MvRoStbhPuOXWZr6yI
>>> This actually appears to be a discussion related to the offset and length
>>> arguments for LayoutCommit, but...
>>>>
>>>>and in a special session of the IETF:
>>>>
>>>>https://mailarchive.ietf.org/arch/msg/nfsv4/Rpw9XCwCARxfaU4ym5L2TauV6ao
>>> Ok. This was long before I got around to implementing it, so I wouldn't have
>>> understood the implications.
>>> --> I would have been interested in hearing the rationale behind not doing
>>>      LayoutCommit for FILE_SYNC4 writes, since it seems to me that RFC-5661
>>>      had gotten it right when it required them.
>>
>>In case of a cluster filesystem back-end, FILE_SYNC4 on write indicates that >>MDS already
>>has the correct file attributes. An extra LAYOUTCOMMIT will introduce >>additional overhead.
>>I can imagine a HPC workload where client talks to DS over InfiniBand and >>Ethernet to
>>MDS. An extra LAYOUTCOMMIT will drop write throughput.
>Ok, I'll assume you have a server which needs LayoutCommit for UNSTABLE4
>writes, but doesn't need one for FILE_SYNC4 writes.
>(If the server never needs LayoutCommit, it can simply do what I believe
> the Netapp filer does, which is reply NFS4ERR_NOTSUPP.)
>
>I agree that this could result in extra overhead, but at least for some cases
>the client could put the LayoutCommit in the same compound as something
>like Close (or Commit if the server does commit-through-mds), which would
>avoid an extra RPC RTT.
>Presumably the server would know it didn't need to do anything and could
>just reply NFS_OK for the operation in the FILE_SYNC4 case?
>
>But, yes, this is a case where using the Errata might improve performance.
>
>For NFSv4.2, I think a new recommended attribute could be added, so that
>a server could indicate to a client when it wanted a LayoutCommit.
>(I think the NFSv4.2 versioning rules would allow this to be added?)
>I'm not sure how the client would "handshake" with the server, acknowledging
>that it understood the new attribute, though?
Duh. I realized that the "handshake" would simply be the client getting the
attribute.

>It is tempting to add NFL4_UFLG_FILE_SYNC4_LAYOUT_COMMIT, but that
>would probably cause problems for extant implementations that don't
>expect the flag bit (returning an error instead of ignoring it, etc).

rick



>
> As I said, the FreeBSD server can handle this case, it just results in a lot of
> overhead synchronizing Size, Change, Time_Modify between MDS and DS
> whenever a RW layout is issued to a client for the file.
>
> Thanks for pointing this out, rick
>
> On Wed, 18 Sep 2019 at 12:39, Rick Macklem <rmacklem@uoguelph.ca> wrote:
>>
>> Mkrtchyan, Tigran wrote:
>> [stuff snipped]
>> >Hi Rick,
>> >
>> >here is the public link to errata
>> >
>> >https://www.rfc-editor.org/errata/eid2751
>> >
>> >Tigran.
>> Thanks Tigran.
>>
>> Ok, so now that I've read it I have to admit I think it is rewriting the RFC
>> to conform with what the Linux client does.
>>
>> I think this para. from Sec. 13.10 of RFC-5661 is clear:
>>    The NFSv4.1 protocol only provides close-to-open file data cache
>>    semantics; meaning that when the file is closed, all modified data is
>>    written to the server.  When a subsequent OPEN of the file is done,
>>    the change attribute is inspected for a difference from a cached
>>    value for the change attribute.  For the case above, this means that
>>    a LAYOUTCOMMIT will be done at close (along with the data WRITEs) and
>>    will update the file's size and change attribute.  Access from
>>    another client after that point will result in the appropriate size
>>    being returned.
>>
>> It states "will be done". It doesn't say anything about UNSTABLE4 vs FILE_SYNC4.
>> (I think most POSIX-like clients would consider the fsync(2) syscall to require
>>  the same treatment as "close" above, but that is a POSIX-specific client issue.)
>> I can see the argument that, since there is no "must" in the statement, that a
>> client can choose not to do this, but that would also imply that the client will
>> need to live with the consequences of it.
>>
>> I think the second sentence of the first para. of the errata is bogus:
>> For file layouts, WRITEs to a Data Server that return a stable_how4 value of
>> FILE_SYNC4 guarantee that data and file system metadata are on stable
>> storage.  This means that a LAYOUTCOMMIT is not needed in order to make the
>> data and metadata visible to the metadata server and other clients.
>>
>> Why?
>> The FILE_SYNC4 was returned by the DS. This would imply the DS
>> has committed data and metadata to stable storage on the DS.
>> However, I am not aware of anything in RFC-5661 that would imply that
>> the Size, Time_Modify and Change attributes or anything else must have
>> been updated or in stable storage on the MDS at this time.
>>
>> If a server does not require LayoutCommit operations for correct behaviour
>> then it can simply reply NFS4ERR_NOTSUPP (as I believe the Netapp filer
>> does) and the client then no longer needs to do them.
>>
>> If there is somewhere in RFC-5661 that it is stated that LayoutCommits are
>> not required when the DS replies FILE_SYNC4, then I missed it and there
>> is a problem with RFC-5661 that needs to be addressed.
>>
>> Otherwise, sorry, but it seems that the bug is in the Linux client
>> implementation
>> and not RFC-5661.
>>
>> Is there a File Layout pNFS server implementation where the DSs return
>> FILE_SYNC4 that will break if the client does a LayoutCommit for this case?
>> (If so, then something may need to be done.)
>>
>> rick
>>
>>

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4