Re: [nfsv4] RFC 1813 - NFS questions on write verifier

"Mehta, Viral" <Viral.Mehta@dell.com> Sun, 24 June 2018 10:16 UTC

Return-Path: <Viral.Mehta@dell.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 47662130DC6 for <nfsv4@ietfa.amsl.com>; Sun, 24 Jun 2018 03:16:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.711
X-Spam-Level:
X-Spam-Status: No, score=-2.711 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_DKIMWL_WL_HIGH=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=dell.com header.b=QXE4o0Sg; dkim=fail (1024-bit key) reason="fail (message has been altered)" header.d=emc.com header.b=ildLgnmV
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tbD_TptQY2nY for <nfsv4@ietfa.amsl.com>; Sun, 24 Jun 2018 03:16:03 -0700 (PDT)
Received: from esa3.dell-outbound.iphmx.com (esa3.dell-outbound.iphmx.com [68.232.153.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 360B712D7F8 for <nfsv4@ietf.org>; Sun, 24 Jun 2018 03:16:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=dell.com; i=@dell.com; q=dns/txt; s=smtpout; t=1529834864; x=1561370864; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=j2A9/O33efE4vHIBnwRmEmVs/J/6bna1qN3rEFd/N5Q=; b=QXE4o0SgY+qyg6WMidjiY3uTvlBXdHbCRVFStEFMcVlCwA2cckmo6Ewb UpJpl+14Je416kG8djbKvvp5ykgk4ImWkiQb2PVv+3bkjQeiAirapsr5I wOXQfvUav+/vJEHKQT3NGPIKbPBi0wvASmY0zkijrt2dfcjrzk7gLd/Ac w=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A2E1AACtbi9bmMmZ6ERbGQEBAQEBAQEBAQEBAQcBAQEBAYJ1gSgOfygKg2+IBIxAggV1hzSMYRSBKzsLGAsLhD4CF4JtITQYAQIBAQEBAQECAQECEAEBAQEBCAsLBikjDII1JAEOLxwhCAYBAQEBAQEnAQEBAQEBAQEBAQEBAQEBAQEBFwJDARIBARgBAQEBAgEBASERDB8PCwEEBwQCAQgRBAEBAQICBh0DAgICHwYLFAEICAEBBAENBQiDHQGBZwMNCAEOrUWCHIJ4hBANgSx6AwWBC4ZYgQmBVz6BD4JaNYFBgRVCAQGBLQESAQkYBRAjgkcxgiSZBSwDBAICjAaQUopxhlUCBAIEBQIUgUGBGnFwUIJpgiMOCYNFhRSFPm8BjQCBH4EaAQE
X-IPAS-Result: A2E1AACtbi9bmMmZ6ERbGQEBAQEBAQEBAQEBAQcBAQEBAYJ1gSgOfygKg2+IBIxAggV1hzSMYRSBKzsLGAsLhD4CF4JtITQYAQIBAQEBAQECAQECEAEBAQEBCAsLBikjDII1JAEOLxwhCAYBAQEBAQEnAQEBAQEBAQEBAQEBAQEBAQEBFwJDARIBARgBAQEBAgEBASERDB8PCwEEBwQCAQgRBAEBAQICBh0DAgICHwYLFAEICAEBBAENBQiDHQGBZwMNCAEOrUWCHIJ4hBANgSx6AwWBC4ZYgQmBVz6BD4JaNYFBgRVCAQGBLQESAQkYBRAjgkcxgiSZBSwDBAICjAaQUopxhlUCBAIEBQIUgUGBGnFwUIJpgiMOCYNFhRSFPm8BjQCBH4EaAQE
Received: from esa1.dell-outbound2.iphmx.com ([68.232.153.201]) by esa3.dell-outbound.iphmx.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jun 2018 05:07:42 -0500
From: "Mehta, Viral" <Viral.Mehta@dell.com>
Received: from mailuogwdur.emc.com ([128.221.224.79]) by esa1.dell-outbound2.iphmx.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jun 2018 16:07:17 +0600
Received: from maildlpprd53.lss.emc.com (maildlpprd53.lss.emc.com [10.106.48.157]) by mailuogwprd51.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id w5OAFwKY011987 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 24 Jun 2018 06:15:58 -0400
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd51.lss.emc.com w5OAFwKY011987
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=emc.com; s=jan2013; t=1529835359; bh=AAAxz0CAKAfktBVK9WGK3p89zx0=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:MIME-Version; b=ildLgnmVF/3n9FUXsQ6MPXYMMpHjJ9oNlvQPRZVvOXpSUu8DwuBgIprQyCjI6Ej9f m05USOK9y7h+cW40wma4QsrjruS9nTZ/0GSQ33si4f+hz/1mYfUElaQ4GgNHZEqvsC zd7fRifqctIY/eol53hz9PwUlJ8/idHsr7qMa5js=
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd51.lss.emc.com w5OAFwKY011987
Received: from mailusrhubprd03.lss.emc.com (mailusrhubprd03.lss.emc.com [10.253.24.21]) by maildlpprd53.lss.emc.com (RSA Interceptor); Sun, 24 Jun 2018 06:15:46 -0400
Received: from MXHUB320.corp.emc.com (MXHUB320.corp.emc.com [10.146.3.98]) by mailusrhubprd03.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id w5OAFlws015253 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=FAIL); Sun, 24 Jun 2018 06:15:48 -0400
Received: from MX301CL03.corp.emc.com ([fe80::2d8f:7e11:ebd7:15c9]) by MXHUB320.corp.emc.com ([10.146.3.98]) with mapi id 14.03.0382.000; Sun, 24 Jun 2018 06:15:47 -0400
To: Trond Myklebust <trondmy@gmail.com>, Rick Macklem <rmacklem@uoguelph.ca>
CC: "Noveck, David" <David.Noveck@netapp.com>, "brent.callaghan@eng.sun.com" <brent.callaghan@eng.sun.com>, "Pawlowski, Brian" <Brian.Pawlowski@netapp.com>, "peter.staubach@eng.sun.com" <peter.staubach@eng.sun.com>, Michael R Eisler <mike@eisler.com>, IETF NFSv4 WG Mailing List <nfsv4@ietf.org>
Thread-Topic: [nfsv4] RFC 1813 - NFS questions on write verifier
Thread-Index: AdQCCnsH+cVq4GPyQJuQLyfSkvN+mgHMSt1wAASiSgAAAxffMwAQJMkAAIHy6QA=
Date: Sun, 24 Jun 2018 10:15:46 +0000
Message-ID: <8A984F97D7F55B489CC0E1B7A89286E00143A93A@MX301CL03.corp.emc.com>
References: <8A984F97D7F55B489CC0E1B7A89286E001439464@MX301CL03.corp.emc.com> <BN6PR06MB3059712FC9E2E2C92EF19525E1760@BN6PR06MB3059.namprd06.prod.outlook.com> <YTOPR0101MB095343177ADC5D308DFF3394DD760@YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM> <F9279925-8DD6-4D35-AF93-9863E3FBF9C1@gmail.com>
In-Reply-To: <F9279925-8DD6-4D35-AF93-9863E3FBF9C1@gmail.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.42.43.49]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Sentrion-Hostname: mailusrhubprd03.lss.emc.com
X-RSA-Classifications: public
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/99Ow2muMylXKWd9lzi9_BX2LJDY>
Subject: Re: [nfsv4] RFC 1813 - NFS questions on write verifier
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Jun 2018 10:16:07 -0000

I agree and Linux has definitely interpreted it smartly, here. 

To some extent, even making VERF cookie per mount basis is also just one more interpretation of 
" cookie must be  consistent during a single boot session" when the same NFS server can serve multiple exports.

I opened an errata against 7530 and 5661 RFCs to make it bit more clear for server implementations.

Thanks,
Viral

-----Original Message-----
From: Trond Myklebust [mailto:trondmy@gmail.com] 
Sent: Thursday, June 21, 2018 9:35 PM
To: Rick Macklem
Cc: Noveck, David; Mehta, Viral; brent.callaghan@eng.sun.com; Pawlowski, Brian; peter.staubach@eng.sun.com; Michael R Eisler; IETF NFSv4 WG Mailing List
Subject: Re: [nfsv4] RFC 1813 - NFS questions on write verifier

I’m not aware of anything in the current specs that state the client is allowed to assume that the writeverf is a global constant, which is why the Linux kernel tracks on a per-file basis. However it is clear from the semantics of COMMIT that the writeverf is required to be at least global to the file, since the client is required to match the cookies returned by each WRITE to those returned by other WRITEs and the final COMMIT.

That said, if there are clients such as FreeBSD out there that assume the verifier is global, then perhaps we should introduce errata to RFC5661 and RFC7530 that clarify that this is now a requirement upon server implementations.

> On Jun 21, 2018, at 08:24, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> 
> Just a data point...The FreeBSD client keeps the write verifier on a 
> per-mount basis and assumes that it will be the same for all Writes 
> unless the server has rebooted and requires the Writes to be redone.
> 
> rick
> 
> ________________________________________
> From: nfsv4 <nfsv4-bounces@ietf.org> on behalf of Noveck, David 
> <David.Noveck@netapp.com>
> Sent: Thursday, June 21, 2018 7:37:47 AM
> To: Mehta, Viral; brent.callaghan@eng.sun.com; Pawlowski, Brian; 
> peter.staubach@eng.sun.com; mike@eisler.com
> Cc: nfsv4@ietf.org
> Subject: Re: [nfsv4] RFC 1813 - NFS questions on write verifier
> 
> First of all, I’ve cc’d the v4 working group, which is a good idea in general when you ask questions about these specs, especially since most of the email addresses on your to-list are probably no longer active (e.g. anything ending eng.sun.com).
> 
> 
>  *   RFC 7530 is bit more clear,
> 
> I think it is less clear.  RFC1813 says you can’t do what you want to do.  (it say “must be consistent rather than “MUST BE consistent”, but RFC1813 is not a standards-track document).  In any case the write verifier handling was not designed to support what you want to do, even though client implementations might well work as you would wish.
> 
> 
>  *   The server returns a write verifier upon successful completion of the
>  *   COMMIT..  The write verifier is used by the client to determine if the
>  *   server has restarted or rebooted between the initial WRITE(s) and the
>  *   COMMIT..
> 
> Note that it says “the server”  implying there is just one.
> 
>  *   The client does this by comparing the write verifier
>  *   returned from the initial writes and the verifier returned by the
>  *   COMMIT operation.
> 
> “The write verifier” (singular) “returned from the initial writes” 
> only make sense if the restriction specified in RFC1813 is adhered to here as well.
> 
> 
>  *   So, the client would compare COMMIT verifier with Previous WRITE verifier.
> 
> Except that that are multiple previous WRITE verifiers.
> 
>  *   And that allows SERVER to send “different” VERIFIER on different WRITE Operations
> 
> It can send them, but it is not clear how clients are supposed to deal with them since the assumption (in both specs) is that a single server will return the same one on successive writes.
> 
> From: Mehta, Viral <Viral.Mehta@dell.com>
> Sent: Thursday, June 21, 2018 4:50 AM
> To: brent.callaghan@eng.sun.com; Pawlowski, Brian 
> <Brian.Pawlowski@netapp.com>; peter.staubach@eng.sun.com; Noveck, 
> David <David.Noveck@netapp.com>; mike@eisler.com
> Subject: RE: RFC 1813 - NFS questions on write verifier
> 
> RFC 7530 is bit more clear,
> 
> The server returns a write verifier upon successful completion of the
>   COMMIT.  The write verifier is used by the client to determine if the
>   server has restarted or rebooted between the initial WRITE(s) and the
>   COMMIT.  The client does this by comparing the write verifier
>   returned from the initial writes and the verifier returned by the
>   COMMIT operation.
> 
> 
> So, the client would compare COMMIT verifier with Previous WRITE verifier.
> And that allows SERVER to send “different” VERIFIER on different WRITE Operations.
> 
> From: Mehta, Viral
> Sent: Tuesday, June 12, 2018 10:32 AM
> To: 'brent.callaghan@eng.sun.com'; 'beepy@netapp.com'; 'peter.staubach@eng.sun.com'
> Cc: 'David.Noveck@netapp.com'; 'mike@eisler.com'
> Subject: RFC 1813 - NFS questions on write verifier
> 
> Hi,
> 
> I had a question on NFSv3 server behaviour for unstable WRITEs in clustered environment.
> 
> As we know,
> NFS client sends WRITE op with stable or unstable flag. If the client 
> sends WRITE op with UNSTABLE flag, and NFS server responses with the 
> same flag, it also sends VERIFIER cookie. Later, when NFS server receives COMMIT operation, it sends the same VERIFIER cookie back and so the NFS client is happy.
> 
> If in between WRITE and COMMIT operations, NFS server was crashed and 
> rebooted, it sends the new VERIFIER (most of the implementation choose to send boot time of the NFS server) and so the client should perform WRITE again and it is again happy.
> 
> It all works if there is single node where NFS server is running. What if NFS server is clusterized / running in scale-out environment ? I wanted to understand how is this VERIFIER maintained ?
> 
> For e.g., if one node “N1” in scale out receives NFS WRITE request and the UNSTABLE WRITE actually happened on other node “N2”.
> i.e., Node “N1” forwarded WRITE request to node “N2” since data was on the node “N2”.
> Now, if node “N2” crashes before it receives COMMIT, which VERIFIER is 
> sent back to the client ? RFC-1813 says that
> 
> 
> “This cookie must be
>         consistent during a single boot session and must be
>         unique between instances of the NFS version 3 protocol
>         server where uncommitted data may be lost.
> “
> 
> Since there may not be single NFS server instance, and if NFS server process is running on all scale out nodes. What is the verifier cookie that is sent back ?
> And when is that verifier cookie updated ?
> 
> PS – It looks like real NFS client implementations are tracking COOKIE on PER WRITE basis and not PER NFS Server instance bases as mentioned in RFC.
>         But, while implementing NFS server, is that fair enough to assume above scenario ?
> 
> 
> Thanks,
> Viral
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4