Re: [nfsv4] RFC 7530: Filehandle of opened file after the REMOVE

Trond Myklebust <trondmy@gmail.com> Tue, 27 December 2016 21:00 UTC

Return-Path: <trondmy@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2BBEB128B38 for <nfsv4@ietfa.amsl.com>; Tue, 27 Dec 2016 13:00:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.699
X-Spam-Level:
X-Spam-Status: No, score=-1.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u93G_3Ahbsmw for <nfsv4@ietfa.amsl.com>; Tue, 27 Dec 2016 13:00:24 -0800 (PST)
Received: from mail-lf0-x242.google.com (mail-lf0-x242.google.com [IPv6:2a00:1450:4010:c07::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EA15A1296FD for <nfsv4@ietf.org>; Tue, 27 Dec 2016 13:00:23 -0800 (PST)
Received: by mail-lf0-x242.google.com with SMTP id t196so11770265lff.3 for <nfsv4@ietf.org>; Tue, 27 Dec 2016 13:00:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=jD48c6/5OIMsM+KoE/9zbDwZiACDVDKqFKPK4LP81EU=; b=cA7KCGkO9M+cnbt87xgX5PtU3jTx+lcsvXiZhbrw7I49e752uNFP/g/uj/BmV0uUwh cnpjpRpncg2oV8Z0zy2hkamJ2YAiRqpw9jRN9oSIQNvwc+tQCygHucjB4dMHPW713o9G WQ7j3KomcsB11yqAvzkyWkv+t/jtRimxb8209bw3R0qc3j+qL2yckqzzaWOl7HqRABv3 OK88X8MeZkXWollqRmZf12EEhhyyzbqnVpFmPrsZ90sg2xAgJEIENPa1Ojbw2ioW5Obx c+dU9hIO2HA/oh7p6Piz7Y80A+QlpMV/gGxxD3cDYtzTO5mt4kFv/e3sfD8TdmaBopas bstQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=jD48c6/5OIMsM+KoE/9zbDwZiACDVDKqFKPK4LP81EU=; b=nZNUyWEz8uGc/JzXr76NIKEFi+OOy0NHhzz8aEjostl4x6eHMsKcck6bs1+QpZBCyE vi3yG7teiD0JUqwIHFZx5bk7a2TkW3MYtPHsew8Sp1B6zr2Sy2XzJUeCvE6KeCdmep/O JQ7BcFQnfpWG0Xuo428FOQ74x+dKgOgOwSygD6TPab7CgpIJWZ0jU6MtC5p15mbRWNGE KJwE1U75cbE61ym9t4Tfz1vob6oFQ+DYuVF+yjQEl/6AGFA8y6at3rJPZa+fHcUYT4zG hp8iWfDv08oKgEY0/hu4jg3gx4FFuNWx8L8e3yryshUCR9pCP/8yggmumAMEMOCa71D5 6+lg==
X-Gm-Message-State: AIkVDXIWGlqUlA0oRjkOJL7bYVn+d/0+dnUwtZZ4eafHFJpZv9aMTDrE0W1CwlX9PCc4Jw==
X-Received: by 10.25.19.29 with SMTP id j29mr11481639lfi.34.1482872421938; Tue, 27 Dec 2016 13:00:21 -0800 (PST)
Received: from [10.0.1.13] (9.42.202.84.customer.cdi.no. [84.202.42.9]) by smtp.gmail.com with ESMTPSA id q19sm11875627lfi.1.2016.12.27.13.00.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Dec 2016 13:00:20 -0800 (PST)
From: Trond Myklebust <trondmy@gmail.com>
Message-Id: <03E226BE-5300-4D7F-9C33-EAAA8A72DF30@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_68AF6E73-4AD5-4385-83D8-8970C64D547F"
Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\))
Date: Tue, 27 Dec 2016 22:00:18 +0100
In-Reply-To: <C496AE44-0F27-4B66-A1F6-A76AEAFD7A90@gmail.com>
To: Dave Noveck <davenoveck@gmail.com>
References: <20161213155825.Horde.vsqZuNSZ9hIXlcHQYxmgRC7@mail.telka.sk> <CADaq8jeiwGwgV=_HHjR2D4uNaKq9zY96hJOVXp4Q0H-3OgH2qA@mail.gmail.com> <20161213165639.Horde.t6BGVBJqifWKHucfa069yT8@mail.telka.sk> <CAABAsM579kGU4VzZfqWPUMPJ14QDBheJ8eMAk7DrYUSGscfVkQ@mail.gmail.com> <20161213171902.Horde.MkS1YMOM6VpxA0Z7rSMTe7P@mail.telka.sk> <CAABAsM5L0xdKodxk1dRSugLyROzn2JzgDkq6kdHE0LuGcfh++A@mail.gmail.com> <20161213181734.Horde.EqgB09El8rupnkesIQaBwJ3@mail.telka.sk> <CADaq8jcq2C0o8EWXoGjxDn58sV_J+-SP-=rj934Se-DV69b-pw@mail.gmail.com> <20161214112112.Horde.aPh8AjT6iWRl37CULwihyV7@mail.telka.sk> <CAABAsM7v6y0bsb0jKzfvobkUjniTLhM3uv8FYjo07HcLD2004w@mail.gmail.com> <20161227144414.GA32002@fieldses.org> <CADaq8jck14SKL6Ua9QxbqPyX1=1aaA7+76wv-__EWFvh7ZcEJA@mail.gmail.com> <C496AE44-0F27-4B66-A1F6-A76AEAFD7A90@gmail.com>
X-Mailer: Apple Mail (2.3259)
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/TLqMqHVc0mfuHPbrejie-UxJpJ4>
Cc: Bruce James Fields <bfields@fieldses.org>, IETF NFSv4 WG Mailing List <nfsv4@ietf.org>
Subject: Re: [nfsv4] RFC 7530: Filehandle of opened file after the REMOVE
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Dec 2016 21:00:27 -0000

> On Dec 27, 2016, at 21:24, Trond Myklebust <trondmy@gmail.com> wrote:
> 
>> 
>> On Dec 27, 2016, at 19:13, David Noveck <davenoveck@gmail.com <mailto:davenoveck@gmail.com>> wrote:
>> 
>> Bruce wrote:
>> 
>> > I think this is what OPEN4_RESULT_PRESERVE_UNLINKED in 5661 was meant to
>> > do.  
>> 
>> In fact it was.  See the sixth bullet in section 1.8 of RFC5661.
>> 
>> The problem I see is that it doesn't solve the problem that Marcel has pointed out.  If client A
>> opens the file and client B removes it, there is no way it can decide whether or not to do a 
>> silly rename.  Since he hasn't opened, the file he doesn't see the flag indicating it is not 
>> needed.  Also, since the file is not open he has no indication that it is needed either.  So client
>> B will do a REMOVE in all cases.  
> 
> As far as I can tell, Marcel is mainly interested in NFSv4.0. RFC5661 does not address any of the cases he was describing.
> 
>> 
>> Server implementations may or may not preserve the file until last close.  For those that do, 
>> everything will work out OK, while for those that don't, there isn't much that the client can 
>> do.  If he did find out the server did not have the support, he has o way to defer the remove 
>> until last close because he has no way of finding out when this happens,
>> 
>> Trond wrote:
>> 
>> > There are 4 main problems that are inadequately discussed in any of the existing RFCs, 
>> > and that you need to address before we can consider replacing sillyrename 
>> 
>> It appears it has already been considered and spec'd but the spec may not be
>> adequate.  Sigh!
>> 
>> > (which is well established today, and well understood by most users).
>> 
>> Most people understand why it works when it does but don't understand why it doesn't
>> work in some cases.  So they are happy until they aren’t.
> 
> Umm…. If the solution works with unlink-on-close then you all do realize that it MUST always also work with sillyrename, right? Having a third party client remove a file named ‘.nfsXXXXX’ is just a special case of having it remove a file with a generic name.
> 
> IOW: I could argue that the problem here can be considered to be purely a server problem and irrelevant to the actual delete strategy chosen by the client.

Correction: most of the problem is orthogonal to the client delete strategy. The one place where the client can put its foot in its mouth is ‘rm -rf’, where the sillyrename does prevent the removal of the directory from succeeding.

> 
>> > 1) How does the client identify that the server supports this functionality?
>> 
>> The problem is that the client who doesn't open file, the troublesome case, is exactly the one in which
>> he receives no information.  The client who does the open finds out silly rename is not needed, but, in
>> this case, silly rename seems to work OK.
>> 
>> > 2) What functionality is needed on the underlying filesystems on the server?
>> 
>> Almost all filesystems have the ability to defer deletion until the last close.  If they don't, they
>> are not very usable.  Servers which don't support this are not suffering from a lack of filesystem
>> functionality.  Instead, the problem seems to be that some servers have an NFSv4 open which 
>> either doesn't connect the to the underlying open functionality or there is no underlying open functionality.
> 
> Sure, but in order to be reboot safe, you need to go well beyond the functionality that you describe. As far as I can tell, you need to defer the garbage collection that would normally occur when the server comes back up again until after the state reclaim has occurred (as you describe below).
> 
>> 
>> > 3) How does the server function in the case of a reboot? What can the client expect in terms of recoverability?
>> 
>> This is addressed b section 18.16.3 of rfc5661.  The third bullet on page  449 says:
>> 
>> Furthermore, the server promises to preserve the file through the grace period after server restart, thereby giving the client the opportunity to reclaim its open.
> 
> Again, this is specified for NFSv4.x (x>0) but not for x=0. It is not clear to me that older NFSv4 servers will have any of this functionality.
> 
>> 
>> > 4) How does the client in practice perform recovery?
>> >   a) In the case of server reboots.
>> 
>> I think he just recovers this opens within the grace period.
>> 
>> >   b) In the case of lease timeouts/network partitions
>> 
>> This case is not addressed RFC5661, so far as I can see.  The typical handling, in which
>> the locks are all dropped would cause the file to go away.  Courtesy locks could address the
>> problem, but the spec doesn't that have to be implemented.
>> 
>> > Note that the lack of open-by-filehandle in NFSv4.0 makes 4.b) more difficult than it should otherwise be.
>> 
>> True. In v4.1, a client can try an open-by-filehandle and take advantage of any courtesy locks that the server 
>> has retained.
>> 
>> On Tue, Dec 27, 2016 at 9:44 AM, J. Bruce Fields <bfields@fieldses.org <mailto:bfields@fieldses.org>> wrote:
>> On Wed, Dec 14, 2016 at 09:28:41AM -0500, Trond Myklebust wrote:
>> > On Wed, Dec 14, 2016 at 6:21 AM, Marcel Telka <marcel@telka.sk <mailto:marcel@telka.sk>> wrote:
>> >
>> > > Citát David Noveck <davenoveck@gmail.com <mailto:davenoveck@gmail.com>>:
>> > >
>> > >> It appears that you want an informational document saying, more or less:
>> > >>
>> > >>    - If the server does not want clients to be discomfited by open files
>> > >>    being removed, since such behavior is disallowed by typical OS
>> > >> (e.g.Unix)
>> > >>    semantics, the server can avoid this situation by delaying the actual
>> > >>    removal of the file until last close, as allowed by RFC7530.
>> > >>    - The use of rename by clients as a substitute for remove, normally
>> > >>    known as "silly rename", has significant problems, since removes can
>> > >> happen
>> > >>    on nodes that do not have the file open.
>> > >>
>> > >> If this is what you want, then you can write an I-D and submit it.
>> > >>
>> > >
>> > > Yes, this is exactly what I want to see.
>> > >
>> > >
>> > There are 4 main problems that are inadequately discussed in any of the
>> > existing RFCs, and that you need to address before we can consider
>> > replacing sillyrename (which is well established today, and well understood
>> > by most users).
>> >
>> > 1) How does the client identify that the server supports this functionality?
>> 
>> I think this is what OPEN4_RESULT_PRESERVE_UNLINKED in 5661 was meant to
>> do.  It'd be interesting to try an implementation and see if your other
>> points are addressed, but I haven't thought about it in a long time.
>> 
>> --b.
>> 
>> > 2) What functionality is needed on the underlying filesystems on the server?
>> > 3) How does the server function in the case of a reboot? What can the
>> > client expect in terms of recoverability?
>> > 4) How does the client in practice perform recovery?
>> >    a) In the case of server reboots.
>> >    b) In the case of lease timeouts/network partitions
>> >
>> > Note that the lack of open-by-filehandle in NFSv4.0 makes 4.b) more
>> > difficult than it should otherwise be.
>> 
>> > _______________________________________________
>> > nfsv4 mailing list
>> > nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>> > https://www.ietf.org/mailman/listinfo/nfsv4 <https://www.ietf.org/mailman/listinfo/nfsv4>
>> 
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org <mailto:nfsv4@ietf.org>
>> https://www.ietf.org/mailman/listinfo/nfsv4 <https://www.ietf.org/mailman/listinfo/nfsv4>