Re: [nfsv4] RFC 7530: Filehandle of opened file after the REMOVE

Christoph Hellwig <> Sun, 01 January 2017 13:48 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2E2D412945D for <>; Sun, 1 Jan 2017 05:48:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -5
X-Spam-Status: No, score=-5 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-3.1] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Am8XNIVn283M for <>; Sun, 1 Jan 2017 05:48:48 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 7CD1B12940F for <>; Sun, 1 Jan 2017 05:48:48 -0800 (PST)
Received: by (Postfix, from userid 2407) id 94155DE53C; Sun, 1 Jan 2017 14:48:46 +0100 (CET)
Date: Sun, 01 Jan 2017 14:48:46 +0100
From: Christoph Hellwig <>
To: David Noveck <>
Message-ID: <>
References: <> <> <> <> <> <> <> <> <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.5.17 (2007-11-01)
Archived-At: <>
Cc: Bruce James Fields <>, Christoph Hellwig <>, IETF NFSv4 WG Mailing List <>
Subject: Re: [nfsv4] RFC 7530: Filehandle of opened file after the REMOVE
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 01 Jan 2017 13:48:50 -0000

On Sun, Jan 01, 2017 at 08:13:55AM -0500, David Noveck wrote:
> You need to delay the deletion until after the grace period is over.  There
> may be technical issues with this but the biggest hurdle may may be
> architectural/political.  I expect that people involved in file system
> recovery may have trouble with the idea that the NFS server is telling them
> when they can and cannot do various pieces of file system recovery.

Wearing my local fs hat the only issue I have is that I'd like to avoid
the case where people accidentally configure their fs to never reclaim
the unlinked but open inodes after a crash.

> It might be more possible to ask them to complete recovery but move the
> unlinked inodes to a directory like /unlinked-inodes and consider the
> recovery complete before NFS actually runs.  If that is OK, and I believe
> it is, the result would be almost the same as the NFS server moving the
> file to that directory in the first place, using a server-based silly
> rename.

For the classis text book case of file systems with intent log this
doesn't make sense.  Open but unlinked inodes are only tracked in the
unlinked inode list, but do not have a name attached to them any more.
So we'd have to do a lot of effort to move it into a directory using
synthetic names while we already have a better data structure built
for exactly this use case.  We'll just need to delay the action performed
on it a little bit.  And that actions is really trivial - basically the
inodes is read into memory and then released again to execute the same
code we'd execute after dropping the last reference to an open but
unlinked inode.

> > It's a major pain for getting sensible semantics out of NFS.
> Please elaborate.  Would a server-based silly rename allow the sensible
> semantics you are looking for?

After delving so much into internals above I'll switch to my protocol
developers hereand say: we should only specify the on the wire semantics,
how it's implemented on the server is not our business.

A NFS server that moves all files that are unlinked but open by a client
(note that this might be a problem for local opens or unlinks not going
through the nfs server) could provide semantics very close to the real
thing, but it would have to be creative about adjusting the link counts
sent to the client.

With my experience in local file systems and nfs servers I'd say it's much
simpler to let the file system do the hard work and let the NFS server
delay reclaiming of unlinked inodes after a crash in some way.