Re: [nfsv4] "Courtesy locks"

Trond Myklebust <Trond.Myklebust@netapp.com> Sun, 12 September 2010 23:14 UTC

Return-Path: <Trond.Myklebust@netapp.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A78DB3A6896 for <nfsv4@core3.amsl.com>; Sun, 12 Sep 2010 16:14:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.299
X-Spam-Level:
X-Spam-Status: No, score=-6.299 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, J_CHICKENPOX_44=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8mEwRNtvO12z for <nfsv4@core3.amsl.com>; Sun, 12 Sep 2010 16:14:03 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by core3.amsl.com (Postfix) with ESMTP id A0F853A6890 for <nfsv4@ietf.org>; Sun, 12 Sep 2010 16:14:03 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.56,356,1280732400"; d="scan'208";a="448996960"
Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2-out.netapp.com with ESMTP; 12 Sep 2010 16:14:30 -0700
Received: from sacrsexc2-prd.hq.netapp.com (sacrsexc2-prd.hq.netapp.com [10.99.115.28]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o8CNEUov008078; Sun, 12 Sep 2010 16:14:30 -0700 (PDT)
Received: from SACMVEXC2-PRD.hq.netapp.com ([10.99.115.18]) by sacrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Sun, 12 Sep 2010 16:14:29 -0700
Received: from 10.58.60.153 ([10.58.60.153]) by SACMVEXC2-PRD.hq.netapp.com ([10.99.115.16]) with Microsoft Exchange Server HTTP-DAV ; Sun, 12 Sep 2010 23:13:43 +0000
Received: from heimdal.trondhjem.org by SACMVEXC2-PRD.hq.netapp.com; 12 Sep 2010 19:13:44 -0400
From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: david.noveck@emc.com
In-Reply-To: <BF3BB6D12298F54B89C8DCC1E4073D8002665106@CORPUSMX50A.corp.emc.com>
References: <BF3BB6D12298F54B89C8DCC1E4073D8002665106@CORPUSMX50A.corp.emc.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Organization: NetApp
Date: Sun, 12 Sep 2010 19:13:43 -0400
Message-ID: <1284333223.4050.17.camel@heimdal.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.30.3 (2.30.3-1.fc13)
X-OriginalArrivalTime: 12 Sep 2010 23:14:29.0976 (UTC) FILETIME=[408BA580:01CB52D0]
Cc: nfsv4@ietf.org
Subject: Re: [nfsv4] "Courtesy locks"
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Sep 2010 23:14:06 -0000

On Sun, 2010-09-12 at 12:21 -0400, david.noveck@emc.com wrote:
> The term "courtesy locks" is Trond's.  It refers to the practice of
> allowing locks associated with an expired lease to remain around, "as a
> courtesy to the client or as an optimization" (see section 8.6.3 or RFC
> 3530).  The text that is associated with this situation is not as clear
> or complete as it should be and a number of RFC3530bis issues are
> related to it.
> 
> In deciding how to address this, we need to find out what exactly
> clients and servers do.  A long while back I promised to send mail and
> gather up the requisite information, and now my promise has caught up
> with me.  If you can answer the following questions via email that would
> be great.  If not, I hope people will at least find out what their
> implementations do on these points so we discuss it at the upcoming
> bakeathon.
> 
> This looks pretty daunting but most people will probably say "Yes" to
> (S1) and/or easily deal with (C1), (C2), and (C3).
> 
> Make sure you use black pen and completely fill in the circles :-) 
> 
> 
> If you have a client and not a server, skip to (C1)
> 
> S1) When a lease expires, do you simply release all
>     the associated locks?
> 
>     If so, go to (C1), or if you don't have a client you 
>     are all done.
> 
> S2) If you do maintain locks for a longer time, when and if
>     there is a conflict before the lease being reactivated,
>     what do you do?
> 
>     A) Release all the locks associated with that client
>     B) Release all the locks held by that client for that 
>        same fh.
>     C) Something more limited than (B) that includes the lock
>        that had the conflict.
> 
>     If you answered (A) or (B), skip down to (S7).
> 
> S3) If the lock that had a conflict is a byte-range lock,
> 
>     S3.1) Is the associated open released as well?
>     S3.2) Are all byte-range locks associated with the
>           same open released as well?
>     S3.3) Are all byte range locks associated with the same
>           client/lock-owner released as well?
>     S3.4) Are all byte-range locks associated with the 
>           same open and owned by the client/lock-owner
>           released as well?
>     S3.5) If the lock is associated with an open that is
>           subordinate to a delegation, is the delegation
>           released as well?
>     S3.6) Are there any other locks that are released?
> 
> S4) If the lock that had a conflict is an open?
> 
>     S4.1) Are byte-range locks associated with this open
>           released as well?
>     S4.2) If the open is associated with a delegation,
>           is the delegation released as well.
> 
> S5) If the lock that had the conflict a delegation?
> 
>     S5.1) Are opens associated with that delegation
>           released as well?
>     S5.2) Are all opens for that client and for that fh
>           released as well?
>     S5.3) Are all byte-range locks associated with open
>           that are associated with that delegation released
>           as well.
>     S5.4) Are all byte-range locks for that client and for
>           released as well?
> 
> S6) Are there any other situations in which release of a 
>     lock due to a conflict will cause other locks to be
>     released?
> 
> S7) After releasing locks due to a conflict, what happen if
>     the associated stateid is referenced, assuming that 
>     happens relatively quickly?
> 
>     S7.1) The client gets NFS4ERR_EXPIRED?
>     S7.2) The client gets some other error?  What?
>     S7.3) Can it be different for different types of stateids?
> 
> S8) Assuming that the release does not cause immediate freeing
>     of the stateid (and returning NFS4ERR_BAD_STATEID), what 
>     provision is made for eventual deletion of such stateids?
> 
>     a) There is an LRU mechanism to eventually delete them.
>     b) They are kept around until the client or server 
>        reboots.
>     c) It depends on the type of lock.
> 
> If you have no client, you are done.
> 
> C1) When you get NFS4ERR_EXPIRED, what happens?
> 
>     a) Assume all locks associated with lease have been released
>        that the client needs to create a new clientid?
>     b) Assume that it means one or more locks have been lost and
>        the client needs to determine what as gone and which are
>        still valid.
>     c) Something else.  What?

C) We try to send a RENEW. If that fails, we try to send a SETCLIENTID
that uses the same clientid and verifier as when we originally mounted
the filesystem.
After that, we try to replay all OPEN and LOCK calls.

> C2) If you didn't answer (a) to (C1), are there any assumptions
>     (of the sort mentioned in (S3) to (S6)) whose violation would
>     cause problems? 

In principle not. We are supposed to be able to recover all locks on a
per-open_stateid basis.

IOW: if an open stateid, lock stateid or delegation stateid returns an
error, we should be able to replay just the OPEN+LOCK requests that are
required to allow the application to proceed.

Note that we do not implement a SIGLOST feature in the case where the
application has lost a lock (although we probably should). We never did
it for NFSv3, and since nobody has really requested it for NFSv4 either,
we have deferred implementing it there too.

> C3) How do you respond to getting the following errors?  Specifically
>     would it be problematic if that error were returned when a
>     courtesy lock was released due to a conflict:
> 
>     C3.1) NFS4ERR_ADMIN_REVOKED
>     C3.2) NFS4ERR_BAD_STATEID

We replay the OPEN+LOCK requests, and if that fails, we try again using
the zero stateid.

Cheers
  Trond