Re: [nfsv4] 3530bis Issue 39: Clarification on renewing sequence IDs

Trond Myklebust <Trond.Myklebust@netapp.com> Fri, 05 November 2010 14:00 UTC

Return-Path: <Trond.Myklebust@netapp.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 360853A694D for <nfsv4@core3.amsl.com>; Fri, 5 Nov 2010 07:00:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.599
X-Spam-Level:
X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aozG1CmjzG-0 for <nfsv4@core3.amsl.com>; Fri, 5 Nov 2010 07:00:08 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by core3.amsl.com (Postfix) with ESMTP id 322133A694C for <nfsv4@ietf.org>; Fri, 5 Nov 2010 07:00:08 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.58,303,1286175600"; d="scan'208";a="477976685"
Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2-out.netapp.com with ESMTP; 05 Nov 2010 07:00:21 -0700
Received: from svlrsexc2-prd.hq.netapp.com (svlrsexc2-prd.hq.netapp.com [10.57.115.31]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id oA5E0FRv002496; Fri, 5 Nov 2010 07:00:21 -0700 (PDT)
Received: from SACMVEXC2-PRD.hq.netapp.com ([10.99.115.18]) by svlrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 5 Nov 2010 07:00:15 -0700
Received: from 10.58.61.112 ([10.58.61.112]) by SACMVEXC2-PRD.hq.netapp.com ([10.99.115.16]) with Microsoft Exchange Server HTTP-DAV ; Fri, 5 Nov 2010 13:59:28 +0000
Received: from heimdal.trondhjem.org by SACMVEXC2-PRD.hq.netapp.com; 05 Nov 2010 09:59:28 -0400
From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Robert Thurlow <Robert.Thurlow@oracle.com>
In-Reply-To: <4CD32C31.3040909@oracle.com>
References: <4CACD9BF.2010809@oracle.com> <4CD32C31.3040909@oracle.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Organization: NetApp
Date: Fri, 05 Nov 2010 09:59:14 -0400
Message-ID: <1288965554.3975.27.camel@heimdal.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.32.0 (2.32.0-2.fc14)
X-OriginalArrivalTime: 05 Nov 2010 14:00:15.0745 (UTC) FILETIME=[C5C92F10:01CB7CF1]
Cc: NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] 3530bis Issue 39: Clarification on renewing sequence IDs
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Nov 2010 14:00:09 -0000

On Thu, 2010-11-04 at 15:57 -0600, Robert Thurlow wrote:
> Robert Thurlow wrote:
> > Hi folks,
> > 
> > This is issue 39 from 
> > http://github.com/loghyr/3530bis/blob/master/tasklist.txt.
> > 
> > In implementing NFSv4 migration support, we believe that
> > MOVED and LEASE_MOVED need to be added to the list of errors
> > in 8.1.5 which do NOT result in incrementing the open owner
> > or lock owner sequence ID.  The goal is to make the sequence
> > ID readily calculable for both the client and the destination
> > server after the migration has occurred.
> > 
> > On the 3530bis call, this appeared exactly backwards to some
> > others - that since a completely gross error had not occurred,
> > we should increment the sequence ID and the client and the
> > destination server should know to expect that when they interact
> > after a migration.  I do not know this issue well enough to
> > properly defend a position, so please reply with your reasoned
> > opinion :-)
> 
> I don't think this has had a response.  If you disagree with
> the wording change, now is the time to say so.

As stated on the confcall, I strongly disagree with this change w.r.t.
NFS4ERR_LEASE_MOVED. The operation that resulted in a
NFS4ERR_LEASE_MOVED cannot be safely replayed if the sequence id has not
been bumped.

The point is that NFS4ERR_LEASE_MOVED is an error that depends on the
state of a _different_ filesystem. It does not even pertain to the
actual state you are trying to modify (and is a monstrous hack). Worse
yet, that error condition can be cleared at any time with no
consequences for the stateids held by the client, so unlike
NFS4ERR_BAD_STATEID or NFS4ERR_BAD_SEQID, there is no ordering w.r.t.
the operation that you are retrying.

IOW: if the error condition happens to get cleared between two replays
of the operation, the client may end up getting 2 conflicting replies
(one NFS4ERR_LEASE_MOVED, the other being a change of state on the
server). Which one does it choose?

So how about counter-proposal: we bury NFS4ERR_LEASE_MOVED at a
cross-roads with a stake through its heart, and promise never to mention
it again except when the kids need scaring to bed...

Trond