Re: [nfsv4] 3530bis Issue 39: Clarification on renewing sequence IDs

<david.noveck@emc.com> Sun, 07 November 2010 06:05 UTC

Return-Path: <david.noveck@emc.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id F420C3A69C1 for <nfsv4@core3.amsl.com>; Sat, 6 Nov 2010 23:05:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KKeelGqPegXV for <nfsv4@core3.amsl.com>; Sat, 6 Nov 2010 23:04:52 -0700 (PDT)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by core3.amsl.com (Postfix) with ESMTP id 329B93A69AE for <nfsv4@ietf.org>; Sat, 6 Nov 2010 23:04:13 -0700 (PDT)
Received: from hop04-l1d11-si03.isus.emc.com (HOP04-L1D11-SI03.isus.emc.com [10.254.111.23]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id oA764TWc010345 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 7 Nov 2010 01:04:29 -0500
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.221.253]) by hop04-l1d11-si03.isus.emc.com (RSA Interceptor); Sun, 7 Nov 2010 01:04:18 -0500
Received: from corpussmtp4.corp.emc.com (corpussmtp4.corp.emc.com [10.254.169.197]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id oA7649ZH019480; Sun, 7 Nov 2010 01:04:11 -0500
Received: from CORPUSMX50A.corp.emc.com ([128.221.62.39]) by corpussmtp4.corp.emc.com with Microsoft SMTPSVC(6.0.3790.4675); Sun, 7 Nov 2010 01:04:09 -0500
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable
Date: Sun, 07 Nov 2010 01:04:11 -0500
Message-ID: <BF3BB6D12298F54B89C8DCC1E4073D8002945154@CORPUSMX50A.corp.emc.com>
In-Reply-To: <1288965554.3975.27.camel@heimdal.trondhjem.org>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] 3530bis Issue 39: Clarification on renewing sequence IDs
Thread-Index: Act88eR95G2FBfqOSyiwwcvCalQ1XwBRxz1g
References: <4CACD9BF.2010809@oracle.com> <4CD32C31.3040909@oracle.com> <1288965554.3975.27.camel@heimdal.trondhjem.org>
From: david.noveck@emc.com
To: Trond.Myklebust@netapp.com, Robert.Thurlow@oracle.com
X-OriginalArrivalTime: 07 Nov 2010 06:04:09.0729 (UTC) FILETIME=[97F08F10:01CB7E41]
X-EMM-MHVC: 1
Cc: nfsv4@ietf.org
Subject: Re: [nfsv4] 3530bis Issue 39: Clarification on renewing sequence IDs
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 07 Nov 2010 06:05:38 -0000

I agree with Trond's argument as to seqid, i.e. that you should
increment the seqid in the case of NFS4ERR_LEASE_MOVED.

NFS4ERR_LEASE_MOVED was buried at a cross-roads in RFC5661, but there we
have SEQ_STATUS_LEASE_MOVED so we don't need it.

If you bury it in RFC3530bis, you do have to have some way to deal with
the issue of letting the client find out that there is a migrated lease.
Otherwise, there is an unbounded period in which the new server will not
hear from the client and the client's open files could be lost.

The alternative to the monstrous hack would be to require the server to
simulate a reboot.  The client would see a STALE client or stateid error
and then he would go through the reclaim sequence for both any migrated
and non-migrated fs's.   That seems harder to make happen than
LEASE_MOVED, as monstrous as it is.
 

-----Original Message-----
From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
Of Trond Myklebust
Sent: Friday, November 05, 2010 9:59 AM
To: Robert Thurlow
Cc: NFSv4
Subject: Re: [nfsv4] 3530bis Issue 39: Clarification on renewing
sequence IDs

On Thu, 2010-11-04 at 15:57 -0600, Robert Thurlow wrote:
> Robert Thurlow wrote:
> > Hi folks,
> > 
> > This is issue 39 from 
> > http://github.com/loghyr/3530bis/blob/master/tasklist.txt.
> > 
> > In implementing NFSv4 migration support, we believe that
> > MOVED and LEASE_MOVED need to be added to the list of errors
> > in 8.1.5 which do NOT result in incrementing the open owner
> > or lock owner sequence ID.  The goal is to make the sequence
> > ID readily calculable for both the client and the destination
> > server after the migration has occurred.
> > 
> > On the 3530bis call, this appeared exactly backwards to some
> > others - that since a completely gross error had not occurred,
> > we should increment the sequence ID and the client and the
> > destination server should know to expect that when they interact
> > after a migration.  I do not know this issue well enough to
> > properly defend a position, so please reply with your reasoned
> > opinion :-)
> 
> I don't think this has had a response.  If you disagree with
> the wording change, now is the time to say so.

As stated on the confcall, I strongly disagree with this change w.r.t.
NFS4ERR_LEASE_MOVED. The operation that resulted in a
NFS4ERR_LEASE_MOVED cannot be safely replayed if the sequence id has not
been bumped.

The point is that NFS4ERR_LEASE_MOVED is an error that depends on the
state of a _different_ filesystem. It does not even pertain to the
actual state you are trying to modify (and is a monstrous hack). Worse
yet, that error condition can be cleared at any time with no
consequences for the stateids held by the client, so unlike
NFS4ERR_BAD_STATEID or NFS4ERR_BAD_SEQID, there is no ordering w.r.t.
the operation that you are retrying.

IOW: if the error condition happens to get cleared between two replays
of the operation, the client may end up getting 2 conflicting replies
(one NFS4ERR_LEASE_MOVED, the other being a change of state on the
server). Which one does it choose?

So how about counter-proposal: we bury NFS4ERR_LEASE_MOVED at a
cross-roads with a stake through its heart, and promise never to mention
it again except when the kids need scaring to bed...

Trond
_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4