RE: [nfsv4] more re: admin revoke and edge conditions
"Noveck, Dave" <Dave.Noveck@netapp.com> Fri, 08 October 2004 21:34 UTC
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA19471 for <nfsv4-web-archive@ietf.org>; Fri, 8 Oct 2004 17:34:24 -0400 (EDT)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CG2Xd-0005RX-24 for nfsv4-web-archive@ietf.org; Fri, 08 Oct 2004 17:44:46 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CG28m-00042g-1e; Fri, 08 Oct 2004 17:19:04 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CG1ev-0006jF-PS for nfsv4@megatron.ietf.org; Fri, 08 Oct 2004 16:48:13 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA10843 for <nfsv4@ietf.org>; Fri, 8 Oct 2004 16:48:11 -0400 (EDT)
Received: from mx01.netapp.com ([198.95.226.53]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CG1or-0002cb-OW for nfsv4@ietf.org; Fri, 08 Oct 2004 16:58:32 -0400
Received: from hawk.corp.netapp.com (hawk [10.57.156.122]) by mx01.netapp.com (8.12.10/8.12.10/NTAP-1.4) with ESMTP id i98KlcFC025816; Fri, 8 Oct 2004 13:47:38 -0700 (PDT)
Received: from svlexc02.hq.netapp.com (svlexc02.corp.netapp.com [10.57.157.136]) by hawk.corp.netapp.com (8.12.9/8.12.9/NTAP-1.5) with ESMTP id i98Klcf1021566; Fri, 8 Oct 2004 13:47:38 -0700 (PDT)
Received: from violet.hq.netapp.com ([10.56.10.190]) by svlexc02.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.6713); Fri, 8 Oct 2004 13:47:37 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by violet.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.2966); Fri, 8 Oct 2004 13:47:37 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] more re: admin revoke and edge conditions
Date: Fri, 08 Oct 2004 16:47:36 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB803BD0C@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] more re: admin revoke and edge conditions
Thread-Index: AcSopeXY0LDluhCASZeTEoe0+CHeYwEz1/xg
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: rick@snowhite.cis.uoguelph.ca, nfsv4@ietf.org
X-OriginalArrivalTime: 08 Oct 2004 20:47:37.0804 (UTC) FILETIME=[0BC028C0:01C4AD78]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: e8c5db863102a3ada84e0cd52a81a79e
Content-Transfer-Encoding: quoted-printable
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a92270ba83d7ead10c5001bb42ec3221
Content-Transfer-Encoding: quoted-printable
> It seems to me that the server needs to use stable storage such that, after > reboot, it would reply: > (6) - NFS4ERR_xxx > (7) - NFS4_OK > which then leads to what NFS4ERR_xxx should be? I was thinking NFS4ERR_NOGRACE, > but will a client expect that error for only some of the reclaims and not all > of them? Clients might expect that. The spec is ambiguous on the issue. For example in section 8.6.3, it says In the event, after a server reboot, the server determines that there is unrecoverable damage or corruption to the the stable storage, then for all clients and/or locks affected, the server MUST return NFS4ERR_NO_GRACE. You could read "locks affected" as a mandate to return NO_GRACE for the locks affected and no others. If it were "or" then this would be stronger. If it were "and" then you'd wonder why lock were even mentioned. As it is with "and/or", things aren't very clear. On the other hand the error definitions seem to reject this: NFS4ERR_NO_GRACE A reclaim of client state has fallen outside of the grace period of the server. As a result, the server can not guarantee that conflicting state has not been provided to another client. which sounds like the client might logically assume that he is going to get NO_GRACE for subsequent reclaims. I don't see that sessions help this. On the other hand, Mike Eisler had a proposal for more specific error codes, and my impression was that they would allow you to determine whether a reclaim failure was due to something that was restricted to the lock in question. I think that's one of those little cleanup items for v4.1 that would be really helpful. -----Original Message----- From: rick@snowhite.cis.uoguelph.ca [mailto:rick@snowhite.cis.uoguelph.ca] Sent: Saturday, October 02, 2004 1:30 PM To: nfsv4@ietf.org Subject: [nfsv4] more re: admin revoke and edge conditions [Spencer's good stuff snipped, which helped clarify some of my confusion] > In these cases, not all client state would be removed ot need to be > removed. However, the client should be told of the revocation and > hence the NFS4ERR_ADMIN_REVOKED. The reason for returning this error > on RENEW is similar to the NFS4ERR_CB_PATH_DOWN such that the client > is active on other parts of the server's resources but not the part > where the state has been revoked. The client could be told earlier of > the ADMIN_REVOKED than later and it could then in turn notify the > client environment as per its policies. The idea was that the client > could "check" its state to determine what had been revoked; no clear > guidance was obviously given but something like a zero length read on > open files may suffice. Ok, so Renew should return NFS4ERR_ADMIN_REVOKED whenever any of the state (doesn't have to be all of it) has been revoked? And, it sounds like this Renew Op that returns NFS4ERR_ADMIN_REVOKED does Renew the client's lease. Is that correct? > If the server has rebooted, it just needs to return things like > NFS4ERR_STALE_CLIENTID/NFS4ERR_STALE_STATEID. The ADMIN_REVOKED is > not required because all state has been removed for the client. I did my usual crappy job of explaining what my concern was, so I think Spencer didn't follow it. I'll try again with an example. (Dave might have understood or he might be thinking of something else. I haven't looked at it closely enough to see what effect sessions might have on this. I have a hunch that they will help, but it's just a hunch.) For example: 1 - client A has lock on xxx 2 - administrator revokes this lock on xxx for client A 3 - client B gets lock on xxx that would have conflicted with (1) 4 - server crashes/reboots 5 - both clients see STALECLIENTID or STALESTALEID and start reclaiming locks 6 - client A requests reclaim of lock on xxx 7 - client B requests reclaim of lock on xxx, that would conflict with (6) Since, if (1)->(4) occurs in less than a lease time (or client A is network partitioned until (5)), it did not see NFS4ERR_ADMIN_REVOKED. Now, the question is, what is the server's correct response to (6) and (7)? If the server doesn't keep anything about the revoke at (2) in stable storage, it would be: (6) - NFS4_OK (7) - NFS4ERR_RECLAIM_CONFLICT and this doesn't seem correct to me. It seems to me that the server needs to use stable storage such that, after reboot, it would reply: (6) - NFS4ERR_xxx (7) - NFS4_OK which then leads to what NFS4ERR_xxx should be? I was thinking NFS4ERR_NOGRACE, but will a client expect that error for only some of the reclaims and not all of them? I was also thinking that I would handle admin revoke using the same stable storage mechanism as for lease expiry (essentially as described in 8.6.3), but that has a couple of negative implications: a) - client A can't reclaim any of its state for the above scenario b) - once I note admin revoke has happenned to client A in stable storage, I can't come up with a way, short of the client doing a new SetClientID of recognizing that it is ok to let client A do reclaims and this doesn't seem like a good thing. (This was what I was trying to ask about in the last post, believe it or not:-) For example: 1 - client A has lock on xxx 2 - administrator revokes that lock, which is noted in stable storage 3 - client A sees NFS4ERR_ADMIN_REVOKED, but doesn't do a SetClientID, since it doesn't see NFS4ERR_EXPIRED (this assumes that some Op is renewing the lease, even though the lock has been admin revoked. If no Op can renew the lease, then everything will expire and you might as well just revoke all the state for a clientid at once, because it will expire soon anyhow.) 4 - client A chugs merrily along for weeks and has lots of other valid state 5 - server crashes/reboots 6 - client A sees STALECLIENTID or STALESTATEID 7 - client A tries to reclaim state and gets NFS4ERR_NOGRACE for all of it The more I look at it, it seems that there are only two alternatives (at least until sessions are in place, which I haven't looked at closely enough yet, to understand how they might help). 1 - forget about the cases where not all of the clientid's state is admin revoked and just revoke it all at once, then let the lease expire. OR 2 - figure out a more sophisticated stable storage structure that handles the edge cases, so that the NFS4ERR_xxx replied to A's reclaim above, can be ADMIN_REVOKED. > In the case of server reboot, all of this is moot. The server doesn't > need to worry about partial revocation before the reboot because all > of it is gone now. But, only if the client knows the state was revoked and doesn't try to reclaim it, right? Am I making sense or still out in left field? rick _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www1.ietf.org/mailman/listinfo/nfsv4 _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www1.ietf.org/mailman/listinfo/nfsv4