RE: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE
"Noveck, Dave" <Dave.Noveck@netapp.com> Fri, 07 January 2005 19:44 UTC
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA08143 for <nfsv4-web-archive@ietf.org>; Fri, 7 Jan 2005 14:44:08 -0500 (EST)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1Cn0Es-0001wn-Hl for nfsv4-web-archive@ietf.org; Fri, 07 Jan 2005 14:57:39 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Cmzzn-0004Yg-Gc; Fri, 07 Jan 2005 14:42:03 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CmzwO-0003ca-Qg for nfsv4@megatron.ietf.org; Fri, 07 Jan 2005 14:38:32 -0500
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA07803 for <nfsv4@ietf.org>; Fri, 7 Jan 2005 14:38:31 -0500 (EST)
Received: from mx1.netapp.com ([216.240.18.38]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1Cn09Q-0001PJ-TI for nfsv4@ietf.org; Fri, 07 Jan 2005 14:52:01 -0500
Received: from smtp1.corp.netapp.com (10.57.156.124) by mx1.netapp.com with ESMTP; 07 Jan 2005 11:37:56 -0800
X-IronPort-AV: i="3.88,109,1102320000"; d="scan'208"; a="77006804:sNHT19311796"
Received: from svlexc01.hq.netapp.com (svlexc01.corp.netapp.com [10.57.156.135]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id j07JbumH017020; Fri, 7 Jan 2005 11:37:56 -0800 (PST)
Received: from violet.hq.netapp.com ([10.56.10.190]) by svlexc01.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.6713); Fri, 7 Jan 2005 11:37:56 -0800
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by violet.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.2966); Fri, 7 Jan 2005 11:37:55 -0800
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE
Date: Fri, 07 Jan 2005 14:37:54 -0500
Message-ID: <C98692FD98048C41885E0B0FACD9DFB840D218@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE
Thread-Index: AcT0YVirKPVF5/BSS2evV3g+0vge3QAXnQ8A
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: spencer.shepler@sun.com, nfsv4@ietf.org
X-OriginalArrivalTime: 07 Jan 2005 19:37:55.0938 (UTC) FILETIME=[62C14C20:01C4F4F0]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 90e8b0e368115979782f8b3d811b226b
Content-Transfer-Encoding: quoted-printable
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 2c12be3f3a8d57895fb9c003e1517c01
Content-Transfer-Encoding: quoted-printable
Spencer Shepler wrote. > It might help me if we split the question of returning NFS4ERR_EXPIRED > into two pieces: with and without the use of SETCLIENTID... That makes sense. I'll try my best to avoid splitting those into four sub-cases :-) > So in the case of a network partition between client and server in > which the partition lasts longer than the lease period, the server > presumably has to return some error to the client. It wouldn't allow > the client to continue using state associate with the now-expired >lease. NFS4ERR_EXPIRED is the most appropriate. Not sure what else > the server could do in this case. As you note below, BAD_STATEID would give the client the message that his state-id is no longer usable. It doesn't give the reason but I'm not sure the reason is all that helpful to the client. The fact that is important is that the stateid is no longer valid and it isn't clear what the client would do with the information that lease expiration was the reason. In many cases, it is the only possible reason, though. The problem with EXPIRED is the one Rick mentioned, that there is no way to delimit when it is no longer needed and so you have a piece of state that there is no way to deallocate, at least within a client instance, and it is troubling to have something where there is a resource leak by design, even when the magnitude of leakage means that it is not a big issue in practice. Note also that if the stateid's can't go away, neither can the owner, and so there is another thing which leaks. By the way, one interesting side question about delimiting the scope of EXPIRED concerns RELEASE_OWNER. If I do a RELEASE_OWNER and all of the associated stateid's are EXPIRED, then I would say that this would go through, allow deallocation of those stateid's and the lockowner. This leaves open stateid's and allowing CLOSE of those would allow us a way to avoid leakage if the client takes care to get rid of such stuff. But, if you deallocate revoked state, then you don't face leakage issues, and the client still gets the message ("That state you just handed me is gone. Live with it.") via the BAD_STATEID and life goes on. Bias alert: My approval of returning BAD_STATEID may have something to do the fact that that is what our server currently does. > In the other case, use of SETCLIENTID, the client may still have some > inflight requests. For example, the client may have received a > NFS4ERR_EXPIRED error on one request and started its recovery and sent > the SETCLIENTID whilst other requests were still outstanding (which > happen to use state from the previous client/server instantiation). If you do setcl/setcl-cf with a new verifier and you have outstanding requests that refer to states within the state corpus of the previous client instance that this setcl/setcl-cf will trash, you need to deal with the consequences. My inclination, if I had to write a client, would to simplify things and drain that stuff before proceeding to what is, from the v4 state point of view, the moral equivalent of a client reboot. I want to leave aside expired stateid's for the moment. When you issue the setcl/setcl-cf you don't know that all of locks, and thus the stateids are in the expired state, some may be fine. For those non- expired stateids that are trashed by the new setcl/setcl-cf, the client has to be prepared for BAD_STATEID. You can't return EXPIRED and they are not valid. They are bad stateid's and there is nothing the server can return but BAD_STATEID. > It seems that the server will again need to return some error based on > checking the stateid and NFS4ERR_EXPIRED seems most appropriate. I > suppose that NFS4ERR_BAD_STATEID may be appropriate but _EXPIRED is > friendlier. Despite my resolution to avoid four sub-cases, I appear forced to it: If you have setcl racing with other state referencing requests, then you may have the request hit the server BEFORE or AFTER the setcl and it may encounter a state that was OK or REVOKED (due to lease expiration. So, we have four cases: 1) AFTER/OK It seems like the only thing that could be returned here is BAD_STATEID. So the client has to be prepared for that case. 2) AFTER/REVOKED I would return BAD_STATEID here, indicating that if there were any state corresponding to that stateid, it has been trashed, at the clients request. Returning EXPIRED to indicate that the state which was trashed was revoked at the time does not seem helpful to me. You might describe it as friendlier, but I would consider it obsessively friendly, in giving you dubiously helpful information you don't care about. 3) BEFORE/OK State is OK and the operation goes through. 4) BEFORE/REVOKED I agree that EXPIRED can be returned her, but I would argue that BAD_STATEID is just as good. The client knows that the requested operation did not happen and that when the setcl/setcl-cf completes, he has clean slate, statewise. So the client has to be prepared for OK, EXPIRED and BAD_STATEID and it isn't clear what he would do different in the two error cases. > I agree that the RFC doesn't mandate the return of _EXPIRED but as > mentioned above, it seems most appropriate. Not to me. If the client issues a setcl/setcl-cf with a new verifier then it is asking for the state corpus associated with the same id and different verifier, to be trashed/eliminated. All the stateids, locks, etc, become invalid. You cannot issue a new lock request and have it conflict with a lock from the previous instance. Stateids from that instance become invalid and trying to maintain some across the instance boundary seems wrong to me (and besides that I don't want to do the work to that unless it is *really* needed). > Are you concerned about exhaustion of the stateid space and reuse such > that the server will be unable to meet a MUST statement about _EXPIRED > error returns? That's certainly part of it. The other part, to be honest, is that our server does not do this, and we've got enough work to do that we are reluctant to make changes unless they are either clearly required by the spec or are for other reasons something that clients truly need. If clients would find EXPIRED helpful, I can see returning it on a best- effort basis, and only within the code of a single client setcl instance. Keeping the EXPIRED state for a few lease times and then freeing it, allowing any subsequent references to get BAD_STATEID, seems a reasonable way of providing this more detailed error information to clients who want it and are interested enough to obtain it within a reasonable time. On Thu, Noveck, Dave wrote: > I was just reviewing the ADMIN_REVOKED mail in conection with dealing with > the case of a delegation which is not returned when recalled while the > client continues to renew its lease. This isn't exactly administrative > action in the sense of a administrator deciding to do something but since > we will be getting rid of the delegation while the client has a valid > (non-expired) lease, it is the closest fit we could find. > > I'll probably be sending more mail on that subject when I get some of my > thoughts/questions together but for now I noticed the following: > > rick@snowhite.cis.uoguelph.ca wrote: > > > I also allow the SetClientID/confirm to lift the embargo, although I am > > > still on the fence as to whether or not that should be the case. > > Spencer Shepler wrote: > > The server has to return NFS4ERR_EXPIRED to old state from the client > > regardless of how long the client keeps sending it (unless the server > > eventually reboots). Even if the client does a > > SETCLIENTID/SETCLIENTID_CONFIRM and is confused enough to send old > > state requests, those old state requests still have to receive > > NFS4ERR_EXPIRED. > > Why? > > This is two questions I guess: > > What purpose is served in doing this? > > Where does the spec say that this has to be done? > > I looking for answers to either question (or both). > > I have a real problem with keeping state around forever that will never be > used. If the client rebooting once will not clear it then no number of client > reboots will do it and we will wind up keeping revoked state that the client > lost interest in months ago. To what purpose? > > Also, do you mean this requirement to apply just to cases of NFS4ERR_EXPIRED > that happen after admin revocation or to cases in which NFS4ERR_EXPIRED > is the result of lease expiration was well. If the latter, it is very likely > that a common sequence with a buggy client kernel (not necessarily the v4 > client code having the bug): > > Client comes up and does setclientid > > Client gets a bunch of locks/opens > > Client dies > > Lease expires > > Client reboots > > Repeat until the damn bug gets fixed > > would result in lot of state being saved to distnguish states expired N > client invocations ago (EXPIRED) from just random stateid (BAD_STATED). > Is the confused client using obsolete stateids going to be any better > off if the ones that corresponded to expired obsolete stateids returned > EXPIRED in order to justify the work of maintaining that sort of > archival information? > > > -----Original Message----- > From: Spencer Shepler [mailto:spencer.shepler@sun.com] > Sent: Friday, October 01, 2004 6:24 PM > To: nfsv4@ietf.org > Subject: Re: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE > > > On Mon, rick@snowhite.cis.uoguelph.ca wrote: > > > We use the NFS4ERR_ADMIN_REVOKED error for indicating when state > > > [clientid, stateid (either open,lock, or delegation)] has been revoked and > > > will no longer be accepted by the server. > > > > Yep. My main concern in this area was "how permanent" this has to be. Which > > you've addressed later. (Put another way when/if the NFS4ERR_ADMIN_REVOKED > > can be replaced by NFS4ERR_EXPIRED.) > > Clientid doesn't apply and I think that is understood in that if the > clientid is revoked then NFS4ERR_EXPIRED should be returned, not > NFS4ERR_ADMIN_REVOKED. > > > > Once the client as a whole has been marked as revoked, we do not do any > > > renewing of it. > > > Once the client's state structures are reclaimed (which may be some time > > > after it has really "expired"), then the client would get back the > > > NFS4ERR_EXPIRED error. > > > > This sounds exactly like what my current code will do. (I, also, don't > > expire the client until sometime after the expiry.) > > > > > The only way for the client to get past this client wide revoke is to then > > > issue a SETCLIENTID/SETCLIENTID_CONFIRM sequence. This will purge all > > > revoked state from the client and it can begin anew. Do you do something > > > like this OR do you make the revoke permanent? That is, require > > > administrative intervention to lift the embargo on the bad client ? > > > > I also allow the SetClientID/confirm to lift the embargo, although I am > > still on the fence as to whether or not that should be the case. > > The server has to return NFS4ERR_EXPIRED to old state from the client > regardless of how long the client keeps sending it (unless the server > eventually reboots). Even if the client does a > SETCLIENTID/SETCLIENTID_CONFIRM and is confused enough to send old > state requests, those old state requests still have to receive > NFS4ERR_EXPIRED. > > > > We do not tie the revokes to a specific open owner. The main reason being > > > is that the server hands out clientids and stateids, but not "open > > > owners". The server only revokes things it hands out. > > > > This sounds like the only place where our current code differs. I used > > the open/lockowner since it was explicitly referred to in the RFC. (My > > assumption was that the author was thinking that an open/lockowner equates > > to a client process/task that went south without releasing the state as > > it should.) btw, when I say "revoke an openowner" I mean that all stateids > > for all opens and lockowners associated with that openowner will be revoked > > and use of all those stateids will get NFS4ERR_ADMIN_REVOKED. > > > > My current code could easily be changed to revoke opens instead of openowners. > > > > Do others see this as an issue? (ie. Does it matter w.r.t the protocol whether > > the openowner with all associated opens or the individual opens, gets revoked?) > > Administratively, I would expect that files or shares/exports are the > unit of revocation. The admin is unlikely to care about openowners > unless there seems to something broken with the NFSv4 client and it > that case the entire client might as well be revoked. > > Spencer > > > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www1.ietf.org/mailman/listinfo/nfsv4 > _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www1.ietf.org/mailman/listinfo/nfsv4 _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www1.ietf.org/mailman/listinfo/nfsv4
- [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE rick
- Re: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE Spencer Shepler
- RE: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE Noveck, Dave
- Re: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE Spencer Shepler
- RE: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE Noveck, Dave
- Re: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE Spencer Shepler
- [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE rick
- RE: [nfsv4] re: re: NFS4ERR_ADMIN_REVOKE Noveck, Dave