[nfsv4] more re: admin revoke and edge conditions

rick@snowhite.cis.uoguelph.ca Sat, 02 October 2004 17:32 UTC

Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA28544 for <nfsv4-web-archive@ietf.org>; Sat, 2 Oct 2004 13:32:19 -0400 (EDT)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CDnso-0007k3-Ho for nfsv4-web-archive@ietf.org; Sat, 02 Oct 2004 13:41:22 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CDniG-0001Zj-S6; Sat, 02 Oct 2004 13:30:28 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CDnhI-00015l-CB for nfsv4@megatron.ietf.org; Sat, 02 Oct 2004 13:29:28 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA28406 for <nfsv4@ietf.org>; Sat, 2 Oct 2004 13:29:25 -0400 (EDT)
From: rick@snowhite.cis.uoguelph.ca
Received: from ccshst09.cs.uoguelph.ca ([131.104.96.18]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CDnpz-0007ge-RL for nfsv4@ietf.org; Sat, 02 Oct 2004 13:38:28 -0400
Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca [131.104.48.1]) by ccshst09.cs.uoguelph.ca (8.12.11/8.12.11) with ESMTP id i92HTOva030329 for <nfsv4@ietf.org>; Sat, 2 Oct 2004 13:29:24 -0400
Received: (from rick@localhost) by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id NAA96099 for nfsv4@ietf.org; Sat, 2 Oct 2004 13:30:25 -0400 (EDT)
Date: Sat, 02 Oct 2004 13:30:25 -0400
Message-Id: <200410021730.NAA96099@snowhite.cis.uoguelph.ca>
To: nfsv4@ietf.org
X-Spam-Scanner: SpamAssassin 2.63 (http://www.spamassassin.org/) on ccshst09.cs.uoguelph.ca
X-Spam-Score: hits=0.3
X-Spam-Tests: NO_REAL_NAME
X-Spam-Status: Suspected
X-Scanned-By: MIMEDefang 2.39
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 1a1bf7677bfe77d8af1ebe0e91045c5b
Subject: [nfsv4] more re: admin revoke and edge conditions
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 944ecb6e61f753561f559a497458fb4f

[Spencer's good stuff snipped, which helped clarify some of my confusion]

> In these cases, not all client state would be removed ot need to be
> removed.  However, the client should be told of the revocation and
> hence the NFS4ERR_ADMIN_REVOKED.  The reason for returning this error
> on RENEW is similar to the NFS4ERR_CB_PATH_DOWN such that the client
> is active on other parts of the server's resources but not the part
> where the state has been revoked.  The client could be told earlier of
> the ADMIN_REVOKED than later and it could then in turn notify the
> client environment as per its policies.  The idea was that the client
> could "check" its state to determine what had been revoked; no clear
> guidance was obviously given but something like a zero length read on
> open files may suffice.

Ok, so Renew should return NFS4ERR_ADMIN_REVOKED whenever any of the state
(doesn't have to be all of it) has been revoked?

And, it sounds like this Renew Op that returns NFS4ERR_ADMIN_REVOKED does
Renew the client's lease. Is that correct?

> If the server has rebooted, it just needs to return things like
> NFS4ERR_STALE_CLIENTID/NFS4ERR_STALE_STATEID.  The ADMIN_REVOKED is
> not required because all state has been removed for the client.

I did my usual crappy job of explaining what my concern was, so I think
Spencer didn't follow it. I'll try again with an example.
(Dave might have understood or he might be thinking
of something else. I haven't looked at it closely enough to see what
effect sessions might have on this. I have a hunch that they will help,
but it's just a hunch.)
For example:
1 - client A has lock on xxx
2 - administrator revokes this lock on xxx for client A
3 - client B gets lock on xxx that would have conflicted with (1)
4 - server crashes/reboots
5 - both clients see STALECLIENTID or STALESTALEID and start reclaiming locks
6 - client A requests reclaim of lock on xxx
7 - client B requests reclaim of lock on xxx, that would conflict with (6)

Since, if (1)->(4) occurs in less than a lease time (or client A is network
partitioned until (5)), it did not see NFS4ERR_ADMIN_REVOKED.

Now, the question is, what is the server's correct response to (6) and (7)?
If the server doesn't keep anything about the revoke at (2) in stable storage,
it would be:
(6) - NFS4_OK
(7) - NFS4ERR_RECLAIM_CONFLICT
and this doesn't seem correct to me.

It seems to me that the server needs to use stable storage such that, after
reboot, it would reply:
(6) - NFS4ERR_xxx
(7) - NFS4_OK
which then leads to what NFS4ERR_xxx should be? I was thinking NFS4ERR_NOGRACE,
but will a client expect that error for only some of the reclaims and not all
of them?

I was also thinking that I would handle admin revoke using the same stable
storage mechanism as for lease expiry (essentially as described in 8.6.3), but
that has a couple of negative implications:
a) - client A can't reclaim any of its state for the above scenario
b) - once I note admin revoke has happenned to client A in stable storage,
     I can't come up with a way, short of the client doing a new SetClientID
     of recognizing that it is ok to let client A do reclaims and this doesn't
     seem like a good thing. (This was what I was trying to ask about in
     the last post, believe it or not:-)
     For example:
     1 - client A has lock on xxx
     2 - administrator revokes that lock, which is noted in stable storage
     3 - client A sees NFS4ERR_ADMIN_REVOKED, but doesn't do a SetClientID,
         since it doesn't see NFS4ERR_EXPIRED
	 (this assumes that some Op is renewing the lease, even though
	  the lock has been admin revoked. If no Op can renew the lease,
	  then everything will expire and you might as well just revoke all
	  the state for a clientid at once, because it will expire soon
	  anyhow.)
     4 - client A chugs merrily along for weeks and has lots of other valid
         state
     5 - server crashes/reboots
     6 - client A sees STALECLIENTID or STALESTATEID
     7 - client A tries to reclaim state and gets NFS4ERR_NOGRACE for all of
         it

The more I look at it, it seems that there are only two alternatives (at
least until sessions are in place, which I haven't looked at closely
enough yet, to understand how they might help).

1 - forget about the cases where not all of the clientid's state is admin
    revoked and just revoke it all at once, then let the lease expire.
OR
2 - figure out a more sophisticated stable storage structure that handles
    the edge cases, so that the NFS4ERR_xxx replied to A's reclaim above,
    can be ADMIN_REVOKED.

> In the case of server reboot, all of this is moot.  The server doesn't
> need to worry about partial revocation before the reboot because all
> of it is gone now.

But, only if the client knows the state was revoked and doesn't try to reclaim
it, right?

Am I making sense or still out in left field? rick

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4