RE: [nfsv4] more re: client re-using lock_owner

"Noveck, Dave" <Dave.Noveck@netapp.com> Sat, 21 May 2005 14:19 UTC

Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DZUpJ-0004OZ-Dn; Sat, 21 May 2005 10:19:41 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DZUpH-0004OP-Vi for nfsv4@megatron.ietf.org; Sat, 21 May 2005 10:19:40 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA24291 for <nfsv4@ietf.org>; Sat, 21 May 2005 10:19:37 -0400 (EDT)
Received: from mx1.netapp.com ([216.240.18.38]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZV6r-0006aT-RN for nfsv4@ietf.org; Sat, 21 May 2005 10:37:50 -0400
Received: from smtp1.corp.netapp.com (10.57.156.124) by mx1.netapp.com with ESMTP; 21 May 2005 07:19:30 -0700
X-IronPort-AV: i="3.93,125,1115017200"; d="scan'208"; a="172396768:sNHT23807704"
Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com [10.57.156.149]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id j4LEJU76027099; Sat, 21 May 2005 07:19:30 -0700 (PDT)
Received: from lavender.hq.netapp.com ([10.56.11.75]) by svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Sat, 21 May 2005 07:19:29 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by lavender.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.6713); Sat, 21 May 2005 07:19:29 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] more re: client re-using lock_owner
Date: Sat, 21 May 2005 10:19:28 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE60@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] more re: client re-using lock_owner
Thread-Index: AcVd/szzZ6FeUokzTbSA63eKIoTVLAABS/hg
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: email2mre-ietf@yahoo.com, nfsv4@ietf.org
X-OriginalArrivalTime: 21 May 2005 14:19:29.0862 (UTC) FILETIME=[19FFF660:01C55E10]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 86f85b2f88b0d50615aed44a7f9e33c7
Content-Transfer-Encoding: quoted-printable
Cc:
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Even supposing that a server could have complicated
Schrodinger logic that works, I don't think this idea
(client resetting seqids when there is no open state)
would work.

Suppose a client, with a given lockowner opens a file,
does a little stuff and closes with the owner at the 
close being at seqid 3.

Now suppose that later the client sends seqid 0 (same
owner string which is sent as part of a "new" openowner
and the server accepts it and ask for an open confirm and
it is duly sent with seqid 1.

Now you receive seqid 2 and the question arises whether it
is part of the "new" openowner sequence or is some that has
just been disgorged by a router.  The server can't tell so
if he gets seqid 2 he will execute it, correctly or not.

The issue here is that the spec requires ascending seqid's
for a given owner for a reason.  If they can be reset to zero
then you can have different messages with the same seqid/
owner-string/clientid triple and that is *bad*.

If a client tries to reset the seqid, the result shoud be
BADSEQID.  Under some circumstances, the server may be 
unable to detect this situation (dropping apparently unused
state), and so the spec cannot require this in all situations,
but it is something the server should try to do and it 
certainly should do that when it has the information available,
as in the case we are talking about.

If the client resets the seqid he has to expect BADSEQID.  If
the server is unable to make the check, then OPEN_CONFIRM is
used to determine whether the potentially bad seqid received is
OK or not.  If the client's response is to send OPEN_CONFIRM,
then he is saying that the seqid for the open is the correct one,
i.e. it is in the correct sequence.  If he does that when it 
isn't, then he is deliberately violating the protocol and the
server will not be able to reliably detect replays.



Mike Eisler wrote:
> The other thing is that OPEN_CONFIRM is a kludge 
> (courtesy me), but it was put it save a separate round 
> trip to establish an open_owner. Given the separation
> between open_owners and lock_owners a separate operation 
> for creating open_owners might not be a bad thing. But anyway, 
> draft-ietf-nfsv4-sess-01.txt kills OPEN_CONFIRM.

It's a brief for the prosecution.  The v4.1 spec will be the 
death sentence, but since sessions are optional, we will have
to wait at least until v4.2 to actually kill it.  Call me
bloodthirsty but I'd like to get on with it.


-----Original Message-----
From: Mike Eisler [mailto:email2mre-ietf@yahoo.com]
Sent: Friday, May 20, 2005 9:34 PM
To: nfsv4@ietf.org
Subject: RE: [nfsv4] more re: client re-using lock_owner


> I assume this is Mike's obfuscated reply address.
                   ^^^^
                   Eisler

This new address will self destruct once the spammers
find it. (My established correspondents should be able
to use my previous email address, unless they are 
asking me to buy mortgages from Nigerian banks backed by
shareholders who made it big in Viagra sales. Oops,
I've just tripped all your spamassassin filters and you
won't see this).

> email2mre-ietf@yahoo.com wrote:
>  > Here's what bothers me about #2. Let's say reason for
>  > the re-used open_owner is not because the client has
>  > forgotten, but because of a retry of an OPEN,
>  > but the client still remembers the open_owner state.
> 
> If the previous OPEN suceeded the server will have state

Not sure you understand what Trond and I are saying.

Let's say the sequence of events is:

time t: OPEN file 1 (seq 1) -->

   times out/held up somewhere

time t+1: OPEN file 1 [retry] (seq 1) -->

time t+2                     <-- OPEN resp from time t

time t+3: client requences response sent a t+1.

time t+4: OPEN_CONFIRM (seq 2) -->

time t+5:                    <-- OPEN_CONFIRM resp

time t+6: OPEN file2(seq 3) -->

...

time t+n ... time t+n+m (all files for the open_owner closed, all locks released)

sequence number for open_owner is now 1000

time t+n+m+1: OPEN retry from t+1 OPEN retry from t+1 finally reaches server.

          at same time:

time t+n+m+2: OPEN file100 seq (1000) -->

The client, at t+n+m+2 thinks his seq is 1000. If the
server implements door #2, then the open for file100 gets BADSEQ,
because the OPEN at t+n+m+1 is accepted without a BADSEQ.
I believe this will confuse most clients, and perhaps all of
the clients that show up at bakeathons.

>  > If we return NFS4ERR_BADSEQID,
>  > then this doesn't perturb the existing sequence number state
>  > for the open_owner. If we request open confirmation, then
>  > the server has two choices:
> 
>  > 1. Perturb the state. So if the client in fact had not forgotten
>  >    about the open_owner's previous sequence number, the next time
>  >    client goes to use a sequence number from the previous use of
>  >    the open_owner, he gets NFS4ERR_BADSEQID. And gets very
>  >    confused.
> 
> This would only happen if we have a buggy client. If the server
> requests an open_confirm he is telling the client that they

I should have said if the server "requests open_confirm and resets the sequence
number to 2"

> are establishing new state. If a client holds onto
> its old state  after getting a OPEN4_RESULT_CONFIRM it is
> just broken.

This was a retried OPEN that the client has long since
forgotten about, because he has long since achieved success
with that operation.

The client doesn't care if the server is requesting OPEN_CONFIRM;
he's going to drop the OPEN response. He's going to drop it because
the xid of that response does not correspond to any outstanding
request.

The problem is that short of an infinite duplicate request
cache, the server cannot disinguish (1) retried OPENs from (2)
a reset of a sequence of sequence numbers from a 
new OPEN caused by a client that has forgotten about his 
unused open_owner. So he has to maintain the multiple
quantum-mechanics-like states I mentioned, until the next operation.
If the next operation is OPEN_CONFIRM with seq #2 and the matching stateid X,
then the other state associated with sequence #1000 is disposed of. If the
next operation is OPEN with sequence #1000, then the state with sequence#2
is disposed off. So implementing door #2, requires Schrodinger states.

(The point of all this sequence number stuff we (you actually :-) added
to NFSv4 was to dispense with the dup request cache for nasty
non-idempotent operations like open and locks.)

What is missing here is a RELEASE_OPENOWNER operation. With that,
the issue of clients forgetting about unused OPEN_OWNERS would be moot.

The other thing is that OPEN_CONFIRM is a kludge 
(courtesy me), but it was put it save a separate round 
trip to establish an open_owner. Given the separation
between open_owners and lock_owners a separate operation 
for creating open_owners might not be a bad thing. But anyway, 
draft-ietf-nfsv4-sess-01.txt kills OPEN_CONFIRM.


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4