Re: [nfsv4] What error to return if destination server fails to READ within cnr_lease_time

David Noveck <davenoveck@gmail.com> Sat, 19 December 2015 00:02 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 82D6B1A00D5 for <nfsv4@ietfa.amsl.com>; Fri, 18 Dec 2015 16:02:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YbXynK4BcTkC for <nfsv4@ietfa.amsl.com>; Fri, 18 Dec 2015 16:02:20 -0800 (PST)
Received: from mail-ob0-x22b.google.com (mail-ob0-x22b.google.com [IPv6:2607:f8b0:4003:c01::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CD4CC1A00CD for <nfsv4@ietf.org>; Fri, 18 Dec 2015 16:02:19 -0800 (PST)
Received: by mail-ob0-x22b.google.com with SMTP id 18so90226338obc.2 for <nfsv4@ietf.org>; Fri, 18 Dec 2015 16:02:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Mku5K/nCm9mGPgXkQlkNYZwLBOT6KFTLCnmxL3iVsIM=; b=E/5viHIbzQAKhfO1SpXXtnIT+lYqE90R3W2MHYiQ42wS6gqEkTizvcCbZvawXpJhDk j8EEg9xqyh99h8opMcwAVZ1Fs5WviPYICx06HxmCDmVDFrWZYsUsfYpsfhfC7YQXy5e/ rWKHPNJdoEfzD7CbdX8dEqSUBiIDxyehQqUaA5sMak/Ml4qEqrlTMLxATl79GHcGFYAB qEw9JARcCz05xorQveMYmjlhG8BfBTFEBbIDHG3yPlnlQG2kAvvhHxvCCPlLddTyG5qy XTq5fsH2osFWTKLn2t8Pm6GPJEuVTW5q1Olez1a1xh9g9V3g35cwOOzeHZqZuoQpuJj+ IJtw==
MIME-Version: 1.0
X-Received: by 10.60.70.236 with SMTP id p12mr3020470oeu.55.1450483339159; Fri, 18 Dec 2015 16:02:19 -0800 (PST)
Received: by 10.182.165.102 with HTTP; Fri, 18 Dec 2015 16:02:19 -0800 (PST)
In-Reply-To: <601F497C-40CF-409E-A2C3-561234AC54B9@netapp.com>
References: <2EE02221-E9C5-4087-AFA6-1A1D52308C0C@netapp.com> <CADaq8jfxrJDvtfhEVeGMXkD4PLFcv5sfBP4g4B6VUfdhB_Ks2w@mail.gmail.com> <E19898F5-E239-42D8-87FE-870712D5DA63@netapp.com> <CADaq8jePDoEj=jt2pC5_ko+8wiX4d7EVO-EDKPmPJTNqthhFKg@mail.gmail.com> <601F497C-40CF-409E-A2C3-561234AC54B9@netapp.com>
Date: Fri, 18 Dec 2015 19:02:19 -0500
Message-ID: <CADaq8jd2p-_NKd+5fyMeNpVc2US8BU=CG8+1qY8hY22hMgEfSA@mail.gmail.com>
From: David Noveck <davenoveck@gmail.com>
To: "Adamson, Andy" <William.Adamson@netapp.com>
Content-Type: multipart/alternative; boundary="001a1133158a2ef0dc052734f8d3"
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/ezxrdSh0Ju1jHgTpRqTr2Cnuxaw>
Cc: NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] What error to return if destination server fails to READ within cnr_lease_time
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Dec 2015 00:02:23 -0000

> All of the above complexity goes away if a READ for the COPY is able to
return NFS4ERR_PARTNER_NO_AUTH.

Not all of it.  Only the complexity associated with situation #3.
Situations #1 and #2 remain.

Still, I think this is a good change to make.

On Fri, Dec 18, 2015 at 2:09 PM, Adamson, Andy <William.Adamson@netapp.com>
wrote:

>
> > On Dec 18, 2015, at 1:14 PM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > > ?? The destination server is a client (referred to as dst-client)
> >
> > Yes it is a client but there is another client as well.  I was referring
> to that as "the client" which didn't help clarify things.
> >
> > > it loads the client module, and the dst-client mounts and reads the
> data from the source server with _no_ change to the client code.
> >
> > I can why one might want that, but the situation is one that typical
> client code does not deal with, particularly when lease considerations are
> concerned.  Normally, one issue a read with a stateid that was gotten in
> the context of the client issuing the request.  If lease expiration occurs,
> it is clear which lease has expired.  Here we have two clients and the
> stateid was gotten by one and is used by a different one.  If
> NFS4ERR_EXPIRED means a lease has expired, the obvious question is "which
> one?" and RFC5661 is not very clear about that since the situation doesn't
> occur in  v4.1.  I was assuming that the lease expiration would be for the
> original client while you seem to be assuming it would be for the
> dst-client.
> >
> > > The READ is a normal READ.
> >
> > It's normal in form,but it is done with a stateid that it didn't get
> with an OPEN that it did, which makes it kind of "non-standard", if
> "abnormal" seems inappropriate.
> >
> > > So if the dst-client gets an NFS4ERR_EXPIRED on the READ, it must
> assume that the stateid is bad, and try to recover it.
> >
> > But there would be no way for it do that.  The normal way to do that is
> to do an OPEN, but the dst-client doesn't have the information to do that.
> >
> > > But the stateid is not bad, the clientID is not expired. This has
> nothing to do with the cnr_lease_time and in my opinion, should not be
> returned.
> >
> > OK, but I want to explain my logic, which I think still works, if one
> assumes, as I did, that the client code will have to be as the server code
> will have to be, aware of the the non-standar nature of what is going on.
> Since, there are two leases involved, I was assuming NFS4ERR_EXPIRED would
> mean that one or more of the following had occurred:
> >       • The original client's lease has expired.  In this case, the best
> thing to do is to fail the copy immediately and let the original client do
> any necessary lease recovery on his own
> >       • The dst-client's has expired.  As you point out, this case is
> unlikely to arise, since the clientid has just been established, but it is
> still (just barely) possible that a communication break at just the wrong
> time can cause this to occur.  the dst-client can test any stateid's that
> it got itself to see if this is the case that occurred.
>
> Um - the stateid in question did not come from the dest-client. it came
> from the client. This means that the stateid used by the READ to the source
> server is not associated with the clientID established by the mount of the
> dest-client. We already have special code on the source server to allow for
> a lookup of a stateid against a different clientID than the SEQUENCE
> operation resolves to - special for the COPY READ. How can it test the
> stateid!! We would need more special code on the server. This is IMHO not
> the direction to go.
>
> >  If it didn't, it can just fail the copy since it doesn't matter whether
> 1 or 3 occurred.  If 2 has occurred, the dst-client can only recover
> stateids it got for itself.  it has no ability to recover the others and so
> in this case as well it should fail the copy.  In this case, the original
> client might see that its stateid is usable and so reissue the copy
> immediately
> >       • The cnr_lease_time has expired.  In this case also, the best
> thing to do is to fail the copy immediately and let the original client do
> any necessary lease recovery on his own
>
> All of the above complexity goes away if a READ for the COPY is able to
> return NFS4ERR_PARTNER_NO_AUTH.
>
> —>Andy
>
> >
> >
> > On Fri, Dec 18, 2015 at 10:08 AM, Adamson, Andy <
> William.Adamson@netapp.com> wrote:
> >
> > > On Dec 17, 2015, at 10:46 PM, David Noveck <davenoveck@gmail.com>
> wrote:
> > >
> > > > 1) What is the error returned by the source server on the READ?
> > >
> > > > It’s not NFS4ERR_EXPIRED
> > >
> > > I would not be so quick to dismiss this.  See below.
> > >
> > > > as this refers to the clients lease
> > >
> > > Normally it does but since there is no NFS4ERR_CNR_LEASE_EXPIRED, I
> think it is a reasonable accommodation.
> >
> > What if the stateid has actually expired, and the source server is
> saying ‘recover the stateid’ when it returns NFS4ERR_EXPIRED on the READ?
> In other words, there is no way to add the ‘this NFS4ERR_EXPIRED means that
> the cnr_lease_time not the stateid.
> >
> > >
> > > > and will promt the client to recover the stateid.
> > >
> > > It would if the client got it, but the client is not going to get it
> in this case.
> >
> > ?? The destination server is a client (referred to as dst-client) - it
> loads the client module, and the dst-client mounts and reads the data from
> the source server with _no_ change to the client code. The READ is a normal
> READ. So if the dst-client gets an NFS4ERR_EXPIRED on the READ, it must
> assume that the stateid is bad, and try to recover it. But the stateid is
> not bad, the clientID is not expired. This has nothing to do with the
> cnr_lease_time and in my opinion, should not be returned.
> >
> > >  The destination server is going to get it and he could reasonably
> conclude that either:
> > >       • The client's lease has expired:
> >
> >         in which case the dst-client will try to recover the stateid,
> and if that indicates that the clientID has expired, then recover the
> clientID which is nuts as that the dst-client has _just_ established the
> clientID, as all that dest-client does is mount, READ, umount.
> >
> > >       • cnr_lease_time has expired
> >
> >         in which case the destination server would just fail the COPY.
> >
> > > It would be nice if he could know which of these two occurred but it
> is not essential.
> > > In either case, the COPY has to be failed and the client will find
> soon out enough whether his lease for the source server has expired or not.
> >
> > Really? How does the client know if the lease for the source server has
> expired?
> >
> >
> > >  If it has, he is in a position to re-establish it.
> >
> > Well, a new COPY needs to be started.
> >
> > >
> > > Another possibility is NFS4ERR_ADMIN_REVOKED if you want to
> distinguish this from a true lease expiration.
> >
> > This also implies a stateid problem, not a cnr_lease_time problem and
> will prompt stateid recovery.
> > >
> > >
> > > > 2) What is the error returned by the destination server on the COPY?
> > >
> > > One possibility is NFS4ERR_PARTNER_NO_AUTH.  That's an exact fit if
> you know cnr_lease_time expired.  It's kind of a rough ft if either lease
> could have expired.
> >
> >
> > Ah! This is the error I was looking for.
> >
> > Why not add NFS4ERR_PARTNER_NO_AUTH as an error on a READ? This seems a
> simple addition to the protocol and would be
> >
> > —>Andy
> >
> >
> >
> > >
> > > If you are unsure of the specfic lease expiration causing the failure,
> NFS4ERR_OFFLOAD_DENIED seems like it would do what you want.
> > >
> > > On Thu, Dec 17, 2015 at 3:42 PM, Adamson, Andy <
> William.Adamson@netapp.com> wrote:
> > > From draft-ietf-nfsv4-minorversion2-39 Section 15.3.3.  DESCRIPTION of
> COPY_NOTIFY:
> > >
> > >    If this operation succeeds, the source server will allow the
> > >    cna_destination_server to copy the specified file on behalf of the
> > >    given user as long as both of the following conditions are met:
> > >
> > >
> > >       The destination server begins reading the source file before the
> > >       cnr_lease_time expires.
> > >
> > >
> > >
> > > So on an inter-SSC the source server starts the cnr_lease_time upon
> the reply to COPY_NOTIFY, and
> > > if the cnr_lease_time expires prior to the beginning of the READ from
> the source
> > >  server, the source server fails the READ.
> > >
> > > 1) What is the error returned by the source server on the READ?
> > >
> > > It’s not NFS4ERR_EXPIRED as this refers to the clients lease and will
> promt the client to recover the stateid.
> > >
> > >
> > > 2) What is the error returned by the destination server on the COPY?
> > >
> > > I would hope it is the same error as returned by READ.
> > >
> > > Do we need a new error code?
> > >
> > > Suggestions?
> > >
> > > —>Andy
> > >
> > > _______________________________________________
> > > nfsv4 mailing list
> > > nfsv4@ietf.org
> > > https://www.ietf.org/mailman/listinfo/nfsv4
> > >
> >
> >
>
>