Re: [nfsv4] What error to return if destination server fails to READ within cnr_lease_time

David Noveck <davenoveck@gmail.com> Fri, 18 December 2015 18:14 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 58BE31B3005 for <nfsv4@ietfa.amsl.com>; Fri, 18 Dec 2015 10:14:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HzhJuc3RjCxQ for <nfsv4@ietfa.amsl.com>; Fri, 18 Dec 2015 10:14:34 -0800 (PST)
Received: from mail-ob0-x22c.google.com (mail-ob0-x22c.google.com [IPv6:2607:f8b0:4003:c01::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CD9E31B31F7 for <nfsv4@ietf.org>; Fri, 18 Dec 2015 10:14:33 -0800 (PST)
Received: by mail-ob0-x22c.google.com with SMTP id no2so84856512obc.3 for <nfsv4@ietf.org>; Fri, 18 Dec 2015 10:14:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=b/fDGXvWLJNQaxMUj7WUZneTSJrHDkaWdHpLvS6eqjo=; b=J1IM3YVLcOprmu6zqJ9uhYGNzqWyWFmtJmN0xk8WGymqckzC20+5XWr8W3cW0zeDpY ItwmUGvmEXsBI7zcNIv0vwUJWrSp1L6ZISMGHxggn5yZtBIadOFK8gZQ/Psmc+8JRp2K lkGoA/BKXqerKxxx0b9GUMj24QGldG5+RlBa1yYD4lh+av4FnCeAvJCkE8PVCpRml6er 0lWG/c4eTwepViiYrGfiSI9UbONVMBDFGxpReVwIbCf8/oDJ9TLpUrN439im+KUCvUvZ 7u48md2xsLhtjVa+ZDxwGV6zwBrgzwYa7dUnkJUn/26213z3u7X/zcnb6YFOZ6Jefeo5 hfVA==
MIME-Version: 1.0
X-Received: by 10.182.103.167 with SMTP id fx7mr2203986obb.36.1450462473216; Fri, 18 Dec 2015 10:14:33 -0800 (PST)
Received: by 10.182.165.102 with HTTP; Fri, 18 Dec 2015 10:14:33 -0800 (PST)
In-Reply-To: <E19898F5-E239-42D8-87FE-870712D5DA63@netapp.com>
References: <2EE02221-E9C5-4087-AFA6-1A1D52308C0C@netapp.com> <CADaq8jfxrJDvtfhEVeGMXkD4PLFcv5sfBP4g4B6VUfdhB_Ks2w@mail.gmail.com> <E19898F5-E239-42D8-87FE-870712D5DA63@netapp.com>
Date: Fri, 18 Dec 2015 13:14:33 -0500
Message-ID: <CADaq8jePDoEj=jt2pC5_ko+8wiX4d7EVO-EDKPmPJTNqthhFKg@mail.gmail.com>
From: David Noveck <davenoveck@gmail.com>
To: "Adamson, Andy" <William.Adamson@netapp.com>
Content-Type: multipart/alternative; boundary="089e011602f879e9f10527301cf9"
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/pOrWraf0GzUTQPZBlx069aCzbag>
Cc: NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] What error to return if destination server fails to READ within cnr_lease_time
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Dec 2015 18:14:37 -0000

> ?? The destination server is a client (referred to as dst-client)

Yes it is a client but there is another client as well.  I was referring to
that as "the client" which didn't help clarify things.

> it loads the client module, and the dst-client mounts and reads the data
from the source server with _no_ change to the client code.

I can why one might want that, but the situation is one that typical client
code does not deal with, particularly when lease considerations are
concerned.  Normally, one issue a read with a stateid that was gotten in
the context of the client issuing the request.  If lease expiration occurs,
it is clear which lease has expired.  Here we have two clients and the
stateid was gotten by one and is used by a different one.  If
NFS4ERR_EXPIRED means a lease has expired, the obvious question is "which
one?" and RFC5661 is not very clear about that since the situation doesn't
occur in  v4.1.  I was assuming that the lease expiration would be for the
original client while you seem to be assuming it would be for the
dst-client.

> The READ is a normal READ.

It's normal in form,but it is done with a stateid that it didn't get with
an OPEN that it did, which makes it kind of "non-standard", if "abnormal"
seems inappropriate.

> So if the dst-client gets an NFS4ERR_EXPIRED on the READ, it must assume
that the stateid is bad, and try to recover it.

But there would be no way for it do that.  The normal way to do that is to
do an OPEN, but the dst-client doesn't have the information to do that.

> But the stateid is not bad, the clientID is not expired. This has nothing
to do with the cnr_lease_time and in my opinion, should not be returned.

OK, but I want to explain my logic, which I think still works, if one
assumes, as I did, that the client code will have to be as the server code
will have to be, aware of the the non-standar nature of what is going on.
Since, there are two leases involved, I was assuming NFS4ERR_EXPIRED would
mean that one or more of the following had occurred:

   1. The original client's lease has expired.  In this case, the best
   thing to do is to fail the copy immediately and let the original client do
   any necessary lease recovery on his own
   2. The dst-client's has expired.  As you point out, this case is
   unlikely to arise, since the clientid has just been established, but it is
   still (just barely) possible that a communication break at just the wrong
   time can cause this to occur.  the dst-client can test any stateid's that
   it got itself to see if this is the case that occurred.  If it didn't, it
   can just fail the copy since it doesn't matter whether 1 or 3 occurred.  If
   2 has occurred, the dst-client can only recover stateids it got for itself.
    it has no ability to recover the others and so in this case as well it
   should fail the copy.  In this case, the original client might see that its
   stateid is usable and so reissue the copy immediately
   3. The cnr_lease_time has expired.  In this case also, the best thing to
   do is to fail the copy immediately and let the original client do
   any necessary lease recovery on his own



On Fri, Dec 18, 2015 at 10:08 AM, Adamson, Andy <William.Adamson@netapp.com>
wrote:

>
> > On Dec 17, 2015, at 10:46 PM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > > 1) What is the error returned by the source server on the READ?
> >
> > > It’s not NFS4ERR_EXPIRED
> >
> > I would not be so quick to dismiss this.  See below.
> >
> > > as this refers to the clients lease
> >
> > Normally it does but since there is no NFS4ERR_CNR_LEASE_EXPIRED, I
> think it is a reasonable accommodation.
>
> What if the stateid has actually expired, and the source server is saying
> ‘recover the stateid’ when it returns NFS4ERR_EXPIRED on the READ? In other
> words, there is no way to add the ‘this NFS4ERR_EXPIRED means that the
> cnr_lease_time not the stateid.
>
> >
> > > and will promt the client to recover the stateid.
> >
> > It would if the client got it, but the client is not going to get it in
> this case.
>
> ?? The destination server is a client (referred to as dst-client) - it
> loads the client module, and the dst-client mounts and reads the data from
> the source server with _no_ change to the client code. The READ is a normal
> READ. So if the dst-client gets an NFS4ERR_EXPIRED on the READ, it must
> assume that the stateid is bad, and try to recover it. But the stateid is
> not bad, the clientID is not expired. This has nothing to do with the
> cnr_lease_time and in my opinion, should not be returned.
>
> >  The destination server is going to get it and he could reasonably
> conclude that either:
> >       • The client's lease has expired:
>
>         in which case the dst-client will try to recover the stateid, and
> if that indicates that the clientID has expired, then recover the clientID
> which is nuts as that the dst-client has _just_ established the clientID,
> as all that dest-client does is mount, READ, umount.
>
> >       • cnr_lease_time has expired
>
>         in which case the destination server would just fail the COPY.
>
> > It would be nice if he could know which of these two occurred but it is
> not essential.
> > In either case, the COPY has to be failed and the client will find soon
> out enough whether his lease for the source server has expired or not.
>
> Really? How does the client know if the lease for the source server has
> expired?
>
>
> >  If it has, he is in a position to re-establish it.
>
> Well, a new COPY needs to be started.
>
> >
> > Another possibility is NFS4ERR_ADMIN_REVOKED if you want to distinguish
> this from a true lease expiration.
>
> This also implies a stateid problem, not a cnr_lease_time problem and will
> prompt stateid recovery.
> >
> >
> > > 2) What is the error returned by the destination server on the COPY?
> >
> > One possibility is NFS4ERR_PARTNER_NO_AUTH.  That's an exact fit if you
> know cnr_lease_time expired.  It's kind of a rough ft if either lease could
> have expired.
>
>
> Ah! This is the error I was looking for.
>
> Why not add NFS4ERR_PARTNER_NO_AUTH as an error on a READ? This seems a
> simple addition to the protocol and would be
>
> —>Andy
>
>
>
> >
> > If you are unsure of the specfic lease expiration causing the failure,
> NFS4ERR_OFFLOAD_DENIED seems like it would do what you want.
> >
> > On Thu, Dec 17, 2015 at 3:42 PM, Adamson, Andy <
> William.Adamson@netapp.com> wrote:
> > From draft-ietf-nfsv4-minorversion2-39 Section 15.3.3.  DESCRIPTION of
> COPY_NOTIFY:
> >
> >    If this operation succeeds, the source server will allow the
> >    cna_destination_server to copy the specified file on behalf of the
> >    given user as long as both of the following conditions are met:
> >
> >
> >       The destination server begins reading the source file before the
> >       cnr_lease_time expires.
> >
> >
> >
> > So on an inter-SSC the source server starts the cnr_lease_time upon the
> reply to COPY_NOTIFY, and
> > if the cnr_lease_time expires prior to the beginning of the READ from
> the source
> >  server, the source server fails the READ.
> >
> > 1) What is the error returned by the source server on the READ?
> >
> > It’s not NFS4ERR_EXPIRED as this refers to the clients lease and will
> promt the client to recover the stateid.
> >
> >
> > 2) What is the error returned by the destination server on the COPY?
> >
> > I would hope it is the same error as returned by READ.
> >
> > Do we need a new error code?
> >
> > Suggestions?
> >
> > —>Andy
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4
> >
>
>