Re: [nfsv4] RFC: Is Open/Claim_Delegate_Prev done before/after Reclaim_complete?

Rick Macklem <rick.macklem@gmail.com> Thu, 11 April 2024 01:31 UTC

Return-Path: <rick.macklem@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 07074C14F5FA for <nfsv4@ietfa.amsl.com>; Wed, 10 Apr 2024 18:31:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U6U9pKMjA1kK for <nfsv4@ietfa.amsl.com>; Wed, 10 Apr 2024 18:31:19 -0700 (PDT)
Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2F64FC14F5F4 for <nfsv4@ietf.org>; Wed, 10 Apr 2024 18:31:19 -0700 (PDT)
Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-5dbf7b74402so4513529a12.0 for <nfsv4@ietf.org>; Wed, 10 Apr 2024 18:31:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712799078; x=1713403878; darn=ietf.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7j72LwAxCL+q9NfPwXt0/F1EAxD3+TTY07wR7YLxgAY=; b=T4JdZSgqbojtAaWSutiThYJ944agKihpMPkDAndL16zvZaRdx5AMjWvGWQBhTQfW/2 IhyN9C4/pF9nNYfCgQd6dw0v8NaiIVqrimNhs7zsW0zjMJLWQoMPC9o5AMAIA3PzLIpn 8MxDiGppLXqN1Eei36NcHc8VuN/Fwr8nJFA3DrQPBTCuPlHi4n3MiVvY1RjWrAEvXzxH ZlZ4lvOLd/d31AAii1hELzCvks2tQZUNJ/eO9wrR24hW5nNdUw0NEKEAd88sRx2oyENQ 03XsU+208l+Mvcs9gYPg5cGIZCZcCaZOA8IQ4VWXvNAVqMBw83dIvuom/aZ9l7Ut8cUl U0ZQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712799078; x=1713403878; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7j72LwAxCL+q9NfPwXt0/F1EAxD3+TTY07wR7YLxgAY=; b=SZ6xZX/0EL+LliKV8QYkCNmljLip1Fjo/F25tF3RQ43KRMqawOeSWBBF4b2Ws+7Y+9 khtsOi5UONDACVR6QOU4kqnQu8V6HuSIRK5niX7Zhq7Ix/QIjCqacSbOGd6MX1GIElBT nadnWRvccHQlDNdwJX61o78mDyMbhCRqKWZIdgZSw1WZbdvBofAluDI0gf58DZ1UJPan moWxrXI2Gpb5+58fmby3bDDAiY3L0HdAO9KMoCLm3LNp1/McGPEmVqV+CrvZeMZWiyGR uv6aYdyLSRwRE6XUl/DAprHM+0A6r6Mp3+WhJ1fEAzQHMv9Rv6tPZ/IWPPQ5X48uj0ES zCmw==
X-Forwarded-Encrypted: i=1; AJvYcCVhY+6Ciba/BAJSBLfRAqhxcpSx67X4zxRKEdG2jBoR3tirjx0NxkaB0jYAlpvucq4geGyyFE9RxFsG/DT7Tg==
X-Gm-Message-State: AOJu0Yzog82A2YojirRX3QpCvNjYu3SJc86SE7FTjn77nB3HK15HpAtT 8XXGcWOZkfsuxGAEBQbexIJl/D7fFD/5sNuuYmhFZ69BbV6GzzS8sDLypmc8eE5DugknnQDXl9g B4KbO+ZrP2iCKPS3ds0pWGUeggQ==
X-Google-Smtp-Source: AGHT+IGU9HvaVVU45o7la4Wq6eJneqoljl8M7DTUdYfpfNO2dgKhkCuDPKKYwxGNEoZgzXZTBj93ZO6L7yQEQwAweWU=
X-Received: by 2002:a17:90a:a893:b0:2a5:1d05:e374 with SMTP id h19-20020a17090aa89300b002a51d05e374mr4478099pjq.29.1712799078112; Wed, 10 Apr 2024 18:31:18 -0700 (PDT)
MIME-Version: 1.0
References: <CAM5tNy4y36AEyD5SeOZ2F1t8AYhu8-ZKPyNvCd-dmydFg1kJtA@mail.gmail.com> <19e3502ca1859608975bd17dc18e6774bf0bfab3.camel@hammerspace.com> <CAM5tNy41j0BLi0ixu=VA_rT14GxL7ZKKwbSyb1UAgf-xAE8m+g@mail.gmail.com> <0d1c12d3e85a5feb165c13b44b0f25111224166f.camel@hammerspace.com> <CAM5tNy7i3SyY0mLbd+vsmO-jpkZ6Bdr8hz3w5T-PhR8GdbMpnA@mail.gmail.com> <CADaq8jdEUrf7Bp5njeKEbzLFW+vYdvNAaZtqeyfCcVVHmaYMxg@mail.gmail.com> <CAM5tNy61c+4DtkLQBfJnHagd+uc-OpR7HfEL0TCqd-hw-DZeCA@mail.gmail.com>
In-Reply-To: <CAM5tNy61c+4DtkLQBfJnHagd+uc-OpR7HfEL0TCqd-hw-DZeCA@mail.gmail.com>
From: Rick Macklem <rick.macklem@gmail.com>
Date: Wed, 10 Apr 2024 18:31:07 -0700
Message-ID: <CAM5tNy4sK2mFNPcT-58b_ad1E6-Pijkiqu5xfTK3osON07nC3w@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/5AzdlyQ5XBpunrvm5eP8DYTQcwI>
Subject: Re: [nfsv4] RFC: Is Open/Claim_Delegate_Prev done before/after Reclaim_complete?
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Apr 2024 01:31:23 -0000

On Sun, Apr 7, 2024 at 8:33 AM Rick Macklem <rick.macklem@gmail.com> wrote:
>
> On Sun, Apr 7, 2024 at 7:52 AM David Noveck <davenoveck@gmail.com> wrote:
> >>
> >>
> >>
> >> The more I think about it, the more I think you are correct, at least
> >> by default.
> I was referring to the case where the client cannot reacquire the delegation,
> such as when the Open/Claim_delegate_prev gets a reply of NFS4ERR_RECLAIM_BAD.
> (Btw, Sec. 15.1.9.4 states "before the server restart or file system
> migration event".
> If we are going to use NFS4ERR_RECLAIM_BAD instead of NFS4ERR_EXPIRED, these
> words will need to be tweaked.)
>
> I am fine with either error return, but the current wording in Sec.
> 15.1.9.4 defines it for
> specific cases that are not this one.
>
> >
> >
> > Not fully correct.  Some of what I told you is wrong and will be corrected in -04.    The current target for submission is 4/23 but it is likely I canpull that forward by a week.   In any case, I hope the following fragment will be helpful:
> >
> > /*
> >  * Includes multiple types of operations:
> >  *
> >  *    - Non-reclaim operations valid independent of grace period status.
> >  *    - Reclaim operations only used during a grace period.
> >  *    - Reclaim operations only used during a special delegation recovery
> >  *      period.
> >  */
> >
> > enum open_claim_type4 {
> >
> >         CLAIM_NULL              = 0,    /* Non-reclaim operation, */
> >         CLAIM_PREVIOUS          = 1,    /* Reclaim operation -- grace
> >                                                                period only. */
> >         CLAIM_DELEGATE_CUR      = 2,    /* Non-reclaim operation. */
> >         CLAIM_DELEGATE_PREV     = 3,    /* Reclaim operation -- special
> >                                                                       delegation recovery period
> >                                                                       only. */
> Sounds correct to me, assuming that "special delegation recovery period" is
> described somewhere. Otherwise I'd just say something like
> "delegation reclaim - not during
> grace period".
>
> >         /*
> >          * Beyond this point, all values are new to v4.1.
> >          */
> >
> >        /*
> >          * Like CLAIM_NULL, but object identified
> >          * by the current filehandle.
> >          */
> >         CLAIM_FH                = 4,    /* Non-reclaim operation. */
> >
> >         /*
> >          * Like CLAIM_DELEGATE_CUR, but object identified
> >          * by current filehandle.
> >          */
> >         CLAIM_DELEG_CUR_FH      = 5,    /* Non-reclaim operation. */
> >
> >         /*
> >          * Like CLAIM_DELEGATE_PREV, but object identified
> >          * by current filehandle.
> >          */
> >         CLAIM_DELEG_PREV_FH     = 6     /* Reclaim operation -- special
> >                                                                       delgation recovery period
> >                                                                        only. */
> > };
> >
> >> Why?
> >> Because when a client crashes/reboots and there is dirty data in the
> >> RAM cache it gets
> >> tossed, so tossing dirty data in the non-volatile cache when it cannot safely
> >> write it back to the server would be consistent with what happens now.
> >
> >
> > Yes, but the idea is to do better than that, when it can be written safely.
> >>
> >>
> >> For the case where the server is in grace, the client can use
> >> Open/Claim_previous
> >> and then, if the reply indicates an immediate recall of the
> >> delegation, it can write the dirty data
> >> back to the server before doing a Delegreturn.
> >
> >
> > True but that is an unusual case.
> >>
> >>
> >> For the case where the server is not in grace and the Open/Claim_delegate_prev
> >> fails, I think there is another way that the client can safely write
> >> the dirty data back.
> >> - If the client keeps track of the server's value of the change
> >> attribute for the file
> >>   in non-volatile storage.
> >>   - The client can try an
> >> Open/Claim_null/Open_share_access_write/Open_share_deny_write
> >>      followed by a Getattr of change in the same compound.
> >>      - if this succeeds and the change attribute is the same as the
> >> stored one, then I think the client can
> >
> >
> > If the  Open/Claim_delegate_prev fails, it would be because the delegation was revoked, in which
> > case, change attribute will be different.
> Why would the change attribute be different? It changes when there is
> a data/metadata change
> done to the file. Typically, a delegation would be revoked (instead of
> recalled) if there is a conflicting Open
> request and the lease has expired. This does not imply a change in the
> change attribute, does it?
> Of course, if any data/metadata changes are applied to the file by
> other clients (such as the one
> that acquired the conflicting open), then change would change.
> Or, I do not understand what you mean by revoke?
>
> >  Also, in that case, checking change attribute and proceding
> > to write is not safe, since the attribute can change after the check.  You need continuous posession
> > of the delegation which is why clain-delege-prev was created.
> Wouldn't a OPEN4_SHARE_DENY_WRITE be sufficient to guarantee that
> other clients are
> not writing to the file during the write-back? Once change is checked
> and found the same after the Open,
> there shouldn't have been any writes done by other clients already
> done and the deny_write should guarantee
> that no other clients write to the file until the open_stateid is
> closed (or there is something ugly
> like a network partitioning/lease expiration). For this case, all the
> client is trying to do is write the dirty data
> back to the server safely, since it was not able to re-acquire the
> delegation. It will no longer be able to use
> the cached data once the open_stateid is closed.
> I have not tried this yet, so there might be issues that I have not
> recognized w.r.t. doing this.
Although I think what I stated above is correct, I now realize that it is
probably not that useful.
Why?
Because all it takes is another client to do a Getattr of change that
triggers a CB_GETATTR to change the value of change on the server.
As such, the likelyhood of that change attribute not being changed is
fairly low.

I am not planning on implementing the above Open with WRITE_ACCESS
and WRITE_DENY for now, rick

>
> I agree that reclaiming the delegation is preferable, since caching of
> the file can continue and
> the write-back does not need to be done right away.
>
> >
> >>        safely write the dirty data back to the server using the open_stateid
> >>        - then close the open_stateid
> >>
> >> An option other than tossing the data would be to flag it and let a
> >> sysadmin/user decide if
> >> merging it back to the server's file makes sense. Ugly but doable when
> >> the dirty data is in
> >> client non-volatile storage.
> >
> >
> > He could decide but I'm not sure on what basis he could decide, especially since telling
> > him all the relevant information could be a security hole a megaparsec wide. :-(
> Well, I wore a sysadmin hat for 30+ years and one of my more enjoyable
> work items
> went something like...
> A prof. would email saying they had made a bunch of changes to a file
> and then lost them.
> Can you get the changes back?
> - The attempt usually involved finding the most recent backup of the
> file, along with stuff
>    like an editor tmp file left around to try and get back what I could.
> So, yes, for the user it could easily be a security issue (it happens
> that this caching mechanism
> uses a separate file for each file being cached, so access to the
> cache for the file could be
> the same as the file), but for a sysadmin it is just a bothersome manual action.
>
> rick
>
> >>
> >>
> >> rick
> >>
> >> >
> >> > --
> >> > Trond Myklebust
> >> > Linux NFS client maintainer, Hammerspace
> >> > trond.myklebust@hammerspace.com
> >> >
> >> >
> >>
> >> _______________________________________________
> >> nfsv4 mailing list
> >> nfsv4@ietf.org
> >> https://www.ietf.org/mailman/listinfo/nfsv4