Re: [nfsv4] RFC: Is Open/Claim_Delegate_Prev done before/after Reclaim_complete?

Rick Macklem <rick.macklem@gmail.com> Sun, 07 April 2024 15:33 UTC

Return-Path: <rick.macklem@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 63A37C14F681 for <nfsv4@ietfa.amsl.com>; Sun, 7 Apr 2024 08:33:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NOyVrTXxvqWc for <nfsv4@ietfa.amsl.com>; Sun, 7 Apr 2024 08:33:15 -0700 (PDT)
Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D5C1DC14F610 for <nfsv4@ietf.org>; Sun, 7 Apr 2024 08:33:15 -0700 (PDT)
Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-2a25f147f46so1840685a91.1 for <nfsv4@ietf.org>; Sun, 07 Apr 2024 08:33:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712503995; x=1713108795; darn=ietf.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MSi8HFdHYF+WfdnRer0ZjwFOfxud0sM2qLJomy6+sD4=; b=luwptQy+VCvkaQxLvn/xeQaMrg6reocwUTyxiGHdfTlZoxGc4x1b8dEnQtr5oXsnTA 5khZbRgP5LG9JEYGCEyLqpISFzeUeoiHVPp+POMbMXz7cbMbgoDcrFUnYJXJWpxjW/9X CeakWpU2FpNR4Tw91GNOOGb0fiDLgO7loYn2XFc4D021+x8sV63AJSJ/ngIWhYI+UX2L oPKGdLkvuN7OB4vOib496rJpuhTlItdDHHHSomSrgw6XGw9Co4/+MaDcjn/pj4kPgdA3 JJhuMZemSC5cVZ7MwIPKOhZqJfaznQO5mM4YYITJSTKDY42XCLQi4lfA2/xZTx80DDWF OvZA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712503995; x=1713108795; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MSi8HFdHYF+WfdnRer0ZjwFOfxud0sM2qLJomy6+sD4=; b=PPJ4oxlTKnwUmk6fee7+9DidFqDeLG85DCAIoSfxlMScoiyue4QQSMkcY3582xp6cs 2DfbPgtcLiU07O9FNWaFaPS89M1e4GXyoHBR2fcBB137Co/7wNMfyHCUYXvXmWpRIfu1 3JI9CAXonfNXrBbc4JDkpLz+C+RJ3Trjdr725omkyONT1cSD8c/lPtPykMKVYpV+igJe rtN2MGx/4REFYO6afwvl8BODzUd8FhC5tAtuY8qCfzu0f7LgbED38Q+FG9Cz0f/U87YK UUMOz/uDiUPrYiwDZYq4pmdCz1cOUSX+rjC4XruPMaUjZbEBpilJFcLN+twZbRRRrklZ Y7Fg==
X-Forwarded-Encrypted: i=1; AJvYcCW/vfMWhUDZy7bTsxmEgtUWC9Z9yink0L53ALWL1RhtuzyXehvDId6CpIYK0lX5ow414SIgqwChAGfpLhV7UA==
X-Gm-Message-State: AOJu0Yzmczt+ZZ/v1k/dGmZQTPRdu7wkBU2tItxhmjixMXf+QeVrzxFQ i6QOO1Cv6v5HqG+eskFmVPW+Z3r/ZYeLlKuDasFFwz2p2tnYpQWTNbYepFUXj9RbBykAr4jw7TO rubRR1+wbzKihLIUu+/OyUMq0kA==
X-Google-Smtp-Source: AGHT+IH3mLIZQDw01bsImKFvAsrCBR+anzF0Qj2Nu1XwrdRdpLvP4tCkDf0fYT3mEc7loRSKaZNxFDOMLEctWnQYLDU=
X-Received: by 2002:a17:90b:3587:b0:2a4:e962:b62c with SMTP id mm7-20020a17090b358700b002a4e962b62cmr709502pjb.49.1712503994899; Sun, 07 Apr 2024 08:33:14 -0700 (PDT)
MIME-Version: 1.0
References: <CAM5tNy4y36AEyD5SeOZ2F1t8AYhu8-ZKPyNvCd-dmydFg1kJtA@mail.gmail.com> <19e3502ca1859608975bd17dc18e6774bf0bfab3.camel@hammerspace.com> <CAM5tNy41j0BLi0ixu=VA_rT14GxL7ZKKwbSyb1UAgf-xAE8m+g@mail.gmail.com> <0d1c12d3e85a5feb165c13b44b0f25111224166f.camel@hammerspace.com> <CAM5tNy7i3SyY0mLbd+vsmO-jpkZ6Bdr8hz3w5T-PhR8GdbMpnA@mail.gmail.com> <CADaq8jdEUrf7Bp5njeKEbzLFW+vYdvNAaZtqeyfCcVVHmaYMxg@mail.gmail.com>
In-Reply-To: <CADaq8jdEUrf7Bp5njeKEbzLFW+vYdvNAaZtqeyfCcVVHmaYMxg@mail.gmail.com>
From: Rick Macklem <rick.macklem@gmail.com>
Date: Sun, 07 Apr 2024 08:33:03 -0700
Message-ID: <CAM5tNy61c+4DtkLQBfJnHagd+uc-OpR7HfEL0TCqd-hw-DZeCA@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/wg6twDVTKlTqBRoUm30DJujdc3M>
Subject: Re: [nfsv4] RFC: Is Open/Claim_Delegate_Prev done before/after Reclaim_complete?
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 07 Apr 2024 15:33:20 -0000

On Sun, Apr 7, 2024 at 7:52 AM David Noveck <davenoveck@gmail.com> wrote:
>>
>>
>>
>> The more I think about it, the more I think you are correct, at least
>> by default.
I was referring to the case where the client cannot reacquire the delegation,
such as when the Open/Claim_delegate_prev gets a reply of NFS4ERR_RECLAIM_BAD.
(Btw, Sec. 15.1.9.4 states "before the server restart or file system
migration event".
If we are going to use NFS4ERR_RECLAIM_BAD instead of NFS4ERR_EXPIRED, these
words will need to be tweaked.)

I am fine with either error return, but the current wording in Sec.
15.1.9.4 defines it for
specific cases that are not this one.

>
>
> Not fully correct.  Some of what I told you is wrong and will be corrected in -04.    The current target for submission is 4/23 but it is likely I canpull that forward by a week.   In any case, I hope the following fragment will be helpful:
>
> /*
>  * Includes multiple types of operations:
>  *
>  *    - Non-reclaim operations valid independent of grace period status.
>  *    - Reclaim operations only used during a grace period.
>  *    - Reclaim operations only used during a special delegation recovery
>  *      period.
>  */
>
> enum open_claim_type4 {
>
>         CLAIM_NULL              = 0,    /* Non-reclaim operation, */
>         CLAIM_PREVIOUS          = 1,    /* Reclaim operation -- grace
>                                                                period only. */
>         CLAIM_DELEGATE_CUR      = 2,    /* Non-reclaim operation. */
>         CLAIM_DELEGATE_PREV     = 3,    /* Reclaim operation -- special
>                                                                       delegation recovery period
>                                                                       only. */
Sounds correct to me, assuming that "special delegation recovery period" is
described somewhere. Otherwise I'd just say something like
"delegation reclaim - not during
grace period".

>         /*
>          * Beyond this point, all values are new to v4.1.
>          */
>
>        /*
>          * Like CLAIM_NULL, but object identified
>          * by the current filehandle.
>          */
>         CLAIM_FH                = 4,    /* Non-reclaim operation. */
>
>         /*
>          * Like CLAIM_DELEGATE_CUR, but object identified
>          * by current filehandle.
>          */
>         CLAIM_DELEG_CUR_FH      = 5,    /* Non-reclaim operation. */
>
>         /*
>          * Like CLAIM_DELEGATE_PREV, but object identified
>          * by current filehandle.
>          */
>         CLAIM_DELEG_PREV_FH     = 6     /* Reclaim operation -- special
>                                                                       delgation recovery period
>                                                                        only. */
> };
>
>> Why?
>> Because when a client crashes/reboots and there is dirty data in the
>> RAM cache it gets
>> tossed, so tossing dirty data in the non-volatile cache when it cannot safely
>> write it back to the server would be consistent with what happens now.
>
>
> Yes, but the idea is to do better than that, when it can be written safely.
>>
>>
>> For the case where the server is in grace, the client can use
>> Open/Claim_previous
>> and then, if the reply indicates an immediate recall of the
>> delegation, it can write the dirty data
>> back to the server before doing a Delegreturn.
>
>
> True but that is an unusual case.
>>
>>
>> For the case where the server is not in grace and the Open/Claim_delegate_prev
>> fails, I think there is another way that the client can safely write
>> the dirty data back.
>> - If the client keeps track of the server's value of the change
>> attribute for the file
>>   in non-volatile storage.
>>   - The client can try an
>> Open/Claim_null/Open_share_access_write/Open_share_deny_write
>>      followed by a Getattr of change in the same compound.
>>      - if this succeeds and the change attribute is the same as the
>> stored one, then I think the client can
>
>
> If the  Open/Claim_delegate_prev fails, it would be because the delegation was revoked, in which
> case, change attribute will be different.
Why would the change attribute be different? It changes when there is
a data/metadata change
done to the file. Typically, a delegation would be revoked (instead of
recalled) if there is a conflicting Open
request and the lease has expired. This does not imply a change in the
change attribute, does it?
Of course, if any data/metadata changes are applied to the file by
other clients (such as the one
that acquired the conflicting open), then change would change.
Or, I do not understand what you mean by revoke?

>  Also, in that case, checking change attribute and proceding
> to write is not safe, since the attribute can change after the check.  You need continuous posession
> of the delegation which is why clain-delege-prev was created.
Wouldn't a OPEN4_SHARE_DENY_WRITE be sufficient to guarantee that
other clients are
not writing to the file during the write-back? Once change is checked
and found the same after the Open,
there shouldn't have been any writes done by other clients already
done and the deny_write should guarantee
that no other clients write to the file until the open_stateid is
closed (or there is something ugly
like a network partitioning/lease expiration). For this case, all the
client is trying to do is write the dirty data
back to the server safely, since it was not able to re-acquire the
delegation. It will no longer be able to use
the cached data once the open_stateid is closed.
I have not tried this yet, so there might be issues that I have not
recognized w.r.t. doing this.

I agree that reclaiming the delegation is preferable, since caching of
the file can continue and
the write-back does not need to be done right away.

>
>>        safely write the dirty data back to the server using the open_stateid
>>        - then close the open_stateid
>>
>> An option other than tossing the data would be to flag it and let a
>> sysadmin/user decide if
>> merging it back to the server's file makes sense. Ugly but doable when
>> the dirty data is in
>> client non-volatile storage.
>
>
> He could decide but I'm not sure on what basis he could decide, especially since telling
> him all the relevant information could be a security hole a megaparsec wide. :-(
Well, I wore a sysadmin hat for 30+ years and one of my more enjoyable
work items
went something like...
A prof. would email saying they had made a bunch of changes to a file
and then lost them.
Can you get the changes back?
- The attempt usually involved finding the most recent backup of the
file, along with stuff
   like an editor tmp file left around to try and get back what I could.
So, yes, for the user it could easily be a security issue (it happens
that this caching mechanism
uses a separate file for each file being cached, so access to the
cache for the file could be
the same as the file), but for a sysadmin it is just a bothersome manual action.

rick

>>
>>
>> rick
>>
>> >
>> > --
>> > Trond Myklebust
>> > Linux NFS client maintainer, Hammerspace
>> > trond.myklebust@hammerspace.com
>> >
>> >
>>
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4