[nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operations
Pali Rohár <pali-ietf-nfsv4@ietf.pali.im> Fri, 09 August 2024 16:55 UTC
Return-Path: <pali-ietf-nfsv4@ietf.pali.im>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8145EC14F6A3 for <nfsv4@ietfa.amsl.com>; Fri, 9 Aug 2024 09:55:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.004
X-Spam-Level:
X-Spam-Status: No, score=-2.004 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=pali.im
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id A_XGzswZhmN7 for <nfsv4@ietfa.amsl.com>; Fri, 9 Aug 2024 09:55:33 -0700 (PDT)
Received: from pali.im (mail.pali.im [IPv6:2a02:2b88:6:5cc6::2a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DBEB6C14F614 for <nfsv4@ietf.org>; Fri, 9 Aug 2024 09:55:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pali.im; s=mail; t=1723222528; i=@pali.im; bh=SX7O3t+HbXt/Qctx4TROSOpLkKSTHQ9Ek1OQcVDgQxA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Azjgsm9j3zt/kCPt/jqgJ1c+2gN7pOsRkzstP+S+/ioutgzqDcZgXCBlxFYEwwl7M xQ2jS7ZKTTSNMdy670Ai4u7RlE3E6SoSMSgXs9hAQ0PTR/OqY9WmIMwyR6pH7HrFW9 V76kJb2WEIKRNg3AGdNB4wpv12L9ZVi2YcFIuFKmKkLEa5NthskJ6PCaQMwtEyhgLo fZ1fOFLu0LS1v+Nq+YMlArZSjIe7q75Y1qZEizpnRfSG2iHJZ9zjb9QzeFgKrY4Vmq uwoa/awWVyqTEkVe6EBdKxix8WGH0l2rr3W1Xaw0vYkTbDN4Vj8hHT6nWGdmMLUxDN mdoQ7EGmM0ICA==
Received: by pali.im (Postfix) id E6E13786; Fri, 9 Aug 2024 18:55:28 +0200 (CEST)
Date: Fri, 09 Aug 2024 18:55:28 +0200
From: Pali Rohár <pali-ietf-nfsv4@ietf.pali.im>
To: Rick Macklem <rick.macklem@gmail.com>
Message-ID: <20240809165528.liwacbka4j2qqdk5@pali>
References: <CAM5tNy7g+YCiiZQD7G6Ryv_Mo8N5BeRiqMP=224zPpEXa+Yi+A@mail.gmail.com> <20240809090008.tzlq4vy5jmxckcqn@pali> <CAM5tNy7vBv0TLaL57bWMYfwLtG7P4CuBHj4YGOoGsnGmVeVeaA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAM5tNy7vBv0TLaL57bWMYfwLtG7P4CuBHj4YGOoGsnGmVeVeaA@mail.gmail.com>
User-Agent: NeoMutt/20180716
Message-ID-Hash: NQXHAUM2XN2RJICC4WJBPJ3LYCTBQYR6
X-Message-ID-Hash: NQXHAUM2XN2RJICC4WJBPJ3LYCTBQYR6
X-MailFrom: pali-ietf-nfsv4@ietf.pali.im
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-nfsv4.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: NFSv4 <nfsv4@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operations
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/e6m3TIgqorxqSMyyQSZ3NUtwEUU>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Owner: <mailto:nfsv4-owner@ietf.org>
List-Post: <mailto:nfsv4@ietf.org>
List-Subscribe: <mailto:nfsv4-join@ietf.org>
List-Unsubscribe: <mailto:nfsv4-leave@ietf.org>
On Friday 09 August 2024 08:12:39 Rick Macklem wrote: > On Fri, Aug 9, 2024 at 2:00 AM Pali Rohár <pali-ietf-nfsv4@ietf.pali.im> > wrote: > > > Hello, if I understand correctly then this functionality could be > > already possible via NFS v4.1 by using persistent reply cache of the > > session. > > > > RFC 8881 section 2.10.6.5. says: > > > > "A persistent reply cache places certain demands on the server. The > > execution of the sequence of operations (starting with SEQUENCE) and > > placement of its results in the persistent cache MUST be atomic." > > > > It should be enough for a NFS v4.1 client to create a new session with > > flag CREATE_SESSION4_FLAG_PERSIST and place those "atomic" operations > > into that session. > > > Hmm. I think that "atomic" w.r.t. persistent sessions refers to the > changes done to session state and the file system are committed to > storage at one time, using some sort of log that allows partially completed > compounds to be "rolled back" after a reboot such that the session either > represents the state before the compound began or after it has been > completed. Yes, it sounds like that, it requires something like what databases are using for consistency during execution of transactions. > As an implementer I will note that many (maybe all extant) servers that do > not > support persistent sessions. (I know that the FreeBSD server does not.) I think that I have not saw NFS v4.1 server which implements persistent sessions properly, so it is really rare thing. > However, I do not think the above precludes another client from performing > operations on the same file concurrently (interspersed) with the operations > on the compound. I think that this is questionable as the description of "The execution of the sequence of operations (starting with SEQUENCE) ... MUST be atomic." is not fully clear. If we take atomicity as the real strict atomicity then in your example between VERIFY and WRITE cannot be executed some other operation which may change state of VERIFY and WRITE. Some future clarification for this functionality would be nice. > An example case that MUTEX_BEGIN/MUTEX_END is meant to address might be > an attempt to implement append writing (Client A and B both have the CFH > set to the same file): > (in temporal ordering) > Client A Client B > - without MUTEX_BEGIN/MUTEX_END > VERIFY (size == N) returns NFS_OK > VERIFY (size == N) returns NFS_OK > WRITE (at offset N) returns NFS_OK > WRITE (at offset N) returns NFS_OK > --> Client B overwrites Client A's write > - with MUTEX_BEGIN/MUTEX_END > MUTEX_BEGIN returns NFS_OK > MUTEX_BEGIN - blocks (or replies > NFS4ERR_LOCKED) > VERIFY (size == N) returns NFS_OK > WRITE (at offset N) returns NFS_OK > MUTEX_END returns NFS_OK > MUTEX_BEGIN - returns NFS_OK > VERIFY (size == N) returns > NFS4ERR_NOTSAME > --> this causes an implicit MUTEX_END > > Now, it is true that, for writing, byte range locking could be used, > but that adds overhead and requires that all writing do the locking > (since most/all extant servers do advisory byte locking and the client > has no way of knowing whether or not a server is doing advisory vs mandatory > byte range locking). For this purpose there are shared reservations. You can take DENY_WRITE share reservations and you can be sure that nobody overwrite file during holding of your shared reservation. > rick > > > > On Thursday 08 August 2024 13:36:25 Rick Macklem wrote: > > > Hi, > > > > > > Over the years, I've run into cases where it would be really > > > nice to be able to perform multiple NFSv4 operations on a > > > file without other operations done by other clients > > > "gumming up the works" by changing the file's data/metadata > > > between the operations in the compound. > > > > > > So, what do others think about an extension to NFSv4.2 that > > > adds 2 new operations: > > > MUTEX_BEGIN(CFH) > > > MUTEX_END(CFH) > > > Both would use the CFH as argument, and no other client would > > > be allowed to perform operations on the CFH between the MUTEX_BEGIN > > > and MUTEX_END. > > > > > > I think there would need to be a couple of properties for these: > > > - There would need to be an "implicit" MUTEX_END when any operation > > > between MUTEX_BEGIN and MUTEX_END returns a status other than NFS_OK. > > > - I think you would want a restriction of only one mutex for one CFH > > > at a time in a compound. Without that, there could easily be deadlocks > > > caused by other compounds acquiring mutexes on the same CFHs in a > > > different order. > > > - Only one compound can hold a mutex on a given CFH at any time. > > > - MUTEX_BEGIN/MUTEX_END can only be used in compounds where SEQUENCE > > > is the first operation. > > > - All mutexes are discarded by a server when it crashes/recovers. > > > (Any time a client receives a NFS4ERR_STALE_CLIENTID.) > > > That way RPC retries after a server reboot should work ok, I think? > > > > > > I am not sure what the semantics for reading data/metadata should be, > > > but I was thinking that would be allowed to be done by compounds for > > > other clients for the CFH. If a client wanted to serialize against > > > other compounds for the CFH, it could do a MUTEX_BEGIN/MUTEX_END. > > > > > > I see this as useful in a variety of ways: > > > - The example in the previous email of: > > > MUTEX_BEGIN > > > NVERIFY acl_truform ACL_MODEL_NFS4 > > > SETATTR posix_access_acl > > > MUTEX_END > > > - Append writing: > > > MUTEX_BEGIN > > > VERIFY size "offset in WRITE that follows" > > > WRITE "offset" etc > > > MUTEX_END > > > - A bunch of cases where NFSv4 lacks the postop_attributes > > > that were in NFSv3. > > > MUTEX_BEGIN > > > WRITE > > > GETATTR size, change,.. > > > MUTEX_END > > > > > > So, what do others think? > > > (This was obviously not possible without sessions.) > > > > > > rick > >
- [nfsv4] RFC: new MUTEX_BEGIN/MUTEX_END operations Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Tom Haynes
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Pali Rohár
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Chuck Lever III
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Jeff Layton
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Pali Rohár
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Jeff Layton
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Jeff Layton
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Jeff Layton
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Pali Rohár
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Trond Myklebust
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem
- [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operat… Rick Macklem