[nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operations

Pali Rohár <pali-ietf-nfsv4@ietf.pali.im> Tue, 20 August 2024 21:04 UTC

Return-Path: <pali-ietf-nfsv4@ietf.pali.im>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A307EC151983 for <nfsv4@ietfa.amsl.com>; Tue, 20 Aug 2024 14:04:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.005
X-Spam-Level:
X-Spam-Status: No, score=-2.005 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=pali.im
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nS7L5_DiArDO for <nfsv4@ietfa.amsl.com>; Tue, 20 Aug 2024 14:03:58 -0700 (PDT)
Received: from pali.im (mail.pali.im [IPv6:2a02:2b88:6:5cc6::2a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0D721C14F609 for <nfsv4@ietf.org>; Tue, 20 Aug 2024 14:03:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pali.im; s=mail; t=1724187832; i=@pali.im; bh=JiePccGpUL7RDESDtDRpOFKySQnH1lLMcmUAclPQnVw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QjNS6nHUCw7qSqBmLWz3eufKAZCn3GunLlRS3ewtlFBD/RimFcrGrJp/d/diM6/w1 BAgdoy2T/YuxT2XYLWjGF/+e6gAyLqNYQhXv6KrYXzEHq1SovW3FPz1aSXH3lgflh3 dXuhJlfWFOYAyJTXcZo+omQzI7sW+lpzEFg++3iquRCC7m4YMK34wfFT+wQ+lK6dPS 4SWm0dKwbI81MEQqbBJd87PqDJ4/40NkN8R2y/Oh0oP0jgED1XzLsra+gLA3dDg5da oHOU9ZM+ePHNdcdwo0/VoL/pJlB3C6XaC0yu7ZMFWWVPfHZcKduFXlQqUNJC+iJqq3 v7dm6q43mqfxQ==
Received: by pali.im (Postfix) id A93EC945; Tue, 20 Aug 2024 23:03:52 +0200 (CEST)
Date: Tue, 20 Aug 2024 23:03:52 +0200
From: Pali Rohár <pali-ietf-nfsv4@ietf.pali.im>
To: Rick Macklem <rick.macklem@gmail.com>
Message-ID: <20240820210352.hllkh7ht4cch3624@pali>
References: <CAM5tNy7g+YCiiZQD7G6Ryv_Mo8N5BeRiqMP=224zPpEXa+Yi+A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CAM5tNy7g+YCiiZQD7G6Ryv_Mo8N5BeRiqMP=224zPpEXa+Yi+A@mail.gmail.com>
User-Agent: NeoMutt/20180716
Message-ID-Hash: DQCZZSSA2TTGQP5SRUGBITUXG2HOX67C
X-Message-ID-Hash: DQCZZSSA2TTGQP5SRUGBITUXG2HOX67C
X-MailFrom: pali-ietf-nfsv4@ietf.pali.im
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-nfsv4.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: NFSv4 <nfsv4@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [nfsv4] Re: RFC: new MUTEX_BEGIN/MUTEX_END operations
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/DgmAo_bU9xjYbbu_hLXZ9Gk3V6I>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Owner: <mailto:nfsv4-owner@ietf.org>
List-Post: <mailto:nfsv4@ietf.org>
List-Subscribe: <mailto:nfsv4-join@ietf.org>
List-Unsubscribe: <mailto:nfsv4-leave@ietf.org>

On Thursday 08 August 2024 13:36:25 Rick Macklem wrote:
> Hi,
> 
> Over the years, I've run into cases where it would be really
> nice to be able to perform multiple NFSv4 operations on a
> file without other operations done by other clients
> "gumming up the works" by changing the file's data/metadata
> between the operations in the compound.
> 
> So, what do others think about an extension to NFSv4.2 that
> adds 2 new operations:
>   MUTEX_BEGIN(CFH)
>   MUTEX_END(CFH)
> Both would use the CFH as argument, and no other client would
> be allowed to perform operations on the CFH between the MUTEX_BEGIN
> and MUTEX_END.
> 
> I think there would need to be a couple of properties for these:
> - There would need to be an "implicit" MUTEX_END when any operation
>   between MUTEX_BEGIN and MUTEX_END returns a status other than NFS_OK.
> - I think you would want a restriction of only one mutex for one CFH
>   at a time in a compound. Without that, there could easily be deadlocks
>   caused by other compounds acquiring mutexes on the same CFHs in a
>   different order.
> - Only one compound can hold a mutex on a given CFH at any time.
> - MUTEX_BEGIN/MUTEX_END can only be used in compounds where SEQUENCE
>   is the first operation.
> - All mutexes are discarded by a server when it crashes/recovers.
>   (Any time a client receives a NFS4ERR_STALE_CLIENTID.)
>   That way RPC retries after a server reboot should work ok, I think?
> 
> I am not sure what the semantics for reading data/metadata should be,
> but I was thinking that would be allowed to be done by compounds for
> other clients for the CFH. If a client wanted to serialize against
> other compounds for the CFH, it could do a MUTEX_BEGIN/MUTEX_END.
> 
> I see this as useful in a variety of ways:
> - The example in the previous email of:
>   MUTEX_BEGIN
>   NVERIFY acl_truform ACL_MODEL_NFS4
>   SETATTR posix_access_acl
>   MUTEX_END
> - Append writing:
>   MUTEX_BEGIN
>   VERIFY size "offset in WRITE that follows"
>   WRITE "offset" etc
>   MUTEX_END
> - A bunch of cases where NFSv4 lacks the postop_attributes
>   that were in NFSv3.
>   MUTEX_BEGIN
>   WRITE
>   GETATTR size, change,..
>   MUTEX_END
> 
> So, what do others think?
> (This was obviously not possible without sessions.)
> 
> rick

Hello,

Now I'm thinking more about this and those mutexes looks to be
performance killer for any NFS4 server which allows thousands of
parallel operations. And this can open also vectors for DDOS attacks
if servers are implemented not securely where malicious clients take
mutexes and blocks any operations.

Would not it be better to address existing problems by separate
mechanisms? For example append issue can be solved by new append
operation which NFS4 server can implement more optimized (e.g. if its
storage or API already provides append operation, which applies for all
POSIX systems via open/O_APPEND).

Also this mutex mechanism does not solve atomicity of operations in
multiprotocol environment where other protocols without this kind of
mutex co-exist together on the same storage (e.g. interop with SMB or
NFS3). If you have implementation of NFS4 and Samba in different
userspace processes then what can result is that the NFS4 mutex will
hold other NFS4 compound operations, but would not hold Samba
operations. So atomicity would not be guaranteed at all.

NFS4 servers do not have to be implemented in kernel where they can lock
inode to prevent any other operation by other system parts (which
implements mutex on filehandle). And from POSIX userspace such inode
locking is highly impossible to implement.

I just cannot imagine to how implement NFS4 server with support for this
mutex with system fs storage just by using POSIX API.

Pali