[nfsv4] Notes regarding discussion of directory scalabiliy issues
David Noveck <davenoveck@gmail.com> Fri, 26 June 2020 17:54 UTC
Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 773543A0BCD for <nfsv4@ietfa.amsl.com>; Fri, 26 Jun 2020 10:54:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JxJK5kJWBqM2 for <nfsv4@ietfa.amsl.com>; Fri, 26 Jun 2020 10:54:33 -0700 (PDT)
Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5BE453A0BD1 for <nfsv4@ietf.org>; Fri, 26 Jun 2020 10:54:33 -0700 (PDT)
Received: by mail-ed1-x536.google.com with SMTP id dg28so7577279edb.3 for <nfsv4@ietf.org>; Fri, 26 Jun 2020 10:54:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=B22TDJd/QOHMo2wtOA/7hKYqjqLhCD4SZk2xxyKgGUU=; b=dYOr7C899jGwnVxmcuo2Dz/FsoqbqHlGz1VCEjUSdzT8WjgMiR/X2fMmdIfU5SvxeN 7b6lGOc/ZUd5qygP8svvBFkMh23KZ2gTz9Jru6RgqvM8/cnUufxYoBL2+3pE5OgWfpQY aCsxMswYGotKybvQRicvB+0FUonNPoDso74UsGXWNCbdUwNnEz4+zDJmkzDLMVFJ7+oY to6+Y8kl6xeLNq/Ta6a2slx/L1kgzY3YHMsW4bxYFKDKHP5ww+xjLohqElryne/nkiEy tv1v+K8GtKzjFSUpt8XAZwZktqGsRxumYrXrXCj2Zopr6+Kz+e6jbY8jzGGKSdxu8lLv 4GkA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=B22TDJd/QOHMo2wtOA/7hKYqjqLhCD4SZk2xxyKgGUU=; b=uXrv1vmqCeuajHVRjWSMAgCS94iYYMZByIv54tkMMeKsadPNzZ5w3UGMe/S0Pt/Cg2 QaF1ToCFKPCfBuNu8BOY28yuLxwnHLQZmzu9MczUKrUeaoycWLOM5RLzlteAp0cpRFre N2pu2MrZJRBjkAOQ6DWcbeOKkUlJrJVD6OLydGWjesRVI13w+jxwe9ZQb2kNSpvJlQoB fmgeLY8ytumWfO5rMeg/QVy/KvKfMu+TDjMhxZ4Du4SRyeVzVR6ImxZH4HAdF8hcL6yL BV9k5cwKh2pNW3NdTiUMAFJlGvBX644OfofyQ8qmf/qrNk8iqxCskQCA1wxKUjcnCqsN F4ng==
X-Gm-Message-State: AOAM533R2z8SF1a0yOHQZcUPVWKoWES2FYmpt1NAlomu80+nhzvduZ5q Sea8uQ/EN3erXGWYJp7lBn5sN72syPIwJnH5mcZU+g==
X-Google-Smtp-Source: ABdhPJyIM1aepTTFsSH7iDZDywi43rUyjAu6ZdHUivmQoejfMKHDH2RSgVScfZrVAihHoAsvNFCdoJLv6J+4SWD1+Y0=
X-Received: by 2002:a50:f1d9:: with SMTP id y25mr4372629edl.292.1593194071316; Fri, 26 Jun 2020 10:54:31 -0700 (PDT)
MIME-Version: 1.0
From: David Noveck <davenoveck@gmail.com>
Date: Fri, 26 Jun 2020 13:54:20 -0400
Message-ID: <CADaq8jev+tUs=mrGDMnZMpfmQXL=KLwDKW5S-CbBLpL-54RJTA@mail.gmail.com>
To: NFSv4 <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000addf5405a900668c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/ak4ADoj_-xRa5TL-fnSAIHrYQZU>
Subject: [nfsv4] Notes regarding discussion of directory scalabiliy issues
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2020 17:54:35 -0000
*Introduction* On 6/22, Chuck and I held a discussion to try to resolve some issues that arose in our discussion of scalability issues for directory operations. After our original presentations on this topic at the post-IETF107 virtual interim, it was anticipated that we would discuss those on the wg mailing list. Since that didn't work out, Chuck and I decided to clarify and possibly resolve our differences of approach in a short meeting that worked well in the form of a phone call. We were able to clarify but not resolve two issues. We hope to be able to resolve these eventually, but not necessarily during the July 9th meeting. - We did agree that we will need to better understand, and try to resolve issues regarding the compatibility of directory notifications and client handling of directory cookies and their caching. See *Directory Delegation Issues* for details. - We also explored issues raised by the possible addition of ops to aid recursive directory traversals, building on Chuck's suggestion, at the earlier meeting, of possible protocol aid for "rm -r". See *Ops to Add Aiding Recursive Directory Traversal *for details *Directory Delegation Issues* Implementations of directory delegations are quite limited, making it not worth investing in server-side implementations. Useful client-side implementation would require the ability to efficiently cache large slowly changing directories. Unfortunately, that is not currently possible, at least for the Linux client, since the creation or removal of a file from a directory requires that the entire directory (which can be very large) be refetched. This now occurs for directory changes that the client makes itself but would presumably also apply in the case of directory notifications, making them not very useful in the important case of large slowly-changing directories. As Chuck related to me, this requirement for directory refetching is predicated on the possibility that any change to a directory (e.g. remove, create, rename) could potentially invalidate directory entry cookies for all the cached directory.entries. While this is unlikely to happen in practice, I can see that clients might be unwilling to rely on filsystems not changing these. The thing I don't understand, and Chuck was unable to clarify for me, is why such entries need to be cached at all, and essentially treated as if they were attributes. While the directory notification feature has the ability to propagate attribute changes to clients with delegations, there is no such ability in the case of cookie changes. It appears that the feature was designed assuming these would not be necessary. As a result, it appeared that one if the following would have to be done: 1. Clients might be modified so as not to depend on a supposed fixity of such cookies. That's clearly my favorite as I believe the client returning a cached directory could synthesize its own cookies to allow users to fetch directory information across multiple requests. However, I'm not sure client implementers would agree and this is a matter on which consensus is important. 2. Provide a way in which the server could communicate that the theoretical possibility that a remove, create, or rename could cause a revision of cookies for uninvolved directory entries does not occur for a given fs. This could be done by adding a new fs-scope attribute providing information about an fs's directory entry cookie management. The downside is that this would be v4.2-only and would be require an additional RFC to make a v4.1 feature effectively usable 😞 3. Provide another way to exclude the possibility of the client being blind-sided by a server-side fs prone to directory cookie reassignment. For example, any change to a directory entry cookie not being removed or renamed could require delegation recall. Unfortunately, this is a big change to an existing feature😖 As Chuck and I finished the discussion we anticipated a long-term process aimed at securing a consensus about which of these choices the working group would adopt to make directory delegations useful. This could start at the 7/9 meeting but the need to make sure we had the active participation of client implementers meant it wasn't a sure thing for initial discussion at the 7/9 meeting. Lately, I've been looking at directory notifications in more detail and come to the conclusion that the way the notifications use cookies to update cached directories is really predicated on the expectation that cookies for directory entries not involved in the specific update will not change. For example, insert and rename notifications include the cookie of the entry before the insertion point, which is not really useful if the server-side fs is free to change cookies for entries not involved in the directory operation. I now believe that it is possible to rework the description of this feature so that the server by supporting directory notifications is providing assurance that the server, by supporting such notifications, is effectively providing an assurance that the wholesale revision of cookies as a result of directory modification operations, about which clients are concerned, cannot happen. This could not be done as a consequence of an errata report, but it is the sort of clarification/revision that I feel could be done as a part of rfc5661bis, assuming that we can reach a working group consensus on the matter. We intend to do change sof this scale for some REJECTED errata reports.I will take about 5-10 minutes for a presentation of this issue at the 7/9 meeting, hoping to stimulate later discussion of this issue and future directory delegation/notification implementation possibilities on the mailing list. *Ops to Add Aiding Recursive Directory Traversal* In Chuck's presentation he alluded to the possibility of the protocol giving "rm -r" more help and there are a number of useful extensions that could be defined. As I considered other applications in which recursive directory traversal there appeared to be a number of possible READDIR extensions that could be sensibly proposed and would be useful in software build workloads. The problem with such extensions is that they are unlikely to be used unless complementary work is done to provide useful API's to provide access to any helpful protocol extensions. Although such work is out-of-scope for the working group, the working group has to avoid investing in protocol extensions which, realistically, will never be used. As a result I will not present regarding possible extensions at the 7/9 meeting but may dp this later if there is interest in compatible API's for important clients. *Miscellaneous Items.* We also had occasion to discuss some other issues regarding the agenda of the forthcoming meeting: - Chuck reminded me of my previously mentioned intention to use github for the handling of the writing and review of wg documents associated with rfc5661bis (i.e. draft-ietf-nfsv4-internationalization, draft-ietf-nfsv4-security-needs, draft-ietf-nfsv4-security, draft-ietf-nfsv4-rfc5661bis, and possibly others). Chuck wasn't clear what exactly I might need from him to help the process and it turned out, I wasn't sure either. We agreed that I would mention the issues in my general slides on the rfc5661bis process and that I'd give Chuck an early opportunity to respond to those. - Chuck pointed out the issue of the length of rfc5661 nd the need to address concerns about that. I was concerned that previous suggestions in this regard (with regard to pNFS file) might result in multiple documents that don't fit together all that well. I agreed to look at other approaches to the issue and present those to the working group at the 7/9 meeting.
- [nfsv4] Notes regarding discussion of directory s… David Noveck
- Re: [nfsv4] Notes regarding discussion of directo… Trond Myklebust
- Re: [nfsv4] Notes regarding discussion of directo… David Noveck
- Re: [nfsv4] Notes regarding discussion of directo… Trond Myklebust
- Re: [nfsv4] Notes regarding discussion of directo… David Noveck
- Re: [nfsv4] Notes regarding discussion of directo… Rick Macklem
- Re: [nfsv4] Notes regarding discussion of directo… David Noveck
- Re: [nfsv4] Notes regarding discussion of directo… Rick Macklem
- Re: [nfsv4] Notes regarding discussion of directo… David Noveck
- Re: [nfsv4] Notes regarding discussion of directo… Rick Macklem
- Re: [nfsv4] Notes regarding discussion of directo… David Noveck
- Re: [nfsv4] Notes regarding discussion of directo… Rick Macklem