[nfsv4] review of draft-ietf-nfsv4-internationalization-10.txt
Christoph Hellwig <hch@lst.de> Wed, 24 July 2024 17:23 UTC
Return-Path: <hch@lst.de>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60B1CC157931; Wed, 24 Jul 2024 10:23:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P7rdLvgbGx6r; Wed, 24 Jul 2024 10:23:18 -0700 (PDT)
Received: from verein.lst.de (verein.lst.de [213.95.11.211]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 75E48C1CAE8A; Wed, 24 Jul 2024 10:23:16 -0700 (PDT)
Received: by verein.lst.de (Postfix, from userid 2407) id C61F468AFE; Wed, 24 Jul 2024 19:23:13 +0200 (CEST)
Date: Wed, 24 Jul 2024 19:23:13 +0200
From: Christoph Hellwig <hch@lst.de>
To: David Noveck <davenoveck@gmail.com>
Message-ID: <20240724172313.GA16524@lst.de>
References: <172001876016.915067.10248092184385058958@dt-datatracker-5f88556585-g8gwj> <CADaq8jczyQ4sBb3V+eSd1ijo4JHA0cz8SoWVWTU27YrvOB5WkA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CADaq8jczyQ4sBb3V+eSd1ijo4JHA0cz8SoWVWTU27YrvOB5WkA@mail.gmail.com>
User-Agent: Mutt/1.5.17 (2007-11-01)
Message-ID-Hash: NGTWNBCXWVJAL6N6LMJG66UPPMH3VS56
X-Message-ID-Hash: NGTWNBCXWVJAL6N6LMJG66UPPMH3VS56
X-MailFrom: hch@lst.de
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-nfsv4.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: NFSv4 <nfsv4@ietf.org>, nfsv4-chairs <nfsv4-chairs@ietf.org>, arnt@gulbrandsen.priv.no
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [nfsv4] review of draft-ietf-nfsv4-internationalization-10.txt
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/NHf0OzZwP4nFNEiWrU-iR9UX7wQ>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Owner: <mailto:nfsv4-owner@ietf.org>
List-Post: <mailto:nfsv4@ietf.org>
List-Subscribe: <mailto:nfsv4-join@ietf.org>
List-Unsubscribe: <mailto:nfsv4-leave@ietf.org>
Hi Dave, this is a review of draft-ietf-nfsv4-internationalization-10.txt. First, please run the document through a spell checker, there are a lot of typos and annoying inconsistencies like using implementers and implementors right next to each other. Second, please refer to my recent mail to the WG list about covering 4.0 or not. I think that needs to be a WG conclusion an is not document specific, but highly relevant here. The document refers to the term 'physical file systems' without explaining it first. A more common term would be 'local file system' but even that is not entirely correct as NFS servers can also export remote or distributed file systems, so 'underlying file system' might be a better choice. Either way the term should be explained in the General Definitions section. Section 2.2. "Requirements Language Derivation" is very confusing. The BCP 14 terms designate requirements of the specification, and the reason why the specification has these requirements does not matter. As far as I can tell simply removing this section would improve the document significantly. The forward reference from Section 6 to Section 7 harms the readability of the document. Instead of referring to different modes of operation that are defined later, please define the high level operation modes earlier my moving section 7 before section 6. In general I also don't find references to "NFSv3" or "older" string handling very useful. To an unencumbered implementer, spelling out the actual modes with descriptive names will be a lot more useful and descriptive and useful. That does not preclude mentioning what NFSv3 did, but it should be part of the primary definition. It would also really help to define the two operation modes and the terms used by them in a General Definitions section. The name caching described in Section 6.3 can work in a slightly modified form when file name normalization is used. Linux implements it for local file systems using Unicode based normalization and case insensitivity. In the NFS context that would require that the client knows the exact normalization behavior and Unicode version used on the server, and thus is relatively impractical. Given how little NFS is used with case insensitive file systems is also doesn't really matter. Section 8 and especially Section 8.1 are very meandering and the wording feels like an essay and not suitable for a normative document. I would suggest to replace Section 8 with a short introduction of what changes compared to the definition in RFC8881 and drop section 8.1 entirely. The details history of the text is of no importance to implementers. If you feel it should be properly archive somewhere an informational document on the history of internationalization in NFS might be somewhat interesting. Section 8.2 should be worded as actual normative language. My quick and dirty proposal is below: -------------------------------- snip -------------------------------- This section replaces Section 14.4 of [RFC8881], taking into account the behavior of existing implementation of [RFC5661] [RFC8881] while providing best effort compatibility with the definition in [RFC5661] and [RFC8881]. const FSCHARSET_CAP4_CONTAINS_NON_UTF8 = 0x1; const FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 = 0x2; typedef uint32_t fs_charset_cap4; This attribute provides a simple way of determining whether a particular file system behaves as a UTF-8-only server and rejects file names which are not valid UTF8-encoded strings. When this attribute is supported and the value returned has the FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 flag set, the error NFS4ERR_INVAL MUST be returned if any file name argument contains a string which is not a valid UTF8-encoded string. When this attribute is supported and the value returned has the FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 flag clear, the error NFS4ERR_INVAL will not be returned based on adherence to the rules of UTF-8. The FSCHARSET_CAP4_CONTAINS_NON_UTF8 flags exist for historical reasons only and has no clear behavior associated with it. Servers SHOULD set the FSCHARSET_CAP4_CONTAINS_NON_UTF8 when they set the FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 flag, and clear the FSCHARSET_CAP4_CONTAINS_NON_UTF8 when they clear FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 to provide the best compatibility to historic clients. Clients SHOULD ignore the FSCHARSET_CAP4_CONTAINS_NON_UTF8 flag. When the fs_charset_cap attribute is not supported, the client can perform a LOOKUP using a name not conforming to the rules of UTF-8 and use the error returned to determine whether non-UTF-8 names are accepted. -------------------------------- snip -------------------------------- Section 12 is closely related to Sections 6 and 7 and it would be very helpful if it was located close to them in the document. The section could use a bit of rewording, and the use of BCP 14 terms to describe actual server behavior and not just protocol requirements seems wrong. And possible replacement for the start of this section is provided below: -------------------------------- snip -------------------------------- Servers MAY accept component names that are not valid UTF-8 strings on all or on some subset of file systems exported. A typical pattern is for a server to use UTF‑8-unaware underlying file systems that treat component names as uninterpreted strings of bytes, rather than having any awareness of the character set being used. Such servers do not change the stored representation of component names from those received on the wire and use an octet-by-octet comparison of component name strings to determine equivalence (as opposed to any broader notion of string comparison). This is because the server has no knowledge of the specific character encoding being used. -------------------------------- snip -------------------------------- Section 13 feels like editorializing. I don't think this belongs into a normative section. If you feel attached to it maybe move it to an appendix or the above mentioned informational document? In general the document could use a bit less of term "As stated" {,above,previously}". This review didn't make it to the appendices yet, but I plan to look over them in the future.
- [nfsv4] Fwd: New Version Notification for draft-i… David Noveck
- [nfsv4] Re: review of draft-ietf-nfsv4-internatio… David Noveck
- [nfsv4] review of draft-ietf-nfsv4-internationali… Christoph Hellwig