[nfsv4] Re: review of draft-ietf-nfsv4-internationalization-10.txt

David Noveck <davenoveck@gmail.com> Wed, 24 July 2024 17:37 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF370C14CEE4; Wed, 24 Jul 2024 10:37:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.108
X-Spam-Level:
X-Spam-Status: No, score=-2.108 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6-xIIsQXjamD; Wed, 24 Jul 2024 10:37:43 -0700 (PDT)
Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [IPv6:2607:f8b0:4864:20::f31]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 30364C14F6E4; Wed, 24 Jul 2024 10:37:43 -0700 (PDT)
Received: by mail-qv1-xf31.google.com with SMTP id 6a1803df08f44-6b798e07246so483706d6.2; Wed, 24 Jul 2024 10:37:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721842662; x=1722447462; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=okSVZf463Q3B+hrXmuXIr/SIcqFtFGfRXcptbMjbVMI=; b=MRaFHz/ufdt2b6P+F5R3v1HA+DHBPev6c/DHVkiaKlReFH5ThgYMm93WudgygPDhGZ XWbWt9eVdm0AF1/o6yfCAouuwGOEkZ9AuTUbsIIqHUXtiWWL9XUSHmVq4yZY9I0U87oN yVU+TcthvzmZpv+4tGDAkxVrTaeJjc1VIq17tzPZ0QTP/G/FIV7pF6NR3+OHUO1Xl4vt FzuB7Kzq9NtKQITzry6aXuDSvnuy6QHtqcWLN7nlLX7uTFqGr+Kw2gnk2scCwFjVJCok ZGBcuL1qsQR0WN2mWGBjItXof6dZ4GizGbOXoKtnQT78H5gvkiy10tlzSI8x5Z83WOde vjZg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721842662; x=1722447462; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=okSVZf463Q3B+hrXmuXIr/SIcqFtFGfRXcptbMjbVMI=; b=GsBbEGSqtDbqvE8C4KLekCCmxd2HndW0USCl+X2zx+AgC3atl7Phd3kHlPHyIQ4D1o bITS/3YUirGiqK6ShBsdVmLZGpK8UGaCJf4YKaOPdP0+G/w1lyRoCsX2E874pRWsmYR4 OPMnZBEaLMkIrn58EMYX+DbxAnPz4xSjpEFoQB0WfsHsez3LomDMqt7yGW72t4jwDRU/ 4FspeE31B6xnXLLY/aRBhT8apyj//32sTfF1Odkbz5rmRO7bQSjTwBSUNwi/d/tdD5+M A0AWxp1M4f+GILNbnkCrS50eKR2scQGuLPA9DJA8N3F39RblYvFrXdkM6nd7pKUafe0+ tQwQ==
X-Forwarded-Encrypted: i=1; AJvYcCVR0smDW9DQL5kzeux6zBkKxiRzxhuH9i2Sn9sD+mGy0u1vESbBZEe84JYa67gn5w8NuV53TUxk1kN3qX4WtgoDo7ZxqkU=
X-Gm-Message-State: AOJu0YywciQqBxYj/8hHfjdVd/S3P5nSh2SO2scyedQZ40v+PryPqz7e 7Kborf2zVi5mlyF/sjPWEUyf+/av52+Ed6DUtfbZBNFhNe1aJdQrUN2tKK7PkaphtM5tuW4W2KO oGvE0v14aUwKyBi/Z65TgFQF5pm0=
X-Google-Smtp-Source: AGHT+IF9qBn1bs9XISdpIZ4JsYBSlG1ynnrxaBSK4+etZ0VUp35rpGFKN0TAm+3YilcCAqFmp4eTPG4y0ZHcJHT0m6I=
X-Received: by 2002:a05:6214:2025:b0:6b7:aeba:7085 with SMTP id 6a1803df08f44-6bb3ca314c5mr4586376d6.24.1721842662179; Wed, 24 Jul 2024 10:37:42 -0700 (PDT)
MIME-Version: 1.0
References: <172001876016.915067.10248092184385058958@dt-datatracker-5f88556585-g8gwj> <CADaq8jczyQ4sBb3V+eSd1ijo4JHA0cz8SoWVWTU27YrvOB5WkA@mail.gmail.com> <20240724172313.GA16524@lst.de>
In-Reply-To: <20240724172313.GA16524@lst.de>
From: David Noveck <davenoveck@gmail.com>
Date: Wed, 24 Jul 2024 13:37:30 -0400
Message-ID: <CADaq8jfT12C3FEmN+L=q+GguPfEVYHJNBcjm8_aw_6iGX0b9oA@mail.gmail.com>
To: Christoph Hellwig <hch@lst.de>
Content-Type: multipart/alternative; boundary="0000000000003d09ed061e01bc61"
Message-ID-Hash: ZQQNANK3K6652MZSU6KXS3B4CQLGJSUK
X-Message-ID-Hash: ZQQNANK3K6652MZSU6KXS3B4CQLGJSUK
X-MailFrom: davenoveck@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-nfsv4.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: NFSv4 <nfsv4@ietf.org>, nfsv4-chairs <nfsv4-chairs@ietf.org>, arnt@gulbrandsen.priv.no
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [nfsv4] Re: review of draft-ietf-nfsv4-internationalization-10.txt
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/KTDBHZBOLHsnjJxyOy1GeiM_xow>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Owner: <mailto:nfsv4-owner@ietf.org>
List-Post: <mailto:nfsv4@ietf.org>
List-Subscribe: <mailto:nfsv4-join@ietf.org>
List-Unsubscribe: <mailto:nfsv4-leave@ietf.org>

Thanks for the review.   Am looking to provide a new draft in early August.

On Wed, Jul 24, 2024 at 1:23 PM Christoph Hellwig <hch@lst.de> wrote:

> Hi Dave,
>
> this is a review of draft-ietf-nfsv4-internationalization-10.txt.
>
> First, please run the document through a spell checker, there are a lot
> of typos and annoying inconsistencies like using implementers and
> implementors right next to each other.
>
> Second, please refer to my recent mail to the WG list about covering 4.0
> or not.  I think that needs to be a WG conclusion an is not document
> specific, but highly relevant here.
>
> The document refers to the term 'physical file systems' without
> explaining it first.  A more common term would be 'local file system'
> but even that is not entirely correct as NFS servers can also
> export remote or distributed file systems, so 'underlying file system'
> might be a better choice.  Either way the term should be explained in
> the General Definitions section.
>
> Section 2.2. "Requirements Language Derivation" is very confusing.
> The BCP 14 terms designate requirements of the specification, and
> the reason why the specification has these requirements does not
> matter.  As far as I can tell simply removing this section would
> improve the document significantly.
>
> The forward reference from Section 6 to Section 7 harms the readability
> of the document.  Instead of referring to different modes of operation
> that are defined later, please define the high level operation modes
> earlier my moving section 7 before section 6. In general I also don't
> find references to "NFSv3" or "older" string handling very useful.
> To an unencumbered implementer, spelling out the actual modes with
> descriptive names will be a lot more useful and descriptive and useful.
> That does not preclude mentioning what NFSv3 did, but it should be part
> of the primary definition.  It would also really help to define the
> two operation modes and the terms used by them in a General Definitions
> section.
>
> The name caching described in Section 6.3 can work in a slightly modified
> form when file name normalization is used.  Linux implements it for
> local file systems using Unicode based normalization and case
> insensitivity.  In the NFS context that would require that the
> client knows the exact normalization behavior and Unicode version used
> on the server, and thus is relatively impractical.  Given how little
> NFS is used with case insensitive file systems is also doesn't
> really matter.
>
> Section 8 and especially Section 8.1 are very meandering and the
> wording feels like an essay and not suitable for a normative document.
>
> I would suggest to replace Section 8 with a short introduction of
> what changes compared to the definition in RFC8881 and drop section
> 8.1 entirely.  The details history of the text is of no importance to
> implementers.  If you feel it should be properly archive somewhere
> an informational document on the history of internationalization in
> NFS might be somewhat interesting.
>
> Section 8.2 should be worded as actual normative language.  My
> quick and dirty proposal is below:
>
> -------------------------------- snip --------------------------------
> This section replaces Section 14.4 of [RFC8881], taking into account the
> behavior of existing implementation of [RFC5661] [RFC8881] while
> providing best effort compatibility with the definition in [RFC5661] and
> [RFC8881].
>
>
>   const FSCHARSET_CAP4_CONTAINS_NON_UTF8  = 0x1;
>   const FSCHARSET_CAP4_ALLOWS_ONLY_UTF8   = 0x2;
>
>   typedef uint32_t        fs_charset_cap4;
>
> This attribute provides a simple way of determining whether a particular
> file system behaves as a UTF-8-only server and rejects file names which
> are not valid UTF8-encoded strings. When this attribute is supported and
> the value returned has the FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 flag set, the
> error NFS4ERR_INVAL MUST be returned if any file name argument contains a
> string which is not a valid UTF8-encoded string.
>
> When this attribute is supported and the value returned has the
> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 flag clear, the error NFS4ERR_INVAL will
> not be returned based on adherence to the rules of UTF-8.
>
> The FSCHARSET_CAP4_CONTAINS_NON_UTF8 flags exist for historical reasons
> only and has no clear behavior associated with it.  Servers SHOULD set
> the FSCHARSET_CAP4_CONTAINS_NON_UTF8 when they set the
> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 flag, and clear the
> FSCHARSET_CAP4_CONTAINS_NON_UTF8 when they clear
> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 to provide the best compatibility to
> historic clients.
>
> Clients SHOULD ignore the FSCHARSET_CAP4_CONTAINS_NON_UTF8 flag.
>
> When the fs_charset_cap attribute is not supported, the client can
> perform a LOOKUP using a name not conforming to the rules of UTF-8
> and use the error returned to determine whether non-UTF-8 names are
> accepted.
> -------------------------------- snip --------------------------------
>
> Section 12 is closely related to Sections 6 and 7 and it would
> be very helpful if it was located close to them in the document.
>
> The section could use a bit of rewording, and the use of BCP 14
> terms to describe actual server behavior and not just protocol
> requirements seems wrong.  And possible replacement for the start
> of this section is provided below:
>
> -------------------------------- snip --------------------------------
> Servers MAY accept component names that are not valid UTF-8 strings on
> all or on some subset of file systems exported.
>
> A typical pattern is for a server to use UTF‑8-unaware underlying file
> systems that treat component names as uninterpreted strings of bytes,
> rather than having any awareness of the character set being used.
> Such servers do not change the stored representation of component names
> from those received on the wire and use an octet-by-octet comparison of
> component name strings to determine equivalence (as opposed to any
> broader notion of string comparison).
>
> This is because the server has no knowledge of the specific character
> encoding being used.
> -------------------------------- snip --------------------------------
>
> Section 13 feels like editorializing.  I don't think this belongs into
> a normative section.  If you feel attached to it maybe move it to
> an appendix or the above mentioned informational document?
>
> In general the document could use a bit less of term "As stated"
> {,above,previously}".
>
> This review didn't make it to the appendices yet, but I plan to
> look over them in the future.
>