Re: [nfsv4] Making I18N exciting by getting it mostly out of NFSv4

David Noveck <davenoveck@gmail.com> Fri, 10 July 2020 11:07 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C88AD3A0994 for <nfsv4@ietfa.amsl.com>; Fri, 10 Jul 2020 04:07:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OCpTtxIKlqj8 for <nfsv4@ietfa.amsl.com>; Fri, 10 Jul 2020 04:07:48 -0700 (PDT)
Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 442133A0998 for <nfsv4@ietf.org>; Fri, 10 Jul 2020 04:07:48 -0700 (PDT)
Received: by mail-ej1-x62f.google.com with SMTP id lx13so5617214ejb.4 for <nfsv4@ietf.org>; Fri, 10 Jul 2020 04:07:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QWBz+r/KsI2oBK74Fc6ceQ9wcj5irADRhFWTzDbsDHo=; b=JT4wAfycqQ8mH4AajtmsVH8C393Zrc6J9Au60s5bvGCHpsGpXdG6e+GVdzYbwnYXIb 8FxCWoyC/LYw4jFXNzbTo5aFZXJQQ8ui5Ja2Hu+scGnfdk1GJwWyPi4mE7xHtR5WJQBT D0ErWOh+vbIM++u0RU3Il7VyiabqmCCcrSi3O6wJ5qvdA9Pp+HMfI7L9mgPR6G+P75o2 WiRjb02z70n6rFOMNalYGN/KlouikKay8GQCau+1YiDM9ovYhtNT32GnmjJl3bssIV/v zkJWMrYDo4/CXa33eRNDLN//WKTDZr1ykOfCiBDPIf6Jv3lWRp+fBPYN7wO4qBfn9C4T K3Sw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QWBz+r/KsI2oBK74Fc6ceQ9wcj5irADRhFWTzDbsDHo=; b=LvK6HCL4uTQfko7h7ssi0mFwCWwMS+l1DrEj0+ym79u+pn67AadFdsNUr3oOoidu+T AB6ybbwFGI6yDA5LWh4XVuf3zooZXhNwL69h5aqQPKPhDwWmAnJvvGbY2KaYcQLerXW0 ihglUMxFQ8AmX3gJQilvtBMDdzi+t6TYTz1lLtDHzIiEK8uqqEyyY6J3sPeLkRl5bwRx UJwQEjWDyRbSie9Enx+liZm19Zedy+ENPiR/Ynh++YuxkgbD41CZhJlkNoVphgYVlQZb wKwLVn9mnb/msJpogyEoAuw6bKtwCSvS6gDO94uKljEEFL2UedpDa+s7B8eLsCpGbaQf UPgA==
X-Gm-Message-State: AOAM533iv48p/Z/4T1OqpF1ug2gjbYjdre0OUjKhGfkvRPUYpc3smEnw Gs00XITUk4kGN3Pgqa6fMlU98qTuT1wcc21tHis=
X-Google-Smtp-Source: ABdhPJxd3VIoCji5pE0fAct3/cMmo2MOdFdVoF19OGfqtcjzgNCZ/4pPamU0NwFALcmJgjdVwi3CyUX+47TLh4tyPoI=
X-Received: by 2002:a17:906:899:: with SMTP id n25mr57188133eje.298.1594379266515; Fri, 10 Jul 2020 04:07:46 -0700 (PDT)
MIME-Version: 1.0
References: <20200709165910.GU3100@localhost>
In-Reply-To: <20200709165910.GU3100@localhost>
From: David Noveck <davenoveck@gmail.com>
Date: Fri, 10 Jul 2020 07:07:34 -0400
Message-ID: <CADaq8jdcMJk8L98qpx2t0K6Vv00-Uc_Pph4HfHzGnaWOYzOFXA@mail.gmail.com>
To: Nico Williams <nico@cryptonector.com>
Cc: NFSv4 <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000d1cba305aa14597c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/nsYhnXprRekQ8KByyjnIGbVvD80>
Subject: Re: [nfsv4] Making I18N exciting by getting it mostly out of NFSv4
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Jul 2020 11:07:51 -0000

On Thu, Jul 9, 2020, 12:59 PM Nico Williams <nico@cryptonector.com> wrote:

> I performed an early review of draft-dnoveck-nfsv4-internationalization-
> 01 on behalf of the i18ndir.  I'm not holding up progress.  It was only
> an early review.  I am taking advantage of this opportunity to revisit
> filesystem internationalization though.
>
> Yes, I do believe we (the IETF, the NFSv4 WG) have had filesystem I18N
> wrong for 17 years now, i.e., since RFC 3530.


The IESG  clearly got this wrong in RFC3530  but the working did not. The
working group was not involved in drafting the internationalization section
that appeared in RFC3530 and it was never implemented.

In my view, RFC7530 dealt with this adequately, although it's handling of
the case-insensitive case needs work. The job now is to fix that and apply
the RFC7530 treatment to all Nfsv4 minor versions.

This is not something new
> -- I've been saying something like this for almost as long as those 17
> years, and I've written about this on my blog and elsewhere many times.
>

Understood.


> Specifically:
>
>  - putting responsibility for normalization and case handling in a
>    filesystem server is simply wrong and -since implementors understand
>    that!- does not reflect running code
>

That's only true if one considers the server-side file system as not part
of the server.

I see no up-side and a great deal of confusion arising from this peculiar
approach to terminology.


>  - I18N responsibilities belong almost entirely to each filesystem
>

No. As you point out, I18N *implementation *needs to be done within
server-side file systems.

The question of responsibility is different. Within protocol
specifications, responsibilities are assigned to the communicating
entities.  There is no way to assign responsibility to N'th parties within
those entities.


>  - clients that cache directory contents with intent to perform local
>    lookups against the cache do need to know the I18N rules applied by
>    the remote filesystem
>

Good point. Nfsv4 needs to do that.


> The NFSv4 protocol (and others) does need to support telling _caching_
> clients what I18N rules are applied by each filesystem (or even each
> directory).


Agree.  That should be addressed via an NFSv4.2 extension. I hope your
document can provide the IANA  basis for that.

And perhaps NFSv4 servers need to reject invalid file name
> UTF-8 on create (and lookup),


This is currently OPTIONAL.  Your draft has this as a SHOULD.  We need to
discuss.

but that's it for the server's
> responsibilities.
>

Accepting, for the moment, your approach to terminology, yes.


> To help with this, I've written and submitted draft-williams-filesystem-
> 18n-00.


I've read it.  I see a lot of value within this document but have the
following major issues with the approach:

  o The approach taken to terminology makes it hard to distinguish the
document's real advances from the pseudo-advances that come from making a
problem someone else's responsibility.

  o Given that the entities whose behavior is discussed are outside the
IETF's area of responsibility, the document makes more sense as
Informational.

Soon I'll submit -01.  (Back in 2013 I also submitted draft-
> williams-i18n-boundary-analysis-00.)
>
> This has led to lively


I assume you mean "heated".

discussion on the i18ndir mailing list.  Some
> concede that this approach is correct but don't care to change past
> consensus.  Others think otherwise.  We've not reached consensus within
> the i18ndir, but we will, I'm sure.


This discussion has been going for a very long time. I'm not holding my
breath.

You can find the mailing list
> archives, though you subscription is limited to members of the
> directorate.
>

That is not reasonable.  This document needs to be discussed by the wide
range of people it might affect. An appropriate forum needs to be
found/created.


> For this WG, this work should be exciting because it will mostly get the
> WG out of the business of dealing with I18N.


I don't see how.

There will be a modicum of
> I18N work to do with caching clients.
>

I still anticipate problems getting people to read the documents.

Let's face it. You and I think this stuff is interesting but nobody else
does.


> Also, regarding running code, now that APFS implements the same approach
> as ZFS has for some 14 years now, I think it's fair to say that the best
> current practice for I18N in filesystems is:
>
>  - regarding normalization, implement form-insensitive, form-preserving
>    behavior
>
>  - do it in the filesystem
>
> Note that this treats normalization like case is typically treated in
> case-insensitive filesystems: preserve {case, form}, but be {case-,
> form-} insensitive.  Indeed, case-insensitivity is a lot like extending
> the equivalence problem normally dealt with by normalization.
>

True. The only problem is the locale-dependance of the latter, because of
the way Unicode (mis)-handled Turkish.

>
> The only reasonable alternative is HFS+'s, which is closer to 20 years
> old now, and that is to normalize on create (and lookup).  But Apple has
> clearly abandoned this approach with APFS.
>
> I took a look and Linux does have kernel Unicode form- and case-
> insensitivity utility code, but it's generally only used by
> case-insensitive filesystems.  It should be relatively simple to add
> support for a form-insensitive filesystem option.  This isn't needed to
> support the case about running code, but anyways, it'd be nice.
>

I'd like to see an unencumbered open source version. Sigh!


> Nico
> --
>
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4
>