Re: [I18ndir] I-D on filesystem I18N

Nico Williams <nico@cryptonector.com> Tue, 07 July 2020 07:05 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DFB4A3A088F for <i18ndir@ietfa.amsl.com>; Tue, 7 Jul 2020 00:05:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cryptonector.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GweDls1QYn3F for <i18ndir@ietfa.amsl.com>; Tue, 7 Jul 2020 00:05:07 -0700 (PDT)
Received: from bonobo.birch.relay.mailchannels.net (bonobo.birch.relay.mailchannels.net [23.83.209.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 182FA3A088C for <i18ndir@ietf.org>; Tue, 7 Jul 2020 00:05:05 -0700 (PDT)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id EB3A54017A6; Tue, 7 Jul 2020 07:05:04 +0000 (UTC)
Received: from pdx1-sub0-mail-a38.g.dreamhost.com (100-96-5-104.trex.outbound.svc.cluster.local [100.96.5.104]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 40AA14010AB; Tue, 7 Jul 2020 07:05:04 +0000 (UTC)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from pdx1-sub0-mail-a38.g.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.8); Tue, 07 Jul 2020 07:05:04 +0000
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|nico@cryptonector.com
X-MailChannels-Auth-Id: dreamhost
X-Army-Soft: 7ad20c6f0103ac07_1594105504718_757237598
X-MC-Loop-Signature: 1594105504718:1855065518
X-MC-Ingress-Time: 1594105504718
Received: from pdx1-sub0-mail-a38.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a38.g.dreamhost.com (Postfix) with ESMTP id F2238B4266; Tue, 7 Jul 2020 00:05:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to:content-transfer-encoding; s= cryptonector.com; bh=POHhMwdQZGNLS3XUQndYHUL/AV0=; b=J5A39lyKjvJ yGd4qIywMQ4aBlm4DL1slUwQCrqv5BmtfqPf9GB1wT5VNNMrRjE76gzeLTTu5KGg XmD4mAWP7ONvLWTjg4x+944zkQ+lCLSQFut8uHxREM0J68tZVnhIQwPobRp09jGa riWaVv4uwWi20NL07xyGgc8A8aXgxMOA=
Received: from localhost (unknown [24.28.108.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by pdx1-sub0-mail-a38.g.dreamhost.com (Postfix) with ESMTPSA id 3541AB4263; Tue, 7 Jul 2020 00:05:00 -0700 (PDT)
Date: Tue, 07 Jul 2020 02:04:57 -0500
X-DH-BACKEND: pdx1-sub0-mail-a38
From: Nico Williams <nico@cryptonector.com>
To: Patrik Fältström <patrik@frobbit.se>
Cc: i18ndir@ietf.org
Message-ID: <20200707070456.GK3100@localhost>
References: <20200706225139.GJ3100@localhost> <B8BC0F0A-94AB-4BEF-8A5F-449049E28D8F@frobbit.se>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
In-Reply-To: <B8BC0F0A-94AB-4BEF-8A5F-449049E28D8F@frobbit.se>
User-Agent: Mutt/1.9.4 (2018-02-28)
X-VR-OUT-STATUS: OK
X-VR-OUT-SCORE: -100
X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduiedrudeggdduudekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepfffhvffukfhfgggtugfgjggfsehtkeertddtredunecuhfhrohhmpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqnecuggftrfgrthhtvghrnhepheeugedthfetveeihffhgfeuheekudeihfefvdehtdeuleeuledtfeetgfevudevnecuffhomhgrihhnpeguohhmrghinhdrnhgrmhgvnecukfhppedvgedrvdekrddutdekrddukeefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmohguvgepshhmthhppdhhvghloheplhhotggrlhhhohhsthdpihhnvghtpedvgedrvdekrddutdekrddukeefpdhrvghtuhhrnhdqphgrthhhpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqpdhmrghilhhfrhhomhepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhdpnhhrtghpthhtohepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomh
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/zF37iw6enYtPPe6LFYo6y1T1iW8>
Subject: Re: [I18ndir] I-D on filesystem I18N
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Jul 2020 07:05:09 -0000

On Tue, Jul 07, 2020 at 07:01:42AM +0200, Patrik Fältström wrote:
> Nico, I think this is good stuff. I must though of course(!) come with
> some suggestions and ideas that can make this even better.

Thanks!

> When you talk about what context this document is about, I feel you
> should explicitly say that you do not deal with RTL/LTR issues. This
> ends up being something that is very very important as well, but
> display issues is definitely not within scope for this document.

Quite true!

> https://stupid.domain.name/node/681
> https://stupid.domain.name/node/682
> https://stupid.domain.name/node/683

:)

> When looking at case folding, I normally succeed by explaining that
> the definition of case folding in fact is a function, not an attribute
> on the character. I.e. one can apply the function lower_case() to a
> character, and then with the help of that function say that "a
> character is said to be lower case if lower_case(s) is s. I.e. that
> the character is stable when applying the lower_case() function on it.
> 
> The reason for this is that it makes it easier to explain in the next
> step that the function might very well be (as you say) locale
> dependent, and I think more important that lower_case() and
> upper_case() are two functions that might not be inverse of each
> other. I.e. just because t = lower_case(s) might not imply s =
> upper_case(t).

And case folding can be designed for case-insensitive comparisons, so it
need not be the same as tolower().  That's why we speak of case folding,
not "lowering case" or such like.

> When this is understood(!) one can move forward and explain why for
> example IDNA only look at stability of normalisation and case folding
> to lower case. I.e. that that is one of the main ways "stability"
> regarding characters is defined in the IETF, and then how/why we need
> that stability after storage (as you explained).

Yes.  But it's easier for filesystems anyways, since the rules can vary
by filesystem and are applied by the filesystem (and caching clients),
so we have more locality and less in the way of global rules.  That is
as to case anyways; fortunately there's no need for local variations in
NFs!  In contrast, in DNS, the rules are not applied by the servers[*],
which complicates any attempt to vary the rules applied by clients to
domainnames under different TLDs.

That's a distinction probably worth being more explicit about.  In order
to engineer a solution, one has to know the constraints.  The
constraints in DNS are different than in filesystems.

[*] In principle, I suppose, DNS servers could pretend that all case
    variations of U-labels exist with the same content as the canonical
    A-labels.  (After all, that's what DNS servers are required to do
    for ASCII labels.)  But I suspect no one wants to implement this
    functionality in their DNS servers!  (This would be not unlike,
    e.g., Active Directory pretending that all case variations of the
    same realm name are essentially the same.)

Nico
--