Re: [I18ndir] I-D on filesystem I18N

John C Klensin <john-ietf@jck.com> Wed, 08 July 2020 01:32 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7F3E63A0CFE for <i18ndir@ietfa.amsl.com>; Tue, 7 Jul 2020 18:32:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZF54Ys8V39M6 for <i18ndir@ietfa.amsl.com>; Tue, 7 Jul 2020 18:32:05 -0700 (PDT)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5E79F3A0CF9 for <i18ndir@ietf.org>; Tue, 7 Jul 2020 18:32:05 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1jsywL-000FLr-76; Tue, 07 Jul 2020 21:32:01 -0400
Date: Tue, 07 Jul 2020 21:31:55 -0400
From: John C Klensin <john-ietf@jck.com>
To: Asmus Freytag <asmusf@ix.netcom.com>, Nico Williams <nico@cryptonector.com>
cc: i18ndir@ietf.org
Message-ID: <9044C737C36C0787B9EAE190@PSB>
In-Reply-To: <90740541-ab72-ffaf-ff3e-5a27b5805eae@ix.netcom.com>
References: <20200706225139.GJ3100@localhost> <90740541-ab72-ffaf-ff3e-5a27b5805eae@ix.netcom.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/Dxs41w4juS4BvsmimR0ev7WLsbE>
Subject: Re: [I18ndir] I-D on filesystem I18N
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Jul 2020 01:32:07 -0000

Just two quick comments - can't spend more time on this today:

--On Tuesday, July 7, 2020 16:43 -0700 Asmus Freytag
<asmusf@ix.netcom.com> wrote:

> However, early case-insensitive file systems did not preserve
> case. Not sure how rare this has become.

Well, Unix, and every Unix-derived system I know of (definitely
including Linux, FreeBSD, and NetBSD) are case-sensitive and
getting anywhere near their file names with Case Folding or even
lower casing will cause rather interesting problems.  One the
case-insensitive side, I've heard of a system called "Windows"
that usually preserves case but does not guarantee to do so and
many operations, in practice, don't.  

I assume I don't need to comment on how rare those systems are.

In addition, Asmus caught at least one case where your
terminology is inconsistent with that of Unicode.  Such
inconsistencies (at least if not specifically identified and
justified) are an invitation to reader confusion or worse.  As
one more example, you should not be talking about case folding
and then lower casing.  They are different and have different
implications.  And, fwiw, most of my Turkish-speaking and
writing colleagues would disagree that U+0131 "could" or should
be considered equivalent to U+0069.  In my experience, that
would be considered even less correct than treating U+0061 and
U+00FC as equivalent in Swedish or German.  There are certainly
theories and models that allow that, but they are closely
related to the ones that suggest that the easy way to handle
non-English languages on the Internet is by transliteration
rather than dealing with all of those messy non-Latin, and maybe
decorated Latin, scripts and writing systems.

best,
   john