Re: [I18ndir] I-D on filesystem I18N

Nico Williams <nico@cryptonector.com> Wed, 08 July 2020 22:10 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A58833A083D for <i18ndir@ietfa.amsl.com>; Wed, 8 Jul 2020 15:10:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cryptonector.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tMdBg8kwFzfo for <i18ndir@ietfa.amsl.com>; Wed, 8 Jul 2020 15:10:28 -0700 (PDT)
Received: from chameleon.elm.relay.mailchannels.net (chameleon.elm.relay.mailchannels.net [23.83.212.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C293C3A082C for <i18ndir@ietf.org>; Wed, 8 Jul 2020 15:10:27 -0700 (PDT)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id C29C6700822; Wed, 8 Jul 2020 22:10:26 +0000 (UTC)
Received: from pdx1-sub0-mail-a38.g.dreamhost.com (100-96-22-21.trex.outbound.svc.cluster.local [100.96.22.21]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 112DC7010E0; Wed, 8 Jul 2020 22:10:26 +0000 (UTC)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from pdx1-sub0-mail-a38.g.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.8); Wed, 08 Jul 2020 22:10:26 +0000
X-MC-Relay: Good
X-MailChannels-SenderId: dreamhost|x-authsender|nico@cryptonector.com
X-MailChannels-Auth-Id: dreamhost
X-Wipe-Madly: 37bf5d27476200d0_1594246226633_324217338
X-MC-Loop-Signature: 1594246226633:388211254
X-MC-Ingress-Time: 1594246226633
Received: from pdx1-sub0-mail-a38.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a38.g.dreamhost.com (Postfix) with ESMTP id C2997B41AE; Wed, 8 Jul 2020 15:10:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to:content-transfer-encoding; s= cryptonector.com; bh=XqFOQ/314pgK6catAcirXJpKC7E=; b=j90ORow6I3u xCOks/Q+q9ylNSiipD1bqcwXCUByiE+fvlfM7gq3k7gMASxINGm9Z/DCeV6zdYe4 Nk2JIYUQCiJbz2+wXUYeZHe8n7Bx1ttP1fpS/vHOB/3yGgYvlLFl+ZGfhZ/uJtPW WiV1v9e1uCuYbG4SrXB3zQLBGxP9/MZQ=
Received: from localhost (unknown [24.28.108.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by pdx1-sub0-mail-a38.g.dreamhost.com (Postfix) with ESMTPSA id A41DCB41AC; Wed, 8 Jul 2020 15:10:22 -0700 (PDT)
Date: Wed, 08 Jul 2020 17:08:45 -0500
X-DH-BACKEND: pdx1-sub0-mail-a38
From: Nico Williams <nico@cryptonector.com>
To: John Levine <johnl@taugh.com>
Cc: i18ndir@ietf.org, john-ietf@jck.com
Message-ID: <20200708220843.GR3100@localhost>
References: <9044C737C36C0787B9EAE190@PSB> <20200708203811.6CE9D1C6D7BD@ary.qy> <20200708211825.GQ3100@localhost>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
In-Reply-To: <20200708211825.GQ3100@localhost>
User-Agent: Mutt/1.9.4 (2018-02-28)
X-VR-OUT-STATUS: OK
X-VR-OUT-SCORE: -100
X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduiedrudekgddtjecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucggtfgfnhhsuhgsshgtrhhisggvpdfftffgtefojffquffvnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpeffhffvuffkfhggtggugfgjfgesthekredttderjeenucfhrhhomheppfhitghoucghihhllhhirghmshcuoehnihgtohestghrhihpthhonhgvtghtohhrrdgtohhmqeenucggtffrrghtthgvrhhnpedvudfffffftdektdeiieduueejtedvkeffveekhfehgefgveekhfdvudfghfelkeenucffohhmrghinheprghpphhlvgdrtghomhdpfihikhhiphgvughirgdrohhrghdpvggtlhgvtghtihgtlhhighhhthdrtghonecukfhppedvgedrvdekrddutdekrddukeefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmohguvgepshhmthhppdhhvghloheplhhotggrlhhhohhsthdpihhnvghtpedvgedrvdekrddutdekrddukeefpdhrvghtuhhrnhdqphgrthhhpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqpdhmrghilhhfrhhomhepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhdpnhhrtghpthhtohepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomh
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/qlKlhJWmhp_EySGpn9XLPRLkYF0>
Subject: Re: [I18ndir] I-D on filesystem I18N
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Jul 2020 22:10:32 -0000

Apparently, with APFS Apple switched from normalize-on-create to
form-insensitivity:

https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html#//apple_ref/doc/uid/TP40016999-CH6-DontLinkElementID_3

| APFS accepts only valid UTF-8 encoded filenames for creation, and
| preserves both case and normalization of the filename on disk in all
| variants. APFS, like HFS+, is case-sensitive on iOS and is available in
| case-sensitive and case-insensitive variants on macOS, with
| case-insensitive being the default.
| 
| In macOS High Sierra, APFS is normalization-insensitive in both the
| case-insensitive and case-sensitive variants, using a hash-based native
| normalization scheme. In iOS 11, APFS is normalization-insensitive as
| well, using either a native normalization scheme (erase restores only)
| or runtime normalization scheme (upgrades from previous versions).
| Runtime normalization will also be available in iOS 10.3.3 and macOS
| Sierra 10.12.6. Being normalization-insensitive ensures that
| normalization variants of a filename cannot be created in the same
| directory, and that a filename can be found with any of its
| normalization variants. This means that you don’t need to do any
| additional work to ensure correct normalization behavior in these
| versions of macOS and iOS.

With this I think it's fair to say then that form-insensitivity is now
the best common practice in filesystems!

(It'd be nice to have a survey of Unicode support in the various Linux
and *BSD filesystems, like BTRFS, HAMMER, HAMMER2, etc.
https://en.wikipedia.org/wiki/Comparison_of_file_systems does not
mention this stuff.)

Earlier versions of APFS though did not do any normalization or form-
insensitivity:

| In iOS 10.3 and in the case-sensitive variant of the developer preview
| of APFS in macOS Sierra, APFS is normalization-sensitive ...

which yielded this complaint:

https://eclecticlight.co/2017/04/06/apfs-is-currently-unusable-with-most-non-english-languages/

Ooops.  So Apple says:

| To avoid introducing bugs in your code with mismatched Unicode
| normalization (for iOS 10.3.0, 10.3.1 and 10.3.2) in filenames, do the
| following:
|
|  o Use high-level Foundation APIs such as NSFileManager and NSURL when
|    interacting with the filesystem.
|
|  o Use the fileSystemRepresentation property of NSURL objects when
|    creating and opening files with lower-level filesystem APIs such
|    as POSIX open(2), or when storing filenames externally from the
|    filesystem.

So... like Git and rsync as mentioned here earlier.  But Apple did fix
APFS.

Nico
--