Re: [apps-discuss] Review of draft-ietf-appsawg-file-scheme

Ned Freed <> Wed, 11 May 2016 14:39 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DFDF212DB11; Wed, 11 May 2016 07:39:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.998
X-Spam-Status: No, score=-2.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-0.996, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id UKxS2mto6nmQ; Wed, 11 May 2016 07:39:35 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8CE5E12DB0C; Wed, 11 May 2016 07:39:34 -0700 (PDT)
Received: from by (PMDF V6.1-1 #35243) id <>; Wed, 11 May 2016 07:34:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple;; s=mauve; t=1462977269; bh=dxWtqzfwo/wvKbGXex786X53C/8/JeH9SBQQ7HScohE=; h=Cc:Date:From:Subject:In-reply-to:References:To; b=PKus+NiHQ6axY46Lq9E/IwPceJ+OBa0P0XdVvo3Hf7NFh77y7qFgU7K5AH9CBEQh3 3FnjKe2gh6jGwidQG8+TAutp3c7D1Qc4ouQZlj/3LaUeJ8VyLp6+0dOOgNFYCz9/SU s/TLPQxdVHZYT6KPSRoFylCIehlxmX6bnx0mMmeA=
MIME-version: 1.0
Content-transfer-encoding: 8BIT
Content-type: TEXT/PLAIN; charset=utf-8
Received: from by (PMDF V6.1-1 #35243) id <>; Wed, 11 May 2016 07:34:27 -0700 (PDT)
Message-id: <>
Date: Wed, 11 May 2016 07:27:54 -0700 (PDT)
From: Ned Freed <>
In-reply-to: "Your message dated Tue, 10 May 2016 12:06:28 -0400" <>
References: <> <> <> <> <> <> <> <> <> <>
To: John C Klensin <>
Archived-At: <>
Cc: Julian Reschke <>,, IETF Apps Discuss <>
Subject: Re: [apps-discuss] Review of draft-ietf-appsawg-file-scheme
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: General discussion of application-layer protocols <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 11 May 2016 14:39:37 -0000

> --On Tuesday, May 10, 2016 19:19 +0900 "Martin J. Dürst"
> <> wrote:

> > In general, NFC gives you a higher chance for a match that
> > NFD. The Mac filesystem uses (mostly) NFD internally, but is
> > able to handle NFC. On the other hand, Windows and Linux don't
> > do normalization inside the file system, but the chances that
> > files were created in NFC is higher than for NFC.

> Agreed.  But note that this is partially an artifact that
> illustrates why the i18n / "multilingual" versus localization
> issues are important.    NFC gives a higher chance for a match,
> especially with strings that are not systematically normalized
> because, if one is using a keyboard designed for a particular
> language or location, that keyboard is likely to support
> locally-used characters and hence far more likely to product
> precomposed characters than combining sequences.  The same is
> generally true when people select characters from some sort of
> online character-picker, assuming the precomposed forms exist at
> all.   On the other hand, if I'm an experience user of one
> script trying to use a keyboard designed for a wildly different
> script or one with too many distinct character forms
> ("graphemes" or "grapheme clusters") to allow single-stroke
> arrangements to work well, all bets are off.

> For some scripts, there are also what look from the outside like
> internal consistency problems with Unicode: for example, NFD is
> more internally consistent then NDC because many recently-added
> precomposed characters decompose under NFC rather than
> composing.  And some don't, leading to some of the problems that
> led to the "non-decomposable character" mess that led to the
> LUCID BOF and the IETF's apparent paralysis about Unicode 7.x.

> It is hard to say something in cases like this that will always
> deliver the best, or even the most-expected, results.

I'm trying, but I find it difficult to have much sympathy here. The underlying
problem is that repeated applications of the "this is a tiny bit better for
this constituency so we must have it or at least allow for it" rule in this
space has led to a situation where no single best practice exists.

If this sounds like we've reached a distinctly suboptimal pareto optimum, it's
because that's exactly what has happened.

I therefore think the best thing is to document the situation as best
we can and be done with it. This is supposed to be a specification
for a URL scheme; there's no reqruiement that such documents also serve
as a BCP.