Re: [I18ndir] [art] Modern Network Unicode

John C Klensin <> Wed, 10 July 2019 23:34 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 239E4120397 for <>; Wed, 10 Jul 2019 16:34:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id HiYvW3FvL92d for <>; Wed, 10 Jul 2019 16:33:59 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C349012037B for <>; Wed, 10 Jul 2019 16:33:59 -0700 (PDT)
Received: from [] (helo=PSB) by with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <>) id 1hlM61-000HAa-EF; Wed, 10 Jul 2019 19:33:57 -0400
Date: Wed, 10 Jul 2019 19:33:52 -0400
From: John C Klensin <>
To: Asmus Freytag <>,, Carsten Bormann <>
Message-ID: <EEF2384430D3C2B158C5C601@PSB>
In-Reply-To: <>
References: <0A5251342D480BA6437F7549@PSB> <> <248A8DD5DA0D3D34D6B6EFC9@PSB> <>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Scanned: No (on; SAEximRunCond expanded to false
Archived-At: <>
Subject: Re: [I18ndir] [art] Modern Network Unicode
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 10 Jul 2019 23:34:09 -0000

--On Wednesday, July 10, 2019 16:00 -0700 Asmus Freytag
<> wrote:

> On 7/10/2019 12:09 AM, John C Klensin wrote:
>> For the record, I do have one other concern.  The examples
>> above use extended Latin script.  Because of its NVT origins,
>> much of 5198 makes assumptions about that script or scripts
>> closely related to it.  If you are doing something for this
>> century and beyond, you should really think carefully about
>> the implications of scripts that are very different.
> There are a few scripts where un-normalized text is
> "preferred" by the user community over NFC. In some cases, the
> most natural ordering of combining marks does not match NFC's
> canonical ordering. I other cases, NFC does not compose some
> sequences while local user communities strongly prefer the
> precomposed code points (e.g. Bengali).
> Those scripts would be an exception to John's statement: " NFC
> is also a close approximation to what any sensible terminal
> driver or IME is going to produce natively from a plausible
> keyboard layout for the relevant script", a statement that
> otherwise holds well.

Indeed.   The explanation is also more detailed that I had time
for, and better than I would have written if I had tried, as to
why encouraging everything to be forced into NFC before
transmission is probably unwise.