Re: [I18ndir] [art] Modern Network Unicode

John C Klensin <> Fri, 12 July 2019 02:36 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3F1FA12004D; Thu, 11 Jul 2019 19:36:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id LRQ2w0Au1CRY; Thu, 11 Jul 2019 19:36:37 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 29D0912001E; Thu, 11 Jul 2019 19:36:36 -0700 (PDT)
Received: from [] (helo=PSB) by with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <>) id 1hllQH-0000Yk-Pj; Thu, 11 Jul 2019 22:36:33 -0400
Date: Thu, 11 Jul 2019 22:36:28 -0400
From: John C Klensin <>
To: Carsten Bormann <>, Ira McDonald <>
cc:, "Asmus Freytag (c)" <>,
Message-ID: <3A4F6E9D943A8ECC0108A468@PSB>
In-Reply-To: <>
References: <0A5251342D480BA6437F7549@PSB> <> <248A8DD5DA0D3D34D6B6EFC9@PSB> <> <> <> <7F1F41C25D0AC5960D95A67E@PSB> <> <DFB116527FF004C961182B15@PSB> <> <>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Scanned: No (on; SAEximRunCond expanded to false
Archived-At: <>
Subject: Re: [I18ndir] [art] Modern Network Unicode
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 12 Jul 2019 02:36:39 -0000

--On Friday, July 12, 2019 01:46 +0200 Carsten Bormann
<> wrote:

> This is a great discussion.
> To me, it seems to converge on the following.
> (1) Sending sane data is the job of the data originator.  
> (2) Do not include gratuitous normalization steps in your
> processing, once the data have been originated in a sane form.
> (2a) If you broke it, you fix it (as far as possible): If your
> processing steps did involve gratuitous normalization, you
> have to renormalize to NFC before sending.
> Here, "sane" is defined as:
> (0) Data SHOULD be originated in NFC, unless that would be
> inappropriate for the specific script, in which case the
> community consensus rules for the script govern.
> For Latin script, this happens to collapse to what 5198 says.
> This set of rules places the onus on the place where the data
> is generated, which is usually the place that knows most about
> the specific script and about the intent of the originator.
> If you know that place isn't doing its job, add the rule:
> (1a) If the data originator does not do (0), the software
> placing the data on the network may need to sanitize
> (normalize towards sane).
> 1a is similar to 2a in that it doesn't create perfect
> results, so both SHOULD be avoided — there is no way to,
> after the fact, perfectly sanitize data that weren't
> originated sane or that were gratuitously normalized on the
> way.
> With these definitions, MNU can direct towards: 
> (A) Senders: send sane data
> (B) Recipients: break as little as reasonable when data
> received isn't sane 
> (C) B is not a valid excuse not to do A,
> and specifically: recipients are not expected to clean up
> after senders (because there is no correct way to do that).
> (Rule C is the often forgotten third rule of the Postel
> principle. It also means that an entity that is a recipient of
> MNU and then sends the data on as MNU has no need to
> gratuitously normalize, but it does not entirely get rid of
> rule 1a for recipients of data from places known not to be
> sane.)

I think this is exactly right.  And, fwiw, I agree with your
observation about the Postel principle.