Re: [art] Modern Network Unicode — –02 submitted

John C Klensin <john-ietf@jck.com> Tue, 09 July 2019 21:09 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: art@ietfa.amsl.com
Delivered-To: art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A9CD1200B2 for <art@ietfa.amsl.com>; Tue, 9 Jul 2019 14:09:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iqoA9z5koVRA for <art@ietfa.amsl.com>; Tue, 9 Jul 2019 14:09:28 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E3FB712006F for <art@ietf.org>; Tue, 9 Jul 2019 14:09:27 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1hkxMR-000CwX-A9; Tue, 09 Jul 2019 17:09:15 -0400
Date: Tue, 09 Jul 2019 17:09:10 -0400
From: John C Klensin <john-ietf@jck.com>
To: Peter Occil <poccil14@gmail.com>, Carsten Bormann <cabo@tzi.org>, "Manger, James" <James.H.Manger@team.telstra.com>
cc: art@ietf.org
Message-ID: <F1C5158B356431D0A50952A8@PSB>
In-Reply-To: <5d23ea83.1c69fb81.23b36.b4fb@mx.google.com>
References: <CE3AD543-5847-4CAA-9B37-B293BF74C7D8@tzi.org> <SY2PR01MB2764141CB5B9863D0B358C39E5F60@SY2PR01MB2764.ausprd01.prod.outlook.com> <790BA1EF-7C0C-4E14-8BB2-4AD421ACAFB5@tzi.org> <5d23ea83.1c69fb81.23b36.b4fb@mx.google.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/art/lpPzEAnYVmw7uzL06M2fOD60ABw>
Subject: Re: [art] Modern Network Unicode — –02 submitted
X-BeenThere: art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Applications and Real-Time Area Discussion <art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/art>, <mailto:art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/art/>
List-Post: <mailto:art@ietf.org>
List-Help: <mailto:art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/art>, <mailto:art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Jul 2019 21:09:30 -0000

Carsten and Peter,

Citing a corrigendum for Unicode 7 at this point is probably a
bad idea.  If you need a reference or more explanation for this,
see Section 23.7 of Unicode 11.0 (same section number in Unicode
10; I haven't taken the time to change Unicode 12).  That new
section is also more complete than the text Peter cites.

best,
    john


--On Monday, July 8, 2019 21:14 -0400 Peter Occil
<poccil14@gmail.com> wrote:

> Saying U+FFFE and U+FFFF (or any other noncharacters) "MUST
> NOT be used" merely "as per the Unicode specification"
> ought to be guided by looking at what the intent of Unicode
> actually is.
> 
> See corrigendum 9 of the Unicode Standard for that intent:
> http://www.unicode.org/versions/corrigendum9.html
> 
> "Noncharacters in the Unicode Standard are intended for
> internal use and have no standard interpretation when
> exchanged outside the context of internal use. However, they
> are not illegal in interchange nor do they cause ill-formed
> Unicode text. This has always been the intent of the standard,
> as expressed by the Unicode Technical Committee."
> 
> Since noncharacters have no standard meaning outside of
> internal use, however, they may be even more problematic than
> the Unicode paragraph and line separators (which do have
> standard meaning but are forbidden in the draft for Modern
> Network Unicode).  Many kinds of protocol strings, such as
> URIs, IRIs, and strings complying with the PRECIS framework,
> do not allow noncharacter code points, while other kinds of
> UTF-8 text, such as JSON, do allow noncharacters.  XML allows
> all noncharacters except for U+FFFE and U+FFFF.
> 
> --Peter
> 
> From: Carsten Bormann
> Sent: Monday, July 8, 2019 4:24 PM
> To: Manger, James
> Cc: art@ietf.org
> Subject: Re: [art]Modern Network Unicode — –02 submitted
> 
> Hi James, Tim, Martin,
> 
> thank you for the quick feedback!  As suggested, I have
> started turning this into a standalone document (of course,
> still requiring RFC 3629, the UTF-8 definition):
> 
> Status:
> https://datatracker.ietf.org/doc/draft-bormann-dispatch-modern
> -network-unicode/ Htmlized:
> https://tools.ietf.org/html/draft-bormann-dispatch-modern-netw
> ork-unicode-02 Diff:
> https://tools.ietf.org/rfcdiff?url2=draft-bormann-dispatch-mod
> ern-network-unicode-02
> 
> I hope I haven't missed anything important from RFC 5198
> that I wanted to keep.   Maybe time to involve the authors of
> that RFC…
> 
> Grüße, Carsten
> 
> 
>> On Jul 8, 2019, at 04:18, Manger, James
>> <James.H.Manger@team.telstra.com> wrote:
>> 
>> Would be nicer if it wasn't written as a diff from RFC5198
>> (Network Unicode). That is, if you could get all the rules
>> directly from this doc. For instance, I assume Clean Modern
>> Network Unicode must/should be NFC. Keep the RFC5198
>> comparisons for an informative annex.
> 
> _______________________________________________
> art mailing list
> art@ietf.org
> https://www.ietf.org/mailman/listinfo/art
>