Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt

Carsten Bormann <cabo@tzi.org> Sun, 01 October 2023 16:34 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ABC1FC151068; Sun, 1 Oct 2023 09:34:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.906
X-Spam-Level:
X-Spam-Status: No, score=-6.906 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rVJJ0Lt1Z0Zw; Sun, 1 Oct 2023 09:34:49 -0700 (PDT)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [134.102.50.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AAC04C151062; Sun, 1 Oct 2023 09:34:47 -0700 (PDT)
Received: from eduroam-pool10-335.wlan.uni-bremen.de (eduroam-pool10-335.wlan.uni-bremen.de [134.102.91.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4Rz8nK1kmTzDCdf; Sun, 1 Oct 2023 18:34:45 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CAChr6SwLcEX3Oox-CMCui+p8LQQFJBf+kG8p9WNpD8HzgXsm9Q@mail.gmail.com>
Date: Sun, 01 Oct 2023 18:34:44 +0200
Cc: "Manger, James" <James.H.Manger=40team.telstra.com@dmarc.ietf.org>, Tim Bray <tbray@textuality.com>, "i18ndir@ietf.org" <i18ndir@ietf.org>, ART Area <art@ietf.org>
X-Mao-Original-Outgoing-Id: 717870884.569719-2787f6b53b6dad13e1e894762249ec9a
Content-Transfer-Encoding: quoted-printable
Message-Id: <219F675E-0184-4FD9-BD48-6B62AD8A4353@tzi.org>
References: <169566019635.41806.9804796677919971070@ietfa.amsl.com> <CAHBU6is-wU2NLXNWL56nSJ4=nKvDzGv_Aw4qJN6N2O8CuM4-yw@mail.gmail.com> <SYBPR01MB59814B3448F5754AAEDA1740E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAHBU6iueqtd5T1T-ciYUMWvmo8XqBQqO5LkWbdRaoXQzPYSQOQ@mail.gmail.com> <SYBPR01MB59819A9F0BDD785F74EB2855E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAChr6SwLcEX3Oox-CMCui+p8LQQFJBf+kG8p9WNpD8HzgXsm9Q@mail.gmail.com>
To: Rob Sayre <sayrer@gmail.com>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/QWn7KL0dL5aB18gQmqdCal5F9mQ>
Subject: Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Oct 2023 16:34:53 -0000

On 2023-10-01, at 18:27, Rob Sayre <sayrer@gmail.com> wrote:
> 
> Normalization actually does come up in application code sometimes. For example, tweets need to be NFC to count characters correctly [1]. But the person writing that code is not going to learn much from this document (I do find the subsets useful and concise, though).

See https://www.ietf.org/archive/id/draft-bormann-dispatch-modern-network-unicode-03.html#name-normalization

> Unpaired surrogate code points and their escaped form need to be covered, since the standard behavior of billions of web browsers is to smuggle them in well-formed UTF-8 via escape sequences. [2]

But this is about JavaScript (and its half-broken character model), not about Unicode.

Grüße, Carsten