Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt

"Manger, James" <James.H.Manger@team.telstra.com> Sat, 07 October 2023 01:09 UTC

Return-Path: <James.H.Manger@team.telstra.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B0A63C14CE22; Fri, 6 Oct 2023 18:09:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.008
X-Spam-Level:
X-Spam-Status: No, score=-2.008 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=team.telstra.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nAJYk1JyJQAO; Fri, 6 Oct 2023 18:09:11 -0700 (PDT)
Received: from AUS01-SY4-obe.outbound.protection.outlook.com (mail-sy4aus01on2118.outbound.protection.outlook.com [40.107.107.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9755FC151090; Fri, 6 Oct 2023 18:09:09 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QDKRP0Qy8Y3QQMTcJr70B8y/s02z0hHS4zX+QG49jsd4T0uW9nXq4r+4lZmjTtKc+VPj+PCRRYHgrsYXHIBzFAn/AkyUOkAMnZ2nUKjynfPp141FtHpJIIwBtZESRYbfKUFzt8JVGrV4MVpYUvuN9/4vWmTyWu9jnVk9d+YoO8wLjBraeoUpi6UjjwqRb5+f22eiv4na55VVZuhtajvu4YLuicOQo4McaksyMTOm8rKlN7bKGzvDcwudbmSptoMYC52kPigyX6wy8ZVI2v/cVBYEagkvjdAQ70pXPo4F/9bnr+5+U5yaUE0GZJiZDGQNdLN2GyuiN4UnMyg1rMLzVA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tHO84Hwu76L8dcCBpchyWr1f8+nZIu5jCGs1JeYaVHQ=; b=i6srigaAy5zB3i5JuOUgej2Pos8ew94jRjPiTTz2WOrTxbiu1bwfGBg2Iem6BQehAqedMU/vq1y8JpaeTv0p893ltmuX7uhw9549Hz4dqMxCj2wfBMlh6EGmOUSLB2XTispfxVTa18+oB3KM0EHYQjRUIdJH3bNsklLe/RYvqXNcwrj9wwC3RWXOmqvgxQDr1DhhDEFTrZer+LNj2BHFHMaTe+y8VthrldtI/kYDhOEn4Lz94KyaKB7hjIqS4bSjtejRHFqoQ97ybFSlnR2wBw8qZxzZkYS8HP5og3l7pGGbI8S/QQWJshhtIisZ2IAXTYed5gEy/GSogy0tDaUCVw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=team.telstra.com; dmarc=pass action=none header.from=team.telstra.com; dkim=pass header.d=team.telstra.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=team.telstra.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tHO84Hwu76L8dcCBpchyWr1f8+nZIu5jCGs1JeYaVHQ=; b=ox0jcBPuoa7XxmKLgqwdlWcvmdM5Wi0JY5bp2xo4usf9ZtFc4m1EyYZNt+WjBJ4h9ZZHpA6jSz5sFTsz/V9evW9GtApbtFyLU6jwWPKWzqMW8/9hupenFEGRvIrW6rPWf3rHv5iaz0XX4b2pbaLgtglzVZ1x15V9IEM9aD3ULjI=
Received: from SY4PR01MB5980.ausprd01.prod.outlook.com (2603:10c6:10:f7::9) by ME3PR01MB5703.ausprd01.prod.outlook.com (2603:10c6:220:7::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.41; Sat, 7 Oct 2023 01:09:07 +0000
Received: from SY4PR01MB5980.ausprd01.prod.outlook.com ([fe80::9cc9:656:a953:176b]) by SY4PR01MB5980.ausprd01.prod.outlook.com ([fe80::9cc9:656:a953:176b%3]) with mapi id 15.20.6838.033; Sat, 7 Oct 2023 01:09:06 +0000
From: "Manger, James" <James.H.Manger@team.telstra.com>
To: Tim Bray <tbray@textuality.com>
CC: "i18ndir@ietf.org" <i18ndir@ietf.org>, ART Area <art@ietf.org>
Thread-Topic: [art] New Version Notification for draft-bray-unichars-06.txt
Thread-Index: AQHZ9UuFPCV3ExadKEGgYt15nltTT7A3GDaAgAABPJ6ABlCCAIAAE/ds
Date: Sat, 07 Oct 2023 01:09:06 +0000
Message-ID: <SY4PR01MB5980D4B50C9E4E4A1AF92CC4E5C8A@SY4PR01MB5980.ausprd01.prod.outlook.com>
References: <169566019635.41806.9804796677919971070@ietfa.amsl.com> <CAHBU6is-wU2NLXNWL56nSJ4=nKvDzGv_Aw4qJN6N2O8CuM4-yw@mail.gmail.com> <SYBPR01MB59814B3448F5754AAEDA1740E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAHBU6iueqtd5T1T-ciYUMWvmo8XqBQqO5LkWbdRaoXQzPYSQOQ@mail.gmail.com> <SY4PR01MB5980D009F1623E3694B871B7E5C5A@SY4PR01MB5980.ausprd01.prod.outlook.com> <CAChr6SzMXqmEJvwQ0Vb0+CfchBn2kMueQJ-2Th1=4Oct8b9t6A@mail.gmail.com> <E1464943-EB11-4FA4-B933-4F138C6C34A0@tzi.org> <CAHBU6itgC07j0P5DcACDyHSjEOG6=j5kWE=eYF8E0NA3mm_b5A@mail.gmail.com> <SY4PR01MB59803C733B6B6A1C9D4E04F4E5C5A@SY4PR01MB5980.ausprd01.prod.outlook.com> <CAHBU6iuEbKOri56HiTB+HcsPKOpXJArFpbkVnf68=5i8FMWPUg@mail.gmail.com>
In-Reply-To: <CAHBU6iuEbKOri56HiTB+HcsPKOpXJArFpbkVnf68=5i8FMWPUg@mail.gmail.com>
Accept-Language: en-AU, en-US
Content-Language: en-AU
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_Enabled=True; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_SiteId=49dfc6a3-5fb7-49f4-adea-c54e725bb854; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_SetDate=2023-10-07T00:21:46.0313463Z; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_ContentBits=0; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_Method=Standard
authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=team.telstra.com;
x-ms-publictraffictype: Email
x-ms-traffictypediagnostic: SY4PR01MB5980:EE_|ME3PR01MB5703:EE_
x-ms-office365-filtering-correlation-id: 9101cd6e-6a59-4aba-2253-08dbc6d200d2
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: OKmz7rfavM3VinxsHrxS4PZoUiQ0RonxT+CbQzFC1agbjXb6RfuJW0Q57LwScxlrB7Wr4U+9uIiZAnmeAwCFrl4o7Yitjzc9o4JNWPJmqF+It0vI7udw/5MNKLRG+/rsl6ZYkQZePIg0M46zj96/TnFPkrjpLJee5/rN285GQNgQuKIbqlAKKW/Ulu8MJFk+qhHrmF+RDhtgDwozAiqVbmUQpx024xDzt7vDqEZhHlPXETDSxCsedl8z9RFu8OtwafFG0NgYmxmVjcnai+ph4YhuGldPVYyIITRBCKrsOyXr5S8l/XaE+hEpS0hDS/58st9ZiLoebDJ33tfvaAq5vkJIgGOZVLEuay7+55gMbrGrQnvNJxqFFtC62bJBwCSGAAk3jm/yfu+TY4hJUeKcjNFnx6QHI4Bb6xgbQcV1mbEs/O8yBOUn9BTyZmneNWkMX2yE/J1A5L+aACkW6sOePFUJbDfuFUBOgR+ojjSmhPIwLAlYPX2EL8UBN6+5hXeugjazkBf/OHwOjotdC+OgEba4u+OLLAnhPIF+ourY2m3wiA5zIP5+4dKPH70c2TUHIB9I1u5/zW7hz7Mh/fjmSrysPOHNeDKHpoWhc+v71s8=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SY4PR01MB5980.ausprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(39860400002)(346002)(136003)(376002)(366004)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(478600001)(6916009)(66476007)(66446008)(54906003)(76116006)(64756008)(66556008)(66946007)(7696005)(9686003)(71200400001)(53546011)(6506007)(21615005)(316002)(8936002)(41300700001)(2906002)(5660300002)(33656002)(122000001)(8676002)(86362001)(52536014)(4326008)(38070700005)(38100700002)(82960400001)(166002)(66899024)(55016003); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: NydVH+Qvkj+4diSj83YrM+4lJN+6sif729Q5GBO1EXPV2xtUMK1FfVl9Gh2uNM+PGXoU6+9UKfT5Oiwj32fsYCUBYainDvRHGMTlIdng1iBLcHJd7EsabZTbaRjZxL9bJl4FU06Hh/9wcCwH/59OPYZedTjl0Btom0HM92Xi5AuTjqP/Vpf7VKGPuGCvfNrGjR0YTQ6G0ZlfXNtVlQvqSO5SZUAS0GZP6bQxS/77NQNEdMH094xU3ByzEXAxF9KF+OAWzfmRqBxFucWm8D8i+s2/3uHtVpo0Dgv/ABfs+CV8eKR7bBOS8mBgNf2SWHdm34/OqVXPlEpxFUXRgP2aRkxp5iR2D34m6z+58qboDzWyj1YaqUANvhcoTCoYy1uV6CRkTe4pe0Ww9ap57yMRs5PFbIFdJ+l4P6FfJ+sjv9Bf0Fu6IBoPC3QwZTbRNbrRobH2Hgu3D4sxJ5e2+TPY98W2F57nBi+ZjEpcFOwZ6K8tjiEfjVf/ZwoWGJ+fwxrJrGgvemho6RyFrU7Qae27L11gcWc/Ghd/MFxRHPmsoYdJ4NB+IWsNV+5/zOKJnDaUe/Z3/4O8Criuxd2TWcYXmj/B1Bd2VcIh2oykBhilFDbnzlPHsTdfaeuVsSGjVet2dV8TVvBIAor7czlvfQZO3zt/G8GhSNndD9EiDzHfV8FmcwrFyf1fYFp5qamg7OmT/IEI8Eo1AoEwTANl152nWVpxPWW6anHgkthYnccx0huJxcZnzmeAB5lZBhjfHHMrR+qBmpFdqjETEhLQAdyJxayVJSwestPoEj4KyZI9W2z4aH5xSkN+CFtdY0TYrFIR4LqCnnIdbJX/+BUenPKosYgjzlLDln9ovwXelaUmDSFyRB9gyGG6SjpqLZiGaVlx5NEjdH7sliYh4j/9GYIXvlADn7pM3W6iUhZ+PmLYkA6fmzzlfJDmffq0ce4H8vUUp9wjHq3H9+2WJdMuDkSdtBa08F79INi1VuW/KP0UDyWW6DxDWhbqNBcoWhioiUwKiqkjfp4tnUCkiOb0USxqj4adWjVBZh629b81EgoJcq4pTVCpuP8rhRdf9RR+cgPd/hC+zsu2Z48ZbSOHz73Rr6UIWXoAkVgPJYCBakOAI5bKeh+kT7Wt42HlWdYejKgabQqjjqFPNzl3yPWfOYro2JUZ7ixw6EuS28rhP0iArIj6RUqTZzJMc2BAt3/QC1j4LN2QnmI+mBFzategibc3cFDPAFFI8bcqR0gWc2FLyQUnY46Tavi65YpzTmjEQVoZ2vsG3pJB07l0mTxdDXUwzNVXo0tScouKRCQU15HeG53gU8BTUSFeZyVBJKnWYVF92wNzwk7KavB/XGjS792nSVR0ReF1mCJJOF46tKUVgk6qRYkRBwIsN3EOAze1C0Y/JHb/tFrpb4vgX95vQRmag+2XwPzk/ucFH4hWVgXLHhgLFMuNg8GBXNKhUTCCT46r4px6AtpAG+BT1+aaZlG5meVgy7jo4zF96/+23g5dZroK9/cqDsfs9juujmfcLVcXHHt02ByusOwMp4n2es6nUnT+ikJDA3ZiThPFbh7NWg/TtQAGb4glD9z0cDbB8lusmsQ0Q6lm30pdkkXpi9F7TSegSXq5YBJnXI6Coh/kR4mhnWY+Dl4p3omj+Q5MiGPXITCpEjMJWQ/ZUwm1zN6tMHz3c2z5BUWAWDF2nu3vlXZ3WhB4irnXv9XyMKy8N3mH
Content-Type: multipart/alternative; boundary="_000_SY4PR01MB5980D4B50C9E4E4A1AF92CC4E5C8ASY4PR01MB5980ausp_"
MIME-Version: 1.0
X-OriginatorOrg: team.telstra.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: SY4PR01MB5980.ausprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 9101cd6e-6a59-4aba-2253-08dbc6d200d2
X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Oct 2023 01:09:06.1981 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 49dfc6a3-5fb7-49f4-adea-c54e725bb854
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: coHpm6Fr7gXRGzy5hiwJxbJuBRPy0/PL53zwDsp+wSiIBNGxj7Gvwnil28poj+NPIcEjBd3fOYcxaPDBScHB1rY4Jq2IdvLNaKbhnF9O/bE=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: ME3PR01MB5703
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/KauyMjLMf9ZTNk1NX4WUE3-MYTQ>
Subject: Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Oct 2023 01:09:16 -0000

General

On Oct 2, 2023 at 5:20:59 PM, "Manger, James" <James.H.Manger@team.telstra.com<mailto:James.H.Manger@team.telstra.com>> wrote:
draft-bray-unichars<https://datatracker.ietf.org/doc/html/draft-bray-unichars> §3 “Dealing with problematic code points” suggests “replacing problematic code points with "�" (U+FFFD, REPLACEMENT CHARACTER)” (or signalling an error, but I’ll only talk about the replacement option in this email).

  1.  An ill-formed sequence of code units needs to be replaced. It is far less obvious to me that “problematic” scalars should be replaced. Even for noncharacters Unicode provides a good FAQ<https://www.unicode.org/faq/private_use.html#nonchar9> and corrigendum #9 “Clarification about noncharacters”<https://www.unicode.org/versions/corrigendum9.html> that suggests passing them along (treating them like unassigned scalars) is often the best policy (because the internal/interchange boundary is blurry).
> OK, that’s worth a reference.

  1.  So §4.3 defining unicode-assignable that excludes noncharacters is fine -- when to be lenient on receiving a supposed unicode-assignable value is less obvious.
But §3 looks dodgy.
> Would a note that it might be reasonable to accept nonchars, referencing that corregendum, de-dodgify it in your view?

If it might be reasonable to accept nonchars, it presumably might be reasonable to accept controls or any scalar. To de-dodgify, the text should not conflate ill-formed code units with scalars.

“Virtuous intolerance” [RFC9413<https://www.rfc-editor.org/rfc/rfc9413.html#name-virtuous-intolerance>] with respect to UTF-8/16/32 is clear and widely implemented: signal an error; or replace with U+FFFD (or U+003F). Presumably this is why javascript is changing JSON.stringify to always escape an unpaired surrogate (not just accepting escaped-unpaired-surrogates in JSON.parse).

“Virtuous intolerance” with respect to xml-character or unicode-assignable is less clear to me. Maybe it is left to future specs that refers to these repertoires? Or maybe this doc can pick “rules for consistent handling of aberrant conditions”. That means this doc doesn’t merely name some repertoires but adds handling rules. Sounds feasible; could be controversial. Can we pick “always signal an error”? Or do we need to offer “or replace scalars-not-in-the-repertoire with U+FFFD”. It’s just that I’m not sure any systems do the latter.

In any case, I’d like to see any such “virtuous intolerance” rules for this doc’s repertoires described separately from Unicode’s existing “virtuous intolerance” rules for UTF-8/16/32.

--
James Manger