Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt

"Manger, James" <James.H.Manger@team.telstra.com> Tue, 03 October 2023 00:21 UTC

Return-Path: <James.H.Manger@team.telstra.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7BC74C14CF1A; Mon, 2 Oct 2023 17:21:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.009
X-Spam-Level:
X-Spam-Status: No, score=-2.009 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=team.telstra.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id weRD9ETqi1Jk; Mon, 2 Oct 2023 17:21:05 -0700 (PDT)
Received: from AUS01-SY4-obe.outbound.protection.outlook.com (mail-sy4aus01on2135.outbound.protection.outlook.com [40.107.107.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D4C74C14CEF9; Mon, 2 Oct 2023 17:21:04 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=V+2W4LgmJ6SzWrsdRo7V6UchNhKMEP3s7uimssN6ivVHyw6OwkmI23uJbultO7WRn9CnAXorC2iqoctw+jckDK93qmIQm4jLBjcKN+kuhhuMaKjui4A4m499yDxnounVZiLPSk3g4BB0/HrIZZJvH+BAFC1G4LoOI61/9uOkg8/MUCGZSHFgRi1MqcV3zaFF2nqAy2DoEZIpZ0oS+gcLDhRLIH1CKjMJxRCmMsFwvpGdo71ywo0M+nFoBLrfUuWNve/9LIZ9lgqxQ/QMlk/SVwCCidcYBDRm+bPKDu04gvrwWkIlZeHjTyDuv2tpUWdZ4e3/rKctrBaEt+6L8+dfBA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=m5JOM4iRV7+OjNupavDb56o1r7J7XTPCAgIWFvI0L/k=; b=U63Q4QCmH5+5eKqJq10Y92mCwCUI68+xZd3XS03kIdCwwVQiiGnTudf0KGnA5KhwCytkufQXxAzJ7v0rk/mg2cDsiX4my2eH2H+3jL8AeqL8Ez+/moIq3eTDzYVgQp0E2Z3pF6Y2K13UNH+ioBf8wIfvAqMtxYhrrLI1szQ3neFm6I/KGO3sASQfJGPCpAvLXe9W1lIT4jzuLo4/sV9Xf56ELbOBvIQJJFrjCVwBAdb8oGoKWtm8cup/r+6WcVy9NTNU9u7a/nXzQijRfIIA1EoXKGbh8F+trujLeMVvybf/ODL3nuAdZWZUYFr0QpjQH5q5FvRahRZubREKWBXnFg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=team.telstra.com; dmarc=pass action=none header.from=team.telstra.com; dkim=pass header.d=team.telstra.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=team.telstra.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=m5JOM4iRV7+OjNupavDb56o1r7J7XTPCAgIWFvI0L/k=; b=gU/wcwrl8NtZdfIRzpcz0/EPz5+JcBEBZaE/1FCrKyPIFo9YXNkpC9swUcGs+igkqEXURUefsxoeQsDGXGe86Cxg7gnEaAupJ8NohYCAvawpYylEw7yWEC7VyTok2N3BYq0V9l8XjQVF9PAHzmhOvoWvrTsUpYA79aLVN0ZqTU4=
Received: from SY4PR01MB5980.ausprd01.prod.outlook.com (2603:10c6:10:f7::9) by SY7PR01MB8832.ausprd01.prod.outlook.com (2603:10c6:10:217::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.31; Tue, 3 Oct 2023 00:20:59 +0000
Received: from SY4PR01MB5980.ausprd01.prod.outlook.com ([fe80::9cc9:656:a953:176b]) by SY4PR01MB5980.ausprd01.prod.outlook.com ([fe80::9cc9:656:a953:176b%3]) with mapi id 15.20.6838.029; Tue, 3 Oct 2023 00:20:59 +0000
From: "Manger, James" <James.H.Manger@team.telstra.com>
To: Tim Bray <tbray@textuality.com>
CC: "i18ndir@ietf.org" <i18ndir@ietf.org>, ART Area <art@ietf.org>
Thread-Topic: [art] New Version Notification for draft-bray-unichars-06.txt
Thread-Index: AQHZ9UuFPCV3ExadKEGgYt15nltTT7A3GDaAgAABPJ4=
Date: Tue, 03 Oct 2023 00:20:59 +0000
Message-ID: <SY4PR01MB59803C733B6B6A1C9D4E04F4E5C5A@SY4PR01MB5980.ausprd01.prod.outlook.com>
References: <169566019635.41806.9804796677919971070@ietfa.amsl.com> <CAHBU6is-wU2NLXNWL56nSJ4=nKvDzGv_Aw4qJN6N2O8CuM4-yw@mail.gmail.com> <SYBPR01MB59814B3448F5754AAEDA1740E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAHBU6iueqtd5T1T-ciYUMWvmo8XqBQqO5LkWbdRaoXQzPYSQOQ@mail.gmail.com> <SY4PR01MB5980D009F1623E3694B871B7E5C5A@SY4PR01MB5980.ausprd01.prod.outlook.com> <CAChr6SzMXqmEJvwQ0Vb0+CfchBn2kMueQJ-2Th1=4Oct8b9t6A@mail.gmail.com> <E1464943-EB11-4FA4-B933-4F138C6C34A0@tzi.org> <CAHBU6itgC07j0P5DcACDyHSjEOG6=j5kWE=eYF8E0NA3mm_b5A@mail.gmail.com>
In-Reply-To: <CAHBU6itgC07j0P5DcACDyHSjEOG6=j5kWE=eYF8E0NA3mm_b5A@mail.gmail.com>
Accept-Language: en-AU, en-US
Content-Language: en-AU
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_Enabled=True; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_SiteId=49dfc6a3-5fb7-49f4-adea-c54e725bb854; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_SetDate=2023-10-02T22:44:35.8917692Z; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_ContentBits=0; MSIP_Label_f4ab56b7-6ec4-4073-8d92-ac7cc2e7a5df_Method=Standard
authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=team.telstra.com;
x-ms-publictraffictype: Email
x-ms-traffictypediagnostic: SY4PR01MB5980:EE_|SY7PR01MB8832:EE_
x-ms-office365-filtering-correlation-id: f9437b87-0f2e-4768-31d2-08dbc3a69eaa
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: FreycKLmpQGUiSlCgHRBIYIL7cEdpAJ1E07uCG4eISAU8reul0bli24fytuXtpZVWYPWg9a3tSYL/KM5XK+c8ynC1y8bxsKxHELyWCddXQ8FAeObA5OS/tKqrTbyKuGzZz7mOdAfu4DPXDTz+Q+GA745y6qFsg6yuUO1iSVSKGyh/djQ0bJhv8sSv0FXUEhtCcnisStyPdUUuuwnUGfNVQsr0J6PI//GdBfQphPvA6JUTa34Uwjp89whqWhimGofZjyzXs/33TGV4fNDWS8C9Bgq4sxU3uoJDz/Eb7XbSPcvs4WDuqNtj7VQZD3vN7jhv+aRKPGLfrNXWhnmCdvtz5kTd9FbJGCWLUAJNYZ/x1ddCGhykvXn2FGiObsR3wAeo9CKQM0Bwix2s5e0yV9ayofGjwfRqJd6JGTrpbx7jVu7cCzgfnnUlQE1vPYeLqY0pWJ28zVIiVUBwx6u3c4EZHiJWGcoBvLGXSpYPd7f7sG1NIe+UNbkHXG4JjrtOSML3oE8CsZTs5nfEQ0p9Y+UGROq6uxAwwG1/YwlwBtwVT8SCIRazEZbSjCXmM3nk2QTdLVfOsLgLKYSdc6bjrYW6lNvhu+utytSZYAHXJyxXwI=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SY4PR01MB5980.ausprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(376002)(346002)(396003)(39860400002)(366004)(84050400002)(230922051799003)(64100799003)(1800799009)(186009)(451199024)(478600001)(66899024)(38070700005)(9686003)(86362001)(26005)(33656002)(6506007)(7696005)(53546011)(122000001)(82960400001)(38100700002)(166002)(83380400001)(2906002)(71200400001)(15650500001)(8936002)(8676002)(52536014)(4326008)(5660300002)(54906003)(41300700001)(316002)(55016003)(76116006)(66946007)(66556008)(6916009)(64756008)(66476007)(66446008)(21615005); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: W2l2Fw6CyAhrPm46eV8hFO5yaafsoyPbafUMsubKhjfiaU/6hfn64Ljfs2GfSOejbH1E3XHOg9EWO5Oto3kRLD3Pppx6Z3YnqWa8SWnDAKGcLLV0Qkq9wj3o51Lg8338VtOQejshwE3q/jzJmrl66lD47MKMsPZy2cz/iSQDSuqrJ4TB5VCxJm8viRe79u3Yz39/or2B0dyu0UJ/yfWjYcQhsjoXlb2FQuFoHfIPkEwufbKDqNJyEaqBXLKnKlZa7oV+rHWUfMKgOzdDEUXb5++srLT3nbJuCwEzrv+VfcTMiBWjRu34RpajM869gQdSTJPXxKBSShB6yocTqq2hOggD7s/iWvlhlYuPfQQ0dz8WBAddllUjhfTRqmXBkupwA5voKw3IVRCb8IcqYFNco4lQvsZu7M6gnG1H1gqTToMaAVQDnOXZy4YnWeJ4mXmvKCQrf1t9B6B8xsJ3eanqbLCAoklA13JgVxXbGtX5opgfL3Qcrbrq7raCiP1oOGFSZFeLDFlrvqeiYuh7X0ThSKudMfJALnhObyRe/wqpMHIvpqTxSb0XUz/K/+rZgfNSLpY7J5FLK/zN/hLLp1x5CDP+ICDiOModD1sdsJAPHNvwyJ3EpgvpvCnLauywqh72Wit1/JmQiGP2LQ8PitdlRnJNqecn68wTEAkaCF7mppl7vBPvtOihsOaNFxcb8y/5smFgZSbLQ35YpX22UN9hqAAinXL5DoisHPrNb40Bg6Ok7WGUDGWNGFkgwwHVpicSVvvr2v5sXYHZ/gOm5ZDVWf/unWUGPkvJrmlmuirVNUntrfjC0mCHov4pqiQ09RB01fje378zBN7hecBXNdREubejmfvoZzXLKKXwuQrVjBh8qa9mYYffLUm8Kd1xQX0NRsqowNpUw1qxSGbD9L6F5Wu5O6nTs2jQOT35sgAxzBPLSUix5uAHomiKWTFz8kyLt0a44XuT1AibldBJc3S4mKMJuvridznpe3X6KHOdO8VN+H7ZuyxDlyT3r+2AULO15oBEQlEptFOY2mdPZLRgVcgn1mzyyWeCf2wh4pCluItSxP2Q3gnrCw8pVYDMZqwZyPSt9GxbwQQptJiRBcNDQ9Gid4r92d+mKIJzUNfpciyB+L8+TtOiNSvZTaLrFXb48XrFbjK/ip7gHw8vaYyFfcGmRapbQQ81l/lF45X6ADC2YBFixLilYj+jFSdF3+jz8yWz5ZrHyhDL9LTAdfJorlmGyspUQi5Nm7wexdYOH52XCR2+3qrePxiG3loFL2DsvpheVdr/auzItglMpIXwONpeBlI/6lwbdljVr7UbmmBYx1aQEhwM8MvgfQyDHfhXR0DALrI8v2noJDP6n7WCeYwb1ttHbul1NwsqzTCvjVyOEWkmtmRYMdfGRdSzOOT7f8a/ssP6fhmd7y3sz+QUr4UqT1FuGp1QX+9fz+UYzOhBs/I8hs37GdYuouHZY3G5iJcoFQA9HPfUxuu8+Z4eQqh5FKk/lbbQbcQF8tCAqfNm3bVGsu8KX7j3DhLF3yJ2J0WllNwqn/0pSo5GRiFM3yw+kfxPRtoxaUgaeBvCiZ1sYApZ9VARQ+Ep19SGLsYMZA0fIphWzuEb5gg5WCfZBzr9nKG7R0LZhonZBY1urvQ=
Content-Type: multipart/alternative; boundary="_000_SY4PR01MB59803C733B6B6A1C9D4E04F4E5C5ASY4PR01MB5980ausp_"
MIME-Version: 1.0
X-OriginatorOrg: team.telstra.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: SY4PR01MB5980.ausprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: f9437b87-0f2e-4768-31d2-08dbc3a69eaa
X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Oct 2023 00:20:59.6775 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 49dfc6a3-5fb7-49f4-adea-c54e725bb854
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: BozsNM2K/aQZf9q77uXSS0KBlaKbITOZ6gVS+w1JiSGgE3c1UXFRPfOltyX1cbEKynZXmMtmClnPbbcSGtnkUitjqkFB+63kuQ6u7L2gRvc=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SY7PR01MB8832
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/-upaXvTI7KP41hbC0o4a1Wk1TO0>
Subject: Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Oct 2023 00:21:09 -0000

draft-bray-unichars<https://datatracker.ietf.org/doc/html/draft-bray-unichars> §3 “Dealing with problematic code points” suggests “replacing problematic code points with "�" (U+FFFD, REPLACEMENT CHARACTER)” (or signalling an error, but I’ll only talk about the replacement option in this email).


  1.  An ill-formed sequence of code units needs to be replaced. It is far less obvious to me that “problematic” scalars should be replaced. Even for noncharacters Unicode provides a good FAQ<https://www.unicode.org/faq/private_use.html#nonchar9> and corrigendum #9 “Clarification about noncharacters”<https://www.unicode.org/versions/corrigendum9.html> that suggests passing them along (treating them like unassigned scalars) is often the best policy (because the internal/interchange boundary is blurry).
So §4.3 defining unicode-assignable that excludes noncharacters is fine -- when to be lenient on receiving a supposed unicode-assignable value is less obvious.
But §3 looks dodgy.
  2.  U+FFFD is an obvious choice to replace code units or scalars you don’t want. But Unicode does allow choices. Unicode ch3<https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf> C10 only says “with a marker such as U+FFFD”. Unicode TR36<https://unicode.org/reports/tr36/#Substituting_for_Ill_Formed_Subsequences> says “where U+FFFD is not available, a common alternative is "?"”. Java, for instance, uses “?” is some common circumstances. Unichars does not admit such an option.
  3.  “Silently ignoring” is the wrong phrase. The security risk is “deleting” ill-formed sequences or unwanted scalars. “Silently ignoring” feels the same as “deleting” when decoding code units to scalars; but feels different when processing input chars to output chars as it covers passing along untouched an unliked scalar.

--
James Manger




General

From: Tim Bray <tbray@textuality.com>
Date: Tuesday, 3 October 2023 at 9:40 am
To: Carsten Bormann <cabo@tzi.org>
Cc: Manger, James <James.H.Manger@team.telstra.com>, i18ndir@ietf.org <i18ndir@ietf.org>, ART Area <art@ietf.org>, Rob Sayre <sayrer@gmail.com>
Subject: Re: [art] New Version Notification for draft-bray-unichars-06.txt
[External Email] This email was sent from outside the organisation – be cautious, particularly with links and attachments.
On Oct 2, 2023 at 9:14:18 AM, Carsten Bormann <cabo@tzi.org<mailto:cabo@tzi.org>> wrote:
 The IETF could pound its collective fist and say "all ill-formed Unicode must be rejected”,

Yes, please.
The fact that this is the only reasonable way forward is the point of RFC 9413.

Now we agree! And further (especially given the threats described in Unicode TR36) you often also want to reject control codes and noncharacters.  I think the IETF should be shouting this!

To promote this, it would be helpful if people actually understood what the problems are, and which code points to reject, and had a reference that explained the issues and provided ABNF for what to accept, at increasing levels of fussiness. Then when the IETF starts shouting, there will be a short clean reference to accompany the shouting.