Re: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08

Michel Suignard <michel@suignard.com> Wed, 20 March 2019 00:11 UTC

Return-Path: <michel@suignard.com>
X-Original-To: idna-update@ietfa.amsl.com
Delivered-To: idna-update@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5D3CA1311CF for <idna-update@ietfa.amsl.com>; Tue, 19 Mar 2019 17:11:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.891
X-Spam-Level:
X-Spam-Status: No, score=-1.891 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=suignard.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id abXyS5U3s5pz for <idna-update@ietfa.amsl.com>; Tue, 19 Mar 2019 17:11:32 -0700 (PDT)
Received: from NAM05-DM3-obe.outbound.protection.outlook.com (mail-dm3nam05on060b.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe51::60b]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BAA0C1311DA for <idna-update@ietf.org>; Tue, 19 Mar 2019 17:11:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suignard.onmicrosoft.com; s=selector1-suignard-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vOADifRXQnVC9fAkYdAwBL4p2d1YykHxBeRzXtfGyaQ=; b=j4/y2QJtB9Im3rTrclXehtK1Zb1UUm4+r3+sj0nyUEtbhPhKoxjWedxntVp8+h8d/bfShfxrXTakPNGNEFUbLQpACyjn8SffXiCZzGVTnZqrUrKsWNqU8wI2+9e3RLu5qDOTSagT1xalXlbuIh4K/0sMWOCgCscvC9hWtu9amXc=
Received: from BYAPR19MB2806.namprd19.prod.outlook.com (20.178.238.15) by BYAPR19MB2455.namprd19.prod.outlook.com (20.179.91.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1709.14; Wed, 20 Mar 2019 00:11:28 +0000
Received: from BYAPR19MB2806.namprd19.prod.outlook.com ([fe80::99c1:282c:6d39:27b7]) by BYAPR19MB2806.namprd19.prod.outlook.com ([fe80::99c1:282c:6d39:27b7%3]) with mapi id 15.20.1709.015; Wed, 20 Mar 2019 00:11:28 +0000
From: Michel Suignard <michel@suignard.com>
To: "Asmus Freytag (c)" <asmusf@ix.netcom.com>, =?utf-8?B?TWFydGluIEouIETDvHJzdA==?= <duerst@it.aoyama.ac.jp>, "idna-update@ietf.org" <idna-update@ietf.org>
Thread-Topic: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08
Thread-Index: AQHU3a8v9gk+p3jJRU+60cNFrLr0HqYSg72AgAATPQCAACgmAIAAZeUAgAB4FRA=
Date: Wed, 20 Mar 2019 00:11:28 +0000
Message-ID: <BYAPR19MB2806D0E3053AB196D49F0AAEA2410@BYAPR19MB2806.namprd19.prod.outlook.com>
References: <155289429627.26188.2047331005281292889@ietfa.amsl.com> <458987D953A5B3227D3A791F@PSB> <EA2B2A09-152C-4AF3-B0C8-0D352CCA6647@netnod.se> <6b149a8d-9102-ea1a-5048-b83842fc66c0@ix.netcom.com> <3a7bd491-ab06-b8c7-9a8b-862c7a3cd122@it.aoyama.ac.jp> <c7885cd4-cfd5-1458-e58f-e5f83cde1cad@ix.netcom.com>
In-Reply-To: <c7885cd4-cfd5-1458-e58f-e5f83cde1cad@ix.netcom.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=michel@suignard.com;
x-originating-ip: [173.14.247.227]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ec23d505-ebe1-44ef-0a0d-08d6acc89960
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7027125)(7023125)(5600127)(711020)(4605104)(2017052603328)(7153060)(7193020); SRVR:BYAPR19MB2455;
x-ms-traffictypediagnostic: BYAPR19MB2455:
x-microsoft-antispam-prvs: <BYAPR19MB245521B939BEDC7EED3A9B5BA2410@BYAPR19MB2455.namprd19.prod.outlook.com>
x-forefront-prvs: 098291215C
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39830400003)(346002)(376002)(366004)(396003)(136003)(189003)(199004)(66066001)(6306002)(97736004)(26005)(508600001)(8936002)(6506007)(76176011)(71190400001)(71200400001)(14454004)(6116002)(93886005)(86362001)(14444005)(52536014)(790700001)(3846002)(316002)(99286004)(2501003)(102836004)(256004)(110136005)(229853002)(105586002)(106356001)(33656002)(81156014)(7696005)(25786009)(81166006)(8676002)(55016002)(11346002)(53936002)(5660300002)(446003)(2906002)(74316002)(54896002)(6436002)(9686003)(6246003)(486006)(7736002)(68736007)(186003)(476003); DIR:OUT; SFP:1101; SCL:1; SRVR:BYAPR19MB2455; H:BYAPR19MB2806.namprd19.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: suignard.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: iUyFieyl7WZs4LwLDl0ZPTKJPhgIXs1Q0iJ+c29CM5bneofz9P1QAP/HYbcWqaME3KEm65dDOU8BbNK3F06pth6MRVweZRRiHNdkP5+FRMAOXnvFuW6ktJdInYncfZGBmB6jtk3hirtal2ZNz/QcUwYRwEBPvQMLcPDMe0dZgmWnVim8SNsWAIItbaY+BbkSX1vrN9t44z+JgTAF8yek7CXgspTuhmUlu5Wj7StloHuinZoaQpm3ijhYQtlyP530dnJEU5PKCr0oVkjqwSSkShI4LCdM0cJbgyfbqprL0SnZjPbXBNFwfZwc4nFlVtHhCl9LcVZxk5ZsGPHxtS+hGiHcKEsTh3GDqzh5cOfxzZIWM3Wm1vxYunFAMKsHNuqWGWwaeN26WiclD+r5Y7ZAAHgYO23a3siYeMajFmVIULc=
Content-Type: multipart/alternative; boundary="_000_BYAPR19MB2806D0E3053AB196D49F0AAEA2410BYAPR19MB2806namp_"
MIME-Version: 1.0
X-OriginatorOrg: suignard.com
X-MS-Exchange-CrossTenant-Network-Message-Id: ec23d505-ebe1-44ef-0a0d-08d6acc89960
X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Mar 2019 00:11:28.4987 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: c72bffc7-022d-442d-a3fe-f53a3fa020d2
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR19MB2455
Archived-At: <https://mailarchive.ietf.org/arch/msg/idna-update/Wc1djPKjEM_3Va3YEr4Hb8u5pLg>
Subject: Re: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08
X-BeenThere: idna-update@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Internationalized Domain Names in Applications \(IDNA\) implementation and update discussions" <idna-update.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idna-update>, <mailto:idna-update-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idna-update/>
List-Post: <mailto:idna-update@ietf.org>
List-Help: <mailto:idna-update-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idna-update>, <mailto:idna-update-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Mar 2019 00:11:43 -0000

To add some here (I happen to be the project editor for ISO/IEC 10646, the ISO ‘equivalent’ of Unicode and also de facto liaison from Unicode to IETF):
ISO/IEC 10646 has been for a long time using a subset of Unicode properties by reference, and that includes the General Category (Gc) which is used by IDNA2008, same for normalization forms (NFC, NFD, NFKC, NFKD). There is zero appetite (and frankly no resource) in the ISO side to change that situation.

Concerning stability guarantee I can assure every one in this list that Unicode is way more concerned about stability guarantee than ISO. So far the only stability guaranteed in 10646 are the character names and the non-deletion of code points.

To the best of my knowledge there has been a single change: U+19DA NEW TAI LUE THAM DIGIT ONE that is problematic because it goes from PVALID to DISALLOWED, all others are from DISALLOWED to PVALID (which is not that different from the common change from UNASSIGNED to a mix of PVALID/DISALLOWED for new repertoires based on their Gc value and NFC behavior which has happened for 10,000s code points since Unicode 5.2).

Where Category G should contain that single case (or any similar case) is open to debate which we can entertain. I don’t think the other cases deserve it.

The point concerning modern scripts is well taken as well. Typically Gc changes result of a reinterpretation of a writing system because it was not well understood. This is unlikely to happen for a modern use script. There are some tension on NFC stability (recent case in Bengali), but most interested parties in Unicode and IETF know better than giving up on that kind of cases.

There is one case where a somewhat ‘modern’ script is currently under revision. That is the Mongolian script, but you would have to be a fool to create Mongolian IDNA labels at this point, even if there are many Mongolian PVALID characters and they appear in IDNA tables published by IANA. Therefore, there is a non-zero chance that Mongolian code points could show up in Category G in the future. But Mongolian has so many confusability issues that it is unlikely that it could be used for IDNA labels ever.

Finally there is some hope that the Unicode interpretation of IDNA (UTS #46) may get better aligned with IDNA 2008 and RFC5892. I have been considering putting some resources into that.

Michel