Re: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08

Martin J. Dürst <duerst@it.aoyama.ac.jp> Wed, 20 March 2019 05:35 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: idna-update@ietfa.amsl.com
Delivered-To: idna-update@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C5299130EF2 for <idna-update@ietfa.amsl.com>; Tue, 19 Mar 2019 22:35:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.923
X-Spam-Level:
X-Spam-Status: No, score=-0.923 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LrF8nXE46fi5 for <idna-update@ietfa.amsl.com>; Tue, 19 Mar 2019 22:35:32 -0700 (PDT)
Received: from JPN01-TY1-obe.outbound.protection.outlook.com (mail-eopbgr1400098.outbound.protection.outlook.com [40.107.140.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1974D127287 for <idna-update@ietf.org>; Tue, 19 Mar 2019 22:35:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0/pXr7xmUw2sDFVpPOeBm2vu9ttv4zaFwfWMH8rFAwQ=; b=Rgp5ZFoGBd0DTUx0u02l8YMawscMvTZNqhdOJl4daBxb6kYFp1KVxeYJDU7bx5KTJvjy2jPzo3YEHdgrw+CXHJICQTJ1RMetAMhTctvim1O+2cBlBDiW+OD0nmAEeze1fRXgOf+b02rzzOisU+MWIml0lNSlGcGdz5DsH3TzZkY=
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com (20.179.187.18) by TYAPR01MB3407.jpnprd01.prod.outlook.com (20.178.136.150) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1709.14; Wed, 20 Mar 2019 05:35:28 +0000
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::98b6:d90e:9ae7:302]) by TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::98b6:d90e:9ae7:302%3]) with mapi id 15.20.1709.015; Wed, 20 Mar 2019 05:35:28 +0000
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: Michel Suignard <michel@suignard.com>, "idna-update@ietf.org" <idna-update@ietf.org>
Thread-Topic: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08
Thread-Index: AQHU3VzVurPJZqZEcUKPczDgWng2vqYRo1uAgADhB4CAABM9AIAAKCYAgABl5QCAAIKJAIAAWoeA
Date: Wed, 20 Mar 2019 05:35:28 +0000
Message-ID: <39d5059c-43f1-0bd8-ac40-52fc9863fbb6@it.aoyama.ac.jp>
References: <155289429627.26188.2047331005281292889@ietfa.amsl.com> <458987D953A5B3227D3A791F@PSB> <EA2B2A09-152C-4AF3-B0C8-0D352CCA6647@netnod.se> <6b149a8d-9102-ea1a-5048-b83842fc66c0@ix.netcom.com> <3a7bd491-ab06-b8c7-9a8b-862c7a3cd122@it.aoyama.ac.jp> <c7885cd4-cfd5-1458-e58f-e5f83cde1cad@ix.netcom.com> <BYAPR19MB2806D0E3053AB196D49F0AAEA2410@BYAPR19MB2806.namprd19.prod.outlook.com>
In-Reply-To: <BYAPR19MB2806D0E3053AB196D49F0AAEA2410@BYAPR19MB2806.namprd19.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-clientproxiedby: TYAPR04CA0023.apcprd04.prod.outlook.com (2603:1096:404:15::35) To TYAPR01MB5149.jpnprd01.prod.outlook.com (2603:1096:404:12e::18)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=duerst@it.aoyama.ac.jp;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [133.2.210.64]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: d8924b17-7678-44e9-6821-08d6acf5dc84
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(5600127)(711020)(4605104)(2017052603328)(7153060)(7193020); SRVR:TYAPR01MB3407;
x-ms-traffictypediagnostic: TYAPR01MB3407:
x-microsoft-antispam-prvs: <TYAPR01MB34072B302157644CC1A7C398CA410@TYAPR01MB3407.jpnprd01.prod.outlook.com>
x-forefront-prvs: 098291215C
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(396003)(346002)(376002)(136003)(39840400004)(366004)(189003)(199004)(305945005)(81166006)(81156014)(31696002)(7736002)(6506007)(8936002)(26005)(102836004)(186003)(68736007)(386003)(85182001)(53546011)(74482002)(476003)(66066001)(52116002)(446003)(11346002)(486006)(6116002)(2616005)(3846002)(76176011)(14454004)(2501003)(53936002)(85202003)(110136005)(508600001)(86362001)(256004)(6246003)(8676002)(6436002)(99286004)(229853002)(14444005)(71200400001)(71190400001)(31686004)(316002)(97736004)(106356001)(786003)(105586002)(5660300002)(2906002)(6486002)(6512007)(93886005)(25786009); DIR:OUT; SFP:1102; SCL:1; SRVR:TYAPR01MB3407; H:TYAPR01MB5149.jpnprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None (protection.outlook.com: it.aoyama.ac.jp does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: nGX+2MF+yXsH1FxE6c411iEhhlVVGKoGnCCKOZFpricbkT1qFs4GoyBNosOausCq2UFAwLOzHyO9TiPdnh5oCIkCjjXqXtX1hbncMGjvlupLn+yRcLR7ZvJgXXh7FXKGSV5qDRPJ7M6CYcY3DJ/XkMdYLD7d7SPiLur04MXiwaXjWbr3xP3pSlF7HLP6Cjqe1KyMprm9LW9H24vQyLHsVex8ipqKFJeWo7HT0LZbS41762ll1MneA/JHLiOA8Buv+RQbbzc5IzTGRhztDtAjkwkQYe2t8RStzX18MLD/J0khQxMEdQu+US+ASsQxs8ryPQzJ9DBtRziGfUHbHaCF67gHTfptpcb9TGrLIgellQjrCTipqSn/f5LfBnysXj7hKAGnAQsQujQZFDYT2YMJZBmUeVtPbjKKzCv7JQpHH9I=
Content-Type: text/plain; charset="utf-8"
Content-ID: <231BFF49145D3F41925415600B74A1A6@jpnprd01.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: d8924b17-7678-44e9-6821-08d6acf5dc84
X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Mar 2019 05:35:28.8065 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB3407
Archived-At: <https://mailarchive.ietf.org/arch/msg/idna-update/ZZhWeg_tmTaucO9lBcxrMY6atBI>
Subject: Re: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08
X-BeenThere: idna-update@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Internationalized Domain Names in Applications \(IDNA\) implementation and update discussions" <idna-update.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idna-update>, <mailto:idna-update-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idna-update/>
List-Post: <mailto:idna-update@ietf.org>
List-Help: <mailto:idna-update-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idna-update>, <mailto:idna-update-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Mar 2019 05:35:34 -0000

Hello Michel,

Glad to hear from you.

On 2019/03/20 09:11, Michel Suignard wrote:
> To add some here (I happen to be the project editor for ISO/IEC 10646, the ISO ‘equivalent’ of Unicode and also de facto liaison from Unicode to IETF):
> ISO/IEC 10646 has been for a long time using a subset of Unicode properties by reference, and that includes the General Category (Gc) which is used by IDNA2008, same for normalization forms (NFC, NFD, NFKC, NFKD). There is zero appetite (and frankly no resource) in the ISO side to change that situation.
> 
> Concerning stability guarantee I can assure every one in this list that Unicode is way more concerned about stability guarantee than ISO. So far the only stability guaranteed in 10646 are the character names and the non-deletion of code points.
> 
> To the best of my knowledge there has been a single change: U+19DA NEW TAI LUE THAM DIGIT ONE that is problematic because it goes from PVALID to DISALLOWED, all others are from DISALLOWED to PVALID (which is not that different from the common change from UNASSIGNED to a mix of PVALID/DISALLOWED for new repertoires based on their Gc value and NFC behavior which has happened for 10,000s code points since Unicode 5.2).
> 
> Where Category G should contain that single case (or any similar case) is open to debate which we can entertain.

Yes, we could do that. But the decision not to do it is over 8 years old 
now, and I haven't heard any actual complaint from an interested user 
about it. We are just discussing it here as part of the overall 
discussion of moving on with updating IDNA to newer Unicode versions.

> I don’t think the other cases deserve it.
> 
> The point concerning modern scripts is well taken as well. Typically Gc changes result of a reinterpretation of a writing system because it was not well understood. This is unlikely to happen for a modern use script. There are some tension on NFC stability (recent case in Bengali), but most interested parties in Unicode and IETF know better than giving up on that kind of cases.
> 
> There is one case where a somewhat ‘modern’ script is currently under revision. That is the Mongolian script, but you would have to be a fool to create Mongolian IDNA labels at this point, even if there are many Mongolian PVALID characters and they appear in IDNA tables published by IANA. Therefore, there is a non-zero chance that Mongolian code points could show up in Category G in the future. But Mongolian has so many confusability issues that it is unlikely that it could be used for IDNA labels ever.

One attempt at understanding the Mongolian script I once made was to 
describe it as "imagine to write English with IPA codepoints, but to 
display it the way we see it every day, adding some variant selectors". 
I was quickly told by an expert that it's even more difficult.


> Finally there is some hope that the Unicode interpretation of IDNA (UTS #46) may get better aligned with IDNA 2008 and RFC5892. I have been considering putting some resources into that.

That is very good to hear. Any details worth mentioning? Anything that 
would help?

Regards,   Martin.