Re: [I18ndir] Review of new characters for Unicode 12.0.0

Martin J. Dürst <duerst@it.aoyama.ac.jp> Mon, 18 March 2019 23:47 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 295F4130E68 for <i18ndir@ietfa.amsl.com>; Mon, 18 Mar 2019 16:47:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.923
X-Spam-Level:
X-Spam-Status: No, score=-0.923 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YfGomXt1xz2q for <i18ndir@ietfa.amsl.com>; Mon, 18 Mar 2019 16:47:25 -0700 (PDT)
Received: from JPN01-OS2-obe.outbound.protection.outlook.com (mail-eopbgr1410099.outbound.protection.outlook.com [40.107.141.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1728F129B88 for <i18ndir@ietf.org>; Mon, 18 Mar 2019 16:47:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+h46AdmlDdXaFAeFxaFSdYrUZ+mfvMZ5qSIXLk52eJA=; b=Y3XH5KtDli3NXkhjd+VPNKMqXeBoHxsuSo9X2fSGIXyuMCj/7TWPaL1Pyvz2OMmdXewKGTCJtq3X2cgETjyMEXXv67t9851W/eY3oL5V8InGgoUnrogqROOzFUa821qst5xBrmTF5Q6Rw1gCXwn1ws//NUq59zHOAka0wVUfLNw=
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com (20.179.187.18) by TYAPR01MB3327.jpnprd01.prod.outlook.com (20.178.136.80) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1709.13; Mon, 18 Mar 2019 23:47:22 +0000
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::98b6:d90e:9ae7:302]) by TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::98b6:d90e:9ae7:302%3]) with mapi id 15.20.1709.015; Mon, 18 Mar 2019 23:47:22 +0000
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: Patrik Fältström <paf=40frobbit.se@dmarc.ietf.org>
CC: "i18ndir@ietf.org" <i18ndir@ietf.org>
Thread-Topic: [I18ndir] Review of new characters for Unicode 12.0.0
Thread-Index: AQHU3SR5vGzOCrEDAkiGi1ITnTQZx6YREKoAgABBeoCAAL1ugA==
Date: Mon, 18 Mar 2019 23:47:22 +0000
Message-ID: <8aa72ac4-1eb9-5df1-8d56-165e12202456@it.aoyama.ac.jp>
References: <e0174987-056d-d74e-c3fa-5b457a72f8c3@it.aoyama.ac.jp> <12f6742d-081b-5ef0-097c-d571e7fe1e9f@it.aoyama.ac.jp> <ADFDCB3A-BAEA-46BD-991A-F9D4FC863ED1@dmarc.ietf.org>
In-Reply-To: <ADFDCB3A-BAEA-46BD-991A-F9D4FC863ED1@dmarc.ietf.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-clientproxiedby: TY2PR0101CA0010.apcprd01.prod.exchangelabs.com (2603:1096:404:92::22) To TYAPR01MB5149.jpnprd01.prod.outlook.com (2603:1096:404:12e::18)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=duerst@it.aoyama.ac.jp;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [223.218.133.122]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 5b9ddac8-cacf-4771-dab4-08d6abfc1109
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(5600127)(711020)(4605104)(2017052603328)(7153060)(7193020); SRVR:TYAPR01MB3327;
x-ms-traffictypediagnostic: TYAPR01MB3327:
x-microsoft-antispam-prvs: <TYAPR01MB33278694C7E5E69053D67C7FCA470@TYAPR01MB3327.jpnprd01.prod.outlook.com>
x-forefront-prvs: 098076C36C
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(136003)(39840400004)(346002)(376002)(366004)(396003)(199004)(189003)(55674003)(256004)(7736002)(5024004)(5660300002)(8936002)(25786009)(6486002)(4326008)(71190400001)(3846002)(6116002)(71200400001)(81156014)(186003)(6246003)(2906002)(53936002)(81166006)(74482002)(229853002)(8676002)(102836004)(26005)(53546011)(6506007)(386003)(66066001)(105586002)(786003)(99286004)(76176011)(52116002)(6512007)(85182001)(85202003)(106356001)(97736004)(86362001)(316002)(446003)(31696002)(2616005)(508600001)(68736007)(476003)(6436002)(486006)(31686004)(305945005)(11346002)(14454004); DIR:OUT; SFP:1102; SCL:1; SRVR:TYAPR01MB3327; H:TYAPR01MB5149.jpnprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:0; MX:1;
received-spf: None (protection.outlook.com: it.aoyama.ac.jp does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: CB+RDaQMKnhKgn9LDNlIldO5GpZrDG7r4SSXERYz/JvLcinNh0ZBzLVgaTPWzQmxa/8b5saAd5M1ZfDwOz89Xtph1UuPRifYl0DQNdQz1KoHYDeqZbWfxu2mtJNqiXb+5TZO0Hyuu+QvZuRNesKHHKDjGTYdWV/B5Kr1Nt1/iZUJ3fvhbBbyFAZvPmDb6QoEt35UBK/JL55R2XI0Sf6LtDW3eCicCsYaJH8tg53qreMwSOlMw++11SfQsjYtR6ekxSA7E+gsp4ilebt9/XvlmnnnxHDOk8H5zfP5BDeJndhCBuVAPB/8s3V0zzOCWb3m4n16cZ1QCU5b8F3lSNWVTF6TGEeNaOZt4ZDqFhakCedRiupTXuGPwQ3+ov8MlTY2wD5xvkuw95M2FL8ejOzOUMC177Ny/6eJDnnBfebyZ9k=
Content-Type: text/plain; charset="utf-8"
Content-ID: <0EE84BBBCFB29440B91BD50F1CFD8475@jpnprd01.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 5b9ddac8-cacf-4771-dab4-08d6abfc1109
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Mar 2019 23:47:22.7712 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB3327
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/YJ3iTla91n3p5s224aVA0Bzgl8A>
Subject: Re: [I18ndir] Review of new characters for Unicode 12.0.0
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2019 23:47:27 -0000

Hello Patrik,

On 2019/03/18 21:29, Patrik Fältström wrote:
> [Now without my list of code points]

In the meantime, with a bit more patience, I have been able to extract 
the file and have been able to look at it in an editor.

> Martin, as I have implemented IDNA2008 from scratch, i.e. not using any 
> libraries at all,

I'm also not using any libraries at all, just relying on the Unicode 
version in Ruby to be the same as the target version. But I understand 
what you're saying.

> let me suggest we sync on what output format we use so 
> that we can do "diff" between your and my lists. My list for 12.0.0 is 
> btw attached.

I think it's a good idea to have this for some cross-checks. But one of 
the features of my program was that it listed only the new characters, 
because these were the ones I wanted to review. So I'll definitely keep 
this ability.

> May I suggest what my program is doing (of course), which is as follows:
> 
> <Codepoint>;<Derived property value>;<Rule(s) that matched>;<Name>

I'm fine with that. I'll have to do more work on supporting ranges in 
UnicodeData.txt and supporting the rules that I ignored in my first 
quick attempt. So please don't rely on my work for moving forward.

Regards,   Martin.

> :
> :
> 0058;DISALLOWED;AB;LATIN CAPITAL LETTER X
> 0059;DISALLOWED;AB;LATIN CAPITAL LETTER Y
> 005A;DISALLOWED;AB;LATIN CAPITAL LETTER Z
> 005B;DISALLOWED;;LEFT SQUARE BRACKET
> 005C;DISALLOWED;;REVERSE SOLIDUS
> 005D;DISALLOWED;;RIGHT SQUARE BRACKET
> 005E;DISALLOWED;;CIRCUMFLEX ACCENT
> 005F;DISALLOWED;;LOW LINE
> 0060;DISALLOWED;;GRAVE ACCENT
> 0061;PVALID;AE;LATIN SMALL LETTER A
> 0062;PVALID;AE;LATIN SMALL LETTER B
> 0063;PVALID;AE;LATIN SMALL LETTER C
> 0064;PVALID;AE;LATIN SMALL LETTER D
> 0065;PVALID;AE;LATIN SMALL LETTER E
> 0066;PVALID;AE;LATIN SMALL LETTER F
> 0067;PVALID;AE;LATIN SMALL LETTER G
> 0068;PVALID;AE;LATIN SMALL LETTER H
> 0069;PVALID;AE;LATIN SMALL LETTER I
> 006A;PVALID;AE;LATIN SMALL LETTER J
> :
> :
> 
>     Patrik
> .
>