Re: [I18n-discuss] Is this an A-label or what?

Martin J. Dürst <duerst@it.aoyama.ac.jp> Thu, 14 February 2019 06:25 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: i18n-discuss@ietfa.amsl.com
Delivered-To: i18n-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 786FE130FEE for <i18n-discuss@ietfa.amsl.com>; Wed, 13 Feb 2019 22:25:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.922
X-Spam-Level:
X-Spam-Status: No, score=-0.922 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sFQCT5Ra88OL for <i18n-discuss@ietfa.amsl.com>; Wed, 13 Feb 2019 22:25:43 -0800 (PST)
Received: from JPN01-OS2-obe.outbound.protection.outlook.com (mail-eopbgr1410135.outbound.protection.outlook.com [40.107.141.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 914B6130F84 for <i18n-discuss@iab.org>; Wed, 13 Feb 2019 22:25:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VfXv9VFhFUWOFGic70H7KE729Ic7QgNM0qfyyAZGT6k=; b=d7hDDHNSB908sXTXpE+786eAP2Vi16SogD4ju5dQJYc+vETiY8sXb5JAghk4F7/retelMw8bneyg/4eIEOkcEX8uw/8DgL+dsh2qeJxN5HbouCOOxmhagbxn4xcbKvhbhDj9hzyQavagNm14VWxQ6fYkevIN1tQT9YTjxGe9f10=
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com (20.179.187.18) by TYAPR01MB4029.jpnprd01.prod.outlook.com (20.178.139.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1622.16; Thu, 14 Feb 2019 06:25:41 +0000
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::6d0f:10e4:f18d:70e7]) by TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::6d0f:10e4:f18d:70e7%3]) with mapi id 15.20.1622.016; Thu, 14 Feb 2019 06:25:41 +0000
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: John R Levine <johnl@taugh.com>, "i18n-discuss@iab.org" <i18n-discuss@iab.org>
Thread-Topic: [I18n-discuss] Is this an A-label or what?
Thread-Index: AQHUw7LCSpJKZwK20U2LHCFGwwoQ5aXe1NAA
Date: Thu, 14 Feb 2019 06:25:41 +0000
Message-ID: <ffabe498-44ad-500b-d684-fe7b84ef4af1@it.aoyama.ac.jp>
References: <alpine.OSX.2.21.1902131031480.12648@ary.qy>
In-Reply-To: <alpine.OSX.2.21.1902131031480.12648@ary.qy>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-clientproxiedby: TY2PR06CA0032.apcprd06.prod.outlook.com (2603:1096:404:2e::20) To TYAPR01MB5149.jpnprd01.prod.outlook.com (2603:1096:404:12e::18)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=duerst@it.aoyama.ac.jp;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [133.2.210.64]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: bf9d5aa3-539d-4a2c-2a42-08d692453e08
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(5600110)(711020)(4605077)(2017052603328)(7153060)(7193020); SRVR:TYAPR01MB4029;
x-ms-traffictypediagnostic: TYAPR01MB4029:
x-ms-exchange-purlcount: 1
x-microsoft-antispam-prvs: <TYAPR01MB402928CDDD07B57D78AAB2F6CA670@TYAPR01MB4029.jpnprd01.prod.outlook.com>
x-forefront-prvs: 09480768F8
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(346002)(366004)(136003)(376002)(39840400004)(396003)(189003)(199004)(110136005)(508600001)(25786009)(966005)(52116002)(76176011)(2906002)(68736007)(6436002)(14454004)(99286004)(66066001)(53936002)(6512007)(6246003)(14444005)(85202003)(2501003)(106356001)(6306002)(105586002)(256004)(31686004)(6506007)(31696002)(86362001)(8936002)(2616005)(786003)(11346002)(316002)(305945005)(486006)(6486002)(53546011)(97736004)(446003)(476003)(7736002)(85182001)(386003)(71190400001)(6116002)(71200400001)(3846002)(229853002)(26005)(8676002)(81166006)(102836004)(186003)(81156014)(74482002); DIR:OUT; SFP:1102; SCL:1; SRVR:TYAPR01MB4029; H:TYAPR01MB5149.jpnprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None (protection.outlook.com: it.aoyama.ac.jp does not designate permitted sender hosts)
x-microsoft-exchange-diagnostics: 1;TYAPR01MB4029;23:vHR5zP3+vTZtLQncrs8AKtoXXMz1R9r/Nu8IDUVEcN7iD6F8WNiEmcSqcdUqqzyix5FDRkPmLpD3EsB5/FfskAhhctj9Ao7AT3zFXJGXPao/T8vWR2FVsO4nL/Uh7W3kxPfrQL2YCI4sdAGBOCb5DQ1ZXzXKOJI8YzsPVpTjpnO8PlWzH9FBZrnDIEPRbPkbWx4LePUVt+RcSH5D8+nxKUu5tFzxtL0PN6MkI3CZ1aepUAPK4TgxVRyLHtfkJQFFJ4IkbRO4jkTribmZSYLgsnfpW+2wRV3TPncaCKQRj4rpQroy8MM+u20ksUUS5H8QXIGn8WDUZ6Vhf+AMkXkmFZPsqJFk4tAK6Jb1QgIjQYu8m/QaLHBTUTRhylz9zZXWazGDfqJdaX4w5UZNjcMn3YZBZjhRkn8m6jsjZhZkjHCHhGN9vT/zsuMEwxIek+DPFsofHMF4X1wB5SUb5+RfS1Zxx6R5tTJxyvXllKOxE+sMTuyxNZRqwY4KvaurSPhhTBNYTIIDXScyBVC9YuIr4eJdRX2Kj1KyYGqJcGs2nmdSoCziYkUXT66HJnVDBHnbtEL+dLDQ3/CuDBLusGhAKE11sVJYbplxzCc2nPhHrbVZ0NGYDVFnIxslGVx3YdBhIk9sAiYzHJ55nYxgNC3B1Wu/1o1km7oKt0ARVq+GaO1Sz/mz1onU9246219ktaus4908JWP0uyGtdnnf+/1HcBUWlcYRyVKF+oyGLdNagC0cvR4+QVT6MalFo2chdeiBWgGH2Fl8g0NJYZDP4sp2uCP4JI3tWVEzK1ylNckkKWn6qrHcZS4zba6DLv2db23pZvT6bIOY6hmi8DuAWsSiRDoEPiEDDXIHEGvzb96o/ylk5LfTY5FQT/6llLqCS5v7x8xMVAdxzn1z2Ot6L7mV45rSgfbCAE35DyjHvTAmJRk5HERo7fVx7QeEa3vkqb+KqJ0CrPnfwj6g1e/MSdaGycCFP6c/fxg2IhCo8mtJUboivXnLqiBjuk0Kx8wI4S9dslAFA/fb740uQmJOqu3pffnNLMVHPD6aHXQNqvPo2azHVI4PE0x4LUHJ49vEVhrr0FgvuCW6jaTjlUTprzRWZwBpz2/MfAvmmwda+39ZHQs6vTw7M98O7qp7+RJSFZdFwG74TSqJFRtq3CMG1STsvllbkdL2YQWOgVaG+XSK4XCUO84pNc0aBxOhdCfd8uGDBBwTQAvAbfWIwaMjBJtqP+vXpMx72+UYgNznRtVpMlMfuFQBjK777tHnt86jHgypGBThF237TiKfcMt6N21VgUAH2k0wy7LRWeTSSn0AbmL4eXzGxzqu4FbBtDN9TB9t
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: gUAkhGhy5yTseyWzWJgsIR236T2uVVi+nkCLiGIKSODKKhh6reZPZhOw+XKToEyVkZS87g2Kx1Y7qjFlpy1OUOEQPz+I2lMXennKppPuMe7fUV8azoGhSomzIYu9OpqTM2nhNiUV66THd7d8waumHs88/fjQc6eGArnKMgOfklWE9d4wsNI1ndqtWk0+GNs4xpMepN+c/4b7Cg1zCzG32tE4qhXOFgNbcBMpaSap3bZ9Xw4Ha32gIrPIkEC9M5SIfjxIkRFhHHTIGcKEN5lvphxIWfxeWnDqfw+JA4GG3Sf0srAzvaVAWVgbN7qFQzdtiLbriIh/tXxkj6XOGnLNAj/f4fxaWIh78RsTtiDN7eEAFXd40vcKmBORUTWo3Os4tREEr31sfRp98rReoJiFi7meZHwlBhrsNQJCTD8jmkM=
Content-Type: text/plain; charset="utf-8"
Content-ID: <A42C81CAA69E7946A9D2CB3C8DBD7794@jpnprd01.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: bf9d5aa3-539d-4a2c-2a42-08d692453e08
X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Feb 2019 06:25:41.0084 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB4029
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18n-discuss/9SSNGSjdGj_l1YLniaa7eTMngT0>
Subject: Re: [I18n-discuss] Is this an A-label or what?
X-BeenThere: i18n-discuss@iab.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Program Open Discussion List <i18n-discuss.iab.org>
List-Unsubscribe: <https://www.iab.org/mailman/options/i18n-discuss>, <mailto:i18n-discuss-request@iab.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18n-discuss/>
List-Post: <mailto:i18n-discuss@iab.org>
List-Help: <mailto:i18n-discuss-request@iab.org?subject=help>
List-Subscribe: <https://www.iab.org/mailman/listinfo/i18n-discuss>, <mailto:i18n-discuss-request@iab.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Feb 2019 06:25:46 -0000


On 2019/02/14 00:42, John R Levine wrote:
> This does what you'd expect, two A-labels turn into two U-labels:
> 
>>>> idn2.atou('xn--eckwd4c7c.xn--zckzah')
> 'ドメイン.テスト'
> 
> But someone added a space at the end.  Gnu idn2 decodes it as punycode, 
> presumably using some implicit negative code value for the space:
> 
>>>> idn2.atou('xn--eckwd4c7c.xn--zckzah ')
> 'ドメイン.テニスト'

Just for those who don't read Japanese, and don't trust their eyes with 
details in foreign scripts: The two results are clearly different. The 
first reads "domain.test", the second reads "domain.tenist" (or some 
such, obviously, "tenist" or "tenisuto" in a more literal transcription 
is not a word). The "ニ" is clearly superfluous.

> The existing python code barfs because it doesn't expect a space:
> 
> idna.decode('xn--eckwd4c7c.xn--zckzah ')
> IndexError: string index out of range

That's a good thing to do.

> The punycode algorithm will never create a label with a space at the 
> end, If you see one do you decode it anyway?  Treat it as some random 
> ASCII label and pass it through?  Reject it because it looks sort of 
> like an A-label but it's not a hostname?

I could expect a library in a some more lenient language to just ignore 
the space. Using the space as part of the calculation seems really 
weird. Of course Python, being a rather strict language, barfs.

Regards,   Martin.

> Regards,
> John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
> Please consider the environment before reading this e-mail. https://jl.ly