RE: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)

Larry Masinter <masinter@adobe.com> Fri, 01 June 2018 04:51 UTC

Return-Path: <masinter@adobe.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F1101126DFB for <ietf@ietfa.amsl.com>; Thu, 31 May 2018 21:51:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.021
X-Spam-Level:
X-Spam-Status: No, score=-2.021 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_HIGH=-0.01, T_FILL_THIS_FORM_SHORT=0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=adobe.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BswXPjthjgAB for <ietf@ietfa.amsl.com>; Thu, 31 May 2018 21:51:50 -0700 (PDT)
Received: from NAM02-CY1-obe.outbound.protection.outlook.com (mail-cys01nam02on0075.outbound.protection.outlook.com [104.47.37.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 775E8126C0F for <ietf@ietf.org>; Thu, 31 May 2018 21:51:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=adobe.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dbmsr4YeS0P0t8as5r4xA7Y2jLExFGk0l5tyTpAcbjQ=; b=FX8KYx/JkvNKm1odrsk5HAygJyDMdz6qksqq3AZhqOkZBUrkrqyvI0Xw0C5fMfXTVAbTQn8wq6R4IkNMT4eekBBTCpAtxGTD85m+HiqDm2vaOwzlR2Dy94mu2jpgFZJTgwL6EEOkCXyNwGwtnW8R1REZxSf4BAlOmGsqxoJKOh4=
Received: from DM5PR0201MB3461.namprd02.prod.outlook.com (10.167.105.155) by DM5PR0201MB3511.namprd02.prod.outlook.com (10.167.106.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.797.11; Fri, 1 Jun 2018 04:51:48 +0000
Received: from DM5PR0201MB3461.namprd02.prod.outlook.com ([fe80::4803:2e73:f7df:9ae5]) by DM5PR0201MB3461.namprd02.prod.outlook.com ([fe80::4803:2e73:f7df:9ae5%4]) with mapi id 15.20.0820.012; Fri, 1 Jun 2018 04:51:48 +0000
From: Larry Masinter <masinter@adobe.com>
To: Nico Williams <nico@cryptonector.com>, Peter Saint-Andre <stpeter@mozilla.com>
CC: John C Klensin <john-ietf@jck.com>, John Levine <johnl@taugh.com>, "ietf@ietf.org" <ietf@ietf.org>, Patrik Fältström <paf@frobbit.se>
Subject: RE: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)
Thread-Topic: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)
Thread-Index: AQHT+IskW5CTFEAUtkqgLxwuBM+VR6RJX8QAgAC3DICAAAGdgIAAtg0AgAADbJA=
Date: Fri, 01 Jun 2018 04:51:48 +0000
Message-ID: <DM5PR0201MB3461D96B0526D648593D1424C3620@DM5PR0201MB3461.namprd02.prod.outlook.com>
References: <20180530231127.17198276FEE3@ary.qy> <071E6235FE7B088A2B56A238@PSB> <0093E2CD-670E-47B6-A286-4FDEB140FAD9@frobbit.se> <20180531172228.GF14446@localhost> <383c2404-7beb-63e9-b2b2-e75fd1b174f1@mozilla.com> <20180601041949.GH14446@localhost>
In-Reply-To: <20180601041949.GH14446@localhost>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=masinter@adobe.com;
x-originating-ip: [24.6.174.39]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; DM5PR0201MB3511; 7:lRAYtIuXCawPhAuSVxw+fkI2H5NtuK6gJ9/0Km8lV27j/TnmITvlYB1T+oEA/DmR46f62Czv5ydqtV9a3OCX9tlLnWWNQ3amO2RKuHWFGoxrcioAMCzlhsECQPSjTpcJ6LHkVogRYNWrhVV/oQhou8BbCeI1aUDIQLw5hcnTqbtJ7evyvkGSqFv1upVIFQswelLw0YTPZdWaACGEvp/An6ipieKuC322/zADqbAG70TU4Xh+EHpTf3pqcjmEcOCh
x-ms-exchange-antispam-srfa-diagnostics: SOS;
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(48565401081)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:DM5PR0201MB3511;
x-ms-traffictypediagnostic: DM5PR0201MB3511:
x-microsoft-antispam-prvs: <DM5PR0201MB3511FA0FCA27C0828E240B36C3620@DM5PR0201MB3511.namprd02.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(131327999870524);
x-ms-exchange-senderadcheck: 1
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(93006095)(93001095)(3231254)(944501410)(52105095)(10201501046)(6055026)(149027)(150027)(6041310)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123558120)(20161123560045)(6072148)(201708071742011)(7699016); SRVR:DM5PR0201MB3511; BCL:0; PCL:0; RULEID:; SRVR:DM5PR0201MB3511;
x-forefront-prvs: 0690E5FF22
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(39380400002)(39860400002)(396003)(346002)(376002)(54094003)(189003)(199004)(476003)(14454004)(55016002)(3280700002)(10090500001)(8990500004)(2900100001)(3846002)(6116002)(486006)(8936002)(3660700001)(8676002)(446003)(478600001)(11346002)(2906002)(305945005)(6436002)(74316002)(81156014)(7736002)(68736007)(229853002)(5660300001)(86362001)(81166006)(99286004)(53936002)(54906003)(110136005)(4326008)(561944003)(6246003)(9686003)(6506007)(76176011)(26005)(186003)(7696005)(102836004)(93886005)(25786009)(97736004)(66066001)(33656002)(105586002)(316002)(106356001)(5250100002); DIR:OUT; SFP:1101; SCL:1; SRVR:DM5PR0201MB3511; H:DM5PR0201MB3461.namprd02.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: adobe.com does not designate permitted sender hosts)
x-microsoft-antispam-message-info: PGfA+ILuiAoBm+ARZUko6n1NwdqfXFaJPXDJvJ/g/5fHA2JBtfIZ1akiUk8uGalqpohczQX7RNMoXpfD/CAr2FNpxMtSDWq/bJDrnJlpflSU18zqPkQ5fzWag+iQxRNZBUQg5xx8Wz4MAkYdOFIAW8sq+Q9FYBsYA1lFfZ4WxC2yHO0qLmYhiZnfaqT4VAmd
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Office365-Filtering-Correlation-Id: b72f3984-f1f6-4925-c2c0-08d5c77b622a
X-OriginatorOrg: adobe.com
X-MS-Exchange-CrossTenant-Network-Message-Id: b72f3984-f1f6-4925-c2c0-08d5c77b622a
X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Jun 2018 04:51:48.3196 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: fa7b1b5a-7b34-4387-94ae-d2c178decee1
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR0201MB3511
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/eGwSbSKKP-GLMkYpG1kJmrBlFQc>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Jun 2018 04:51:53 -0000

A modest proposal (I'm sure this is controversial so flame away...)

A big part of the problems in i18n in IETF protocols have to do with extending protocol elements from ASCII to Unicode, and how to avoid difficulties when that happens. 

Protocol elements include domain names, URLs, email addresses, file names

But where do these Unicode names come from? They're not arbitrarily generated by automated processes, they're constructed from strings that are selected, typed in, registered. So focus on encouraging people to choose strings that won't give problems. 

A large specification of all of the use cases to avoid is very difficult to write and hard to review. There are very many special cases (final sigma, umlauts, private name characters, non-normalization of combined forms) with expertise widely distributed. I'm not sure the solution is "more specs"; in fact, there are many obscure special cases, and the specs are very difficult to write and review.

I wonder if there's any interest in building an open-source service that would, when given a proposed domain name or URL or email address, tell you what problems various subsets of users would have when trying to deploy that name (e.g., names that don't display properly on popular platforms, names that can't be reliably typed in correctly even if they can be viewed, those that are likely to get confused with other similar but different names).

Perhaps get started at a Hackathon? 

I did reserve the domain name "caniuse.name" that I will offer to any sincere effort.