Re: [I18nrp] Conservatism principle doesn't go far enough

"John Levine" <johnl@taugh.com> Fri, 01 February 2019 02:18 UTC

Return-Path: <johnl@iecc.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F0FD1311E3 for <i18nrp@ietfa.amsl.com>; Thu, 31 Jan 2019 18:18:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1536-bit key) header.d=iecc.com header.b=F1lqO2mz; dkim=pass (1536-bit key) header.d=taugh.com header.b=JF/u1S7r
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GUDL3MiAzWhE for <i18nrp@ietfa.amsl.com>; Thu, 31 Jan 2019 18:18:04 -0800 (PST)
Received: from gal.iecc.com (gal.iecc.com [IPv6:2001:470:1f07:1126:0:43:6f73:7461]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4B7DF1311A8 for <i18nrp@ietf.org>; Thu, 31 Jan 2019 18:18:04 -0800 (PST)
Received: (qmail 78283 invoked from network); 1 Feb 2019 02:18:03 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=iecc.com; h=date:message-id:from:to:cc:subject:in-reply-to:mime-version:content-type:content-transfer-encoding; s=131c8.5c53ac5b.k1901; bh=VhNAx6pXx44qC1GXjdAltlK1wU3UUsxD9WkPQb6eZMw=; b=F1lqO2mzJYoOtFaKUdFf/yyxib4BdaYl5I4Az5Ug9zswxUMHpGhfS+mBfMWmyqOtXKv1jbRdhZ8aGgtFF1VbQi7zbB1g3s82yBAk719BsDcsKWrK1MludrfAYe/Mkr6GLRogFyYquFLgaFfRhSqJsMfzyRBUJIJGkZ9qefG8trbNK0iJZM11Pc+C1+Z7+HRyrkgkY/+1JhdVtmphugB4mAr2M784axGY1eZq7Bzx9cK4T3JeOhcoB1TRfKvz0Tnq
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=taugh.com; h=date:message-id:from:to:cc:subject:in-reply-to:mime-version:content-type:content-transfer-encoding; s=131c8.5c53ac5b.k1901; bh=VhNAx6pXx44qC1GXjdAltlK1wU3UUsxD9WkPQb6eZMw=; b=JF/u1S7rQwdS4LsQRZrGnGdlyVcgXakyjvn05FGv+WH6RrPlFpPD+iTopZpM0R06zAhSiYdba9i+lOwwXaO+vtQsHjEa00N+1cBt6nDM1FIMS0LrIYdRZx/eXFMPJfMDXDQAlhFsBg1WuwuC4RKDDUkfeRWGazB0ujqbMtzl/Hc+AvQDfDMHlu2Vflmn4al6dsoaI4BOeskUOyPsqCpQ58gyK8ImFYxXe+/FBedhKwHnNCH9oSAYxouq3kuaYtLK
Received: from ary.qy ([IPv6:2001:470:1f07:1126::78:696d:6170]) by imap.iecc.com ([IPv6:2001:470:1f07:1126::78:696d:6170]) with ESMTP via TCP6; 01 Feb 2019 02:18:02 -0000
Received: by ary.qy (Postfix, from userid 501) id A5160200D93BBA; Thu, 31 Jan 2019 21:18:01 -0500 (EST)
Date: Thu, 31 Jan 2019 21:18:01 -0500
Message-Id: <20190201021802.A5160200D93BBA@ary.qy>
From: John Levine <johnl@taugh.com>
To: i18nrp@ietf.org
Cc: john@jck.com
In-Reply-To: <A0F4590C3A54B244C34226F8@PSB>
Organization: Taughannock Networks
X-Headerized: yes
Mime-Version: 1.0
Content-type: text/plain; charset="utf-8"
Content-transfer-encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/F3CGo9hZgMiPMKV8P-fhVxJ9-No>
Subject: Re: [I18nrp] Conservatism principle doesn't go far enough
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Feb 2019 02:18:06 -0000

In article <A0F4590C3A54B244C34226F8@PSB> you write:
>--On Thursday, January 31, 2019 16:17 -0800 Larry Masinter
><LMM@acm.org> wrote:
>
>> https://chromium.googlesource.com/chromium/src/+/master/docs/s
>> ecurity/url_di splay_guidelines/url_display_guidelines.md
>
>> is interesting; one wouldn't want to register a domain which a
>> popular browser / OS won't display.

I am reasonbly sure that and domain you could register today in an
ICANN contracted TLD would be OK with Chrome.  You can put non-OK
subdomains under an OK 2LD, but I can't see what registries or
registrars can do about that, particularly when they're not hosting
the domains' DNS.

>deliberately), but the specification is quite clear.  And yet,
>surveys in ICANN-land have have shown SLD registrations for many
>such labels.

Funny you should mention that.  I went through and looked at all of
the IDNs in every contracted TLD zone file.  There's about 2 million,
half in .COM, half in other TLDs.  I checked for IDNA2003 and 2008
compliance, and I also parsed the per-TLD script rules at IANA and
checked whether the names match the rules.

I was pleasntly surprised to find that over 99% of the names are valid
under IDNA2008 and match one of the script rules for the TLD.  Of the
maybe 5000 that aren't, the biggest single group are ancient names in
.COM and .NET registered a decade or longer ago before there were
script rules, and that contain stupid stuff like currency symbols and
other punctuation.  I did some spot checks and could not find any that
were in use other than maybe pointing to a parked or for-sale web
page, so really, who cares?

Some of the new TLDs are just lazy.  They have to tell ICANN what
scripts they'll allow, and send the script rules to IANA.  Most do,
it's not hard, but a few can't be bothered. I found some TLDs with
names in scripts not on their list (Chinese mostly), and some with
scripts on their lists but not at IANA.  Nearly all of the rogue
Chinese names would be fine if the TLDs published the same Chinese
script tables as other gTLDs, so they seem lazy, not malicious, at
least in their choice of names.  ICANN agrees these are compliance
issues and I'm arranging to give them a feed of violators.

The ccTLDs of countries big enough to matter seem OK.  I can't get
many zone files but of the few I can get, .US has no IDNs, .se and .nu
(managed by .se) are squeaky clean, and a few small African countries
that allow public AXFR have no IDNs in their tiny zones.

There are a few bad actors like .WS and .LA who have sold their TLDs
to speculators whose registration rules primarily involve checking
that the charge went through.  There are emoji 2LDs in .la and .ws,
but given how few names there are in those TLDs that anyone cares
about, overblocking would be appropriate.

I'll be giving a talk on Two Million IDNs and What I Found at ICANN
Tech Day in Kobe in March.

R's,
John
-- 
Regards,
John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly