Re: [I18ndir] Study Group on Use of Emoji as Second Level Domain
John C Klensin <john-ietf@jck.com> Fri, 08 March 2019 17:08 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BCD921313CF for <i18ndir@ietfa.amsl.com>; Fri, 8 Mar 2019 09:08:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XkhlUbHyXuAc for <i18ndir@ietfa.amsl.com>; Fri, 8 Mar 2019 09:07:58 -0800 (PST)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A19031313CD for <i18ndir@ietf.org>; Fri, 8 Mar 2019 09:07:58 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1h2IyS-000DZv-Ja; Fri, 08 Mar 2019 12:07:56 -0500
Date: Fri, 08 Mar 2019 12:07:50 -0500
From: John C Klensin <john-ietf@jck.com>
To: Marc Blanchet <marc.blanchet@viagenie.ca>
cc: IETF I18N Directorate <i18ndir@ietf.org>
Message-ID: <BB2C46B8E7989AFD9925C55A@PSB>
In-Reply-To: <132AD5F9-EFAD-4A26-B439-D55AC5D92634@viagenie.ca>
References: <1d07e7ef-7c2f-e98a-4ff8-a1de5a8102dc@it.aoyama.ac.jp> <8893E807E58D89AEDAB37E6B@PSB> <132AD5F9-EFAD-4A26-B439-D55AC5D92634@viagenie.ca>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/rZi4-suV0LjAVVgVGyHgr6ltK_o>
Subject: Re: [I18ndir] Study Group on Use of Emoji as Second Level Domain
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 17:08:01 -0000
Marc, I was trying to be brief in the hope of having time to look carefully at Harald's note today but, to clarify... --On Friday, March 8, 2019 09:59 -0500 Marc Blanchet <marc.blanchet@viagenie.ca> wrote: >> (3) I've looked briefly at UTR#46 for Unicode 12 >> (https://www.unicode.org/reports/tr46/ and >> https://unicode.org/Public/idna/12.0.0/IdnaMappingTable.txt >> and it still allows emoji (or at least a considerable number >> of them -- I haven't spotted any exceptions). Because >> (except for earlier emoticons) they are prohibited by IDNA2003 > > not sure that IDNA2003 prohibited them specifically because at > that time, we were using Unicode 3.2 as the base and emojis > did not exist. And all « Unassigned » code points were > giving some direction (store/query) but was underspecified > essentially. One can certainly call it "underspecified", but it seems to me that the following are/were perfectly clear in IDNA2008, at least for strings accepted by registries for delegation and storage in the DNS, which is, AFAIK, the only issue before ICANN. By definition, such as string is a "stored string" (see Section 4(1) of RFC 3490 and Section 7.2 of RFC 3454 (Stringprep). The latter includes the statement "(stored strings MUST NOT contain unassigned code points,...", which, to me, is about as close to "prohibited" as the IETF gets. Obviously that might be different if StringPrep (and possibly NamePrep) had been updated after Unicode 3.2, but they weren't, so any claim for validity under IDNA2003 takes us into the vary strange territory of local interpretations and extrapolations of what the IDNA2003 tables might have been with speculative updates. That, in turn, leads us down the path to many alternative IDNA alleged standards (not just IDNA2003, UTR#46, and IDNA2008, but many, many, versions and interpretations of what an IDNA2003bis might look like) -- a subject about which Patrik, SSAC, and others have written at length (independent of my concerns and, if you like, ranting on the subject). >> as well as >> IDNA2008, there should be no further pretense that it is about >> "transition". Instead, it is a third, orthogonal, standard. > The Unicode IDNA mapping table such as > https://unicode.org/Public/idna/12.0.0/IdnaMappingTable.txt > but all versions down to 6.X (see > https://unicode.org/Public/idna/6.0.0/IdnaMappingTable.txt) > actually make emojis « valid » for the purpose of TR46. So > this is not so much news. Yes. I didn't mean to imply that it was new. I have been hoping that, with each recent version of UTR#46 and its tables and additional evidence about their unsuitability for identifiers (including, fwiw, their exclusion by UAX#31 [1]) that the decision to treat emoji as valid in UTR#46 would be reversed or at least supplemented there by a strong cautionary note. While I gather that the stability rules would prefer an actual change from "valid" to "disallowed" without some very fancy footwork, nothing I know of would prevent such a warning. Those changes haven't happened. I suppose I might have inserted "still" in my comment but, the more UTR#46 is pushed as an alternate or supplemental standard relative to IDNA2008, the more I think we (and ICANN, etc.) need to be aware of the divergence. best, john [1] Yes, I'm aware of the claim that different (and less restrictive) rules are appropriate for IDns relative to UAX#31 "identifiers" because the latter are about programming language identifiers, but I just don't buy it. It would probably be reasonable if one believes that DNS names are not identifiers (or intended for use as identifiers) but are, instead, value or vanity tokens to be bought and sold without concern about identifier usability. If one considers DNS names as identifiers for use by end users, many of whom may have little knowledge of what is going on, then there is a strong case that rules for what is allowed in those identifiers should be more restrictive, not less, than rules for identifiers to be used by programmers-specialists in programming languages.
- [I18ndir] Study Group on Use of Emoji as Second L… Martin J. Dürst
- Re: [I18ndir] Study Group on Use of Emoji as Seco… Patrik Fältström
- Re: [I18ndir] Study Group on Use of Emoji as Seco… Patrik Fältström
- Re: [I18ndir] Study Group on Use of Emoji as Seco… John C Klensin
- Re: [I18ndir] Study Group on Use of Emoji as Seco… Marc Blanchet
- Re: [I18ndir] Study Group on Use of Emoji as Seco… John C Klensin
- Re: [I18ndir] Study Group on Use of Emoji as Seco… Patrik Fältström
- Re: [I18ndir] Study Group on Use of Emoji as Seco… Harald Alvestrand
- Re: [I18ndir] Study Group on Use of Emoji as Seco… Martin J. Dürst