Re: [I18ndir] Study Group on Use of Emoji as Second Level Domain

John C Klensin <john-ietf@jck.com> Fri, 08 March 2019 14:45 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F329E1279AD for <i18ndir@ietfa.amsl.com>; Fri, 8 Mar 2019 06:45:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0P-4FXcel-od for <i18ndir@ietfa.amsl.com>; Fri, 8 Mar 2019 06:45:12 -0800 (PST)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BAFAD12798C for <i18ndir@ietf.org>; Fri, 8 Mar 2019 06:45:12 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1h2GkJ-000DDh-4u; Fri, 08 Mar 2019 09:45:11 -0500
Date: Fri, 08 Mar 2019 09:45:04 -0500
From: John C Klensin <john-ietf@jck.com>
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
cc: IETF I18N Directorate <i18ndir@ietf.org>
Message-ID: <8893E807E58D89AEDAB37E6B@PSB>
In-Reply-To: <1d07e7ef-7c2f-e98a-4ff8-a1de5a8102dc@it.aoyama.ac.jp>
References: <1d07e7ef-7c2f-e98a-4ff8-a1de5a8102dc@it.aoyama.ac.jp>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/LmBWvPgmhN7GPOd4ER-jKPYDlqM>
Subject: Re: [I18ndir] Study Group on Use of Emoji as Second Level Domain
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Mar 2019 14:45:18 -0000

A few small observations to supplement Patrik's two messages,
partially because the first point below interacts with what
draft-faltstrom-unicode11 should say...

(1) Unless ICANN decides to depart from IDNA2008, allowing emoji
in domain names would require a substantive change to RFC 5892
and probably 5891.  Perhaps one should not extrapolate, but,
given our inability to engage on even relatively minor
clarifications, I'd a little skeptical about the IETF being able
to do that.  From a technical standpoint, such a change would
either require allowing all So code points or would require
continuing to disallow that general category but then allowing
emoji by exception.  Such an exception, whether by block or by
enumerating code points, would almost certainly lead to a
requirement for normative updates to 5892 with every version of
Unicode and more pressure to do them quickly, not just
statements that nothing of significance has changed.

(2) In addition to the issues mentioned by Patrik about
representations of individual characters, emoji combining
sequences are defined by Unicode, but are defined without
normalization or rendering rules that would facilitate
unambiguous comparison, either of strings by computers or by
people looking at displays.   To the extent to which we consider
non-decomposing code points a problem to be concerned about,
even if all we do is to put those code points on a "troublesome"
list, emoji strings would pose a far more challenging set of
issues.  The combining problem could be eliminated by
prohibiting emoji-containing labels of more than one code point,
but not only would that contradict the long-standing principle
of avoiding single-character labels (because of the original
reason for that principle), but, despite what I gather is
current practice, I can't imagine those who are anxious to sell
domain names with emoji in them would allow such a restriction
for very long.

(3) I've looked briefly at UTR#46 for Unicode 12
(https://www.unicode.org/reports/tr46/ and
https://unicode.org/Public/idna/12.0.0/IdnaMappingTable.txt and
it still allows emoji (or at least a considerable number of them
-- I haven't spotted any exceptions).  Because (except for
earlier emoticons) they are prohibited by IDNA2003 as well as
IDNA2008, there should be no further pretense that it is about
"transition".  Instead, it is a third, orthogonal, standard.  It
also causes a nasty problem (and ambiguity in that spec) for ZWJ
and JZNJ, the former of which has a special interpretation for
emoji that I'll leave as an exercise.

Madness.  I hope those who are at the ICANN meeting,
particularly Harald (as IETF liaison) and Patrik will be
explaining that to relevant parties.

    john


--On Friday, March 8, 2019 10:15 +0000 "Martin J. Dürst"
<duerst@it.aoyama.ac.jp> wrote:

> Dear I18N Directorate members,
> 
> I don't want to interrupt Harald's important work on our
> review, and  would like the topic of this mail to be treated
> as a separate topic,  with less priority.
> 
> Because ICANN meets next week (starting already tomorrow my
> time, i.e.  already this Friday for some of you) in Kobe,
> Japan, I at one point was  considering attending. I had to
> abandon that plan because of other  committments, but as a
> result of looking at the schedule carefully, I  found the
> following:
> 
> https://64.schedule.icann.org/meetings/962146
> ccNSO: Study Group on Use of Emoji as Second Level Domain
>...