Re: [Idna-update] emoji and security

Andrew Sullivan <ajs@anvilwalrusden.com> Tue, 13 March 2018 20:25 UTC

Return-Path: <ajs@anvilwalrusden.com>
X-Original-To: idna-update@ietfa.amsl.com
Delivered-To: idna-update@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 44032126CD6 for <idna-update@ietfa.amsl.com>; Tue, 13 Mar 2018 13:25:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=yitter.info header.b=Ru0U0Lv4; dkim=pass (1024-bit key) header.d=yitter.info header.b=Veb6KZiR
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b5sxIuoaye4o for <idna-update@ietfa.amsl.com>; Tue, 13 Mar 2018 13:25:39 -0700 (PDT)
Received: from mx4.yitter.info (mx4.yitter.info [159.203.56.111]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B0609126C89 for <idna-update@ietf.org>; Tue, 13 Mar 2018 13:25:39 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mx4.yitter.info (Postfix) with ESMTP id E3228BE780 for <idna-update@ietf.org>; Tue, 13 Mar 2018 20:25:08 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yitter.info; s=default; t=1520972708; bh=UdLdWNr9Q0wFsEU40vSIDgl77/5/+GEEuEBA5hLAcho=; h=Date:From:To:Subject:References:In-Reply-To:From; b=Ru0U0Lv4+2RewwGcPSFHO5TKXG/7rD2asgd6KorAcQQc2ONib+BK2mynsqPPLWDhh BVenB7fJHasBwiyDkIHZPkUsYe85a+lxd8e5zMSOdthKHVTRbamF8OauY2WYAs6JAg /lNEtEQl/EB2skZKbF//alqDpJlMq/mi56KJHAsw=
X-Virus-Scanned: Debian amavisd-new at crankycanuck.ca
Received: from mx4.yitter.info ([127.0.0.1]) by localhost (mx4.yitter.info [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zs2pTewj-hEg for <idna-update@ietf.org>; Tue, 13 Mar 2018 20:25:07 +0000 (UTC)
Date: Tue, 13 Mar 2018 16:25:05 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yitter.info; s=default; t=1520972707; bh=UdLdWNr9Q0wFsEU40vSIDgl77/5/+GEEuEBA5hLAcho=; h=Date:From:To:Subject:References:In-Reply-To:From; b=Veb6KZiR3NJfHV8i9s4ftqe6OsbDlYTlENeAnbrYfE6USDmZLbA6vR3YGQ1vcyiGN Ldy9vXxxRu6OlTU5dpiGQLQku8F/2PMZYuJeq65yxxbD9fEjUSSPDBUyg7WK7O0t38 WyKUolAHK8+eVZtL6emNBNUBGIzg1cVZpLiCo6yE=
From: Andrew Sullivan <ajs@anvilwalrusden.com>
To: idna-update@ietf.org
Message-ID: <20180313202505.ztersmy2z5xuxlvp@mx4.yitter.info>
References: <533bb471-da9b-64d0-76aa-a8a1251d256b@ix.netcom.com> <DM5PR1901MB219712F39A6297F9A147312DA2D30@DM5PR1901MB2197.namprd19.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <DM5PR1901MB219712F39A6297F9A147312DA2D30@DM5PR1901MB2197.namprd19.prod.outlook.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/idna-update/FMKBFw81OIGxTNITG8FhvTetwQg>
Subject: Re: [Idna-update] emoji and security
X-BeenThere: idna-update@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Internationalized Domain Names in Applications \(IDNA\) implementation and update discussions" <idna-update.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idna-update>, <mailto:idna-update-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idna-update/>
List-Post: <mailto:idna-update@ietf.org>
List-Help: <mailto:idna-update-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idna-update>, <mailto:idna-update-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Mar 2018 20:25:41 -0000

On Mon, Mar 12, 2018 at 06:21:25PM +0000, Michel Suignard wrote:

> I could not resist commenting on this, because unlike emoji, Egyptian
> Hieroglyphs, while looking often like colorful pictures, are the expression of
> a fully formed writing system with ideograms and phonograms and all the usual
> attributes of such a system in term of sentence structure.

Yes.  Which is why they have a general category that …

> Jokes aside, the 100s Egyptian hieroglyphs contains hundreds of semantic and
> phonetic variants and are totally unsuited for any identifier usage and are
> still fully IDN 2008 PVALID (unless of course you limit the scope using variant
> sets).

… causes them to be PVALID.  Which means they're _not_ totally
unsuited to any identifier usage, but certainly they are unsuited to
any general-purpose identifier usage and almost certainly they will
remain in at best extremely limited use effectively forever.  Emojis,
as you know perfectly well, do not have the same general category and
therefore are INVALID.

If the general-category approach to IDNA2008 was unsuitable in 2008
because it yielded examples of things that should never in principle
be identifiers, I wish someone would have made that argument.  I do
not recall it having been made, and I believe I participated pretty
avidly.  I do recall some people arguing that various archaic things
were "not needed" or "almost always unusable by anyone", but that
would have put us in the situation of going through everything one
code point at a time.  There were some who wanted to try to re-do
IDNA2003 only updated for then-current versions of Unicode, but I
think experience shows that wouldn't have worked out too well either.

Best regards,

A

-- 
Andrew Sullivan
ajs@anvilwalrusden.com