Re: [Lucid] [mark@macchiato.com: Re: Non-normalizable diacritics - new property]

Shawn Steele <Shawn.Steele@microsoft.com> Wed, 11 March 2015 19:43 UTC

Return-Path: <Shawn.Steele@microsoft.com>
X-Original-To: lucid@ietfa.amsl.com
Delivered-To: lucid@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E288E1A1B8B for <lucid@ietfa.amsl.com>; Wed, 11 Mar 2015 12:43:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EWr80V1JZ_P3 for <lucid@ietfa.amsl.com>; Wed, 11 Mar 2015 12:43:34 -0700 (PDT)
Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0705.outbound.protection.outlook.com [IPv6:2a01:111:f400:fc10::705]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 420071A1B8A for <lucid@ietf.org>; Wed, 11 Mar 2015 12:43:34 -0700 (PDT)
Received: from CY1PR0301MB0731.namprd03.prod.outlook.com (25.160.159.149) by BL2PR03MB402.namprd03.prod.outlook.com (10.141.91.146) with Microsoft SMTP Server (TLS) id 15.1.112.13; Wed, 11 Mar 2015 19:43:13 +0000
Received: from CY1PR0301MB0731.namprd03.prod.outlook.com ([25.160.159.149]) by CY1PR0301MB0731.namprd03.prod.outlook.com ([25.160.159.149]) with mapi id 15.01.0106.007; Wed, 11 Mar 2015 19:43:12 +0000
From: Shawn Steele <Shawn.Steele@microsoft.com>
To: Ted Hardie <ted.ietf@gmail.com>, "Asmus Freytag (t)" <asmus-inc@ix.netcom.com>
Thread-Topic: [Lucid] [mark@macchiato.com: Re: Non-normalizable diacritics - new property]
Thread-Index: AQHQXCT6v+fcyt1y/kWOJMb4v31hfZ0Xp0gsgAACDiA=
Date: Wed, 11 Mar 2015 19:43:12 +0000
Message-ID: <CY1PR0301MB07310C68F6CFDD46AE22086F82190@CY1PR0301MB0731.namprd03.prod.outlook.com>
References: <20150311013300.GC12479@dyn.com> <CA+9kkMDZW9yPtDxtLTfY1=VS6itvHtXHF1qdZKtXdwwORwqnew@mail.gmail.com> <55008F97.8040701@ix.netcom.com> <CA+9kkMAcgSA1Ch0B9W1Np0LMn2udegZ=AzU1b26dAi+SDcbGgg@mail.gmail.com>
In-Reply-To: <CA+9kkMAcgSA1Ch0B9W1Np0LMn2udegZ=AzU1b26dAi+SDcbGgg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2001:4898:80e0:ee43::3]
authentication-results: gmail.com; dkim=none (message not signed) header.d=none;
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BL2PR03MB402;
x-forefront-antispam-report: BMV:1; SFV:NSPM; SFS:(10019020)(106116001)(50986999)(16601075003)(54356999)(76576001)(76176999)(19609705001)(99286002)(122556002)(19300405004)(2656002)(87936001)(33656002)(19625215002)(2950100001)(62966003)(77156002)(92566002)(19580395003)(40100003)(74316001)(46102003)(86362001)(86612001)(102836002)(15975445007)(93886004)(2900100001)(19617315012)(16236675004)(7059030)(3826002)(222073002); DIR:OUT; SFP:1102; SCL:1; SRVR:BL2PR03MB402; H:CY1PR0301MB0731.namprd03.prod.outlook.com; FPR:; SPF:None; MLV:sfv; LANG:en;
x-microsoft-antispam-prvs: <BL2PR03MB4026629F79534E957DAA18E82190@BL2PR03MB402.namprd03.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(5002009)(5005006); SRVR:BL2PR03MB402; BCL:0; PCL:0; RULEID:; SRVR:BL2PR03MB402;
x-forefront-prvs: 0512CC5201
Content-Type: multipart/alternative; boundary="_000_CY1PR0301MB07310C68F6CFDD46AE22086F82190CY1PR0301MB0731_"
MIME-Version: 1.0
X-OriginatorOrg: microsoft.onmicrosoft.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Mar 2015 19:43:12.6334 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR03MB402
Archived-At: <http://mailarchive.ietf.org/arch/msg/lucid/6U-lTwuLOg-WaWCmu0x4i1Np0PA>
Cc: "lucid@ietf.org" <lucid@ietf.org>, Dave Thaler <dthaler@microsoft.com>, Andrew Sullivan <ajs@anvilwalrusden.com>
Subject: Re: [Lucid] [mark@macchiato.com: Re: Non-normalizable diacritics - new property]
X-BeenThere: lucid@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Locale-free UniCode Identifiers \(LUCID\)" <lucid.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lucid>, <mailto:lucid-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lucid/>
List-Post: <mailto:lucid@ietf.org>
List-Help: <mailto:lucid-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lucid>, <mailto:lucid-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Mar 2015 19:43:37 -0000

Re: audio vs visual rendering…  I pronounce my droid “L3-G0” as “El Three Gee Oh”.  Which leads to problems as the last character is a zero.

This document makes a hard line between “homoglyphs” and “visually similar”.  Specifically calling out to (paraphrasing) just use a font that makes characters look different.  I’d assert that there isn’t a hard line between these concepts and that just picking the right font glosses over the difficulties in the problem.

The ʻokina is a decent case where it looks a lot like another character, and often fonts may even use the same glyph, however sometimes font designers choose to make a distinction.  It’s nearly impossible to tell a developer to “use the right font”.

Additionally it continues to treat these newly noticed characters as a special case without considering the many existing problems.

I’m also confused by the document’s attention to the need for unique identifiers at the beginning, but then looks at the existing IDNA problem.  I don’t consider IDNA able to provide “secure” (meaning unconfusable) identifiers.

I’m definitely in the 4.1 camp “Just another species of Confusables”.  However I disagree with the final paragraph.

IMO the cost of any mitigation (and implied cost of having a firedrill the next time one of these happens) outweighs any possibility of reducing the confusable space.  We’re spending hundreds of hours and thousands of dollars on this topic, but I’d be really surprised if an attacker even bothered with this code point.  If they did, it’d be 0.1% of the attacks, with 90% of the attacks using far less esoteric methods.

IMO the reason to solve the problem with this character would be as part of a solution to solve confusability attacks in IDNA in general.  Yet this solution addresses only a few characters of what are probably billions of ways to make confusable attacks.

-Shawn

http://L3-G0.com