Re: [idn] Mac OS X Safari and IDN spoofing

Erik van der Poel <erik@vanderpoel.org> Wed, 23 March 2005 19:41 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA24264 for <idn-archive@lists.ietf.org>; Wed, 23 Mar 2005 14:41:51 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1DEBcP-000CDy-Aa for idn-data@psg.com; Wed, 23 Mar 2005 19:34:17 +0000
Received: from [207.115.63.98] (helo=pimout4-ext.prodigy.net) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1DEBcL-000CDU-ST for idn@ops.ietf.org; Wed, 23 Mar 2005 19:34:14 +0000
Received: from [10.1.1.2] (adsl-64-174-147-206.dsl.sntc01.pacbell.net [64.174.147.206]) by pimout4-ext.prodigy.net (8.12.10 milter /8.12.10) with ESMTP id j2NJY75K193018; Wed, 23 Mar 2005 14:34:08 -0500
Message-ID: <4241C4AF.1020707@vanderpoel.org>
Date: Wed, 23 Mar 2005 11:34:07 -0800
From: Erik van der Poel <erik@vanderpoel.org>
User-Agent: Mozilla Thunderbird 1.0 (X11/20041206)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: James Seng <james@seng.cc>, Gervase Markham <gerv@mozilla.org>
CC: idn@ops.ietf.org
Subject: Re: [idn] Mac OS X Safari and IDN spoofing
References: <p06210212be663c22ba1c@[10.20.30.249]> <4240917F.30801@mozilla.org> <9271f2a6d20072ae7e9f1cf9e74cce45@seng.cc>
In-Reply-To: <9271f2a6d20072ae7e9f1cf9e74cce45@seng.cc>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

Opera addressed the IDN spoofing issue with a number of changes:

In 8.0 beta 2, they introduced a whitelist of TLDs that they consider 
safe because they appear to have good policies in place: no, jp, de, se, 
kr, tw, cn, at, dk, ch, and li. TLDs not on this list have their domain 
labels checked for characters outside Latin-1 (ISO 8859-1, Unicodes up 
to U+00FF). If there are characters outside Latin-1, the label is 
displayed in Punycode.

In 8.0 beta 3, they added hu and museum to the TLD whitelist, and they 
allowed the user to switch to a blacklist using the tilde (~), e.g. 
~:com:tw:. The character checking now allows a single script or specific 
script combinations in each domain label or sublabel, separated by dot 
(.) and hyphen (-). This allows e.g. xml-ccccccc where xml is ASCII and 
cccccc is the Russian word for "documents" in Cyrillic (I think).

I have added links to Opera's 8.0 beta 2 and 3 release notes and IDN 
Security Advisory to my Related Work section:

http://nameprep.org/#related-work

Another idea that I mentioned a while ago in a couple of forums is to 
check for characters used in the user's languages, which can be found in 
the browser localization and HTTP Accept-Language list. There are many 
different ways to display these labels, e.g. Punycode for labels with 
characters outside the user's languages. Another idea is to use pale 
green for characters in the user's main language, pale yellow for those 
in the user's secondary languages, and pale red for characters outside 
those languages. These colors are based on traffic lights.

James Seng wrote:

> now, do we want to standard "this" or do we want apps people to continue 
> to evolve the mechanism to deal with spoofing? i prefer the latter.

I agree that the IETF should not standardize these types of UI policies, 
though it might be a good idea to have some recommendations in an 
informative appendix or something.

However, the IETF may wish to consider standardizing a limited set of 
characters in IDN. For example, we may wish to extend RFC 952's host 
name rules (LDH = Letters, Digits and Hyphen) to a Unicode equivalent, 
thereby disallowing such characters as the slash homographs (e.g. math 
symbol for division).

As I wrote this email, Mark Davis sent a very relevant email to the 
Unicode list:

http://www.unicode.org/mail-arch/

Click the first link, user unicode-ml, password unicode.

Erik