Re: [idn] nameprep2 and the slash homograph issue

Erik van der Poel <erik@vanderpoel.org> Thu, 24 February 2005 15:21 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA01630 for <idn-archive@lists.ietf.org>; Thu, 24 Feb 2005 10:21:55 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1D4Kbn-000NuQ-9d for idn-data@psg.com; Thu, 24 Feb 2005 15:08:55 +0000
Received: from [207.115.63.102] (helo=pimout3-ext.prodigy.net) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1D4Kbk-000Nty-Rg for idn@ops.ietf.org; Thu, 24 Feb 2005 15:08:53 +0000
Received: from [10.1.1.2] (adsl-64-174-147-206.dsl.sntc01.pacbell.net [64.174.147.206]) by pimout3-ext.prodigy.net (8.12.10 milter /8.12.10) with ESMTP id j1OF8lpY115802; Thu, 24 Feb 2005 10:08:48 -0500
Message-ID: <421DEDFF.2000300@vanderpoel.org>
Date: Thu, 24 Feb 2005 07:08:47 -0800
From: Erik van der Poel <erik@vanderpoel.org>
User-Agent: Mozilla Thunderbird 1.0 (X11/20041206)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: IETF idn working group <idn@ops.ietf.org>
Subject: Re: [idn] nameprep2 and the slash homograph issue
References: <421B8484.3070802@vanderpoel.org> <20050223072837.GA21463~@nicemice.net> <D872CCF059514053ECF8A198@scan.jck.com> <20050223105244.GE21463~@nicemice.net> <421CA114.9090302@vanderpoel.org> <20050224081721.GB12336~@nicemice.net>
In-Reply-To: <20050224081721.GB12336~@nicemice.net>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

Adam M. Costello wrote:
> Erik van der Poel <erik@vanderpoel.org> wrote:
>>The IETF generally only specifies the "wire" protocol, not UI
>>behavior.  The IETF does not specify how apps interface with users;
> 
> Generally, that's true, but IDNA is an exception.  It state four
> requirements (RFC 3490 section 3.1), and one of those four has rather
> little to do with wire protocols, and quite a lot to do with UI
> behavior:
> 
>    3) ACE labels obtained from domain name slots SHOULD be hidden from
>       users when it is known that the environment can handle the non-ACE
>       form, except when the ACE form is explicitly requested.  When it
>       is not known whether or not the environment can handle the non-ACE
>       form, the application MAY use the non-ACE form (which might fail,
>       such as by not being displayed properly), or it MAY use the ACE
>       form (which will look unintelligible to the user).

I don't think IDNA is an exception. Note that the part you quote above 
uses words like SHOULD and MAY. I would say that those words were chosen 
for *exactly* the reasons I mentioned (i.e. IETF *specifies* wire 
protocols, not UI behavior). See section 6 of:

http://ietf.org/rfc/rfc2119.txt

This RFC focusses on "interoperability" (but also mentions "harm") so I 
would say that the wire protocol is the main concern.

> I think this discussion is headed toward an update to IDNA that would
> add a second exception to that requirement, for protecting the user
> against phishing.  What we need to figure out is how to describe that
> exception, and how specific or deliberately vague that description
> should be.

Here I agree with you. I'm not going to try to come up with the wording 
for that, but this morning I started to think that the right-to-left DNS 
and IDN spoofing problems *could* be addressed at the UI level by 
providing a *tool* that security-conscious users could *choose* to use.

I'm thinking of a tool that might be implemented as an extension for 
Mozilla, for example. It would offer to display domain names in the safe 
order, i.e. left-to-right for users whose main language is 
left-to-right. I have not heard of any UIs that offer top-to-bottom in 
their menus, dialogs, etc, so I would guess that this would be omitted 
in the extension too, though right-to-left might be offered for 
right-to-left users (many of which are in the Middle East -- Hebrew and 
Arabic).

In addition, such a tool would offer to display domain names in a clear 
font, unlike the sans-serif that is commonly used today. This would make 
the distinction between lowercase l and digit 1 clearer. And it would 
separate the domain name from its context, e.g. using color.

Finally, this tool would offer to display characters outside the user's 
language(s) in a special way, to make them stand out and catch the 
user's attention. I believe we need to focus on the user here, because 
we are talking about how things *look* to the user.

For example, people in the Far East are used to spotting small 
differences between complicated characters because they were taught as 
children to read and write the thousands of complex "Han" characters 
that differ in small ways.

Americans, on the other hand, are only used to seeing a small number of 
characters, and would not even be able to *read* Han characters, let 
alone spot differences between them. This is why I believe that a tool 
that focusses on the user might be a good idea.

You may claim that nobody would ever want to read domain names 
left-to-right, to which I would counter that some people are willing to 
try Dvorak keyboards, which are totally different from QWERTY. I.e. it's 
the user's choice. Internet security education may eventually lead 
*some* users to make this choice.

Erik