Re: [idn] Re: character tables

Erik van der Poel <erik@vanderpoel.org> Wed, 02 March 2005 01:52 UTC

Message-ID: <42251B80.5050503@vanderpoel.org>
Date: Tue, 01 Mar 2005 17:48:48 -0800
From: Erik van der Poel <erik@vanderpoel.org>
User-Agent: Mozilla Thunderbird 1.0 (X11/20041206)
MIME-Version: 1.0
To: Paul Hoffman <phoffman@imc.org>
CC: Gervase Markham <gerv@mozilla.org>, John C Klensin <klensin@jck.com>, idn@ops.ietf.org
Subject: Re: [idn] Re: character tables
References: <421B8484.3070802@vanderpoel.org> <20050223072837.GA21463~@nicemice.net> <D872CCF059514053ECF8A198@scan.jck.com> <421D8411.9030006@vanderpoel.org> <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <A574CA1BE87BFDA3C2A1AC0E@scan.jck.com> <421FA55B.9000308@vanderpoel.org> <421FCBD7.8000805@vanderpoel.org> <42227EBF.9040703@vanderpoel.org> <45781B7428C6AA07C3B283BD@scan.jck.com> <42229BBC.8020608@vanderpoel.org> <p0621021ebe484f52c0c5@[10.20.30.249]> <4225ABAB.60002@mozilla.org> <p0621022dbe4ab4b8a3fa@[10.20.30.249]>
In-Reply-To: <p0621022dbe4ab4b8a3fa@[10.20.30.249]>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

Paul Hoffman wrote:
> At 12:03 PM +0000 3/2/05, Gervase Markham wrote:
> 
>> Could you tell us more about the problems you found with the ideas of 
>> bundling and blocking?
> 
> It was impossible to come up with a bundling scheme that kept everyone 
> happy. The needs of the Chinese language communities for bundling were 
> different than the needs of the Scandinavian language communities, which 
> in turn were different than the needs of the Indic language communities, 
> which were different than the needs of the Arabic language communities, 
> and so on. Then toss in the communities that truly want multiple scripts 
> but want to avoid homograph attacks (yes, we really did think about that 
> years ago...), and your brain starts dripping from your ears.

Yes, as a long-time internationalization engineer, I can imagine that it 
was difficult to come up with a single set of guidelines for all of the 
world's registries. (In addition to language differences, some comments 
on this list have led me to believe that there are also protocol 
differences between the registries, i.e. VeriSign's multiple versions of 
RRP vs the EPP that Edmon Chung seems to have been working on vs fax and 
sneaker net vs any others?)

However, I note that this particular conversation is between a browser 
developer (Gervase) and one of the IDNA authors (Paul), neither of which 
is a registry representative, so why exactly are you 2 having this 
conversation? :-)

Sorry, I'm half joking. Half, because you two have every right to 
discuss whatever you wish. The other half because I believe browser 
developers can afford to focus more on their end of things. Allow me to 
insert an excerpt from a previous email I wrote up:

-----------------

It is pretty clear that none of the organizations can completely solve
the problem on its own. Unicode can warn about these issues, but that is
all they can do. They cannot remove characters. The IETF is currently
discussing the prohibition of certain characters or character types.
Even if the IETF publishes updated versions of the specs, there will
still be the problem of certain characters being unfamiliar to many
users (simply because they do not know all the legitimate characters in
the world), thereby leaving them exposed to the phishers. The registries
can enforce rules at their level, but nobody has yet shown that they can
truly enforce any rules at other levels. So, the browser developers must
address that problem.

There are several issues here. One is that domain names are typically
displayed inside something else, e.g. a URI. This, in itself, gives the
phishers something to work with. So the browser developers must think
about other ways to display domain names. This is not very easy. People
exchange URIs via email and other means all the time. Apps turn those
URIs into clickable links, as a service to users. If not, they can copy
and paste the URI into the URI field. Both of these methods could be
improved to highlight the domain name in the interests of security.

Another problem is that humans are only familiar with a small set of
characters. Some humans know *many* characters (i.e. the East Asians),
but most know a lot less than that. Now, within the set of characters
that each user is familiar with, there are no homograph problems (or
just a few). However, as soon as you stray outside any single user's
familiar set, there are many homographs, near-homographs and unfamiliar
symbols. When a typical computer user is faced with something
unfamiliar, they are quite likely to shrug it off and assume it's just
one of those "computer" things that they cannot understand. This is
something that IDN phishers could take advantage of, if the browsers do
not take steps to highlight the unfamiliar characters (via HTTP
Accept-Language and browser localization as I suggested). Of course,
highlighting is not sufficient. Education is also very important.

So, instead of wasting time talking about a non-solution (white/black
lists), it would be nice to see these parties spending their valuable
time on real solutions. The registries could be working on the
guidelines, to address the concerns about language tagging, variants and
so on. They could also get in touch with the IETF, to let them know
which Unicode characters and character types they wish to use, so that
the IETF can consider how to publish new specs that might prohibit other
characters. Browser developers could start working on ways to display
domain names in ways that give the phishers less to work with.

---------------------

In other words, I do not think browser developers need to be overly 
concerned with the particular bundling/blocking schemes that the 
registries might be using. Instead, I wish the browser developers would 
focus more on the *user*, who may be "surfing" from one site to the 
next, spanning the globe, and crossing language boundaries. In order to 
protect such a user, the browser should focus on the core set of 
characters that s/he is familiar with, and provide some sort of 
indication when unfamiliar characters appear, so that the 
security-conscious, educated user may know when to be careful. I.e. the 
language of the *user* is important, not the language of the domain name.

I am *not* saying that this would be easy to implement. I am not at all 
surprised that Mozilla and Opera have chosen an easy stopgap, hopefully 
only for the interim. It's great to see Mozilla and Opera lead the way 
as they have been!

Erik

[idn] related work Erik van der Poel
[idn] Unicode categories Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue JFC (Jefsey) Morfin
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue John C Klensin
Re: [idn] nameprep2 and the slash homograph issue Adam M. Costello
Re: [idn] something a little lighter for the week… Doug Ewell
Re: [idn] stability Erik van der Poel
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: process Adam M. Costello
Re: [idn] punctuation John C Klensin
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] Re: character tables Gervase Markham
Re: [idn] stringprep: PRI #29 Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Gervase Markham
Re: [idn] Re: stability Erik van der Poel
Re: [idn] process Paul Hoffman
Re: [idn] Re: character tables YAO Jiankang
Re: [idn] nameprep2 and the slash homograph issue JFC (Jefsey) Morfin
Re: [idn] nameprep2 and the slash homograph issue Adam M. Costello
Re: [idn] punctuation John C Klensin
Re: [idn] punctuation tedd
Re: [idn] Re: character tables JFC (Jefsey) Morfin
Re: [idn] punctuation Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Gervase Markham
Re: [idn] Re: stability Erik van der Poel
Re: [idn] Re: character tables Adam M. Costello
[idn] Re: character tables John C Klensin
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] Re: character tables Paul Hoffman
Re: [idn] Re: stability Martin v. Löwis
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: stability John C Klensin
[idn] Re: Unicode categories John C Klensin
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
[idn] character tables Erik van der Poel
Re: [idn] Re: character tables John C Klensin
Re: [idn] Re: stability Mark Davis
Re: [idn] Re: stringprep: PRI #29 Erik van der Poel
[idn] stability Erik van der Poel
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: dichotomies JFC (Jefsey) Morfin
Re: [idn] process Adam M. Costello
Re: [idn] Re: character tables William Tan
Re: [idn] Re: process James Seng
[idn] Re: stability Simon Josefsson
Re: [idn] stability Erik van der Poel
[idn] Re: stability Martin v. Löwis
Re: [idn] Re: process Jaap Akkerhuis
Re: [idn] Re: stringprep: PRI #29 Adam M. Costello
Re: [idn] punctuation tedd
[idn] Re: dichotomies Erik van der Poel
Re: [idn] Re: stability Martin v. Löwis
Re: [idn] punctuation Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] process JFC (Jefsey) Morfin
[idn] Re: stability Simon Josefsson
Re: [idn] nameprep2 and the slash homograph issue JFC (Jefsey) Morfin
[idn] Re: stringprep: PRI #29 Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Adam M. Costello
Re: [idn] process John C Klensin
Re: [idn] Re: Unicode categories Mark Davis
Re: [idn] process Doug Ewell
Re: [idn] Re: stability Adam M. Costello
Re: [idn] process Erik van der Poel
[idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] punctuation tedd
[idn] punctuation Erik van der Poel
Re: [idn] Re: stability James Seng
[idn] Re: stability Simon Josefsson
[idn] something a little lighter for the weekend Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] something a little lighter for the week… Adam M. Costello
Re: [idn] process Gervase Markham
[idn] Re: character tables Cary Karp
[idn] Mozilla? JFC (Jefsey) Morfin
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] punctuation Erik van der Poel
[idn] Re: Unicode categories Erik van der Poel
[idn] Re: stability Simon Josefsson
Re: [idn] Re: character tables JFC (Jefsey) Morfin
[idn] Re: process Stephane Bortzmeyer
Re: [idn] process Erik van der Poel
Re: [idn] punctuation Jaap Akkerhuis
Re: [idn] Re: character tables Gervase Markham
Re: [idn] Re: process Jaap Akkerhuis
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] Re: process James Seng
[idn] stringprep mailing list Erik van der Poel
Re: [idn] Re: dichotomies Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] Re: stability Erik van der Poel
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] Re: process Erik van der Poel
[idn] Re: stringprep: PRI #29 Simon Josefsson
Re: [idn] punctuation Erik van der Poel
Re: [idn] stability Martin v. Löwis
[idn] stringprep: PRI #29 Erik van der Poel
Re: [idn] Re: character tables Paul Hoffman
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
[idn] Re: stability Simon Josefsson
[idn] process Erik van der Poel
[idn] stringprep: existing profiles and string pr… Erik van der Poel
Re: [idn] Re: stability Erik van der Poel
[idn] dichotomies Erik van der Poel
Re: [idn] stability JFC (Jefsey) Morfin
[idn] Re: character tables Cary Karp
Re: [idn] Re: process Erik van der Poel
[idn] Re: stringprep mailing list Simon Josefsson
Re: [idn] Re: Unicode categories Martin v. Löwis
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] something a little lighter for the week… John C Klensin
Re: [idn] something a little lighter for the week… Adam M. Costello
Re: [idn] Re: dichotomies JFC (Jefsey) Morfin
Re: [idn] Re: stability Erik van der Poel
Re: [idn] Re: stability Erik van der Poel
[idn] Re: stringprep: PRI #29 Simon Josefsson
Re: [idn] stability Erik van der Poel
[idn] Re: stringprep: PRI #29 Simon Josefsson