Re: [idn] process

Erik van der Poel <erik@vanderpoel.org> Fri, 25 February 2005 22:32 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA09539 for <idn-archive@lists.ietf.org>; Fri, 25 Feb 2005 17:32:02 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1D4nrv-000GHC-PB for idn-data@psg.com; Fri, 25 Feb 2005 22:23:31 +0000
Received: from [207.115.63.98] (helo=pimout4-ext.prodigy.net) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1D4nru-000GGy-52 for idn@ops.ietf.org; Fri, 25 Feb 2005 22:23:30 +0000
Received: from [10.1.1.2] (adsl-64-174-147-206.dsl.sntc01.pacbell.net [64.174.147.206]) by pimout4-ext.prodigy.net (8.12.10 milter /8.12.10) with ESMTP id j1PMNOHb145020; Fri, 25 Feb 2005 17:23:25 -0500
Message-ID: <421FA55B.9000308@vanderpoel.org>
Date: Fri, 25 Feb 2005 14:23:23 -0800
From: Erik van der Poel <erik@vanderpoel.org>
User-Agent: Mozilla Thunderbird 1.0 (X11/20041206)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: John C Klensin <klensin@jck.com>
CC: Doug Ewell <dewell@adelphia.net>, idn@ops.ietf.org
Subject: Re: [idn] process
References: <421B8484.3070802@vanderpoel.org> <20050223072837.GA21463~@nicemice.net> <D872CCF059514053ECF8A198@scan.jck.com> <421D8411.9030006@vanderpoel.org> <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <A574CA1BE87BFDA3C2A1AC0E@scan.jck.com>
In-Reply-To: <A574CA1BE87BFDA3C2A1AC0E@scan.jck.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

John,

Thank you for taking the time to write such a well-thought-out response. 
I agree with some of the points you make, but I'm going to present 
arguments against the others. I'm currently leaning towards *not* 
changing IDNA (other than to fix mistakes and clarify some sections).

John C Klensin wrote:
> 
> (1) To the extent possible, we should accommodate all Unicode
> characters, excluding as little as possible.  This position was
> reinforced by the view that, at the time, the Unicode
> classifications of characters were considered a little soft and
> a general conviction that the IETF should not be making
> character-by-character decisions.   A counter-principle, now if
> not then, is that we should permit a relatively narrow extension
> of the "letter-digit-hyphen" rule, i.e., permitting, only
> letters (in any alphabet or script), perhaps local digits, and
> the hyphen, but no other punctuation, symbols, drawing
> characters, or other non-letter characters.  Adam has argued for
> that revised principle recently; several people argued for it
> when IDNA was being produced.  We could probably still impose
> it, and, in any event, it would not require a change in the
> basic architecture (see below).

I believe it would be difficult to reach consensus on a relatively 
narrow extension of the LDH rule. Just for starters, the hyphen used to 
separate names and other strings in the Western world is not used in 
Japan for Katakana, because Katakana uses a middle dot (U+30FB) to 
separate 2 Katakana strings. In fact, this character is allowed in .jp.

If we do *not* allow these special local characters that function in the 
same way as the hyphen in the West, then people in other parts of the 
world would not only claim that our spec is unfair, they might even 
ignore it. If we *do* allow this Japanese example, then we have started 
sliding down a slippery slope that ends with a rather large extension of 
the LDH rule (for the rest of the world), and then the phishing problem 
would not be alleviated as much as we might have hoped when we started 
with just LDH. This would be a lot of work for little gain.

So it's a lose-lose situation. Instead, we should probably stick to 
IDNA's original principle of allowing a lot of Unicode, and have the 
local registries, zone administrators and apps address the phishing problem.

> (2) When code points had been identified by UTC as the same as,
> or equivalent to, others, we tended to map them together, rather
> than picking one and prohibiting the others.   This has caused
> more problems than most of us expected, with people being
> surprised when they register or query using one character and
> the result that comes back uses another.  It also creates a
> near-homograph problem that we haven't "discovered" in the last
> couple of weeks: If we have character X mapping to character Y,
> but X looks vaguely like Z, then there may be no Y-Z homograph,
> but there may be an X-Z one.  That could make display decisions,
> etc., quite critical and, unless applications got it entirely
> right, we might end up with a new family of attacks.  Again,
> that decision could be reviewed.  Perhaps there are groups of
> characters that should be prohibited from being included in a
> lookup or registration operation, not just mapped to something
> more reasonable.  And, again, this would be a tuning of tables,
> not a change in the basis architecture.

It may be possible to "tune" the tables, but nowhere in your email do I 
find any reference to the ACE prefix. I think that we should also figure 
out exactly which types of changes would absolutely require a new ACE 
prefix, and then explore in detail what all the affected parties would 
have to do to add a new prefix to the mix or to transition to it. The 
parties I'm thinking of are app developers and registries, mostly, but 
content developers might also be affected.

> The assumption I referred to above was that ICANN would take a
> strong role in determining which characters were really
> appropriate for registration and under what circumstances, that
> they would institute and enforce appropriate rules, and that
> everyone relevant would pay attention to whatever they said.
> Every element of that assumption has turned out to be false:
> they haven't taken that role; their guidelines are weak,
> ambiguous, and at least partially wrong; and some registries
> have just ignored the rules that do exist without any penalty.
> If there is a problem, either we are going to need to solve it,
> or we are going to risk different solutions in different
> applications that, taken together, compromise interoperability.

I'm currently thinking that we (IETF) can't really solve these problems, 
and that the registries and apps are going to have to address them. But 
I strongly sympathize with your stated concern about differing solutions 
leading to interoperability problems, and so I think "we" (not IETF) 
must come up with much better registry guidelines and even 
recommendations and proposals for the apps. Such documents would not 
necessarily be IETF documents, though they could be if they are merely 
informational (not standards track). Other organizations like ICANN 
could then take some of that and fold it into their own doc, but they 
would probably make some of it normative (or MUST). There isn't really a 
single organization for the apps (W3C doesn't cover all), so an IETF 
informational RFC might be good for them.

> If
> we can get past "right to register", we need to look at the
> experience of the browser implementers who have already
> concluded that, registered or not, they really don't want to
> recognize or process domain names containing such characters.

Some of these implementors might decide to disable IDNA labels under 
some circumstances, but the existence of a number of IDN plug-ins for 
MSIE and the extensibility of Mozilla and the need for IDNs around the 
world suggest that their decisions may be circumvented. Eventually, 
these implementors may decide to improve their own IDN support. I 
realize that the short-term decisions may be bad for IDN, but I am 
hopeful for the future.

Erik