Re: [idn] space-like unicode char

Erik van der Poel <erik@vanderpoel.org> Fri, 08 April 2005 19:44 UTC

Message-ID: <4256DD38.3070708@vanderpoel.org>
Date: Fri, 08 Apr 2005 12:36:24 -0700
From: Erik van der Poel <erik@vanderpoel.org>
User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317)
MIME-Version: 1.0
To: Soobok Lee <lsb@lsb.org>
CC: idn@ops.ietf.org
Subject: Re: [idn] space-like unicode char
References: <42181FD5.3070608@lsb.org> <4255E488.8010302@vanderpoel.org> <42562D22.3090609@lsb.org>
In-Reply-To: <42562D22.3090609@lsb.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

Soobok Lee wrote:
> U+1160 problem has been raised 3.5 years ago (you can look into this
> huge idn-list archive by keyword search for 1160 or filler)
> with some additional hangul jamo problem. One draft has been submitted
> by me (you may find that in www.i-d-n.net)
> to filter out these invalid char sequences. But the draft had been
> discarded . Someone argued that such filtering * complicates *
> stringprep algorithms with context-sensitive filtering/prohibiting and
> the problem is up to UTC/NFC not to IETF. of course, i couldn't accept that.

The i-d-n.net name no longer takes you to a real site, but I believe I 
found your draft here:

http://www.watersprings.org/pub/id/draft-ietf-idn-hangeulchar-00.txt

I agree that the U+1160 issues would complicate a spec, and I can see 
why the IETF decided not to include them in the RFCs, but now that we 
have seen that a number of implementations display this character in a 
potentially dangerous way, we should reconsider the specs.

Unicode may not be able to address these issues in the normalization 
spec since they have promised not to make any incompatible changes. 
Unicode might be able to address the issues in other normative or 
informative parts of their book or documents, and the IETF might just 
want to refer to those parts of Unicode.

Alternatively, the IETF can write up its own specifications or 
recommendations. It's not immediately clear to me whether U+1160 ought 
to be addressed in Stringprep or Nameprep. As we have seen, Stringprep 
is used in various protocols, including SASLprep, which is for user 
names and passwords. Some perverse people might suggest that passwords 
ought to allow strange character sequences like multiple consecutive 
U+1160s in order to make it harder to guess the password. I'm new to 
Stringprep, so I don't know how most IETFers feel about this type of thing.

In the meantime, I have added U+1160 and the combining mark issue to my 
list and I have filed a bug report for Mozilla:

http://nameprep.org/#display
https://bugzilla.mozilla.org/show_bug.cgi?id=289588

Erik

[idn] combining marks and space-like unicode char Soobok Lee
Re: [idn] space-like unicode char Soobok Lee
[idn] space-like unicode char Soobok Lee
Re: [idn] space-like unicode char Erik van der Poel
Re: [idn] space-like unicode char Erik van der Poel
Re: [idn] space-like unicode char Soobok Lee
Re: [idn] space-like unicode char Soobok Lee