Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txt

"Clive D.W. Feather" <clive@demon.net> Mon, 22 January 2007 08:48 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1H8uqt-0000U1-IT; Mon, 22 Jan 2007 03:48:31 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1H8uqr-0000Tu-Gi for discuss@apps.ietf.org; Mon, 22 Jan 2007 03:48:29 -0500
Received: from anchor-internal-1.mail.demon.net ([195.173.56.100]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1H8uqq-0005rp-3v for discuss@apps.ietf.org; Mon, 22 Jan 2007 03:48:29 -0500
Received: from finch-staff-1.server.demon.net (finch-staff-1.server.demon.net [193.195.224.1]) by anchor-internal-1.mail.demon.net with ESMTP� id l0M8mRL1003927Mon, 22 Jan 2007 08:48:27 GMT
Received: from clive by finch-staff-1.server.demon.net with local (Exim 3.36 #1) id 1H8uqp-000IuS-00; Mon, 22 Jan 2007 08:48:27 +0000
Date: Mon, 22 Jan 2007 08:48:27 +0000
From: "Clive D.W. Feather" <clive@demon.net>
To: John C Klensin <klensin@jck.com>
Subject: Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txt
Message-ID: <20070122084827.GI60599@finch-staff-1.thus.net>
References: <891E235E7A867F0DB506C90A@p3.JCK.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <891E235E7A867F0DB506C90A@p3.JCK.COM>
User-Agent: Mutt/1.5.3i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 50a516d93fd399dc60588708fd9a3002
Cc: discuss@apps.ietf.org
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org

John C Klensin said:
> If you have any interest in internationalization issues, a
> careful reading of, and comments on, this proposal would be
> greatly appreciated.  

Please remove the term "extended ASCII" from 1.1 - there is no such thing.
If you want to talk about an 8 bit character set, then "the ISO 8859
family" would do.

At the end of 3.1, you can't say "the same considerations" apply to
Punycode and such encodings, because they *are* more compact (your third
and textually closest point).

I disagree with the claim that the \u notation is easier to read and less
"ugly and awkward" than the HTML one. Is:

    Ank\uabcdef

really easier to parse than:

    Ank&#xabcd;ef

? I think - particularly in contexts using alphabetic text - that having a
clear and obvious delimiter is much more preferable and will produce far
less mistakes.

I can accept that the specific HTML notation is a bit clumsy, and in
particular that the use of 'x' implies other bases are available. Therefore
I would prefer something like \u(xxxx).

The text in 3.2 implies that \U is followed by 6 digits, not 8. That risk
of confusion (particularly since Unicode ends at U+10FFFF) is itself
reason for concern with any undelimited option.

I haven't seen a "consensus" that there isn't a problem with being
case-sensitive; the fact that you messed up the ABNF on this particular
point is evidence that this is unwise. I know C does it that way (and in
retrospect, I would have fought harder for doing it differently), but C has
always been case sensitive in such matters. IETF protocols mostly aren't.

Reference [ISO-C-Chars] is wrong - the \u and \U notation was added to
ISO 9899 in the 1999 edition, not in that TR. They aren't extensions; they
are part of the core language.

-- 
Clive D.W. Feather  | Work:  <clive@demon.net>   | Tel:    +44 20 8495 6138
Internet Expert     | Home:  <clive@davros.org>  | Fax:    +44 870 051 9937
Demon Internet      | WWW: http://www.davros.org | Mobile: +44 7973 377646
THUS plc            |                            |