Re: Next step

"Clive D.W. Feather" <clive@demon.net> Thu, 25 January 2007 08:11 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1H9zhW-0004Ap-Mh; Thu, 25 Jan 2007 03:11:18 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1H9zhW-0004Aj-7P for discuss@apps.ietf.org; Thu, 25 Jan 2007 03:11:18 -0500
Received: from anchor-internal-1.mail.demon.net ([195.173.56.100]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1H9zhU-0007ST-Px for discuss@apps.ietf.org; Thu, 25 Jan 2007 03:11:18 -0500
Received: from finch-staff-1.server.demon.net (finch-staff-1.server.demon.net [193.195.224.1]) by anchor-internal-1.mail.demon.net with ESMTP� id l0P8BGrw006061Thu, 25 Jan 2007 08:11:16 GMT
Received: from clive by finch-staff-1.server.demon.net with local (Exim 3.36 #1) id 1H9zhT-0005Ic-00; Thu, 25 Jan 2007 08:11:15 +0000
Date: Thu, 25 Jan 2007 08:11:15 +0000
From: "Clive D.W. Feather" <clive@demon.net>
To: Frank Ellermann <nobody@xyzzy.claranet.de>
Subject: Re: Next step
Message-ID: <20070125081115.GH18174@finch-staff-1.thus.net>
References: <B1930392E9C03720F9E495F8@p3.JCK.COM> <45B7F44F.4675@xyzzy.claranet.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <45B7F44F.4675@xyzzy.claranet.de>
User-Agent: Mutt/1.5.3i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 769a46790fb42fbb0b0cc700c82f7081
Cc: discuss@apps.ietf.org
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org

Frank Ellermann said:
>    o  Java uses the form \uNNNN, but can represent characters outside
>       Plane 0 (i.e., above U+FFFF) only by the use of surrogate pairs.
> 
> One of the reasons why anything with \u or \U is a non-starter, there
> are too many incompatible conventions in use.

In particular, C uses \U with 8 hex digits because, at the time, there was
a serious possibility that ISO 10646 would still allow all 32 bits (or at
least 31) to be used.

Had it been a few years later, we would probably have make \U indicate 6
hex digits rather than 8.

>    There is one significant disadvantage of the recommended form.  The
> 
> No, there are more, folks will assume that it's a convention they know
> or a variant of U+NNNN[N[N]] with an arbitrary number of leading 0s.
> Nobody will use \U012345 when they can hope to get away with \U12345.

+1

>    should not introduce any security issues that are not present as a
> 
> My objections are also security considerations, because folks will
> screw up with this encoding it could cause havoc.

+2

If UTF-8 can make a security issue out of having more than one way to
encode a character, so can we.

[Which reminds me: being able to encode ASCII characters in this form might
be a security issue as well, or it might be a useful benefit.]

> In theory your proposal is compatible with C044, but in practice I
> fear that it won't work as you expect it.  I could live with e.g.
> "authors SHOULD either pick hex. NCRs as in XML or" (your proposal),
> but in fact I think that the XML-notation is much better.

My only, small, discomfort is that people will expect all protocols to
accept both hex (&#x1234;) and decimal (&#1234;).

-- 
Clive D.W. Feather  | Work:  <clive@demon.net>   | Tel:    +44 20 8495 6138
Internet Expert     | Home:  <clive@davros.org>  | Fax:    +44 870 051 9937
Demon Internet      | WWW: http://www.davros.org | Mobile: +44 7973 377646
THUS plc            |                            |