Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txt
"Clive D.W. Feather" <clive@demon.net> Mon, 22 January 2007 09:02 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
by megatron.ietf.org with esmtp (Exim 4.43)
id 1H8v4X-0005fz-Cr; Mon, 22 Jan 2007 04:02:37 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
by megatron.ietf.org with esmtp (Exim 4.43) id 1H8v4V-0005fD-UO
for discuss@apps.ietf.org; Mon, 22 Jan 2007 04:02:35 -0500
Received: from anchor-internal-1.mail.demon.net ([195.173.56.100])
by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1H8v4U-0000Yj-2p
for discuss@apps.ietf.org; Mon, 22 Jan 2007 04:02:35 -0500
Received: from finch-staff-1.server.demon.net (finch-staff-1.server.demon.net [193.195.224.1])
by anchor-internal-1.mail.demon.net with ESMTP� id l0M92UmJ010234Mon, 22 Jan 2007 09:02:30 GMT
Received: from clive by finch-staff-1.server.demon.net with local (Exim 3.36
#1) id 1H8v4Q-000JIz-00; Mon, 22 Jan 2007 09:02:30 +0000
Date: Mon, 22 Jan 2007 09:02:30 +0000
From: "Clive D.W. Feather" <clive@demon.net>
To: John C Klensin <john-ietf@jck.com>
Subject: Re: FWD: I-D ACTION:draft-klensin-unicode-escapes-00.txt
Message-ID: <20070122090230.GJ60599@finch-staff-1.thus.net>
References: <891E235E7A867F0DB506C90A@p3.JCK.COM> <45B0F363.6020102@cs.utk.edu>
<E77D46A3FD71DD741ED1BE85@p3.JCK.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <E77D46A3FD71DD741ED1BE85@p3.JCK.COM>
User-Agent: Mutt/1.5.3i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 73734d43604d52d23b3eba644a169745
Cc: discuss@apps.ietf.org, Keith Moore <moore@cs.utk.edu>
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols
<discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>,
<mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>,
<mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
John C Klensin said: >> - it should be clear that this is for newly-designed protocols >> only. it shouldn't be interpreted as a request to change >> existing protocols (including deployed and nonstandard >> protocols being standardized by IETF), as this would generally >> break backward compatibility by changing the meaning of '\' > > That was intended to be clear already. If it is not > sufficiently so, suggested text, or at least a place to put it, > would be welcome. How about adding "new" before "protocols" in the middle paragraph of 1.1 and the abstract? >> - it should be clear that this is for occasional use of >> non-ASCII characters within a protocol field that is >> constrained to contain only ASCII characters (or a subset), >> rather than a recommendation for how to represent non-ASCII >> characters in a protocol field that is capable of carrying, >> say, UTF-8. > I don't know if it is clear enough or not. At some level, if > you didn't conclude that it was clear on reading the draft, then > that is evidence that it isn't clear enough... but I don't know > how carefully you read it. I don't think it would hurt to add something in 1.1. I'm not sure how to word it, but something about "Some protocols already accept native UTF-8 or some other encoding of Unicode, and this recommendation does not apply to such protocols.". > I've looked at several RFCs > and U+NNNN seems to be the preferred format for character > literals and, more commonly, for identifying the code point > associated with a named character. It is also, fwiw, the one I > prefer for that purpose. But it is fairly poor for inline use > in a protocol. The authoritative definition and reference for > that form is the "Code Points" section of "Appendix A: > Notational Conventions" of Unicode 5.0 (the reference to the > book is the I-D). I don't have that book. The online version 4.1 suggests the notation <U+0061, U+0300>, which can be abbreviated to <0061, 0030>. This would still need some kind of introductory indicator (like \u) to show that it's a Unicode escape. >> one more caveat: protocol specifications need to specify this >> notation explicitly (either directly or by reference to the >> published RFC) if they are going to use it. conversely, this >> notation SHOULD NOT (maybe MUST NOT) be used unless it is part >> of the protocol specification. > Please suggest text for specifying those rules. I constructed > this rather more as advice to protocol designers and, to a > lesser extent, to document authors, rather than a base for > notational definitions to be included by reference. That could > be changed, but I'd welcome textual suggestions. "This specification is a recommendation to protocol designers and document authors. A protocol or other specification MUST NOT be interpreted as using it unless it explicitly copies this syntax or refers to this RFC as normative." > But it is also, if I have done > the calculation correctly, %C3%83 and that form (used in URIs > and IRIs) is seriously non-intuitive and certainly can't be > converted visually. I certainly agree that encoding of UTF-8 sequences is the wrong thing to do. Oh: you should explicitly forbid the use of surrogates to encode characters above U+FFFF. > But I > have no particularly strong commitment to any particular > recommendation as long as we establish a recommendation. (1) I agree that anything is better than nothing. (2) While \uXXXX is better than encoded UTF-8, it's far worse than something explicitly delimited. -- Clive D.W. Feather | Work: <clive@demon.net> | Tel: +44 20 8495 6138 Internet Expert | Home: <clive@davros.org> | Fax: +44 870 051 9937 Demon Internet | WWW: http://www.davros.org | Mobile: +44 7973 377646 THUS plc | |
- FWD: I-D ACTION:draft-klensin-unicode-escapes-00.… John C Klensin
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… Keith Moore
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… Keith Moore
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… John C Klensin
- Escaping the escape (Was: I-D ACTION:draft-klensi… Stephane Bortzmeyer
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… Clive D.W. Feather
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… Clive D.W. Feather
- Re: Escaping the escape (Was: I-D ACTION:draft-kl… Clive D.W. Feather
- Re: Escaping the escape (Was: I-D ACTION:draft-kl… Julian Reschke
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… Stephane Bortzmeyer
- Re: Escaping the escape (Was: I-D ACTION:draft-kl… Stephane Bortzmeyer
- Re: Escaping the escape (Was: I-D ACTION:draft-kl… Clive D.W. Feather
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… John C Klensin
- Re: FWD: I-D ACTION:draft-klensin-unicode-escapes… Clive D.W. Feather