Re: draft-klensin-unicode-escapes-01
Frank Ellermann <nobody@xyzzy.claranet.de> Tue, 06 February 2007 19:57 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HEWRt-0003kg-Ph; Tue, 06 Feb 2007 14:57:53 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HEWRt-0003kb-AS for discuss@apps.ietf.org; Tue, 06 Feb 2007 14:57:53 -0500
Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HEWRq-0001sd-Qu for discuss@apps.ietf.org; Tue, 06 Feb 2007 14:57:53 -0500
Received: from list by ciao.gmane.org with local (Exim 4.43) id 1HEWRT-0006IL-V3 for discuss@apps.ietf.org; Tue, 06 Feb 2007 20:57:27 +0100
Received: from d255146.dialin.hansenet.de ([80.171.255.146]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <discuss@apps.ietf.org>; Tue, 06 Feb 2007 20:57:27 +0100
Received: from nobody by d255146.dialin.hansenet.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <discuss@apps.ietf.org>; Tue, 06 Feb 2007 20:57:27 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: discuss@apps.ietf.org
From: Frank Ellermann <nobody@xyzzy.claranet.de>
Subject: Re: draft-klensin-unicode-escapes-01
Date: Tue, 06 Feb 2007 20:53:01 +0100
Organization: <URL:http://purl.net/xyzzy>
Lines: 89
Message-ID: <45C8DC9D.3D61@xyzzy.claranet.de>
References: <AF334D6BB0BFF3037B0DE609@p3.JCK.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@sea.gmane.org
X-Gmane-NNTP-Posting-Host: d255146.dialin.hansenet.de
X-Mailer: Mozilla 3.0 (OS/2; U)
X-Spam-Score: 1.1 (+)
X-Scan-Signature: f66b12316365a3fe519e75911daf28a8
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
John C Klensin wrote: >> When I mentioned hex. NCRs I meant XML, not SGML and its many >> ways to save keystrokes. > And I was commenting only on the suggestion that all reference > to HTML be removed, however horrible it is. That's one of those mail communication breakdowns, my remark here about the wonders of SGML (as seen or not in most browsers trying to display HTML) was in reply to Clive. It was no objection to or suggestion for your I-D. Of course I support CharMod C044 with explicit delimiters (as in XML but not SGML) and CharMod C043 with hex. escapes (something XML and SGML couldn't do, but your I-D and RFC 4646 got it right.). [unlike RFC 2231 an URL can't say what it is] > I think I've understood both of those things. I just haven't > seen the justification or requirement to start exploring existing > protocols in this document, famous or not. IIRC you have a SHOULD. One accepted justification to violate a SHOULD is "your new rule came too late for my old implementation", and so far it's unnecessary to talk about it. But IMO there can be also reasons to violate this SHOULD in future protocols, if it's in a context remotely related to IRIs. Or similar situations where say using B64-encoded UTF-8 is better than ASCII with hex. NCRs. If you think that's obvious it's okay. Sometimes folks ask why a SHOULD is "only" a SHOULD, and want to know what a _good_ reason to violate it could be (apart from the clear "too late"), and for that I thought the IRI example might help. From a "protocol lawyer" POV, RFC 2324 is "only" informational <eg> > every time I put something into a document that is not strictly > necessary I get attacked for excessive length, etc. If you think that an example for this SHOULD is unnecessary it's fine. With Murphy somebody will attack you later claiming that the potential exceptions have to be spelled out. [21 bits vs. 7 bits] > Sure. But we routinely express ASCII in terms of octets. We > don't use the "7-bit" or "21-bit" language very often. Yes, the matter of 21 vs. 31 bits was recently discussed on the Unicode list in conjunction with a (hypothetical) "UTF-21", maybe Clive had that discussion in mind. I'm also fascinated by such charset encoding details. One reason that I've not yet published an "UTF-4" I-D was RFC 4042 with its UTF-9 and UTF-18 "nonets". Your "net UTF-8" I-D also mentions "nonets" (not using that name). Of course you don't need to mention that matter in the "escapes" I-D, unless you want to explain why old conventions demand _eight_ hex. digits where (today) _six_ should be good enough. I only tried to state that Clive's remark wasn't off topic or something. > I don't think I disagree with your point -- it is certainly > factual-- but don't yet see the need to open this topic up in > this document (see the comment about length, etc., above). It's perfectly okay if you stick to the "octet layer" in this I-D. FWIW, in theory "UTF-4" (like the "old" UTF-8) could be extented to 31 bits (again), but of course it will never happen. It would break UTF-16, BOCU-1, and all implementations of STD 66. A lame excuse is that those aliens are supposed to bring their own kind of "Intergalacode" when they need more bits. [about RFC 2345, unrelated to the Unicode escapes] > I've become convinced, as I have delved further into the history > of telnet-based protocols, that it was a mistake. IBTD, if that's about using UTF-8 in whois. UTF-8 doesn't need some of the "critical" (wrt telnet) octets, especially no 0xFF. RFC 3912 was a huge victory for any anti-1591 cabal, but the whois- battlefield in the war on spam isn't completely lost yet... <beg> [about net-UTF-8, unrelated to the Unicode escapes] > I'm typically willing to answer questions about comments of mine > that seem obscure if that is more efficient for you. The rest can wait for net-utf8-03. If you have by chance an old copy of RFC 97, it's AWOL in all RFC collections I've heard of. Frank
- Re: draft-klensin-unicode-escapes-01 John C Klensin
- Re: draft-klensin-unicode-escapes-01 Frank Ellermann