Re: Comments on draft-klensin-net-utf8-06
"Frank Ellermann" <nobody@xyzzy.claranet.de> Wed, 17 October 2007 06:24 UTC
Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1Ii2KR-00077w-L5; Wed, 17 Oct 2007 02:24:27 -0400
Received: from discuss by megatron.ietf.org with local (Exim 4.43) id 1Ii2KQ-00076a-01 for discuss-confirm+ok@megatron.ietf.org; Wed, 17 Oct 2007 02:24:26 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1Ii2KO-0006yQ-MZ for discuss@apps.ietf.org; Wed, 17 Oct 2007 02:24:24 -0400
Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Ii2KD-0007if-NH for discuss@apps.ietf.org; Wed, 17 Oct 2007 02:24:20 -0400
Received: from list by ciao.gmane.org with local (Exim 4.43) id 1Ii2Jv-0002ht-OK for discuss@apps.ietf.org; Wed, 17 Oct 2007 06:23:55 +0000
Received: from mail.st-michaelis.de ([217.86.170.58]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <discuss@apps.ietf.org>; Wed, 17 Oct 2007 06:23:55 +0000
Received: from nobody by mail.st-michaelis.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <discuss@apps.ietf.org>; Wed, 17 Oct 2007 06:23:55 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: discuss@apps.ietf.org
From: Frank Ellermann <nobody@xyzzy.claranet.de>
Subject: Re: Comments on draft-klensin-net-utf8-06
Date: Wed, 17 Oct 2007 08:21:08 +0200
Lines: 67
Message-ID: <ff49pf$rfn$1@ger.gmane.org>
References: <OF037DA1CA.695DAFC1-ONC1257376.004E5008-C1257376.00511560@notes.denic.de> <1CEEB76FCFC0070A7B2BDEAE@[10.1.0.164]>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: mail.st-michaelis.de
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
X-Spam-Score: 0.0 (/)
X-Scan-Signature: f60d0f7806b0c40781eee6b9cd0b2135
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
John C Klensin wrote: >> * Section 4: "the string order of RFC 3629". It's not very >> clear to me what is meant with this. Byte order? Sorting >> order? > 3629 specifies a byte order (in section 4). It does not address > or mention sort order except to note (in the introduction) that > UTF-8 preserves it and that sort order based on code point > sequence is likely to be fairly useless. > I _think_ I would welcome text to clarify this Please simplify the remark and remove one "that": -| Were Unicode to be changed in a way that violated these -| assumptions, i.e., that either invalidated the string order -| of RFC 3629 or that that changed the stability of NFC as -| stated above, this specification would not apply. +| Were Unicode to be changed in a way that violated these +| assumptions, i.e., that changed the stability of NFC as +| stated above, this specification would not apply. UTF-8 as specified in STD 63 is stable. > So I am loathe to cover things that are well-covered in > 3629 lest more confusion be created. Yes, just don't mention the "byte-value lexicographic sorting order of UTF-8 strings", it's covered in STD 63, and besides not very interesting. >> * Section 4: I would drop the last paragraph, since it is a >> repetition of what is exhaustively explained in section 5.2. >> I got a parsing error at the last sentence of that paragraph >> anyway. Indeed, that paragraph is unnecessary. I also can't parse its first sentence. > That last sentence could be restated, less formally, as: > If one encounters a UTF-8 string in a protocol, and its > syntax and properties are not specifically defined, then > it is reasonable to assume that it conforms to this > specification. I still don't understand this. What is an UTF-8 string with "unspecified syntax" ? STD 63 specifies the syntax of UTF-8, anything not following this syntax is invalid. The net-utf8 I-D doesn't specify any default properties, what is an assumption that "unspecified properties" conform to net-utf8 supposed to do ? If you're talking about unassigned code points please say so. In that case it's covered in 5.2, and you can simply delete the last paragraph of section 4. > I'm going to hold the document for a few days before > re-posting in the hope of getting comments from others. Please update the [NFC] reference, s/March 2005/2006-10-12/ for the version belonging to TUS 5.0. Frank
- Comments on draft-klensin-net-utf8-06 Marcos Sanz/Denic
- Re: Comments on draft-klensin-net-utf8-06 John C Klensin
- Re: Comments on draft-klensin-net-utf8-06 Frank Ellermann
- Re: Comments on draft-klensin-net-utf8-06 Marcos Sanz/Denic
- Re: Comments on draft-klensin-net-utf8-06 Clive D.W. Feather
- Re: Comments on draft-klensin-net-utf8-06 John C Klensin