Re: Comments on draft-klensin-net-utf8-06
Marcos Sanz/Denic <sanz@denic.de> Thu, 18 October 2007 08:45 UTC
Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IiR02-0008FK-II; Thu, 18 Oct 2007 04:45:02 -0400
Received: from discuss by megatron.ietf.org with local (Exim 4.43) id 1IiQzy-00087n-G0 for discuss-confirm+ok@megatron.ietf.org; Thu, 18 Oct 2007 04:44:58 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IiQzw-00084R-Vc for discuss@apps.ietf.org; Thu, 18 Oct 2007 04:44:56 -0400
Received: from smtp.denic.de ([81.91.161.3]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IiQzp-0007La-1u for discuss@apps.ietf.org; Thu, 18 Oct 2007 04:44:56 -0400
Received: from notes.rz.denic.de ([192.168.0.77]) by smtp.denic.de with esmtp id 1IiQze-0003f8-2s; Thu, 18 Oct 2007 10:44:38 +0200
In-Reply-To: <1CEEB76FCFC0070A7B2BDEAE@[10.1.0.164]>
To: John C Klensin <john-ietf@jck.com>
Subject: Re: Comments on draft-klensin-net-utf8-06
MIME-Version: 1.0
X-Mailer: Lotus Notes Release 7.0.2 September 26, 2006
From: Marcos Sanz/Denic <sanz@denic.de>
Message-ID: <OF823CA755.B4F0DAF7-ONC1257378.002D1284-C1257378.0030055F@notes.denic.de>
Date: Thu, 18 Oct 2007 10:44:30 +0200
X-MIMETrack: Serialize by Router on notes/Denic at 18.10.2007 10:44:37, Serialize complete at 18.10.2007 10:44:37
Content-Type: text/plain; charset="US-ASCII"
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 37af5f8fbf6f013c5b771388e24b09e7
Cc: discuss@apps.ietf.org
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
John, > While I would welcome suggestions about other text and ways to > organize this, What about making a forecasting note right away under bullet 2: OLD TEXT: CR SHOULD NOT appear except when followed by LF. SUGGESTED TEXT: CR SHOULD NOT appear except when followed by LF. The other only allowed appearance is in the combination CR NUL, which is not recommended (see note at the end of this section). There is a similar contradictory (less restrictive at the beginning and suddenly more restrictive at the end) situation in bullet 3. The first sentence goes [...] control characters (U+0000 to U+001F and U+007F to U+009F) SHOULD generally be avoided vs the last sentence the so-called "C1 Controls" (U+0080 through U+009F) MUST NOT appear This is the nightmare for any implementor. The double negation doesn't provide for clarity either. What about changing OLD TEXT: control characters (U+0000 to U+001F and U+007F to U+009F) SHOULD generally be avoided. SUGGESTED TEXT control characters (U+0000 to U+001F and U+007F) SHOULD NOT be used and the so-called "C1 controls" (U+0080 to U+009F) MUST NOT be used. Then you can drop the last sentence of the bullet. > > Suggested text: > > > > That is, if a string does not contain any unassigned > > characters for a given version of Unicode, and it is > > normalized according to > > the definition of NFC in that version, it will always result > > in the same normalized string according to all future > > versions of the Unicode Standard. > > The text that was used was supplied by Mark Davis after my first > attempt didn't come out right. He'll certainly know better. > > * Section 4: "the string order of RFC 3629". It's not very > > clear to me what is meant with this. Byte order? Sorting > > order? > > 3629 specifies a byte order (in section 4). It does not address > or mention sort order except to note (in the introduction) that > UTF-8 preserves it and that sort order based on code point > sequence is likely to be fairly useless. > > I _think_ I would welcome text to clarify this I support Frank's suggestion, which I'll copy here again for clarity: -| Were Unicode to be changed in a way that violated these -| assumptions, i.e., that either invalidated the string order -| of RFC 3629 or that that changed the stability of NFC as -| stated above, this specification would not apply. +| Were Unicode to be changed in a way that violated these +| assumptions, i.e., that changed the stability of NFC as +| stated above, this specification would not apply. And again, UTF-8 as specified in STD 63 is stable. > So I am loathe to cover things that > are well-covered in 3629 lest more confusion be created. We fully agree. I only think that the reference in the old text is not necessary. > > * Section 4: I would drop the last paragraph, since it is a > > repetition of what is exhaustively explained in section 5.2. > > I got a parsing error at the last sentence of that paragraph > > anyway. > > Hmm. It parses for me. But I agree about the redundancy, Ok, so we drop it. Now about the last sentence: > except for that last sentence, which makes a normative assertion > about this specification that does not appear in Section 5. > That last sentence could be restated, less formally, as: > > If one encounters a UTF-8 string in a protocol, and its > syntax and properties are not specifically defined, then > it is reasonable to assume that it conforms to this > specification. The old formulation mentioned "unidentified UTF-8 strings", the new formulation mentions a UTF-8 string with syntax and properties "not specifically defined". I am sure you have something in mind, but it still doesn't get through. And you are aiming at a normative assertion, then normative language should be used and not something vague like "it is reasonable to assume". Thanks and best regards, Marcos
- Comments on draft-klensin-net-utf8-06 Marcos Sanz/Denic
- Re: Comments on draft-klensin-net-utf8-06 John C Klensin
- Re: Comments on draft-klensin-net-utf8-06 Frank Ellermann
- Re: Comments on draft-klensin-net-utf8-06 Marcos Sanz/Denic
- Re: Comments on draft-klensin-net-utf8-06 Clive D.W. Feather
- Re: Comments on draft-klensin-net-utf8-06 John C Klensin