Re: I-D Action:draft-klensin-net-utf8-04.txt
John C Klensin <john-ietf@jck.com> Fri, 05 October 2007 15:54 UTC
Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
by megatron.ietf.org with esmtp (Exim 4.43)
id 1IdpVj-0001vE-Rf; Fri, 05 Oct 2007 11:54:43 -0400
Received: from discuss by megatron.ietf.org with local (Exim 4.43)
id 1IdpVj-0001v9-JZ for discuss-confirm+ok@megatron.ietf.org;
Fri, 05 Oct 2007 11:54:43 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
by megatron.ietf.org with esmtp (Exim 4.43) id 1IdpVj-0001v1-A2
for discuss@apps.ietf.org; Fri, 05 Oct 2007 11:54:43 -0400
Received: from ns.jck.com ([209.187.148.211] helo=bs.jck.com)
by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IdpVh-0001qW-Dj
for discuss@apps.ietf.org; Fri, 05 Oct 2007 11:54:43 -0400
Received: from [127.0.0.1] (helo=p3.JCK.COM)
by bs.jck.com with esmtp (Exim 4.34)
id 1IdpVX-000Aig-OH; Fri, 05 Oct 2007 11:54:32 -0400
Date: Fri, 05 Oct 2007 11:54:30 -0400
From: John C Klensin <john-ietf@jck.com>
To: Stephane Bortzmeyer <bortzmeyer@nic.fr>, discuss@apps.ietf.org
Subject: Re: I-D Action:draft-klensin-net-utf8-04.txt
Message-ID: <3DD121D8A8CB33BE639A9B9E@p3.JCK.COM>
In-Reply-To: <20071005150759.GA29903@nic.fr>
References: <E1Idfus-0002O8-1t@stiedprstage1.ietf.org>
<20071005150759.GA29903@nic.fr>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a7d2e37451f7f22841e3b6f40c67db0f
Cc:
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols
<discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>,
<mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>,
<mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
--On Friday, 05 October, 2007 17:07 +0200 Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote: > On Fri, Oct 05, 2007 at 01:40:02AM -0400, > Internet-Drafts@ietf.org <Internet-Drafts@ietf.org> wrote > a message of 91 lines which said: > >> Title : Unicode Format for Network Interchange >> Author(s) : J. Klensin, M. Padlipsky >> Filename : draft-klensin-net-utf8-04.txt > > I have read and studied this I-D and I find it basically OK, > suitable for approval and very useful for the Internet, where > internationalization is an important issue. > > I have some reservations, which are mostly details: > >> Section 1.1 [...] preferred to the double-byte encoding of >> "extended ASCII" [RFC0698] > > This reference to a very obsolete system does not bring useful > information. Delete it or move it to the interesting "History > and Context" appendix. Officially, RFC698 is not obsolete and applies specifically to NVT-like streams, which is why it seemed worth singling out. The spec should have listed "obsoletes RFC 698", which means that this can't be in a section that is purely informative. I'll try to figure out a better way to handle it, but am going to leave the text more or less as is until I see further comments. A better alternative would be for someone to create an RFC titled something like "implications of the character set policy" that clears out internationalization cruft like RFC 698 and "extended ASCII". Opinions and volunteers would be welcome. >> Section 2.1 [...] None of those uses is inappropriate for >> streams of plain text. > > Isn't it a typo? It should be "appropriate". Yes. It was a type. Fixed in -05. There are several other typos, including syntax that omits closing single quotes, that have been reported offlist and fixed in -05. >> Section 3 [...] Recognition of the fact that some applications >> implementations may rely on operating system libraries over >> which they have little control and adherence to the >> robustness principle suggests that receivers of such strings >> should be prepared to receive unnormalized ones > > This is also a security issue. An attacker could deliberately > send unormalized text even if the specification says MUST. As > such, it is worth a mention in the security considerations. Reasonable idea. Text added. >> Section 5.2 [...] internationalized domain names (IDNA >> [RFC3490]) [...] specific difficulties with IDNA in this >> regard are discussed in [RFC4690] > > The two mentions of IDNA brings no value and really smell like > a personal issue. Discussions of the UTF-8 RFC are > understandable but other RFC talking about Unicode are not > mentioned. Why specifically IDNA? It isn't really an IDNA issue at all, but the issue with Unicode versioning and libraries. 4690 contains the best discussion of that subject of anything now published in the RFC series (the discussion in draft-klensin-idnabis-issues is arguably even better, but that document is in a sufficiently preliminary state that having this one reference it would be unwise). If you think it would significantly improve the document, I could make that text say, e.g., that IDNA, SASLPrep, and possibly other protocols are tied to Unicode 3.2 via Stringprep and then point to 4690. But, either way, it is just a comment about something we've done that is weak and should not be repeated for Net-Unicode and an informative reference for further reading. >> Section 6 [...] > > A mention about firewalls and unormalized UTF-8 streams could > be useful. Something like "Firewalls and other systems > interpreting UTF-8 streams should be developed with the clear > knowledge that an attacker may deliberately send unnormalized > text, for instance to avoid detection by naive text-matching > systems." Done. >> Appendix A [...] whois [RFC0954] > > If it is the current version, it should be RFC3912. If it is > the original one, which would make sense in an historical > section, it should be RFC0812. Here, I disagree. RFC954 was chosen because it was the last, and most clear, version of the original spec. RFC3912 is different in several respects and arguably introduces new ambiguities. I could make it "[RFC0812] [RFC0954]" if you think that would improve clarity. thanks, john
- Re: I-D Action:draft-klensin-net-utf8-04.txt Stephane Bortzmeyer
- Re: I-D Action:draft-klensin-net-utf8-04.txt John C Klensin