Re: I-D Action:draft-klensin-net-utf8-04.txt

Stephane Bortzmeyer <bortzmeyer@nic.fr> Fri, 05 October 2007 15:08 UTC

Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IdomY-0002FF-Ld; Fri, 05 Oct 2007 11:08:02 -0400
Received: from discuss by megatron.ietf.org with local (Exim 4.43) id 1IdomX-0002Ck-A8 for discuss-confirm+ok@megatron.ietf.org; Fri, 05 Oct 2007 11:08:01 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IdomX-0002Cb-0W for discuss@apps.ietf.org; Fri, 05 Oct 2007 11:08:01 -0400
Received: from mx2.nic.fr ([192.134.4.11]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IdomV-0000R8-Nj for discuss@apps.ietf.org; Fri, 05 Oct 2007 11:08:00 -0400
Received: from mx2.nic.fr (localhost [127.0.0.1]) by mx2.nic.fr (Postfix) with SMTP id 58F8C1C0100 for <discuss@apps.ietf.org>; Fri, 5 Oct 2007 17:07:59 +0200 (CEST)
Received: from relay2.nic.fr (relay2.nic.fr [192.134.4.163]) by mx2.nic.fr (Postfix) with ESMTP id 540B61C00F7 for <discuss@apps.ietf.org>; Fri, 5 Oct 2007 17:07:59 +0200 (CEST)
Received: from bortzmeyer.nic.fr (batilda.nic.fr [192.134.4.69]) by relay2.nic.fr (Postfix) with ESMTP id 515CF58EBBF for <discuss@apps.ietf.org>; Fri, 5 Oct 2007 17:07:59 +0200 (CEST)
Date: Fri, 05 Oct 2007 17:07:59 +0200
From: Stephane Bortzmeyer <bortzmeyer@nic.fr>
To: discuss@apps.ietf.org
Subject: Re: I-D Action:draft-klensin-net-utf8-04.txt
Message-ID: <20071005150759.GA29903@nic.fr>
References: <E1Idfus-0002O8-1t@stiedprstage1.ietf.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <E1Idfus-0002O8-1t@stiedprstage1.ietf.org>
X-Operating-System: Debian GNU/Linux 4.0
X-Kernel: Linux 2.6.18-4-686 i686
Organization: NIC France
X-URL: http://www.nic.fr/
User-Agent: Mutt/1.5.13 (2006-08-11)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: f4c2cf0bccc868e4cc88dace71fb3f44
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org

On Fri, Oct 05, 2007 at 01:40:02AM -0400,
 Internet-Drafts@ietf.org <Internet-Drafts@ietf.org> wrote 
 a message of 91 lines which said:

> 	Title           : Unicode Format for Network Interchange
> 	Author(s)       : J. Klensin, M. Padlipsky
> 	Filename        : draft-klensin-net-utf8-04.txt

I have read and studied this I-D and I find it basically OK, suitable
for approval and very useful for the Internet, where
internationalization is an important issue.

I have some reservations, which are mostly details:

> Section 1.1 [...] preferred to the double-byte encoding of "extended
> ASCII" [RFC0698]

This reference to a very obsolete system does not bring useful
information. Delete it or move it to the interesting "History and
Context" appendix.

> Section 2.1 [...] None of those uses is inappropriate for streams of
> plain text.

Isn't it a typo? It should be "appropriate".

> Section 3 [...] Recognition of the fact that some applications
> implementations may rely on operating system libraries over which
> they have little control and adherence to the robustness principle
> suggests that receivers of such strings should be prepared to
> receive unnormalized ones

This is also a security issue. An attacker could deliberately send
unormalized text even if the specification says MUST. As such, it is
worth a mention in the security considerations.

> Section 5.2 [...] internationalized domain names (IDNA [RFC3490])
> [...]  specific difficulties with IDNA in this regard are discussed
> in [RFC4690]

The two mentions of IDNA brings no value and really smell like a
personal issue. Discussions of the UTF-8 RFC are understandable but
other RFC talking about Unicode are not mentioned. Why specifically
IDNA?

> Section 6 [...]

A mention about firewalls and unormalized UTF-8 streams could be
useful. Something like "Firewalls and other systems interpreting UTF-8
streams should be developed with the clear knowledge that an attacker
may deliberately send unnormalized text, for instance to avoid
detection by naive text-matching systems."

> Appendix A [...] whois [RFC0954]

If it is the current version, it should be RFC3912. If it is the
original one, which would make sense in an historical section, it
should be RFC0812.