Re: Character encodings in headers [i74][was: Straw-man charter for http-bis]
Martin Duerst <duerst@it.aoyama.ac.jp> Mon, 20 August 2007 07:56 UTC
Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
by megatron.ietf.org with esmtp (Exim 4.43)
id 1IN27M-0005HO-M2; Mon, 20 Aug 2007 03:56:08 -0400
Received: from discuss by megatron.ietf.org with local (Exim 4.43)
id 1IN27L-0005HJ-E7 for discuss-confirm+ok@megatron.ietf.org;
Mon, 20 Aug 2007 03:56:07 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org)
by megatron.ietf.org with esmtp (Exim 4.43) id 1IN27L-0005HB-4K
for discuss@apps.ietf.org; Mon, 20 Aug 2007 03:56:07 -0400
Received: from scmailgw1.scop.aoyama.ac.jp ([133.2.251.194])
by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IN27J-0005IK-7Y
for discuss@apps.ietf.org; Mon, 20 Aug 2007 03:56:07 -0400
Received: from scmse1.scbb.aoyama.ac.jp (scmse1 [133.2.253.16])
by scmailgw1.scop.aoyama.ac.jp (secret/secret) with SMTP id
l7K7u2PL022758
for <discuss@apps.ietf.org>; Mon, 20 Aug 2007 16:56:02 +0900 (JST)
Received: from (133.2.206.133) by scmse1.scbb.aoyama.ac.jp via smtp
id 4012_cbba3b88_4ef2_11dc_822a_0014221fa3c9;
Mon, 20 Aug 2007 16:56:01 +0900
Received: from Tanzawa.it.aoyama.ac.jp ([133.2.210.1]:59483)
by itmail.it.aoyama.ac.jp with [XMail 1.22 ESMTP Server]
id <S116DBC> for <discuss@apps.ietf.org> from <duerst@it.aoyama.ac.jp>;
Mon, 20 Aug 2007 16:53:16 +0900
Message-Id: <6.0.0.20.2.20070820162657.08bf55a0@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Version 6J
Date: Mon, 20 Aug 2007 16:54:20 +0900
To: Mark Nottingham <mnot@mnot.net>
From: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: Re: Character encodings in headers [i74][was: Straw-man
charter for http-bis]
In-Reply-To: <088FB13E-F12F-4BE7-94FB-78B21C51512E@mnot.net>
References: <BA772834-227A-4C1B-9534-070C50DF05B3@mnot.net>
<392C98BA-E7B8-44ED-964B-82FC48162924@mnot.net>
<p06240843c2833f4d7f2f@[10.20.30.108]> <465D9142.9050506@gmx.de>
<6.0.0.20.2.20070610165356.0a69cec0@localhost>
<088FB13E-F12F-4BE7-94FB-78B21C51512E@mnot.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 10d3e4e3c32e363f129e380e644649be
Cc: Richard Ishida <ishida@w3.org>, Apps Discuss <discuss@apps.ietf.org>,
Felix Sasaki <fsasaki@w3.org>,
"ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>,
Paul Hoffman <phoffman@imc.org>
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols
<discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>,
<mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>,
<mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
Hello Mark, Thanks for giving this an issue number. At 12:40 07/08/20, Mark Nottingham wrote: >On 10/06/2007, at 6:05 PM, Martin Duerst wrote: >> - RFC 2616 prescribes that headers containing non-ASCII have to use >> either iso-8859-1 or RFC 2047. This is unnecessarily complex and >> not necessarily followed. At the least, new extensions should be >> allowed to specify that UTF-8 is used. > >My .02; > >I'm concerned about allowing UTF-8; it may break existing >implementations. > >I'd like to see the text just require that the actual character set >be 8859-1, but to allow individual extensions to nominate encodings >*like* 2047,without being restricted to it. What do you mean by "encodings *like* 2047"? And why do you think that UTF-8 may break existing implementations? UTF-8 has virtually the same footprint in terms of bytes as ISO-8859-1: All bytes above 0x7F may be used. Implementations that have to deal with ISO-8859-1 usually do this by just being 8-bit-transparent; that works for UTF-8, too. If your opinion is that UTF-8 cannot be allowed at all, then that's going to be a problem for cases where it's already in use, see e.g. earlier posts in the http list. It's easy to say "may break existing implementations", but in over 10 years of being involved in the Web, I haven't heard about that happening. If you or anybody have, please speak up. >For example, the encoding >specified in 3987 is appropriate for URIs. As one of the authors of RFC 3987, I know what you mean, but "the encoding specified in 3987" wouldn't be enough for a spec. Also, it's not really very well suited to the job, because %hh-encoding is used to escape any bytes, not only UTF-8, and there is a considerably length increase for some scripts. >However, it *has* to be >explicit; I've heard some people read this requirement and think that >they need to check *every* header for 2047 encoding. I have read it that way, too. If it can be safely argued that it was never intended that way, and that no harm is produced if this is restricted, then I'd befine with restricting it, because checking everything for 2047 is indeed tough, but I'd really like to make sure that this isn't creating problems. (i.e. that a careful examination of the various headers makes some reasonably conservative assumptions). >So, I think this means; > >1) Change > "Words of *TEXT MAY contain characters from character sets other >than ISO-8859-1 [22] only when encoded according to the rules of RFC >2047 [14]." >to > "Words of *TEXT MUST NOT contain characters from character sets >other than ISO-885901 [22]." >and, > >2) Identify headers that may have non-8859 There are many parts to ISO-8859, not just ISO-8859-1. >content and explicitly say >how to encode them (IRI, 2047, whatever; the existing ones will have >to be 2047, I believe), modifying their BNF to suit. > >3) When we document extensibility, require new headers to nominate >any encoding explicitly. If that includes UTF-8, I'd be fine with it. If it excludes UTF-8, I think that would be a problem. Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
- Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Paul Hoffman
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Paul Hoffman
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Paul Hoffman
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Paul Hoffman
- Re: Straw-man charter for http-bis Mark Nottingham
- RE: Straw-man charter for http-bis Larry Masinter
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis -- call for er… Mark Nottingham
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis -- call for er… Julian Reschke
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis -- call for er… Julian Reschke
- Re: Straw-man charter for http-bis -- call for er… Cyrus Daboo
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis -- call for er… Cyrus Daboo
- Re: Straw-man charter for http-bis Alexey Melnikov
- Re: Straw-man charter for http-bis Alexey Melnikov
- Re: Straw-man charter for http-bis Yves Lafon
- Re: Straw-man charter for http-bis -- call for er… Robert Sayre
- Re: Straw-man charter for http-bis Robert Sayre
- Re: Straw-man charter for http-bis -- call for er… Robert Sayre
- Re: Straw-man charter for http-bis -- call for er… Robert Sayre
- Re: Straw-man charter for http-bis Roy T. Fielding
- Re: Straw-man charter for http-bis -- call for er… Henrik Nordstrom
- Re: Straw-man charter for http-bis -- call for er… Henrik Nordstrom
- Re: Straw-man charter for http-bis Robert Sayre
- Re: Straw-man charter for http-bis -- call for er… Robert Sayre
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Mark Nottingham
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Robert Sayre
- RE: Straw-man charter for http-bis -- call for er… Henrik Nordstrom
- Re: Straw-man charter for http-bis Henrik Nordstrom
- Re: Straw-man charter for http-bis Roy T. Fielding
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis John C Klensin
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Paul Hoffman
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Robert Sayre
- Re: Straw-man charter for http-bis Chris Newman
- Re: Straw-man charter for http-bis Julian Reschke
- Re: Straw-man charter for http-bis Alexey Melnikov
- Re: Straw-man charter for http-bis Paul Hoffman
- RFC2616 vs RFC2617, was: Straw-man charter for ht… Julian Reschke
- Re: Straw-man charter for http-bis Keith Moore
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Keith Moore
- Re: Straw-man charter for http-bis Julian Reschke
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Julian Reschke
- Re: Straw-man charter for http-bis Paul Hoffman
- Re: Straw-man charter for http-bis Eliot Lear
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Keith Moore
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Lisa Dusseault
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Stephane Bortzmeyer
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Joe Orton
- Re: Straw-man charter for http-bis Henrik Nordstrom
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… lists
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… lists
- Re: Straw-man charter for http-bis Eliot Lear
- Re: Straw-man charter for http-bis Chris Newman
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Chris Newman
- Re: Straw-man charter for http-bis Henrik Nordstrom
- Re: Straw-man charter for http-bis Lisa Dusseault
- Re: Straw-man charter for http-bis Martin Duerst
- Re: Straw-man charter for http-bis Henrik Nordstrom
- Re: Straw-man charter for http-bis Keith Moore
- Re: Straw-man charter for http-bis Julian Reschke
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Mark Nottingham
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Stephane Bortzmeyer
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Adrien de Croy
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Stephane Bortzmeyer
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… tom.petch
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Keith Moore
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… tom.petch
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Keith Moore
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Mark Nottingham
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Adrien de Croy
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… Chris Newman
- Re: Straw-man charter for http-bis Chris Newman
- Re: Straw-man charter for http-bis Henrik Nordstrom
- Re: Straw-man charter for http-bis der Mouse
- Re: Straw-man charter for http-bis Keith Moore
- Re: RFC2616 vs RFC2617, was: Straw-man charter fo… tom.petch
- Re: Straw-man charter for http-bis Mark Nottingham
- Character encodings in headers [i74][was: Straw-m… Mark Nottingham
- Re: Character encodings in headers [i74][was: Str… Keith Moore
- Re: Character encodings in headers [i74][was: Str… John C Klensin
- Re: Character encodings in headers [i74][was: Str… Clive D.W. Feather
- Re: Character encodings in headers [i74][was: Str… Martin Duerst
- Re: Character encodings in headers [i74][was: Str… Martin Duerst
- Re: Character encodings in headers [i74][was: Str… Mark Nottingham
- Re: Character encodings in headers [i74][was: Str… Martin Duerst
- Re: Character encodings in headers [i74][was: Str… Mark Nottingham
- Re: Character encodings in headers [i74][was: Str… Clive D.W. Feather
- Re: Character encodings in headers [i74][was: Str… Clive D.W. Feather
- Re: Character encodings in headers [i74][was: Str… Keith Moore
- Re: Character encodings in headers [i74][was: Str… der Mouse
- Re: Character encodings in headers [i74][was: Str… Keith Moore
- Re: Character encodings in headers [i74][was: Str… Stefanos Harhalakis
- Re: Character encodings in headers [i74][was: Str… Keith Moore