Re: [EAI] UTF-8/MIME
John C Klensin <klensin@jck.com> Thu, 19 August 2010 16:34 UTC
Return-Path: <klensin@jck.com>
X-Original-To: ima@core3.amsl.com
Delivered-To: ima@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9AC123A6915 for <ima@core3.amsl.com>; Thu, 19 Aug 2010 09:34:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.414
X-Spam-Level:
X-Spam-Status: No, score=-2.414 tagged_above=-999 required=5 tests=[AWL=0.033, BAYES_00=-2.599, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1m8gv1VLFvzH for <ima@core3.amsl.com>; Thu, 19 Aug 2010 09:34:26 -0700 (PDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by core3.amsl.com (Postfix) with ESMTP id 32AEC3A693D for <ima@ietf.org>; Thu, 19 Aug 2010 09:34:26 -0700 (PDT)
Received: from [127.0.0.1] (helo=localhost) by bs.jck.com with esmtp (Exim 4.34) id 1Om84t-000B9l-4n; Thu, 19 Aug 2010 12:34:55 -0400
Date: Thu, 19 Aug 2010 12:34:54 -0400
From: John C Klensin <klensin@jck.com>
To: Charles Lindsey <chl@clerew.man.ac.uk>, IMA <ima@ietf.org>
Message-ID: <366A47AA8992A5250A7A8820@PST.JCK.COM>
In-Reply-To: <op.vhotq0c26hl8nm@clerew.man.ac.uk>
References: <E14011F8737B524BB564B05FF748464A0E2374D7@TK5EX14MBXC141.redmond.corp.microsoft.com> <op.vhotq0c26hl8nm@clerew.man.ac.uk>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Subject: Re: [EAI] UTF-8/MIME
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Aug 2010 16:34:27 -0000
--On Thursday, August 19, 2010 14:43 +0100 Charles Lindsey <chl@clerew.man.ac.uk> wrote: > On Wed, 18 Aug 2010 19:31:52 +0100, Shawn Steele > <Shawn.Steele@microsoft.com> wrote: > >>> my understanding is that headers are in utf-8, >>> the body part might still use mime(gb2312). >> >> I'm obviously really confused. My understanding was that EAI >> required UTF-8 headers and MIME encoded bodies, which >> should also be UTF-8, but an appropriate MIME part. >> Presumably the body could also be GB2312 MIME, though I'd >> discourage that as much as possible. The implication of draft-iab-idn-encoding is that we ought to be gradually deprecating everything but ASCII and UTF-8 (I would say "everything but UTF-8", but there is specific language in the MIME specs stating a preference for coding body parts with no non-ASCII characters as "us-ascii" rather than "utf-8"). So I agree with Shawn's observation but don't think it is a matter for EAI to discuss or offer advice about (see note 1 below). > Now you've got me confused. An EAI message (i.e. one coming > from an agent asserting a need for UTF8SMTP) might contain > headers using UTF-8, but the body entirely in ASCII. So no > MIME stuff anywhere in it. So you cannot say that EAI REQUIRES > MIME, though for sure it would be stupid to implement it > without MIME. But there will always be "MIME stuff" in an EAI-conformant message with non-ASCII header material. Remember, UTF8SMTP[bis] requires 8BITMIME, 8BITMIME requires MIME, and MIME requires at least a MIME-version header field, so, even if there is only one body part and the content type is allowed to default to 'text/plain; charset="us-ascii"' (i.e., no Content-Type header field present), there is "MIME stuff" present. > AIUI, if you assert UTF8SMTP and have UTF8 in the headers, and > then want to use UTF-8 in the Body (a common situation, > presumably), then you are supposed to include a Content-Type: > specifying charset=utf8 and a suitable > Content-Transfer-Encoding. Yes. Since 8BITMIME is needed to transport those UTF-8 headers, the C-T-E can reasonably be "8bit" > What we MIGHT do is to state that, for such EAI messages the > default body charset was UTF-8 and the default CTE was 8bit > (and most modern MUAs would likely display that correctly). > That's just another extension to RFC 553[12], and we have > extended those already. The downside would be that any attempt > at downgrading would have to put those assumed-by-default > headers back. I think this would be certain to cause problems with other MIME implementations. Remember that messages are passed outside the transport system and that making up other defaults --body parts not requiring Content-Type or C-T-E fields-- would violate the 8BITMIME spec. Now, if we were to say that use of EAI required either the normal default Content-Type (text/plain; charset="us-ascii") or that text types were required to be charset="utf-8", I think that would be ok. However, I think it would accomplish very little in practice that a simple recommendation to use UTF-8 in body parts where possible (labeled as the MIME and 8BITMIME specs required) would not. In particular, if someone were determined to use GB body parts, I don't think we are going to be able to effectively prohibit that. We should just try to insist that they be properly labeled and identified. > In fact, I suspect that default practice is going to happen > anyway within EAI-only communities, so we might as well make > it official. Partially because of the wave of MTAs that, in self-defense, took very harsh measures to deal with unlabeled 8bit content, there is, in my experience, little current practice of using non-ASCII body parts without Content-type labeling. I see no reason at all to try to reintroduce that practice, especially when we remember how common text/html and various binary ("application") and image type body parts have to be handled by those same MTAs and MUAs. john
- Re: [EAI] UTF-8/MIME Shawn Steele
- Re: [EAI] UTF-8/MIME Shawn Steele
- Re: [EAI] UTF-8/MIME Jiankang YAO
- Re: [EAI] UTF-8/MIME Shawn Steele
- Re: [EAI] UTF-8/MIME Jiankang YAO
- Re: [EAI] UTF-8/MIME Jiankang YAO
- Re: [EAI] UTF-8/MIME Charles Lindsey
- Re: [EAI] UTF-8/MIME Nick Teint
- Re: [EAI] UTF-8/MIME Shawn Steele
- Re: [EAI] UTF-8/MIME John C Klensin
- [EAI] Fwd: Re: UTF-8/MIME Charles Lindsey
- Re: [EAI] Fwd: Re: UTF-8/MIME John C Klensin
- Re: [EAI] Fwd: Re: UTF-8/MIME Nick Teint
- Re: [EAI] UTF-8/MIME Shawn Steele