Re: [apps-discuss] RFC 6657 on Update to MIME regarding "charset" Parameter Handling in Textual Media Types

Ned Freed <ned.freed@mrochek.com> Tue, 10 July 2012 17:42 UTC

Return-Path: <ned.freed@mrochek.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0E3F121F87D0 for <apps-discuss@ietfa.amsl.com>; Tue, 10 Jul 2012 10:42:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.499
X-Spam-Level:
X-Spam-Status: No, score=-2.499 tagged_above=-999 required=5 tests=[AWL=0.100, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pJsU5-4uKQo5 for <apps-discuss@ietfa.amsl.com>; Tue, 10 Jul 2012 10:42:42 -0700 (PDT)
Received: from mauve.mrochek.com (mauve.mrochek.com [66.59.230.40]) by ietfa.amsl.com (Postfix) with ESMTP id AAD3521F8763 for <apps-discuss@ietf.org>; Tue, 10 Jul 2012 10:42:41 -0700 (PDT)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01OHOK4VZEXS0053N5@mauve.mrochek.com> for apps-discuss@ietf.org; Tue, 10 Jul 2012 10:38:05 -0700 (PDT)
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01OHLKS3CK340006TF@mauve.mrochek.com>; Tue, 10 Jul 2012 10:38:01 -0700 (PDT)
Message-id: <01OHOK4TIDIW0006TF@mauve.mrochek.com>
Date: Tue, 10 Jul 2012 10:18:38 -0700
From: Ned Freed <ned.freed@mrochek.com>
In-reply-to: "Your message dated Tue, 10 Jul 2012 09:14:12 +0100" <4FFBE454.1020601@zoo.ox.ac.uk>
MIME-version: 1.0
Content-type: TEXT/PLAIN; Format="flowed"
References: <20120710000754.6BF59B1E006@rfc-editor.org> <4FFBE454.1020601@zoo.ox.ac.uk>
To: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
Cc: apps-discuss@ietf.org
Subject: Re: [apps-discuss] RFC 6657 on Update to MIME regarding "charset" Parameter Handling in Textual Media Types
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Jul 2012 17:42:43 -0000

> On 10/07/2012 01:07, rfc-editor@rfc-editor.org wrote:
> >
> > A new Request for Comments is now available in online RFC libraries.
> >
> >
> >          RFC 6657
> >
> >          Title:      Update to MIME regarding "charset"
> >                      Parameter Handling in Textual Media Types

> I didn't see this one coming.

It was discussed at considerable length both here and on the IETF list.

>  I'm a bit confused by the specification.

You need to keep in mind that this only applies to subtypes of text.

> If we define a media type that is *always* UTF-8, does this count as
> transporting its own charset information?

That's one approach you can use. The alternatives are to allow or require
a charset parameter, always with the value utf-8. The best approach depends
on the specifics of the type.

>  Should we say that the media type
> SHOULD NOT be included, or that it SHOULD be included with value UTF-8?

Included where? Within the content? If so, that's up to the registration to
say. There are plenty of utf-8 based formats that don't provide for inclusion
of media type information - and that includes some that use XML syntax.

> Section
> 3 implies the latter, but it also talks about media types defining their own
> default encoding.

Relying on defaults is discouraged for historical reasons - they don't work
very well. As such, if it's possible for the type to explicitly say what the
charset is, that's probably the best way to do it. If the type isn't capable of
that for whatever reason, your options are to simply say it's always utf-8 or
alternately allow or require a charset parameter with utf-8 as the only value.
The best approach depends on the situation, which is why the document is full
of SHOULDs, not MUSTs.

> (This is not an academic question - a W3C group I'm involved with is about to
> submit a registration for a UTF-8 only text/... media type)

Does this type actually meet the criteria for text specified in RFC 2046
section 4.1? I rather suspect it doesn't. If not, it really has no business
being a text subtype, and all of this is moot.

				Ned