Re: [apps-discuss] Feedback about "Update to MIME regarding Charset Parameter Handling in Textual Media Types"

Henri Sivonen <hsivonen@iki.fi> Thu, 23 February 2012 09:51 UTC

Received-SPF: pass (google.com: domain of hsivonen@gmail.com designates 10.236.161.232 as permitted sender) client-ip=10.236.161.232;
MIME-Version: 1.0
Sender: hsivonen@gmail.com
In-Reply-To: <4F45D452.9010702@it.aoyama.ac.jp>
References: <CAJQvAudekOKa2mzas-igD_6pa2je000Darin2HDNda-sk9TLCQ@mail.gmail.com> <4F45D452.9010702@it.aoyama.ac.jp>
Date: Thu, 23 Feb 2012 11:51:35 +0200
Message-ID: <CAJQvAucL=SbE-0CfBwiFcb2+eKnA-_tNx7gMxtpMANqcUimXFQ@mail.gmail.com>
From: Henri Sivonen <hsivonen@iki.fi>
To: apps-discuss@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Cc: Anne van Kesteren <annevk@opera.com>
Subject: Re: [apps-discuss] Feedback about "Update to MIME regarding Charset Parameter Handling in Textual Media Types"
Precedence: list

On Thu, Feb 23, 2012 at 7:53 AM, "Martin J. Dürst"
<duerst@it.aoyama.ac.jp> wrote:
> I'm not opposed to this in principle, but I'm worried that as written, it
> will lead to more and more unnecessarily different 'algorithms' and
> heuristics. The spec should make clear that where possible, established
> patters should be followed to reduce variety.

I suggesting that new stuff be UTF-8 only.

For old stuff, everything is a bit different. Off the top of my head:
text/javascript uses HTTP charset or the BOM or declaration from the
referrer (the charset="" attribute on <script>) or the encoding of the
referrer
text/css uses what text/javascript uses or an in-band declaration
text/plain uses what I described in my message (well, strictly what I
said was a synthesis of what Gecko, IE and WebKit do; Gecko will do it
when I get around to changing the precedence of the BOM to work as in
WebKit and IE)
text/html uses what text/plain uses or an in-band declaration
text/xml uses HTTP charset or the BOM or an in-band declaration

text/cache-manifest is sane an always uses UTF-8 (HTTP charset is ignored).

For new stuff, I'd much rather recommend the text/cache-manifest
pattern than the BOM/HTTP charset/in-band commonality of text/css,
text/html and text/xml.

>> 4. Determining the character encoding for text/plain
>
> This looks like a very long algorithm. It may work pretty well in a Web
> context (definitely better than an US-ASCII default), but what about other
> contexts (e.g. mail)?

I don't see why mail would need to be different. There are just some
steps in the algorithm (same-origin navigation and XMLHttpRequest or
similar) that never apply to mail.

>> If the entity is being loaded into a browsing context and is being
>> fetched from a location from which an entity has been loaded before
>> and the previous character encoding has been cached, the character
>> encoding is the cached encoding. Terminate these steps.
>>
>> If the entity is being loaded via a non-browsing context mechanism
>> (such as XMLHttpRequest) that defines a fallback encoding, use that
>> encoding. Terminate these steps.
>>
>> Otherwise, the character encoding is a configuration-dependent
>> encoding. The default configuration SHOULD depend on the locale of the
>> user agent according to the table given in step 8 in [5]. Terminate
>> these steps.
>
>
> What is missing here is that there is sometimes a need for user intervention
> if all else fails. While it may be possible to look at this as something
> outside of this algorithm, or as something subsumed under "configuration" or
> whatever, I'm afraid that the current text will give many implementers the
> impression that user overrides are not allowed. So I suggest adding
> something like:
>
>   In an interactive context, the user may occasionally want to
>   override the character encoding determined by this algorithm.

Right. User override should come right after the BOM steps.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

[apps-discuss] Feedback about "Update to MIME reg… Henri Sivonen
Re: [apps-discuss] Feedback about "Update to MIME… Martin J. Dürst
Re: [apps-discuss] Feedback about "Update to MIME… Ned Freed
Re: [apps-discuss] Feedback about "Update to MIME… Henri Sivonen
Re: [apps-discuss] Feedback about "Update to MIME… Henri Sivonen
Re: [apps-discuss] Feedback about "Update to MIME… Ned Freed
Re: [apps-discuss] Feedback about "Update to MIME… Julian Reschke
Re: [apps-discuss] Feedback about "Update to MIME… Ned Freed
Re: [apps-discuss] Feedback about "Update to MIME… Martin J. Dürst
Re: [apps-discuss] Feedback about "Update to MIME… Julian Reschke
Re: [apps-discuss] Feedback about "Update to MIME… Ned Freed
Re: [apps-discuss] Feedback about "Update to MIME… Alexey Melnikov
Re: [apps-discuss] Feedback about "Update to MIME… Julian Reschke
Re: [apps-discuss] Feedback about "Update to MIME… Ned Freed
Re: [apps-discuss] Feedback about "Update to MIME… Henri Sivonen
Re: [apps-discuss] Feedback about "Update to MIME… Henri Sivonen