Re: Call for Adoption: draft-reschke-rfc54987bis

"Poul-Henning Kamp" <> Thu, 01 October 2015 06:36 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 7D7C21B2A84 for <>; Wed, 30 Sep 2015 23:36:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.912
X-Spam-Status: No, score=-6.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id zMvhlPTeIxqd for <>; Wed, 30 Sep 2015 23:36:45 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 09DF81B2A82 for <>; Wed, 30 Sep 2015 23:36:45 -0700 (PDT)
Received: from lists by with local (Exim 4.80) (envelope-from <>) id 1ZhXR8-0000IG-4H for; Thu, 01 Oct 2015 06:33:50 +0000
Resent-Date: Thu, 01 Oct 2015 06:33:50 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <>) id 1ZhXR4-0000HS-Mo for; Thu, 01 Oct 2015 06:33:46 +0000
Received: from ([]) by with esmtp (Exim 4.80) (envelope-from <>) id 1ZhXQs-0002FP-D6 for; Thu, 01 Oct 2015 06:33:40 +0000
Received: from (unknown []) by (Postfix) with ESMTP id D4B8D4F418; Thu, 1 Oct 2015 06:33:09 +0000 (UTC)
Received: from (localhost []) by (8.15.2/8.15.2) with ESMTP id t916X6RZ008666; Thu, 1 Oct 2015 06:33:08 GMT (envelope-from
To: Mark Nottingham <>
cc: HTTP Working Group <>
In-reply-to: <>
From: Poul-Henning Kamp <>
References: <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <>
Content-Transfer-Encoding: quoted-printable
Date: Thu, 01 Oct 2015 06:33:06 +0000
Message-ID: <>
Received-SPF: none client-ip=;;
X-W3C-Hub-Spam-Status: No, score=-5.4
X-W3C-Hub-Spam-Report: AWL=-1.442, BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: 1ZhXQs-0002FP-D6 74bdbf3e084145bc7cd2ba53ced9ca81
Subject: Re: Call for Adoption: draft-reschke-rfc54987bis
Archived-At: <>
X-Mailing-List: <> archive/latest/30298
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

In message <>, Mark Nottingham wri

>We're belatedly adopting this; Julian asked for a breather while he
>finished other work, and now he's ready to commence.

I think adopting the draft is a good idea.

But I find some bits of the low level mechanics proposed troublesome.

For instance it worries me a lot to use '*' as magic marker in
fields which are historically thrown around fast and loose in all
sorts of programming environments where it may or may not be a

Can we find a less overloaded preferably non-meta character ?

If we can find two less overloaded characters, one can indicate
UTF-8, and the other that char set is explictly specified.

Judging from experience, these headers are going to vary a lot, so
if we can shave 5 characters of their length in the usual case,
that's a tangible benefit.

Something like:

    UTF-8 implied:

	foo: bar; title<='en'%C2%A3%20rates

    Charset explicitly specified:

	foo: bar; title>=iso-8859-1'en'%A3%20rates

(Where I'm not specifically proposing '<' or '>' but merely using them
for the example.)

But going even further:  I have a hard time coming up with a credible
(ie: non-demented) scenario for having multiple different charsets
in the same header.

Therefore I would prefer to put the charset at the front of the headers:

    UTF-8 implied:

	foo: = bar; title='en'%C2%A3%20rates

    Charset explicitly specified:

	foo: =iso-8859-1= bar; title='en'%A3%20rates

Some advantages:

* Very like to break in the majority of code which
  doesn't understand the new convention.  (ref: "Postel Was Wrong")

* Header compression algorithms can be smart about it.

* Charset can be converted transparently by proxies, servers,
  frameworks etc.

And we can go even further if we want to:

   If header contains a charset spec (as above) the rest of the
   header can use all byte values from the range [0x20-0xff] and
   %xx encoding/decoding SHALL NOT be performed.

Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.