Re: [ietf-types] The application/www-form-urlencoded format

"Anne van Kesteren" <> Sun, 26 September 2010 20:40 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id AC8DE3A6BA5 for <>; Sun, 26 Sep 2010 13:40:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.413
X-Spam-Status: No, score=-6.413 tagged_above=-999 required=5 tests=[AWL=-0.414, BAYES_00=-2.599, J_CHICKENPOX_14=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id HDRQGUyIfNqb for <>; Sun, 26 Sep 2010 13:40:24 -0700 (PDT)
Received: from ( [IPv6:2620:0:2d0:1::39]) by (Postfix) with ESMTP id E58563A6ADF for <>; Sun, 26 Sep 2010 13:40:23 -0700 (PDT)
Received: from ( []) by (8.13.8/8.13.8) with ESMTP id o8QKedJw021138 for <>; Sun, 26 Sep 2010 13:41:00 -0700
Received: from anne-van-kesterens-macbook-pro.local ( []) (authenticated bits=0) by (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id o8QKebhK009135 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 26 Sep 2010 20:40:38 GMT
Content-Type: text/plain; charset="utf-8"; format="flowed"; delsp="yes"
To: Bjoern Hoehrmann <>
References: <> <op.vjmuz10364w2qv@anne-van-kesterens-macbook-pro.local> <>
Date: Sun, 26 Sep 2010 22:40:37 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: Anne van Kesteren <>
Organization: Opera Software
Message-ID: <op.vjnqdzm664w2qv@anne-van-kesterens-macbook-pro.local>
In-Reply-To: <>
User-Agent: Opera Mail/10.62 (MacIntel)
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.0 ( []); Sun, 26 Sep 2010 13:41:00 -0700 (PDT)
Subject: Re: [ietf-types] The application/www-form-urlencoded format
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "Media \(MIME\) type review" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 26 Sep 2010 20:40:26 -0000

On Sun, 26 Sep 2010 22:19:40 +0200, Bjoern Hoehrmann <>  
> * Anne van Kesteren wrote:
>> I think it is unfortunate it still allows encoding in various ways. So
>> while things could be more readable as you pointed out in the past user
>> agents are still allowed to obscure most everything.
> I do think that encoder implementers can make reasonable choices about
> that. If an implementer decides it's best to escape, say, the zero-width
> space character because it's invisible then I see nothing wrong with
> that. If another implementer decides to not escape it, that's fine too.

I suppose. And I guess if we ever decide to implement any of this we could  
make more specific requirements in the forms section of HTML for that  
class of implementors.

After having thought about it some more I am doubting this would be a very  
useful addition though. The benefit over application/x-www-form-urlencoded  
seems marginal. Not high enough to warrant the cost.

>> The bug about + seems to be still be there. Escapes are first decoded  
>> and
>> then + is replaced with U+0020. Also application/x-www-form-urlencoded  
>> is
>> on its way of being standardized as part of HTML5 now.
> I don't think there is anything that "HTML5" could standardize about it,
> other than define how form submission using that type should be imple-
> mented in HTML implementations. As for the +, the text refers to the ab-
> stract syntax tree you get when parsing a string using the grammar, so
> an escaped plus sign is no instance of `plus`. I take it that's one of
> the text's rough edges I should smooth out.

I guess. I'm not very good at reading grammar / prose combinations. (I  
thought that after the escape was handled you would have instances of  

>> Most other such formats ignore a leading U+FEFF.
> I can't think of any format that's always UTF-8 encoded yet allows for
> it, but anyway, treating it as part of a name is simpler than ignoring
> it, I think people are more likely to implement that correctly than if
> I were to require recognizing it.

text/event-stream and text/cache-manifest.

Anne van Kesteren