Re: [ietf-types] The application/www-form-urlencoded format

"Anne van Kesteren" <annevk@opera.com> Sun, 26 September 2010 09:42 UTC

Return-Path: <annevk@opera.com>
X-Original-To: ietf-types@core3.amsl.com
Delivered-To: ietf-types@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 470EE3A6B11 for <ietf-types@core3.amsl.com>; Sun, 26 Sep 2010 02:42:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.83
X-Spam-Level:
X-Spam-Status: No, score=-4.83 tagged_above=-999 required=5 tests=[AWL=-2.831, BAYES_00=-2.599, J_CHICKENPOX_14=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1UhPU2mCX1sn for <ietf-types@core3.amsl.com>; Sun, 26 Sep 2010 02:42:00 -0700 (PDT)
Received: from pechora3.lax.icann.org (pechora3.icann.org [208.77.188.38]) by core3.amsl.com (Postfix) with ESMTP id 80E8A3A6A67 for <ietf-types@ietf.org>; Sun, 26 Sep 2010 02:41:59 -0700 (PDT)
Received: from smtp.opera.com (smtp.opera.com [213.236.208.81]) by pechora3.lax.icann.org (8.13.8/8.13.8) with ESMTP id o8Q9fxcb028054 for <ietf-types@iana.org>; Sun, 26 Sep 2010 02:42:20 -0700
Received: from anne-van-kesterens-macbook-pro.local (5355737B.cable.casema.nl [83.85.115.123]) (authenticated bits=0) by smtp.opera.com (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id o8Q9Menl013904 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 26 Sep 2010 09:22:41 GMT
Content-Type: text/plain; charset="utf-8"; format="flowed"; delsp="yes"
To: ietf-types@iana.org, Bjoern Hoehrmann <derhoermi@gmx.net>
References: <k1os96p03o78p78490hei104biadpiepit@hive.bjoern.hoehrmann.de>
Date: Sun, 26 Sep 2010 11:22:40 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: Anne van Kesteren <annevk@opera.com>
Organization: Opera Software
Message-ID: <op.vjmuz10364w2qv@anne-van-kesterens-macbook-pro.local>
In-Reply-To: <k1os96p03o78p78490hei104biadpiepit@hive.bjoern.hoehrmann.de>
User-Agent: Opera Mail/10.62 (MacIntel)
X-Greylist: Delayed for 00:19:17 by milter-greylist-4.0 (pechora3.lax.icann.org [208.77.188.38]); Sun, 26 Sep 2010 02:42:20 -0700 (PDT)
Subject: Re: [ietf-types] The application/www-form-urlencoded format
X-BeenThere: ietf-types@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "Media \(MIME\) type review" <ietf-types.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ietf-types>, <mailto:ietf-types-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf-types>
List-Post: <mailto:ietf-types@ietf.org>
List-Help: <mailto:ietf-types-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-types>, <mailto:ietf-types-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Sep 2010 09:42:02 -0000

On Sat, 25 Sep 2010 23:14:39 +0200, Bjoern Hoehrmann <derhoermi@gmx.net>  
wrote:
>   http://tools.ietf.org/html/draft-hoehrmann-urlencoded -- the draft de-
> scribes the application/www-form-urlencoded format, a variant of the
> application/x-www-form-urlencoded format first described in RFC 1866.

I think it is unfortunate it still allows encoding in various ways. So  
while things could be more readable as you pointed out in the past user  
agents are still allowed to obscure most everything.


> RFC 1866 recommended that implementations also accept ";" as separator,
> which the draft permits and prefers, and RFC 1866 failed to define the
> character encoding to use, the draft addresses that by mandating UTF-8,
> which is universally recommended these days. As the ampersand will be
> handled as it would be for the RFC 1866 format, I believe the formats
> are similar enough that using the very similar name is justified. [1]
>
> The draft probably still has some rough edges in the prose but the
> format is not going to change. I believe it addresses the feedback I
> got since the first draft published four years ago; public feedback at
> http://lists.w3.org/Archives/Public/www-archive/2006Sep/thread.html#msg30

The bug about + seems to be still be there. Escapes are first decoded and  
then + is replaced with U+0020. Also application/x-www-form-urlencoded is  
on its way of being standardized as part of HTML5 now.


>       Note: The media type does not have a 'charset' parameter, it
>       is incorrect specify one and to associate any significance to
>       it if specified. The character encoding is always UTF-8. The
>       Unicode encoding form signature is not supported; a leading
>       U+FEFF character will be considered part of a <name>.

Most other such formats ignore a leading U+FEFF.


> [1] The regular expression to match both names is slightly simpler if
>     they only differ in the "x-", and it seems fitting to standardize
>     an "x-" type by removing the "x-", but if there is a good argument
>     why using similar names is a bad idea, I am also quite open to name
>     it application/name-value-pairs or something like that instead.

Well, application/x-www-form-urlencoded is not going away anytime soon. It  
is what you get for <form> by default. That some consider it non-standard  
is just semantics. Having said that I do not mind dropping the x- for an  
improved version of that format.


-- 
Anne van Kesteren
http://annevankesteren.nl/