Re: [http-state] Ticket 11: Character encoding for non-ASCII cookies values

Adam Barth <ietf@adambarth.com> Wed, 03 March 2010 05:47 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B544A3A7EE4 for <http-state@core3.amsl.com>; Tue, 2 Mar 2010 21:47:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K-mb8n9nmnfB for <http-state@core3.amsl.com>; Tue, 2 Mar 2010 21:47:00 -0800 (PST)
Received: from mail-yx0-f188.google.com (mail-yx0-f188.google.com [209.85.210.188]) by core3.amsl.com (Postfix) with ESMTP id BB9023A7A40 for <http-state@ietf.org>; Tue, 2 Mar 2010 21:46:56 -0800 (PST)
Received: by yxe26 with SMTP id 26so587220yxe.29 for <http-state@ietf.org>; Tue, 02 Mar 2010 21:46:55 -0800 (PST)
Received: by 10.90.40.4 with SMTP id n4mr2097177agn.44.1267595215059; Tue, 02 Mar 2010 21:46:55 -0800 (PST)
Received: from mail-iw0-f179.google.com (mail-iw0-f179.google.com [209.85.223.179]) by mx.google.com with ESMTPS id 21sm4855716iwn.3.2010.03.02.21.46.52 (version=SSLv3 cipher=RC4-MD5); Tue, 02 Mar 2010 21:46:53 -0800 (PST)
Received: by iwn9 with SMTP id 9so1195407iwn.17 for <http-state@ietf.org>; Tue, 02 Mar 2010 21:46:52 -0800 (PST)
MIME-Version: 1.0
Received: by 10.231.143.148 with SMTP id v20mr424839ibu.14.1267595212132; Tue, 02 Mar 2010 21:46:52 -0800 (PST)
In-Reply-To: <4BF4ABE3-7699-4D75-9E3C-48871CBA13E8@gbiv.com>
References: <5c4444771003021624qc0b00cet27e348cb6d023b08@mail.gmail.com> <4BF4ABE3-7699-4D75-9E3C-48871CBA13E8@gbiv.com>
From: Adam Barth <ietf@adambarth.com>
Date: Tue, 02 Mar 2010 21:46:32 -0800
Message-ID: <5c4444771003022146h1e4dfc3fi4196b5697725ebc3@mail.gmail.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: http-state <http-state@ietf.org>
Subject: Re: [http-state] Ticket 11: Character encoding for non-ASCII cookies values
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Mar 2010 05:47:01 -0000

On Tue, Mar 2, 2010 at 5:08 PM, Roy T. Fielding <fielding@gbiv.com> wrote:
> On Mar 2, 2010, at 4:24 PM, Adam Barth wrote:
>> The draft treats the cookie values as opaque octets throughout for use
>> on the wire.  I've added a SHOULD-level requirement to use a UTF8 when
>> converting the octets to characters (e.g., for use in the user agent's
>> user interface).
>>
>> Given that the encoding issue doesn't appear to affect
>> interoperability on the wire, I think a SHOULD-level recommendation is
>> appropriate here.  If specific APIs (e.g., document.cookie) have more
>> specific needs, they can add additional requirements.
>>
>> Thoughts?
>
> I think that is fine if it is made clear that UTF-8 is only applicable
> after the field value is extracted from the rest of the message.  I.e.,
> the HTTP parser must be ASCII-based and thus not vulnerable to
> invalid Unicode byte sequences.

Hopefully that should be clear in the draft.  The encoding is mention
at the end of the serialization section (which is two sections after
the parsing section).

Adam