Re: [http-state] non-ASCII cookie values (was Re: Closing Ticket 3: Public Suffixes)

Adam Barth <> Wed, 03 February 2010 00:30 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3FA9228C101 for <>; Tue, 2 Feb 2010 16:30:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[AWL=0.077, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id eE4Tsxsvr7zr for <>; Tue, 2 Feb 2010 16:30:29 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id BAC473A6BA8 for <>; Tue, 2 Feb 2010 16:30:29 -0800 (PST)
Received: by pzk13 with SMTP id 13so873349pzk.32 for <>; Tue, 02 Feb 2010 16:31:07 -0800 (PST)
MIME-Version: 1.0
Received: by with SMTP id e12mr4368365wfa.332.1265157066170; Tue, 02 Feb 2010 16:31:06 -0800 (PST)
In-Reply-To: <>
References: <> <> <> <> <>
From: Adam Barth <>
Date: Tue, 02 Feb 2010 16:30:46 -0800
Message-ID: <>
To: Maciej Stachowiak <>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [http-state] non-ASCII cookie values (was Re: Closing Ticket 3: Public Suffixes)
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 03 Feb 2010 00:30:31 -0000

On Tue, Feb 2, 2010 at 4:27 PM, Maciej Stachowiak <> wrote:
> On Feb 1, 2010, at 10:54 PM, Adam Barth wrote:
>> On Mon, Feb 1, 2010 at 9:07 PM, David Morris <> wrote:
>>> On Mon, 1 Feb 2010, Maciej Stachowiak wrote:
>>>> But there are two ways in which it matters:
>>>> 1) A header with non-ASCII bytes get set via Set-Cookie, then read through a JS API such as document.cookie. document.cookie gives a UTF-16 encoded string, so at this point the server has to decide how to interpret non-ASCII bytes in the cookie value.
>>>> 2) If you set a cookie via document.cookie and include non-ASCII characters in the value, what bytes get sent?
>>> Seems to me that the platform providing the document.cookie object is
>>> responsible for making any value placed in the cookie: header correct.
>> That seems like a wise course of action.  This document understands
>> octet sequences.  HTML5 should define how to translate those octet
>> sequences into JavaScript characters (when reading document.cookie)
>> and how to perform the reverse translation (when writing
>> document.cookie).
> HTML5 does not spec this detail and apparently expects the cookie spec to expose a string interface, not an octet-sequence interface:
> I think defining conversion between octet sequence and string could plausibly go in either spec. I think the cookie spec would be a better place, because other string-oriented interfaces from web platform specs to cookies (if any) should probably use the same conversion, and browser UI for managing cookies should probably use the same conversion too. So it would be a useful thing to define even if it's not used by the network protocol. However, if you don't think it should be in the cookie spec, I can file a bug against HTML5.

I'm happy to do whatever you and Ian think is best.  If the algorithm
needed for HTML5 is sensible enough to be re-used by other string
consumers, that would be a reason to put it in this document.  If the
algorithm is non-sensical, we might be doing other applications a
service keeping it HTML specific.