Re: [http-state] comments on -07

Adam Barth <ietf@adambarth.com> Thu, 22 April 2010 05:17 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0C3F33A6A07 for <http-state@core3.amsl.com>; Wed, 21 Apr 2010 22:17:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.644
X-Spam-Level:
X-Spam-Status: No, score=0.644 tagged_above=-999 required=5 tests=[AWL=-0.579, BAYES_50=0.001, FM_FORGED_GMAIL=0.622, J_CHICKENPOX_73=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kJ2PVdXMa1Pq for <http-state@core3.amsl.com>; Wed, 21 Apr 2010 22:17:39 -0700 (PDT)
Received: from mail-gw0-f44.google.com (mail-gw0-f44.google.com [74.125.83.44]) by core3.amsl.com (Postfix) with ESMTP id 356EB3A659A for <http-state@ietf.org>; Wed, 21 Apr 2010 22:17:13 -0700 (PDT)
Received: by gwaa12 with SMTP id a12so830527gwa.31 for <http-state@ietf.org>; Wed, 21 Apr 2010 22:16:55 -0700 (PDT)
Received: by 10.100.50.1 with SMTP id x1mr22135232anx.149.1271913415240; Wed, 21 Apr 2010 22:16:55 -0700 (PDT)
Received: from mail-iw0-f194.google.com (mail-iw0-f194.google.com [209.85.223.194]) by mx.google.com with ESMTPS id 20sm7075451iwn.5.2010.04.21.22.16.52 (version=SSLv3 cipher=RC4-MD5); Wed, 21 Apr 2010 22:16:53 -0700 (PDT)
Received: by iwn32 with SMTP id 32so5232453iwn.18 for <http-state@ietf.org>; Wed, 21 Apr 2010 22:16:52 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.231.146.198 with HTTP; Wed, 21 Apr 2010 22:16:32 -0700 (PDT)
In-Reply-To: <4BCCB7E1.9080903@gmail.com>
References: <4BCCB7E1.9080903@gmail.com>
From: Adam Barth <ietf@adambarth.com>
Date: Wed, 21 Apr 2010 22:16:32 -0700
Received: by 10.231.148.84 with SMTP id o20mr1347346ibv.94.1271913412170; Wed, 21 Apr 2010 22:16:52 -0700 (PDT)
Message-ID: <m2r5c4444771004212216n6e6f8859pff7977d081574c1c@mail.gmail.com>
To: Dan Winship <dan.winship@gmail.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: http-state@ietf.org
Subject: Re: [http-state] comments on -07
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Apr 2010 05:17:52 -0000

Hi Dan,

Thanks for your detailed comments.  I've adopted the vast majority of
your recommendations.  See below for details.

Thanks!
Adam


On Mon, Apr 19, 2010 at 1:06 PM, Dan Winship <dan.winship@gmail.com> wrote:
>> 1.  Introduction
>
> should explain the relation to 2109 and 2965?

Done:

[[
      <t>Prior to this document, there were at least three descriptions of
      cookies: the so-called "Netscape cookie speciciation," RFC 2109 <xref
      target="RFC2109"/>, and RFC 2965 <xref target="RFC2965"/>.  However, none
      of these documents describe how the Cookie and Set-Cookie headers are
      actually used on the Internet.  By contrast, this document attempts to
      specify the syntax and semantics of these headers as they are actually
      used on the Internet.</t>
]]

> And are we obsoleting 2965 too?

Nope.  2965 discusses the Set-Cookie2 and Cookie2 header, which are
different (although related).  Some folks are using Cookie2, so we
don't want to pull the rug out from under them.

>> The scope indicates the
>> maximum amount of time the user agent should retain the cookie, to
>> which servers the user agent should return the cookie, and for which
>> protocols the cookie is applicable.
>
> would read a little better if the 2nd and 3rd clauses were flipped to
> be parallel with the first. "the maximum amount of time the user agent
> should return the cookie, the servers to which the user agent should
> return the cookie, and the protocols for which the cookie is
> applicable".

Done.

>> The OWS (optional whitespace) rule
>
> ...is not explicitly defined either here or in RFC 5234.

I believe it is defined by this paragraph (which I've stolen from HTTPbis):

[[
        <t>The OWS (optional whitespace) rule is used where zero or more
        linear whitespace characters MAY appear. OWS SHOULD either not be
        produced or be produced as a single SP character. Multiple OWS
        characters that occur within field-content SHOULD be replaced with a
        single SP before interpreting the field value or forwarding the
        message downstream.</t>
]]

Is there some other paragraph from HTTPbis that I should steal instead?

> It only ends up being used in two rules:
>
>> set-cookie-header = "Set-Cookie:" OWS set-cookie-string OWS
>> cookie-header = "Cookie:" OWS cookie-string OWS
>
> And really, since set-cookie-header is only used for production, not
> parsing, it should just be
>
>     set-cookie-header = "Set-Cookie:" SP set-cookie-string
>
> and then for cookie-header, we can just do
>
>     cookie-header = "Cookie:" WSP* cookie-string
>
> and put the notes about correct spacing in the text near there.

My understanding from Julian Reschke is that using OWS in this way is
the proper way to give the grammar for HTTP headers in the new "bis"
world.  If you can convince him that we should make these changes, I'm
happy to do it.  Honestly, this part is just a mysterious incantation
to me.

>> Servers MAY return a Set-Cookie response header with any response.
>
> Even non-2xx responses? Have we tested that?

As far as I know, yes.  The test suite lots of 302 redirects, for example.

>> User agents SHOULD send a Cookie request header, subject to other
>> rules detailed below, with every request.
>
> "subject to other rules detailed below" makes this rule basically a
> no-op, since user agents are allowed to discard or ignore cookies for
> any reason they want to.

Removed.

> Assuming we keep the sentence, should we say "User agents that support
> cookies" or something like that? We said before "The terms user agent,
> client, server, proxy, and origin server have the same meaning as in
> the HTTP/1.1 specification", which implies that "User agents SHOULD
> send a Cookie request header" applies to ALL user agents... At a
> minimum, we should be clear here that servers can't assume that all
> clients will support cookies.

I don't think we need it.  User agent implementors can decide whether
or not to implement cookies without us giving them that advice.

>> An origin server MAY include multiple Set-Cookie header fields in a
>> single response.  Note that an intervening gateway MUST NOT fold
>> multiple Set-Cookie header fields into a single header field.
>
> "Gateways that want to be transparent to cookies MUST NOT fold
> multiple Set-Cookie header fields into a single header field."

Done.

>> If a server sends multiple responses containing Set-Cookie headers
>> concurrently to the user agent (e.g., when communicating with the
>> user agent over multiple sockets), these responses create a "race
>> condition" that can lead to unpredictable behavior.
>
> might be better somewhere else?

I've moved this to the section where we talk about all the gotcha's
with Set-Cookie.

>> == Server -> User Agent ==
>> Set-Cookie: lang=; Expires=Sun, 06 Nov 1994 08:49:37 GMT
>>
>> == User Agent -> Server ==
>> (No Cookie header)
>
> you should either state explicitly that each of the examples is
> independent of the others, or else redo them to be cumulative (eg,
> still returning the SID cookie in this example). (I think cumulative
> would be better.)

Fixed the example to be cumulative (this required tweaking the
previous example too.)

>> Informally, the Set-Cookie response header comprises the token Set-
>> Cookie:
>
> Don't say "token" since "Set-Cookie:" is not an HTTP token.
>
>    the header name "Set-Cookie", followed by a ":" and the cookie
>    value
>
> or something.
>
>> Each cookie begins with a name-value-
>> pair, followed by zero or more attribute-value pairs.
>
> no hyphen between "name-value" and "pair"

Fixed.

>> cookie-pair       = cookie-name "=" cookie-value
>> cookie-name       = token
>> cookie-value      = token
>
> Did we come to a consensus on restricting cookie-value to "token"? It
> seems a little limiting (though I guess the server can always just
> encode the data...)

I'm not sure we discussed it in detail.  I original picked token
because that's what's used by the other cookie specs.  Note that we're
not actually restricting the server to using only token, we're just
recommending it.

> I had a thought at one point that we could do something like
>
>    cookie-value         = token / cookie-quoted-string
>    cookie-quoted-string = DQUOTE *(cqdtext | cookie-quoted-pair) DQUOTE
>    cqdtext              = WSP / %x21 / %x23-3A / %x3C-5B / %x5D-7E
>    cookie-quoted-pair   = "\" ( %x20-3A / %3C-7E)
>
> ie, you can use quoted strings as long as they don't contain
> semicolons, such that clients-that-treat-quoted-strings-specially and
> clients-that-just-treat-cookie-value-as-a-pile-of-bytes would treat
> them exactly the same way.

That seems like way overkill.  If there's a production from HTTP that
we should reference, then I'm open to that.  Ideally we should make
this as "HTTP-like" as possible.

>> cookie-av         = expires-av / max-age-av / domain-av /
>>                     path-av / secure-av / httponly-av
>
>  / extension-av ?

I don't think we should recommend that servers send unspecified
attribute values.  They can, of course, and user agents are required
to process them correctly, but why would we recommend that they send
something with no semantics?

>> expires-av        = "Expires" "=" sane-cookie-date
>
> it may be worth pointing out explicitly in the running text that
> expires-av is NOT quoted.

Why?  Is there some reason folks would think that rfc1123-date would be quoted?

>> path-value        = <abs_path, as defined in RFC 2616>
>
> except that it can't contain any ";"s

Frown.  Good catch.  Fixed.

>> To maximize compatibility with user agents, servers that wish to
>> store non-ASCII data in a cookie-value SHOULD encode that data using
>> a printable ASCII encoding, such as base64.
>
> Base64 uses "/" and "=", which are not allowed in tokens, and thus not
> allowed in cookie-value according to the current grammar. See, I told
> you token was too restrictive. :-)

Haha, ok.  Fixed.

>> NOTE: Some user agents represent dates using 32-bit integers.  Some
>> of these user agents might contain bugs that cause them process dates
>> after the year 2038 incorrectly.  Servers wishing to interoperate
>> with these user agents might wish to use dates before 2038.
>
> Such user agents might also wrap around on the other side and treat
> dates before 1902 as being in the future, so we might want to suggest
> a lower limit on pre-expired dates too (1970?). Maybe both of these
> belong in 4.1.2.1 though?

We could, but I'm not sure it's a big enough problem to worry about.
We have seen examples of distant dates in the future, but I haven't
seen examples of dates in the distant past.

> (Also, it's not really "32-bit integers", it's specifically "32-bit
> UNIX time_t values".)

Fixed.

>> NOTE: The syntax above allows whitespace around the U+003D ("=")
>> characters.
>
> Actually, it doesn't, since we're using RFC 5234 ABNF, not RFC 2616
> crazy implied LWS ABNF.
>
> That means you also need an "SP" after the ";" in set-cookie-string.

Fixed.

>> and it expires at the end of the
>> current session (as defined by the user agent).
>
> We might want more discussion about that (maybe in the security
> considerations section?) In particular wrt mobile devices.

Why did you have in mind?  It's really a vague concept that's up to
the user agent.

>>    WARNING: Not all user agents support the Max-Age attribute.  User
>>    agents that do not support the Max-Age attribute will retain the
>>    cookie for the current session only.
>
> Well, unless there's an Expires attribute. Move this after the "both
> Max-Age and Expires" paragraph, and change the second sentence to:
> "User agents that do not support the Max-Age attribute will determine
> the cookie lifetime based solely on the Expires attribute".

That's too complicated.  I just said that these use agents will ignore
the Max-Age attribute.

>> 4.1.2.3.  The Domain Attribute
>
> Since the 4.1.1 grammar limits Domain to "token", we should probably
> point out that this uses the IDNA A-label form.

Do you mean the subdomain production from
http://www.ietf.org/rfc/rfc3492.txt ?  I've made the requirements on
the value of the Domain attribute explicitly reference this production
now.

>> is ignored.)  If the server omits the Domain attribute, the user
>> agent will return the cookie only to the origin server.
>
> OK, I was annoyed that the Expires section didn't say anything about
> session cookies, but then decided, "well, I guess he mentioned it in
> 4.1.2 so he doesn't have to mention it here". But you mentioned
> origin-server-only cookies there too, and mention them again here. So
> maybe it would be nice to mention session cookies a second time too.
> (Maybe after the "both Max-Age and Expires" paragraph, have a "neither
> Max-Age nor Expires" paragraph.)

Done.

>>    host name.  For example, if example.com returns a Set-Cookie
>>    header without a Domain attribute, these user agents will
>>    erroneously send the cookie to www.example.com.
>
> add "as well" to the end?

Fixed.

>> The Path attribute limits the scope of the cookie to a set of paths.
>
> the most common usage of Path is to *un*limit the scope ("Path=/").
> How about:
>
>     The scope of the cookie also includes a set of paths. The Path
>     attribute can be used to set this explicitly; if the server omits
>     the Path attribute, the user agent will use the directory of the
>     Request-URI's path component as the default value.
>
>     The user agent will include the cookie in an HTTP request only if
>     the path portion of the Request-URI matches (or is a subdirectory
>     of) the cookie's Path attribute, where the U+002F ("/") character
>     is interpreted as a directory separator.

Done.

>> 2.  The "same-origin" policy implemented by many user agents does not
>>     isolate different paths within an origin.  For example, /foo/
>>     bar.html can read cookies with a Path attribute of "/baz" because
>>     they are within the "same origin".
>
> This is talking about javascript, right? It's pretty confusing if you
> don't know that. ("can read cookies" how?)
>
>     2.  The "same-origin" policy implemented by web browsers does not
>         isolate different paths within an origin. For example, a
>         script running on a page downloaded from /foo/bar.html can
>         read cookies from the browser's cookie store with a Path
>         attribute of "/baz", because they are within the "same
>         origin".
>
> or maybe just forward-ref the whole thing to the security
> considerations section.

I've changed this to a forward reference.  We added that text before
we had a security considerations section.

>> 4.1.2.6.  The HttpOnly Attribute
>>
>> The HttpOnly attribute limits the scope of the cookie to HTTP
>> requests.  In particular, the attribute instructs the user agent to
>> omit the cookie when providing access to its cookie store via "non-
>> HTTP" APIs (as defined by the user agent).
>
> That makes it almost uselessly vague. I'd change "(as defined by the
> user agent)" to "(such as JavaScript's document.cookie API)".

Fixed.  (Well, it's technically HTML's API.)

> And we should say somewhere *why* you'd want to do that? (Maybe in 8.5?)

I'm happy to put in some text if you'd like to suggest a coherent
reason.  It's very difficult to articulate precisely what security
benefit you get from HttpOnly cookies.  We might just end up with some
mushy "it sounds like a good idea" text.

>> 4.2.1.  Syntax
>>
>> The user agent returns stored cookies to the origin server in the
>> Cookie header.  If the server conforms to the requirements in this
>> section, the requirements in the next section will cause the user
>> agent to return a Cookie header that conforms to the following
>> grammar:
>
> "If the server conforms to the requirements in section 4.1, then the
> requirements in section 5 will cause the user agent to..."

Fixed.

>> 4.2.2.  Semantics
>
> is this, like 4.1.2, non-normative?

It is normative.  The Set-Cookie section isn't normative because the
semantics are much more complicated than what's explained there.  The
section on Cookie is actually correct because the Cookie header is
much, much more simple.

>> In particular,
>> if the Cookie header contains two cookies with the same name, servers
>> SHOULD NOT rely upon the order in which these cookies appear in the
>> header.
>
> maybe "In particular, if a user agent has two cookies with the same
> name (but different Path or Domain attributes) for a given server, the
> server SHOULD NOT assume that the two cookies will appear in the
> header in a particular (or even consistent) order."

I've adopted the spirit of your suggestion, but with different text.

>> The user agent MUST use the following algorithm to *parse a cookie-
>> date*:
>
> the asterisks there seem superfluous?

I can remove the asterisks if they're not common in IETF specs, but
they server to denote that the text is a "key phrase" that can be
referred to by other parts of the document and by other specs.

> also, "MUST"? Is anyone actually planning to rewrite their cookie date
> parser to be exactly equivalent to this grammar? Mm... I think I'll
> split "conformance" discussion into a separate mail.

IMHO, yes.  The point of writing a spec is to improve interoperability.

>> mystery         = <anything except a delimiter>
>
> not valid ABNF, you need to specify the byte-ranges

Blah.  It seems clearer this way.

> It would be nice to add a comment to "delimiter" explaining what
> characters it includes too.

I'd rather folks just copy and pasted this into their code instead of
thinking about what ASCII characters these bytes represent.

>> 3.  Abort these steps and *fail to parse* if
>
> asterisks again

See above.

>>     *  the year-value is less than 1601 or greater than 30827,
>
> 30827???

Removed.  I'm not sure where that came from.

> It's not clear to me that there's any good reason for
> allowing/requiring 5DIGIT years.

Removed.

>> 4.  If the year-value is greater than 68 and less than 100, increment
>>     the year-value by 1900.
>>
>> 5.  If the year-value is greater than or equal to 0 and less than 69,
>>     increment the year-value by 2000.
>
> You need to move these steps to before step 3, since you've already
> aborted if the year was less than 1601. Probably just make them
> sub-steps of the "date-token matches the year production" step.

I just moved them up.  I didn't want to break the parallelism of the
earlier steps.

>> A *canonicalized* host-name is the host-name converted to lower case
>> and expressed in punycode [RFC3492].
>
> The idnabis drafts ought to be real RFCs soon... Per their rules,
> "punycode" refers only to the *algorithm* defined by 3492, and you
> want to refer to "A-labels". Or probably something like "converted to
> the A-label form as used for DNS lookup according to RFC xxxx"
> (whatever draft-idnabis-protocol becomes).

I couldn't find a reference for A-labels and I'm not wild about citing
an RFC that doesn't exist.  Here's what I've got now: "converted to
ASCII by the punycode algorithm."

> It's weird that it talks about lowercasing and punycoding the
> request-host, but not the cookie-domain. Of course, that turns out to
> be because we "already" lowercased the cookie-domain, even though we
> won't learn about that until later on in the document...

I've pushed the canonicalization into the callers so the two arguments
are more parallel now.

>> 5.1.3.  Paths
>>
>> The user agent MUST use the following algorithm to compute the
>> *default-path* of a cookie:
>
> It is confusing that the algorithm for determining cookie-domain from
> the Domain attribute is in 5.2, but the algorithm for determining
> cookie-path from Path is (for the most part) in 5.1. This ties in to
> the comment above about request-host vs cookie-domain
> canonicalization. I'm not sure having the algorithms split out from
> the processing really makes sense. Or alternatively, it might make
> more sense if you swapped 5.1 and 5.2, so the high-level description
> comes first, and the details afterward.

I've ordered things in this way so that all the normative references
point upwards.  In this case, determining the default cookie-domain is
trivial: "Set the cookie's domain to the canonicalized request-host."
We could inline the default-path algorithm, but it's logically
independent of the parsing, which is already complicated enough.

>> 1.  Let uri-path be the path portion of the Request-URI.
>
> More precision here might be good, especially since Request-URI as
> defined in RFC 2616 is known to be wrong. (The "abs_path" case is
> supposed to be "abs_path [ "?" query ]".)
>
>     1.  Let uri-path be the abs_path portion of the Request-URI.
>         That is, if the Request-URI contains just a path (and
>         optional query string), then the uri-path is that path
>         (without the "?" or query), and if the Request-URI contains a
>         full absoluteURI, the uri-path is the abs_path component of
>         that URI.

Done.

> What about the "OPTIONS * HTTP/1.1" and "CONNECT example.com:443
> HTTP/1.1" cases? Do clients send cookies with those as though path was
> "/"?

Not that I'm able to detect.  In my experiments, they don't include a
Cookie header, presumably because every cookie is required to have a
path that starts with "/" and these requests don't have a path that
starts with "/".  My methodology isn't great here, so if you have data
to the contrary, please let me know.

>> 2.  If the first character of the uri-path is not a U+002F ("/")
>>     character, output U+002F ("/") and skip the remaining steps.
>
> As defined above, uri-path must either start with "/" or be empty, so
> this is equivalent to "If the uri-path is the empty string, ..."

I've added the empty case here, but I've kept the other requirement.
It's an important invariant of the protocol.  It's worth stating even
if it's redundant if you case down all the definitions.

>> the user agent *receives a set-cookie-string* consisting of the value
>
> more gratuitous asterisking. I didn't comment on some of the other
> ones before, but this one, like the others I mentioned, is never
> referred to again. I think in general it's probably better to just
> refer to "parsing a domain attribute according to section 5.blah"
> rather than having magic defined terms.

This is a cultural thing w.r.t. how to write specs to be referenced by
other specs.  I haven't changed these, but I'm happy to if this isn't
how the IETF likes to do things.

>> 1.  If the set-cookie-string is empty or consists entirely of WSP
>>     characters, ignore the set-cookie-string entirely.
>
> this is unecessary, since it's covered by step 3.

Removed.

> OTOH, you should say
> explicitly that any leading whitespace is not part of the
> set-cookie-string, otherwise "  =foo" would result in setting a
> nameless cookie (via step 6).

I fixed this by moving the check for an empty name to the end (after
all the canonicalization).

>> 7.  The cookie-name is the name string, and the cookie-value is the
>>     value string.
>
> Somewhere in here we need a note mentioning that the client can reject
> cookies that are too large.

This is already allowed, but I've added "too large" to the list of
example reasons why you can reject storing a cookie.

>>        Consume the characters of the unparsed-attributes up to, but
> ...
>>     Let the cookie-av string be the characters consumed in this step.
>
> "consume" tends to imply "throw away" to me. (Especially since you
> used it in exactly that sense in the previous step, "Consume the
> ';'".) It might be that the easiest fix is to replace the earlier use
> of "consume" with something else...

I changed the first occurrence to "discard."

>> 6.  Process the attribute-name and attribute-value according to the
>>     requirements in the following subsections.
>
> Need to say somewhere that unrecognized attribute-names are ignored.

This is actually already true, but I added a parenthetical to make
this more explicit.

> Sort of related, if the unparsed-attributes ends with a trailing ";",
> then you'll end up parsing out an attribute with an empty name and
> value.

Well, you won't actually finish parsing it, but yeah.

>> If delta-seconds is less than or equal to zero (0), let expiry-time
>> be the earliest representable date and time.  Otherwise, let the
>> expiry-time be the current date and time plus delta-seconds seconds.
>
> To be consistent with Expires parsing, shouldn't it be the Date header
> plus delta-seconds seconds?

That the opposite of what the Expires thing does.  I think the two are
consistent.

>> Convert the cookie-domain to lower case.
>
> and IDNA A-label? I guess you're assuming people aren't going to try
> to use non-ASCII Domain attributes, but given that we warn about
> non-ASCII cookie-values, is that reasonable?

There's no conversion, as far as I can tell.  Servers need to use puny
code.  The UA doesn't need to recover from that error.

>> 1.   A user agent MAY ignore a received cookie in its entirety.  For
>>      example, the user agent might wish to block receiving cookies
>>      from "third-party" responses.
>
> But we've already said that they MUST have parsed it... So they MUST
> parse it but then MAY ignore it?

That's how the spec is implemented, but UAs are free to implement the
requirements in any way that is blackbox indistinguishable.  In
particular, if they know in advance that they aren't going to store
the cookie, they can skip all the parsing work.  Logically, however,
ignoring received cookies means that you don't store them.  As another
example, if the UA is configured to prompt the user, the UA will often
parse the cookie first to have something sensible to show in the
prompt.

>> 5.   If the user agent is configured to use a "public suffix" list
>>      and the domain-attribute is a public suffix:
>
> Maybe add some wiggle room here for draft-pettersen-subtld-structure
> or other future improvements and just say "If the user agent is able
> to determine that the domain-attribute is a public suffix" (keeping
> the "NOTE" pointing out that the best current solution is the public
> suffix list).

Done (although with slightly different text).

>>         If the domain-attribute is identical to the canonicalized
>>         Request-URI's host:
>
> you say "canonicalized host of the Request-URI" below, which is
> clearer that it's the host being canonicalized, not the URI.

I've cleaned up all these references.

>> 10.  If the cookie's name and value are both empty, abort these steps
>>      and ignore the cookie entirely.
>
> That's an artifact from an earlier draft. We've already thrown out
> nameless cookies

Removed.

>> 11.  If the cookie was received from a non-HTTP API and the cookie's
>>      http-only-flag is set, abort these steps and ignore the cookie
>>      entirely.
>
> There is no defined way to *receive[s] a cookie* other than via HTTP.

Well, receiving a cookie is an extensibility point that other spec
(e.g., HTML5) can/will use to give us cookies via other APIs.

> And if we're going to define semantics for processing document.cookie
> then we need to allow nameless cookies, right? (Or did we decide we
> don't even in that case? I forget.)

AFAIK, we decided we didn't need them.

>>      2.  If the newly created cookie was received from an non-HTTP
>>          API and the old-cookie's host-only-flag is set, abort these
>>          steps and ignore the newly created cookie entirely.
>
> s/host-only/http-only/ ?

Good catch.

>> 13.  Insert the newly created cookie into the cookie store.
>
> should probably say "if it doesn't have an expiry-time in the past"?

We could say that, but it's redundant.  On the other hand, it's a very
important invariant.

>> The user agent MUST evict a cookie from the cookie store if, at any
>> time, a cookie exists in the cookie store with an expiry date in the
>> past.
>
> That reads like "if any cookie is expired, then you have to evict
> *SOME* cookie". Hm... reading on, it appears that that's the intended
> reading. That's bizarre. I think you should just say that cookies are
> evicted when they expire, and also, that when inserting new cookies
> into the store, old cookies can be evicted if adding the new cookie
> would result in too many per-host or total cookies.

I was trying to state the requirements declaratively.  We can change
it to a more algorithmic style, by my understanding is that the IETF
preferes declarative specifications, so I've tried to be as
declarative as possible (although I suspect I could improve here if I
spend more careful thought about, e.g., the storage model).

>>     *  The Request-URI's path patch-matches cookie's path.
>
> s/patch/path/

Fixed.

>>     *  If the cookie's http-only-flag is true, then exclude the
>>        cookie unless the cookie-string is being generated for an
>>        "HTTP" API (as defined by the user agent).
>
> We're talking about generating a Cookie header for an HTTP request.
> It's an HTTP API.

Not necessarily.  For example, the document.cookie API will use this
algorithm to compute its cookie-string.  I've added the asterisks to
make that clearer, although I'm not sure how much that helps.

>>     NOTE: Not all user agents sort the cookie-list in this order, but
>>     this order reflects common practice when this document was
>>     written.  The specific ordering might not be optimal in every
>>     metric, but using the consensus ordering is a relatively low cost
>>     way to improve interoperability between user agents.
>
> "not be optimal" isn't the point. The debate wasn't about a "best"
> ordering, it was about whether there should be an ordering at all.
>
>       NOTE: Not all user agents sort the cookie-list in this order,
>       but this order reflects common practice when this document was
>       written, and historically, there have occasionally been servers
>       that (erroneously) depended on this order.

Fixed (although I removed "occasionally" because I don't have hard
numbers on how often it happens).

>> NOTE: Despite its name, the cookie-string is actually a sequence of
>> octets, not a sequence of characters.  To convert the cookie-string
>> into a sequence of characters (e.g., for presentation to the user),
>> the user agent SHOULD use the UTF-8 character encoding [RFC3629].
>
> That's a little confusing... I'd say something more like "When
> presenting the cookie-string to the user, user agents SHOULD assume
> that the string is UTF-8". But also, it doesn't belong here anyway;
> user agents aren't generally going to present cookie-strings, they're
> going to present cookie-names and cookie-values. Probably this should
> go in 7.2.

If you believe that a string is a sequence of characters, then it
doesn't make sense to say that a string is in UTF-8 because UTF-8 is
about how to represent character as octets.  I've taken your point
about parts of the cookie-string: "To convert the cookie-string (or
components thereof) into ..."

>> Servers SHOULD use as few and as small cookies as possible to avoid
>> reaching these implementation limits and to avoid network latency due
>> to the Cookie header being included in every request.
>
> s/avoid network latency/minimize network bandwidth/.

Done, but avoiding latency is important too.

>> Servers should gracefully degrade if the user agent fails to return
>
> s/should/SHOULD/ ?

Fixed.

>> One reason the Cookie and Set-Cookie headers uses such esoteric
>> syntax is because many platforms (both in servers and user agents)
>> provide a string-based application programmer interface (API) to
>> cookies, requiring application-layer programmers to generate and
>> parse the syntax used by the Cookie and Set-Cookie headers.
>
> Needs to end with something like ", which many programmers have done
> incorrectly, resulting in interoperability problems."

Fixed.

>> Cookies have a number of security and privacy pitfalls.
>
> add something like "The following are merely a brief overview." ?
>
> (also, should it still say "privacy" here, since this is the "Security
> Considerations" section and you already talked about privacy in
> section 7?)

Fixed:

[[
        <t>Cookies have a number of security pitfalls.  This section overviews
        a few of the more salient issues.</t>
]]

>> Servers SHOULD encrypt and sign the contents of cookies when
>
> really? I mean, yeah, they should, but... no one's going to.

Many frameworks do this automatically.  For example, ASP.NET encrypts
and signs every cookie.

> And especially if the cookie content is just a nonce, a
> signed-and-encrypted nonce is not much different from a plain nonce.

Sure.

> Maybe just "SHOULD NOT transmit sensitive information unencrypted in
> cookies".

I think that's too vague.  I'd rather everyone just encrypted and
signed their cookies and called it a day.

>> Cookies do not always provide isolation by path.  Although the
>> network-level protocol does not send cookie stored for one path to
>
> s/cookie/cookies/

Fixed.

>> 8.6.  Weak Integrity
>
> the breakdown of sections here still seems odd to me. In particular,
> the fact that "servers SHOULD NOT both run mutually distrusting
> services on different ports of the same host and use cookies to store
> security-sensitive information" appears under "Weak Confidentiality",
> but "servers SHOULD NOT both run mutually distrusting services on
> different paths of the same host and use cookies store
> security-sensitive information" appears under "Weak Integrity".

That's because the first is a problem for confidentiality+integrity
and the second is a problem only for integrity.  Confidentiality is
more important for cookie security, hence it gets top billing.

>> cookies by storing large number of cookies.  Once the user agent
>
> s/storing large/storing a large/ (or s/number/numbers/)

Fixed.