Re: [http-state] Gen-ART LC review of draft-ietf-httpstate-cookie-18

Adam Barth <ietf@adambarth.com> Mon, 29 November 2010 20:25 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 46B3228C0F2; Mon, 29 Nov 2010 12:25:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.351
X-Spam-Level:
X-Spam-Status: No, score=-2.351 tagged_above=-999 required=5 tests=[AWL=-2.674, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, MANGLED_LIPS=2.3]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iiU8QYCj1ceN; Mon, 29 Nov 2010 12:25:10 -0800 (PST)
Received: from mail-gy0-f172.google.com (mail-gy0-f172.google.com [209.85.160.172]) by core3.amsl.com (Postfix) with ESMTP id 4DEBA3A6BB4; Mon, 29 Nov 2010 12:25:10 -0800 (PST)
Received: by gyb13 with SMTP id 13so2613006gyb.31 for <multiple recipients>; Mon, 29 Nov 2010 12:26:20 -0800 (PST)
Received: by 10.42.227.67 with SMTP id iz3mr1798709icb.93.1291062379696; Mon, 29 Nov 2010 12:26:19 -0800 (PST)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id 34sm6214441ibi.8.2010.11.29.12.26.17 (version=SSLv3 cipher=RC4-MD5); Mon, 29 Nov 2010 12:26:17 -0800 (PST)
Received: by iwn40 with SMTP id 40so6387472iwn.31 for <multiple recipients>; Mon, 29 Nov 2010 12:26:16 -0800 (PST)
Received: by 10.231.11.3 with SMTP id r3mr3148120ibr.53.1291062376690; Mon, 29 Nov 2010 12:26:16 -0800 (PST)
MIME-Version: 1.0
Received: by 10.231.12.77 with HTTP; Mon, 29 Nov 2010 12:25:46 -0800 (PST)
In-Reply-To: <6EA546A6-28EF-43E4-B245-B7DE13A179DC@bbn.com>
References: <6EA546A6-28EF-43E4-B245-B7DE13A179DC@bbn.com>
From: Adam Barth <ietf@adambarth.com>
Date: Mon, 29 Nov 2010 12:25:46 -0800
Message-ID: <AANLkTikjppMeR-a3ZFS80o671NyExLZ8QBuW3e5wo-+S@mail.gmail.com>
To: "Richard L. Barnes" <rbarnes@bbn.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: gen-art@ietf.org, draft-ietf-httpstate-cookie@tools.ietf.org, ietf@ietf.org, http-state <http-state@ietf.org>
Subject: Re: [http-state] Gen-ART LC review of draft-ietf-httpstate-cookie-18
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Nov 2010 20:25:12 -0000

Thanks for the review.  Comments inline.

On Sun, Nov 28, 2010 at 12:47 PM, Richard L. Barnes <rbarnes@bbn.com> wrote:
> I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>
> Please resolve these comments along with any other Last Call comments you may receive.
>
> Document: draft-ietf-httpstate-cookie-18
> Reviewer: Richard Barnes
> Review Date: 28 November 2010
> IETF LC End Date: 2 December 2010
> IESG Telechat date: (unknown)
>
> Summary: This document is close to being ready for publication, but could benefit from several clarifications and upgrades to requirements language.  In particular, user agent security requirements need to be slightly stronger.
>
>
> Major issues:
>
> [S8.6] Many of these integrity issues are caused by user agents accepting cookies from outside the context where they would send them, in particular with the Secure and Path attributes.  It seems like this document, in specifying the desired/proper behavior of user agents, should require behavior that would mitigate these attacks.  In direct parallel to the handling of the Domain attribute:
> 1. Set-Cookies with the Secure flag should only be accepted over secure channels
> 2. Set-Cookies with the Path attribute set should only be accepted if the path value matches the request-uri of the request
> (That is, update Steps 7 and 8 of S5.3 to match Steps 6 and 9/10) I realize that as long as there are legacy user agents out there, servers can't rely on these protections, but if any user agents implement these requirements, then fewer transactions will be subject to these attacks.

I agree that these would be desirable changes to the cookie protocol.
However, the http-state working group is not chartered to change the
cookie protocol.  In particular, the charter reads as follows:

[[
The working group must not introduce any new syntax or new semantics
not already in common use.
]] --- https://datatracker.ietf.org/wg/httpstate/charter/

These changes would introduce new semantics not already in common use.

> Minor issues:
>
> [General] It would be very helpful if this document had a summary of changes from RFC 2965, and possibly from RFC 2109 as well.  In particular, the fate of the Cookie2 and Set-Cookie2 header fields is a little unclear.  If the intent of obsoleting 2965 is to deprecate the use of these fields, it would be good to state that explicitly.

The plan is to obsolete RFC 2965 and move it to historic(al).  Is
there a better way of stating that the use of Cookie2 and Set-Cookie2
is deprecated?

> [General] It's unclear where this document fits on the scale of "documenting existing behavior" to "describing proper behavior".  It seems like the focus should be on documenting a proper behavior that is maximally compatible with existing behavior.  The document should try to clearly distinguish when it is talking about the desired behavior vs. the existing behavior.  The former should be subject to strong requirements "MUSTs" while the latter will have no requirements at all.

I'm not sure I understand the distinction between the two.  The
document describes a protocol that matches, to a large extent,
existing implementations of the protocol.  Where existing user agents
differ from the specified protocol in significant ways, the document
contains a note explaining the difference.

> [S2.3 P2] Is the request-host the same as the value of the HTTP/1.1 Host header field?

No.  For example, the HTTP/1.1 Host header field sometimes contains
the port whereas the request-host never contains the port.

> [S3 P4] It seems like there should be an all-caps requirement here, e.g., that "Origin servers MUST NOT fold multiple Set-Cookie header fields into a single header field".

Origin servers certainly can fold multiple Set-Cookie header fields
into a single header field if they desire.  However, doing so changes
the semantics.  Given the bytes on the wire, there's no experiment we
can run to see if the origin server did or did not perform this
folding.  Therefore, a MUST-level requirement is inappropriate.

> [S4.1.1 P1] It seems like it should be MUST NOT instead of SHOULD NOT: "Servers MUST NOT send Set-Cookie headers that fail to conform to the following grammar:".

This requirement was discussed at length in the working group.  I
advocated for a MUST-level requirement here as well.  However, some
members of the working group felt strongly that the server ought to be
able to make use of the full cookie protocol as specified in Section
5.  Therefore, we reduced this requirement from a MUST to a SHOULD.

> [S4.1.1] Is there a reason that path-value is so unconstrained?  Based on S5.1.4, I would have expected it to be abs_path (from RFC 2396).

The document used to refer to abs_path, but someone in the working
group pointed out some cases where this was incorrect.  We tried a
couple different variations, but none of them were satisfactory.  For
this reason, we ended up with the relatively permissive requirement
you see today.  I don't recall all the details, but they're in the
mailing list archive.

> [S4.1.1 P5] Seems like the SHOULD NOT should be "MUST NOT produce more than one attribute with the same name".

See my response to S4.1.1 P1.

> [S4.1.1 P6] Either this should be a "MUST NOT" or it should define which a user agent MUST accept -- or both.  The third paragraph of 4.1.2 seems to imply that the user agent MUST accept the last one (in order of appearance in the HTTP header).

The user agent requirements are contained in Section 5.

> [S4.1.1 P8] Again, "SHOULD" -> "MUST"

See my response to S4.1.1 P1.

> [S8.1] I find this paragraph vague and confusing.  I would suggest separating protocol vulnerabilities from implementation vulnerabilities.  Suggested text:
> "
> Transport-layer encryption, such as that employed in HTTPS, is insufficient to prevent a network attacker from obtaining or altering a victim's cookies.  The cookie protocol itself allows cookies to be accessed or modified from other contexts (subdomains, subordinate paths, or other ports) and some user agents do not even provide the separations required by the protocol.  (See "Weak Confidentiality" and "Weak Integrity", below).
> "

I'm not sure I agree with your diagnosis.  All the issues described in
that paragraph are vulnerabilities in the cookie protocol.  None of
them are implementation vulnerabilities.

> Nits/editorial comments:
>
> [S2.1 P3 "not intended to be performant"] I'm not sure the word "performant" has the meaning that seems to be intended here.  (The definitions I'm finding indicate "one who performs".)  Should probably expand to "not intended to require the performance of specific steps".

This text is boilerplate commonly used in a number of specifications.
Like the goofy grammar in the RFC2119 boilerplate, it's easier if we
all just use exactly the same text.

> [S3 P2 "the default scope"] The notion of the "scope" of a cookie was only briefly introduced in the Introduction.  Could be good to remind the reader in the first paragraph.  "... the user agent will return in future HTTP requests." -> "... the user will return in future HTTP requests that are within the scope of the cookie".

Done.

> [S4.1.1 P9] The paragraph as it stands is kind of useless -- what am I supposed to do with this information?  Suggest changing to:
> "
> Some user agents represent dates using 32-bit UNIX time_t values.  These user agents will process dates after January 19, 2038 incorrectly.  Origin servers SHOULD NOT set expiration times later than this time.
> "

There was a lot of discussion of this paragraph in the working group.
At one point we had text similar to what you propose, but various
folks objected.  The text we have now is the compromise we reached.

> [S4.1.2] The phrase "Non-Normative" seems inapt here.  Suggest changing the heading to something like "Summary of Semantics".

That text is there to emphasize the fact that the text in that section
is non-normative.  Remove that phrase might confuse readers into
thinking that the section actually describes the semantics of the
protocol (which it does not).

> [S4.1.2.3 P1 "... only to the origin server"] It would be helpful to clarify here by adding "... and not any sub-domains".

Done (slightly tweaked).

> [S4.1.2.6] It could be helpful here to note that the "HttpOnly" attribute is not the opposite of the "Secure" attribute (which one might initially imagine as "HttpsOnly"), and thus that setting them both is not a contradiction.

Done.

> [S5.1.1] Just as in Section 5.2, it would help to explain here why you're not just using the rfc1123 grammar (from above and RFC 2616 S3.1.1).  E.g.:
> "
> NOTE: This grammar is more general than the sane-cookie-date grammar used in the definition of the Set-Cookie header. This allows user agents to interoperate with servers that do not implement this specification.
> "

This is already explained two paragraphs earlier:

[[
      For historical reasons, the full semantics of cookies (as presently
      deployed on the Internet) contain a number of exotic quirks. This section
      is intended to specify the Cookie and Set-Cookie headers in sufficient
      detail to allow a user agent implementing these requirements precisely to
      interoperate with existing servers.
]]

> [S5.1.1] It would be helpful to introduce the found-* flags at the beginning of step 2.

They're already introduced:

[[
          Note that the various boolean
          flags defined as a part of the algorithm are initially "not set".
]]

> [S5.1.1 Step5] "the cookie-date if" -> "the cookie-date if any of the following conditions are true:"

Making that change would make that sentence grammatically incorrect.

> [S5.1.1 Step5] I don't understand why the year 1601 is a cutoff here.  Wouldn't any past date be usable to kill a cookie?

This requirement is based on how user agents behave.  I suspect dates
prior to 1601 are more likely to be errors.  For example, a date of
128 is unlikely to actually refer to the year 128.  In any case, it
doesn't matter why we use 1601 as a cut off, just that we do.

> [S8.2] It might be helpful to have a reference or two here, e.g., explaining XSRF attacks and mitigations.

Sure.  Which references do you suggest?

Thanks again,
Adam