Re: [websec] Principles of the Same-Origin Policy

John Kemp <john@jkemp.net> Tue, 22 February 2011 13:46 UTC

Return-Path: <john@jkemp.net>
X-Original-To: websec@core3.amsl.com
Delivered-To: websec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 339653A68D6 for <websec@core3.amsl.com>; Tue, 22 Feb 2011 05:46:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.265
X-Spam-Level:
X-Spam-Status: No, score=-2.265 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J9JicjR5HpzR for <websec@core3.amsl.com>; Tue, 22 Feb 2011 05:46:50 -0800 (PST)
Received: from cpoproxy3-pub.bluehost.com (cpoproxy3-pub.bluehost.com [67.222.54.6]) by core3.amsl.com (Postfix) with SMTP id 68BD23A6812 for <websec@ietf.org>; Tue, 22 Feb 2011 05:46:50 -0800 (PST)
Received: (qmail 29557 invoked by uid 0); 22 Feb 2011 13:47:34 -0000
Received: from unknown (HELO box320.bluehost.com) (69.89.31.120) by cpoproxy3.bluehost.com with SMTP; 22 Feb 2011 13:47:34 -0000
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=jkemp.net; h=Received:Subject:Mime-Version:Content-Type:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer:X-Identified-User; b=WFk+5jgTwrTouM3QsAj+q2i+hkSecfAShF3MhV1Xq+OywXo2LyoKlWTVFgvj09o8rQijecg7dFQyBxRLnkwstsKWtQGkqqiWm4W5CNa7sg0q14GJAjKyuB7NGS7nl7Q8;
Received: from cpe-67-252-42-129.nycap.res.rr.com ([67.252.42.129] helo=[192.168.1.102]) by box320.bluehost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from <john@jkemp.net>) id 1PrsaT-0005SI-Ow; Tue, 22 Feb 2011 06:47:34 -0700
Mime-Version: 1.0 (Apple Message framework v1082)
Content-Type: text/plain; charset="us-ascii"
From: John Kemp <john@jkemp.net>
In-Reply-To: <AANLkTinwGccOQa_tKPAZ=rMZSnHpmuYgyF=nCa3QE3-N@mail.gmail.com>
Date: Tue, 22 Feb 2011 08:47:32 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <21B58CEE-9235-4443-BB17-BA828715F3B1@jkemp.net>
References: <AANLkTi=nCJSC2ZpY6R_NPJUjODAgiYcRSZTaSxWr8+Fz@mail.gmail.com> <4FCE57FD-F60A-4BF0-B96A-37980AD192B0@jkemp.net> <AANLkTinwGccOQa_tKPAZ=rMZSnHpmuYgyF=nCa3QE3-N@mail.gmail.com>
To: Adam Barth <ietf@adambarth.com>
X-Mailer: Apple Mail (2.1082)
X-Identified-User: {1122:box320.bluehost.com:jkempnet:jkemp.net} {sentby:smtp auth 67.252.42.129 authed with john+jkemp.net}
Cc: websec <websec@ietf.org>
Subject: Re: [websec] Principles of the Same-Origin Policy
X-BeenThere: websec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Web Application Security Minus Authentication and Transport <websec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/websec>, <mailto:websec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/websec>
List-Post: <mailto:websec@ietf.org>
List-Help: <mailto:websec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/websec>, <mailto:websec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Feb 2011 13:46:52 -0000

Hey Adam,

On Feb 21, 2011, at 11:35 PM, Adam Barth wrote:

> Hi John,
> 
> Thanks for the feedback.  Most of your comments revolve around scope.

Yes, and after reviewing your responses, I agree with that.

> This document is based on some text I wrote for an audience that was
> pretty clear on the scope.  The web platform has a bunch of different
> names.  Sometimes the W3C calls it the Open Web Platform, as in
> <http://www.w3.org/QA/2010/10/html5_the_jewel_in_the_open_we.html>.
> Historically, I think O'Reilly gets credit (at least according to
> Wikipedia) for conceptualizing the "Web as Platform."  Certainly,
> implementors think about shaping the direction of the web platform,
> e.g., <http://www.chromium.org/developers/web-platform-status>.
> 
> We should certainly spend some time in the introduction setting
> expectations about scope.  Part of the motivation for writing this
> document is that many of these web technologies interact strongly in
> the platform.  Without a coherent security model, it's easy for
> complexity to spiral out of control.

Agreed.

>  It's taken us (where "us" here
> is broadly defined) a while to converge on a coherent security model,
> but this is pretty well-established stuff at this point, albeit
> perhaps not clearly conceptualized and nominalized for folks who don't
> work with this stuff on a daily basis.
> 
> Detailed responses inline.
> 
> On Mon, Feb 21, 2011 at 6:30 PM, John Kemp <john@jkemp.net> wrote:
>> On Feb 21, 2011, at 5:10 PM, Adam Barth wrote:
>>> Pursuant to the charter, I've posted an informational draft that
>>> "describes the same-origin security model overall:"
>>> 
>>> http://www.ietf.org/id/draft-abarth-principles-of-origin-00.txt
>>> 
>>> I don't expect this document to be very controversial.  I'm sure folks
>>> will nitpick me over renaming URL to URI and MIME types to media
>>> types, however.  :)
>>> 
>>> Feedback welcome.
>> 
>> Some feedback which does not nitpick about your usage of URL or MIME:
>> 
>> i) Introduction
>> 
>>   * What is the "web platform", for the purposes of this discussion?
> 
> See above.
> 
>> i) Section 2. 'Trust':
>> 
>>   * Is trust always specified by URL?
> 
> As far as I can tell, yes.  Do you have any examples that don't fit this theory?

Well, for example, trust is also granted by the recipient of an HTTP message with a particular Origin header to the sender of that message. 

> 
>> Who is the trust specified by,
> 
> Documents (or perhaps document authors if you don't want to ascribe
> intention to documents).
> 
>> and to whom is it granted?
> 
> Whomever controls the resource designated by the URL.
> 
>>   * What do you consider to be a "user agent" - do you mean a Web browser, or the larger class of things which have often been called user agents?
> 
> If it's helpful for you to think about the user agent as a browser,
> please feel free.  However, there are more user agents than just
> browsers.  For example, the Adobe Air execution environment is a user
> agent for the purposes of this document.  Essentially, the user agent
> whatever provides the client execution environment in the web platform
> (as distinct from the network and server execution environments).
> 
>> Wikipedia, for example (http://en.wikipedia.org/wiki/User_agent), mentions search engine crawlers and screen readers. Is 'curl' a user agent for the purposes of your statements about what a user agent does when accessing a script?
> 
> No.  Curl doesn't provide a web platform client execution environment.

Is a Javascript interpreter then required? A (standardized?) DOM? 

> You might be able to twist your mind into thinking of it as a such in
> some sort of degenerate way, but it's certainly not a particularly
> interesting example from the point of view of this document.
> 
>> Is the content at the URL always "executed" by the user agent?
> 
> Yes.

Even if the script is delivered to the user agent with a Content-type header with the value 'text/plain'?

> 
>>   * You mention the term 'principal' - ('principals export data to URLs') - do you mean "security principal", or "user", and are they always synonymous?
> 
> Principal is an abstract notion, which I believe originates from the
> "orange book."  I suspect it's more akin to what you mean by a
> security principal, and it's distinct from the user.
> 
>> ii) 2.1 Pitfalls
>> 
>>   * Is your only "pitfall" that someone might use the http URI scheme for both TLS and non-TLS protected resources?
> 
> That's just an example.  There are lots of ways of screwing this up.
> In fact, I wrote a whole paper about examples of people falling into
> this pitfall:
> 
> http://www.adambarth.com/papers/2008/jackson-barth-b.pdf
> 
> The HTTP / HTTPS example is just one that's easy to explain and fairly
> intuitive.

Yes, it's also very specific, and not (in my opinion) illustrative even of the examples you discuss in the above-referenced paper.

You are also quite narrowly discussing how to prevent privilege escalation within a set of documents which are part of the same origin. That's a very specific (yes, important) kind of trust for the specific cases you mention.  

> 
>> Might there not be other important trust distinctions visible in URLs? Perhaps some examples of distinguishing trust via URLs would be helpful here?
> 
> The more interesting cases are when folks intent their to be trust
> distinctions, but those distinctions *aren't* visible in URLs.  That
> leads quickly to disaster.

I think the examples you mention in your paper, regarding 'origin contamination' (EV certs, cookie path access etc.) support your case better FWIW.

> 
>> iii) 3. Origin
>> 
>>   * Some user agents do already treat each URL as a separate principal (at least in my understanding of a user agent)
> 
> That might well be, but then they're not really participating in the
> common security model of the web platform and are not relevant to this
> discussion.
> 
>>   * Might be worth referencing the definition of origin as scheme, host and port
> 
> Sure.  Thankfully this working group is also working on precisely such
> a document.  :)
> 
>> iv) 4. Authority
>> 
>>   * You mention serving content as image/png instead of text/html - why not recommend either to serve the content without a Content-type header at all (as suggested by the W3C TAG finding on authoritative metadata - http://www.w3.org/2001/tag/doc/mime-respect) and have recipients follow your content sniffing algorithm (for example), or serve the content as 'application/octet-stream'?
> 
> That doesn't make sense.  This document is talking about what
> authority the user agent bestows upon a resource.  It's making the
> point that the authority depends on the MIME type of the resource.
> Omitting the MIME type is below the level of abstraction here.

How can that be - if authority is granted based on MIME type, what does it mean if no MIME type is given? 

> 
>>   * In general, what is the relationship between your content sniffing draft and this section on authority-as-conveyed-by-MIME-type?
> 
> It's the reason why content sniffing is tricky.  If you aren't careful
> about how you sniff, you might give too much authority to a resource
> and create a vulnerability.

I understand that to be the very reason you wrote your sniffing draft, no? And servers can surely help here by not giving an indication of authority, should they not feel comfortable doing so, or to give an indication that the recipient should not grant the content any authority at all by specifying application/octet-stream. 

> 
>>   * How is the amount of authority designated?  What constitutes full (or partial) authority?
> 
> That's defined by the specifications that describe the semantics of
> the various MIME types.

Considering application/javascript or text/javascript, whose specifications seem to be in here: http://tools.ietf.org/html/rfc4329 there is no mention of 'authority' as a concept. Is the 'security considerations' section (http://tools.ietf.org/html/rfc4329#section-5) adequate for your purposes? Or am I looking in the wrong place?

> 
>> v) 5. Policy
>> 
>>   * Is it worth mentioning iframes and the iframe sandbox attribute here, in relation to scripts accessing objects belonging to the parent document?
> 
> That's too detailed.  This is an overview of the security model, not a
> guide to all the various bells, whistles, and knobs.  Other documents,
> such as <http://code.google.com/p/browsersec/>, are more appropriate
> for that.
> 
>>   * You mention that blocking cross-origin requests would prevent users from following hyperlinks (and that this is core to web architecture). This highlighted (to me at least) that trust in URLs is *not* always origin-based. A user may trust content from multiple origins, and compose a page which contains such content.
> 
> We're not talking about what the user does or does not trust.  In
> fact, we're not talking about the user at all.  We're talking about
> the security model of the platform, independent of any user.  That
> trips people up at first because they're used to a user-centric
> world-view, but I can assure you that this model is sensible in the
> absence of the user.

Well, you seem to be talking about trust granted by one software component to another. Which makes barely any sense to me at all, based on the general untrustworthiness exhibited by most software components I have used in my lifetime ;) But anyway, I understand where you are coming from, even if I would still argue that the trust of the user is important, and that trust of the various non-UA components is also important.

> 
>>   * To whom is the "value proposition high enough" in making a cross-origin request?
> 
> The designer and/or implementor of the API that permits the
> cross-origin access.  Keep in mind that this document is descriptive,
> not prescriptive.

Where is the user in all this? Don't API designers design their APIs so that developers can write applications for users to use?

> 
>>   * Can you explain, or provide an example, that illustrates your discussion about granting a privilege to one document and withholding it from another, even though this document is from the same origin?
> 
> Sure.  Please see the seven examples in Section 2 of
> <http://www.adambarth.com/papers/2008/jackson-barth-b.pdf>.

Thanks.

> 
>> vi) 6. Conclusion
>> 
>>   * I find it hard to believe that all trust relationships on the Web are designated via URLs, and that all security policy is associated with origins.
> 
> Keep in mind that this is a model.  Not everything in the world fits
> the model.  Consider Newton's theory of gravity.  You would be correct
> in being skeptical that not all the orbits of all the planets can be
> described by Newtonian mechanics (in fact, observations of Mercury's
> orbit famously don't quite fit).  However, it's still a useful model
> and worth understanding.

I don't argue with any of that, but the question is about scope again, isn't it? Specifying a model becomes much easier when you know to what you should apply the model. Personally, I don't think of the whole Web as being limited to a particular class of user agents (whether they are called the "web platform" or not). I think the Web includes web servers. I think the Web includes users, crawlers and screen readers. Addressing the security model of a class of user agents is an excellent idea. But it is not a security model for the whole Web. 

Regards,

- John

> 
> In our case, the more modern pieces of the web platform are more
> likely to follow this model.  It's mostly just the old crufty parts of
> the platform that don't fit as well because we hadn't sorted all this
> stuff out when we designed them.  Unsurprisingly, that's also where
> most of the design-level vulnerabilities come from.
> 
>> Certainly it is one usable model, but there are others (you mention one yourself - "user agents could treat every URL as a separate principal").
> 
> Indeed, just as Newtonian mechanics is a really bad fit for Godel's
> rotating universe.  However, in our universe, Newtonian mechanics is
> often quite useful.
> 
>> Although the title of this document is 'Principles of the Same-Origin Policy', you have partially described a security model of the web based in origin. It feels as if you should either restrict this document to talk only about origin-based security policy, or more fully describe the web security model to which you allude. Do screen readers, crawlers and curl/wget fit into that model?
> 
> Which brings us back to clarifying the scope.  I agree that's an
> important piece missing from the document.
> 
> Kind regards,
> Adam