Re: [hybi] #1: HTTP Compliance

"Shelby Moore" <shelby@coolpage.com> Sun, 15 August 2010 14:54 UTC

Return-Path: <shelby@coolpage.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 536BF3A6801 for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 07:54:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.583
X-Spam-Level:
X-Spam-Status: No, score=-0.583 tagged_above=-999 required=5 tests=[AWL=-0.899, BAYES_50=0.001, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EGCu5aOwo7c8 for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 07:54:53 -0700 (PDT)
Received: from www5.webmail.pair.com (www5.webmail.pair.com [66.39.3.83]) by core3.amsl.com (Postfix) with SMTP id 314CF3A67F1 for <hybi@ietf.org>; Sun, 15 Aug 2010 07:54:52 -0700 (PDT)
Received: (qmail 86148 invoked by uid 65534); 15 Aug 2010 14:55:28 -0000
Received: from 121.97.54.174 ([121.97.54.174]) (SquirrelMail authenticated user shelby@coolpage.com) by sm.webmail.pair.com with HTTP; Sun, 15 Aug 2010 10:55:28 -0400
Message-ID: <30e5b12387f782afa0cb1df77de847fa.squirrel@sm.webmail.pair.com>
In-Reply-To: <4C67EABA.1020109@gmx.de>
References: <f7d4bb98e444b85b9bf1af6d4d9f0772.squirrel@sm.webmail.pair.com> <20100815100717.GA27614@1wt.eu> <1c2251f66b8d01880b6943561c07d3cb.squirrel@sm.webmail.pair.com> <20100815122648.GC27614@1wt.eu> <91f61d138b8a8a80271688a0e10a685a.squirrel@sm.webmail.pair.com> <4C67EABA.1020109@gmx.de>
Date: Sun, 15 Aug 2010 10:55:28 -0400
From: Shelby Moore <shelby@coolpage.com>
To: Julian Reschke <julian.reschke@gmx.de>, Willy Tarreau <w@1wt.eu>
User-Agent: SquirrelMail/1.4.20
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Cc: hybi@ietf.org
Subject: Re: [hybi] #1: HTTP Compliance
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: shelby@coolpage.com
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Aug 2010 14:54:55 -0000

Hi again all, please do not take my reply as combative.  It is late here
and I am trying to wrap this up, and I am sure I have not taken enough
time to compose this reply with the degree of care that I would like.

> On 15.08.2010 15:06, Shelby Moore wrote:
>  > ...
>> For as long as servers are free to have variable
>> interpretations/implementations of HTTP (within the broad constrainsts
>> of
>> the spec), then HTTP request smuggling will exist.  And since I
>> understand
>> that variable interpretationsinterpretations/implementations is critical
>> to HTTP robustness, then I conclude that "security" at the *transparent*
>> proxy is an oxymoron.  A *transparent* proxy should never be used for
>  > ...
>
> Request smuggling happens when an HTTP message is invalid (for instance,
> because of duplicate Content-Length headers). There is no requirement
> for "robustness" in the HTTP spec related to this (if you disagree,
> please be more specific...).
>
> As a matter of fact, HTTPbis is going to require that recipients  reject
> messages with this kind of problem; see
> <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-11.html#rfc.section.3.3.p.7>.

The HTTP header is exposed to millions of naive programmers through
popular and easy means, e.g. PHP header(), thus user agents that want to
be robust on the WWW have no choice (the choice of failing for popular
content is not a choice). The compatibility modes for HTML rendering in
browsers is another example. A specification is only a request, the actual
standard is what happens in the market place. Given a sufficient
participation (sample), entropy will yield a bell curve of compliance,
with the center of the curve being imperfectly compliant (average IQ of
implementor) and one tail being more so and the other side less so.

Security that depends on specification is not very strong security because
it is impractical to quantify the determinism (you would have to compile
statistics in the market of implementations in all anticipated scenarios,
i.e. massive regression).  Stronger security is a deterministic body of
code that can not be subverted by any anticipated outlier (entropy or
Murphy's Law).  Security is never absolute, because nothing is every 100%
deterministic (Shannon-Nyquist theorem guarantees this because nothing can
be sampled/measured/contemplated with infinite precision or for infinite
time/experience).

On top of this, it is my understanding (incorrect?) that HTTP itself
allows for varying interpretations that might someday prove to be
exploitable. Off top of my head, legacy support for the caching changes
since early version. I suspect there are many such examples that I am not
aware of or haven't been exploited yet. My reasonable suspicion derives
from the differences between what the spec says that user agents are
required to do and what they are allowed to do. And because the various
syntax elements of the specification are composeable in probably way more
permutations that any human has ever wrapped inside their head. I wish I
could present a better example but I am not much of a hacker nor an HTTP
expert.

The following example of a "claimed" mistake in HTML5 (didn't find time to
email the HTML5 list yet) with respect to HTTP caching is an example of
entropy in the market versus the perfection anticipated in the HTTP spec:

https://bugzilla.mozilla.org/show_bug.cgi?id=563439#c8

So I guess what I am saying is that security becomes weaker and weaker
proposition the more elements relied upon are not under your control
and/or reasonably deterministically anticipated.  For example, the more
interfaces you proliferate under your control (that includes permutations
of composeable elements), the weaker your security due the inability to
anticipate all the scenarios that nature will generate.  So I prioritize
security the closer it is the inner core and don't promise security across
the world of proxies, when I am really need to admit I am just hoping.

> On Sun, Aug 15, 2010 at 09:06:56AM -0400, Shelby Moore wrote:
>> I felt compelled to add my (what I feel is a more convincing) point to
>> the
>> mix, because the security argument against the extra 8 bytes appears to
>> be
>> weaker, because HTTP request smuggling already exists.
>>
>> http://www.owasp.org/index.php/HTTP_Request_Smuggling
>
> It only exists when servers are not cautious enough with dangerous
> inputs (eg: conflicting content-lengths). And it is a problem when
> two implementations decide differently how to interprete the request
> because the spec makes it ambiguous or does not cover the case. That's
> been addressed in http-bis, and now that there is some guidance so that
> everyone tries to do the same thing,

I predict you will find that the exponential function runs away faster
than the perfection can fix itself.  Allocating human capital is very
critical function.  Standards that do it well are popular.  The others die
in the market.  We need to be realistic about the fact of entropy.

> we should not try to bring the
> issue back with WebSocket !

Agreed lets not add exploits, especially when it doesn't gain us anything,
and it has huge implementation cost also.

>
>> And since I understand
>> that variable interpretationsinterpretations/implementations is critical
>> to HTTP robustness, then I conclude that "security" at the *transparent*
>> proxy is an oxymoron.
>
> I don't agree. First we're not talking about "transparent proxies" but
> reverse-proxies, that act as the server when seen by the client.

I thought transparent meant that the end point user agent was not aware of
them.  That is all I needed for my point.

> There
> are many of them around you without you ever noticing. That makes them
> "transparent". They're generally quite robust, and most often they bring
> the security themselves because their HTTP stack is well polished and
> strict enough to protect the servers from issues that could result from
> poor implementation in the application.

I can see that, if they have enough experience in the market place to know
all the possible scenarios.  But isn't it a huge effort, because a new
application comes along and breaks them, like WebSocket's proposed
security hole?  My point is that it isn't a very strong argument to argue
that the applications must be careful not to break the security of the
middle ware, when the middleware has to deal with entropy any way.  Again
it about 80/20 rule and allocating human capital efficiently.

Joke: Well I guess it is good for jobs, so is giving people spoons instead
of shovels.


>> A *transparent* proxy should never be used for
>> mission critical functions,
>
> It's quite the opposite. The more mission critical, the more of them you
> find because they bring protocol validation, security, filtering, load
> balancing, surge protection, content switching, etc... basically things
> you don't need to show your photos to your family, but that are needed
> everywhere applications are setup to serve tens to hundreds of thousands
> of concurrent clients.

Won't we find that most all of those are optional and supplemental
technologies that fallback to something other than total failure for the
marjority of the market even when some specification is violated by the
market?

> In fact, the unreliable components are mainly those which work on string
> matching in packets. Those do not rely on proxies and don't have a very
> robust state machine. They can easily be fooled in many ways with hand
> crafted requests. Some of them will already not be able to process the
> protocol upgrade and will likely hang or reset the connection.

I don't know much about this but to the extent the middleware is parsing
the application data in transist and acting upon it outside of any
deterministic specification, we are going to have problem with any HTTP
Upgrade extension and I assume that is what the encryption solves?  I was
trying to get into that topic and to compare on tradeoff of using BOSH and
more market standard HTTP.

>
>> Thus I don't think the security argument against proposal 76 is as
>> convincing.  Feel free to correct me, as I am only using deductive logic
>> of what I have read today, and am not speaking as an expert in the
>> field.
>
> It's not the security argument against draft 76. It's that draft 76 is
> not HTTP compliant and breaks with HTTP-compliant reverse proxies. The
> only dirty hack that could be imagined to make those reverse-proxies
> support it would not be acceptable in those reverse proxies because it's
> incompatible with HTTP and it would break their security.

Agreed. I was just trying to present what I thought was an even stronger
reason. If the HTTP proxies can work around all kinds of market
application errors, they could support WebSocket proposal 76.  The HTTP
proxies that allow widest interoperability on the WWW, thus support HTTP
request smuggling now (they allow the two Content-length headers to pass
through).

In short, security and features and time are in tension against each
other.  There will always be some tradeoff.  At some points along that
solution space, the security argument is weaker than other priorities.  I
felt by presenting the point that did, the case against proposal 76 was
even more slamdunk, in order to persuade other fence sitters.

>
>> > What is important is that if we make the websocket handshake pass
>> through
>> > HTTP components, it must comply with the protocol. HTTP is clear on
>> that:
>> > the protocol has switched *only once* the 101 has been seen. So a
>> reverse
>> > proxy must not pass those 8 bytes before seeing the 101.
>>
>> Otherwise we defacto change HTTP for every HTTP Upgrade protocal
>> extension.  We know we are changing HTTP, because the implementing
>> server
>> or proxy has to implement a new code path for our Upgrade protocal
>> extension.
>
> No, as explained above, we must not do that because it opens a new can
> of holes. HTTP requires too much interoperability, it must not to be
> cross-dressed for every protocol pretending to rely on it.

Agreed.  That was my point too.

>
>> >> It would be an exponentially increasing cost problem of
>> interoperability
>> >> if every Upgrade protocol requires a new code path in the proxies.
>> >
>> > And it would not make sense at all since this would not be HTTP
>> anymore.
>>
>> Agreed.  Ditto I wrote above.
>
> I think we agree, it's just that you seem to be dismissing the security
> impacts of changing the way the HTTP Upgrade mechanism works, and I'd like
> to ensure that the issues are clear.

I am not totally dismissing the security impacts, I am only trying to
point out what the opposing view might say about the priority of "perfect
security in the real world".  I presented another point that we both agree
on, and I don't the opposing camp can disagree that it is a huge cost. The
security hole is not really adding any significant cost that doesn't
already exist (notwithstanding the hope to fix it in HTTPbis).