Re: [hybi] #1: HTTP Compliance

Willy Tarreau <w@1wt.eu> Sun, 15 August 2010 15:48 UTC

Return-Path: <w@1wt.eu>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id AF57E3A6881 for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 08:48:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.48
X-Spam-Level:
X-Spam-Status: No, score=-1.48 tagged_above=-999 required=5 tests=[AWL=-2.352, BAYES_50=0.001, HELO_IS_SMALL6=0.556, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bdaFDkZG4Sxe for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 08:48:44 -0700 (PDT)
Received: from 1wt.eu (1wt.eu [62.212.114.60]) by core3.amsl.com (Postfix) with ESMTP id 8BA293A684C for <hybi@ietf.org>; Sun, 15 Aug 2010 08:48:42 -0700 (PDT)
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id o7FFnAs3028761; Sun, 15 Aug 2010 17:49:10 +0200
Date: Sun, 15 Aug 2010 17:49:10 +0200
From: Willy Tarreau <w@1wt.eu>
To: Shelby Moore <shelby@coolpage.com>
Message-ID: <20100815154910.GF27614@1wt.eu>
References: <f7d4bb98e444b85b9bf1af6d4d9f0772.squirrel@sm.webmail.pair.com> <20100815100717.GA27614@1wt.eu> <1c2251f66b8d01880b6943561c07d3cb.squirrel@sm.webmail.pair.com> <20100815122648.GC27614@1wt.eu> <91f61d138b8a8a80271688a0e10a685a.squirrel@sm.webmail.pair.com> <4C67EABA.1020109@gmx.de> <30e5b12387f782afa0cb1df77de847fa.squirrel@sm.webmail.pair.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <30e5b12387f782afa0cb1df77de847fa.squirrel@sm.webmail.pair.com>
User-Agent: Mutt/1.4.2.3i
Cc: hybi@ietf.org
Subject: Re: [hybi] #1: HTTP Compliance
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Aug 2010 15:48:45 -0000

On Sun, Aug 15, 2010 at 10:55:28AM -0400, Shelby Moore wrote:
> Hi again all, please do not take my reply as combative.  It is late here
> and I am trying to wrap this up, and I am sure I have not taken enough
> time to compose this reply with the degree of care that I would like.

OK.

> > As a matter of fact, HTTPbis is going to require that recipients  reject
> > messages with this kind of problem; see
> > <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-11.html#rfc.section.3.3.p.7>.
> 
> The HTTP header is exposed to millions of naive programmers through
> popular and easy means, e.g. PHP header(), thus user agents that want to
> be robust on the WWW have no choice (the choice of failing for popular
> content is not a choice).

We're not talking about failing for popular contents. Quite the opposite in
fact, we're talking about remaining compatible with clean implementations
so that we're not causing them to fail on popular contents.

HTTP is a very widely deployed protocol and despite the numerous
implementations, all of them are very much interoperable. One of the
reasons is that most mistakes related to the messaging semantics cause
the other side either to fall back to the lowest level compatibility
mode (eg: conenction: close) or to fail (400 bad request, 502 bad
gateway, etc...). Thus, while you may find a lot of buggy applications,
they still have their message framing correct (or at least correct
enough to reliably be transported).

> Security that depends on specification is not very strong security because
> it is impractical to quantify the determinism

No, that's where you are mistaken in my opinion. Security does not depend
on the spec, but the spec dictates what to do to avoid known security issues.
For instance, there have been some add-ons due to the discovery of HTTP
request smuggling attacks (cf Julian's link above).

> (you would have to compile
> statistics in the market of implementations in all anticipated scenarios,
> i.e. massive regression). 

You have to consider the fact that the 4th RFC is being written, and its main
goal is not to specify how things should be, but what is currently being done
and how to interoperate with existing components. So there is no reason for
regressions, quite the opposite, it could help various existing actors quickly
become compatible with other implementations they're not yet aware of.

> On top of this, it is my understanding (incorrect?) that HTTP itself
> allows for varying interpretations that might someday prove to be
> exploitable.

That's what is currently being addressed. If you're aware of some
vulnerabilities in some implementations that are only caused by a
different interpretation of the spec, then you should really join
the http-bis WG and report the issue so that it can be considered.

> My reasonable suspicion derives
> from the differences between what the spec says that user agents are
> required to do and what they are allowed to do. And because the various
> syntax elements of the specification are composeable in probably way more
> permutations that any human has ever wrapped inside their head.

I really invite you to read the link above. The whole p1-messaging and
p2-semantics are very interesting to read. You'll find how accurate they
are on many aspects, and very little is left to chance.

> >> http://www.owasp.org/index.php/HTTP_Request_Smuggling
> >
> > It only exists when servers are not cautious enough with dangerous
> > inputs (eg: conflicting content-lengths). And it is a problem when
> > two implementations decide differently how to interprete the request
> > because the spec makes it ambiguous or does not cover the case. That's
> > been addressed in http-bis, and now that there is some guidance so that
> > everyone tries to do the same thing,
> 
> I predict you will find that the exponential function runs away faster
> than the perfection can fix itself.

There's no seek for perfection here. Just to avoid known bad patterns.

> > There
> > are many of them around you without you ever noticing. That makes them
> > "transparent". They're generally quite robust, and most often they bring
> > the security themselves because their HTTP stack is well polished and
> > strict enough to protect the servers from issues that could result from
> > poor implementation in the application.
> 
> I can see that, if they have enough experience in the market place to know
> all the possible scenarios.  But isn't it a huge effort, because a new
> application comes along and breaks them, like WebSocket's proposed
> security hole?

No, applications don't break implementations, so applications even if
poorly designed, don't introduce security holes into those components.
What introduces security holes is poor implementations. Trying to modify
a clean implementation to add support for an incompatible application may
lead to a poor implementation which is sensible to security holes. But no
application will cause security holes by itself.

> My point is that it isn't a very strong argument to argue
> that the applications must be careful not to break the security of the
> middle ware, when the middleware has to deal with entropy any way.

No, let me insist. The middleware *is not* sensible to such issues and
tries hard not to be. However, if draft-76 were to be ratified, all
implementors of intermediaries would have to choose between being secure
and HTTP compliant, or being compatible with WebSocket. And *that* is not
acceptable.

> > It's quite the opposite. The more mission critical, the more of them you
> > find because they bring protocol validation, security, filtering, load
> > balancing, surge protection, content switching, etc... basically things
> > you don't need to show your photos to your family, but that are needed
> > everywhere applications are setup to serve tens to hundreds of thousands
> > of concurrent clients.
> 
> Won't we find that most all of those are optional and supplemental
> technologies that fallback to something other than total failure for the
> marjority of the market even when some specification is violated by the
> market?

Interestingly, with highly interoperable protocols such as HTTP, the
market cannot really violate the basic specs, otherwise the components
simply don't work at all and cannot sell. If you're an ISP and you buy
a proxy-cache that does not correctly parse content-length, you will
probably not ask all your clients to switch to a patched browser, nor
will you contact all the sites they're accessing to ask them to change
their implementation. No. You will simply put that buggy component to
the trash and buy another one from a different vendor.

> > In fact, the unreliable components are mainly those which work on string
> > matching in packets. Those do not rely on proxies and don't have a very
> > robust state machine. They can easily be fooled in many ways with hand
> > crafted requests. Some of them will already not be able to process the
> > protocol upgrade and will likely hang or reset the connection.
> 
> I don't know much about this but to the extent the middleware is parsing
> the application data in transist and acting upon it outside of any
> deterministic specification, we are going to have problem with any HTTP
> Upgrade extension and I assume that is what the encryption solves?

it's not the "encryption" by itself, but the fact of relying on something
non-HTTP. But it also comes with many new issues which are already addressed
by HTTP : no more session stickiness, no more accessibility from the inside
of most enterprises, no more virtual hosting, etc... So nothing is completely
white or black.

> > It's not the security argument against draft 76. It's that draft 76 is
> > not HTTP compliant and breaks with HTTP-compliant reverse proxies. The
> > only dirty hack that could be imagined to make those reverse-proxies
> > support it would not be acceptable in those reverse proxies because it's
> > incompatible with HTTP and it would break their security.
> 
> Agreed. I was just trying to present what I thought was an even stronger
> reason. If the HTTP proxies can work around all kinds of market
> application errors, they could support WebSocket proposal 76.

No, because doing this would break all other usages. To give you an idea,
think about a reverse proxy in front of an infrastructure hosting 100000
domains, most of which run only HTTP, and many of which also run WebSocket.
You have to change the way HTTP works on this reverse-proxy just in order 
to support the WebSocket stack on some of them. And with such numbers, it's
not a solution to think about configuring all of them by hand.

> The HTTP
> proxies that allow widest interoperability on the WWW, thus support HTTP
> request smuggling now (they allow the two Content-length headers to pass
> through).

No. Those which allow request smuggling are not the most interoperable
because if they randomly consider one of the two CL headers, half of the
times they'll pick the wrong one. The most interoperable ones are the
ones that fix or reject the wrong messages.

> In short, security and features and time are in tension against each
> other.  There will always be some tradeoff.  At some points along that
> solution space, the security argument is weaker than other priorities.  I
> felt by presenting the point that did, the case against proposal 76 was
> even more slamdunk, in order to persuade other fence sitters.

Well, once again, my point is that while security around the new protocol
may not be the most important argument, not forcing reliable and secure
components to become unreliable an insecure for all existing applications
is a very important argument.

Regards,
Willy