Re: [hybi] #1: HTTP Compliance

Willy Tarreau <w@1wt.eu> Sun, 15 August 2010 20:39 UTC

Return-Path: <w@1wt.eu>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3395B3A66B4 for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 13:39:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.46
X-Spam-Level:
X-Spam-Status: No, score=-1.46 tagged_above=-999 required=5 tests=[AWL=-2.332, BAYES_50=0.001, HELO_IS_SMALL6=0.556, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NsOzb-AWyCTm for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 13:39:50 -0700 (PDT)
Received: from 1wt.eu (1wt.eu [62.212.114.60]) by core3.amsl.com (Postfix) with ESMTP id 24F2E3A6828 for <hybi@ietf.org>; Sun, 15 Aug 2010 13:39:49 -0700 (PDT)
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id o7FKeN0w029636; Sun, 15 Aug 2010 22:40:23 +0200
Date: Sun, 15 Aug 2010 22:40:23 +0200
From: Willy Tarreau <w@1wt.eu>
To: Shelby Moore <shelby@coolpage.com>
Message-ID: <20100815204023.GG27614@1wt.eu>
References: <f7d4bb98e444b85b9bf1af6d4d9f0772.squirrel@sm.webmail.pair.com> <20100815100717.GA27614@1wt.eu> <1c2251f66b8d01880b6943561c07d3cb.squirrel@sm.webmail.pair.com> <20100815122648.GC27614@1wt.eu> <91f61d138b8a8a80271688a0e10a685a.squirrel@sm.webmail.pair.com> <4C67EABA.1020109@gmx.de> <30e5b12387f782afa0cb1df77de847fa.squirrel@sm.webmail.pair.com> <20100815154910.GF27614@1wt.eu> <392847bd7906678afc333a9011ae9aab.squirrel@sm.webmail.pair.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <392847bd7906678afc333a9011ae9aab.squirrel@sm.webmail.pair.com>
User-Agent: Mutt/1.4.2.3i
Cc: hybi@ietf.org
Subject: Re: [hybi] #1: HTTP Compliance
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Aug 2010 20:39:52 -0000

On Sun, Aug 15, 2010 at 01:46:20PM -0400, Shelby Moore wrote:
> This is a horrible reply, sorry I am cross-eyed at 1:45am...just let me
> punt after this please...

Well, this thread is becoming really too much off-topic for this list
as it's discussing general points about HTTP only. You should probably
joind the http-bis list. Also, you should take some sleep, this is not
a battle, your arguments can wait for the next day.

Still, a few short responses below to address some of your concerns :

> >> The HTTP header is exposed to millions of naive programmers through
> >> popular and easy means, e.g. PHP header(), thus user agents that want to
> >> be robust on the WWW have no choice (the choice of failing for popular
> >> content is not a choice).
> >
> > We're not talking about failing for popular contents.
> 
> We would be if duplicate Content-length was a popular content error.

I've already seen duplicate CL headers in network captures, but those were
what it means -- duplicate --. I suggested on the http-bis list to relax
the check to allow at least that case (all CL values must be equal), and
some considered it "acceptable". That's what I did for haproxy. Mind you,
after tens of billions of requests passing through, I've yet to see one
which fails ;-) In fact, if some application were to send two distinct
content lengths, there would be no chance for it to work on the Net
because of the variety of client and intermediate behaviours.

> What I meant is that the only way to become more secure is to limit
> features (so that we limit the variables/permutations of the unknown
> attacks and non-obvious non-compliance).  Limiting features is not
> popular.

Not always. If you don't provide enough building blocks, people try to
misuse existing ones in a variety of dangerous ways, then it's often
impossible to later implement controls for correctness. I appreciate
it a lot that this WG tries to define a rich set of building blocks
that applications will be able to use in a safe way. It will surely
result in more secure implementations.

> > For instance, there have been some add-ons due to the discovery of HTTP
> > request smuggling attacks (cf Julian's link above).
> 
> And for the future attacks discovered some more years later?  All during
> that time there will be false sense of security.

No, you're painting everything in black. HTTP is everywhere but does not do
everything. It's a transport protocol, not much more. Security issues with
HTTP can fall into a small set of categories :
  - make messages appear different to different agents (smuggling, encoding).
    Generally used to bypass filtering.
    => not due to the protocol by itself but to the way it is implemented
       and how corner cases are implemented ;

  - cache corruption => injection of bad data into caches that many people
    will be able to retrieve. Generally caused by bad implementations or
    poor configurations.

  - server information leak => retrieve data from servers or caches that
    should not have been possible with the credentials. Poor implementations
    or configs too (eg: "GET /..", proxy requests, TRACE and OPTIONS request,
    authentication of the connection instead of the request, ...)

  - user information theft => try to trick browsers into believing it's safe
    to send some credentials, cookies or any information, by XSS, CSRF, ...
    Related to a specific usage.

  - mass attacks => use the ability to direct a visitor's browser to any
    victim (CSRF, cross-protocol attacks, redirects, ...). Related to a
    specific usage too.

  - and obviously all types of denial of service that depend on specific
    implementations (eg: preallocation of content-length, improper session
    termination on error at some points, proxies looping over themselves,
    etc...).

I'm not saying that no other categories exist, but that HTTP is low-level
enough to provide limited support for building exploits. And it's not about
a "false sense of security", it's just that unplanned cases can fall in a
known column and be processed in a known manner.

The funny websocket handshake also tries to prevent some low-quality
implementations from falling into some of the categories above.

> > There's no seek for perfection here. Just to avoid known bad patterns.
> 
> 
> No argument against that.  I am just saying that avoiding bad SECURITY
> patterns when we know from nature there will be some in our design any way
> that we are not aware of, may not be the strongest argument in terms of
> prioritizing with design.  Because security is a very difficult
> proposition especially as we add features in finite time.

Exactly! And that's why I argued that WS should be HTTP compliant instead
of trying to immitate it and cover one by one a few of the identified
security risks.

(...)
> > No, applications don't break implementations,
> 
> Applications are implementations.

Hey, they're free to break themselves, and after all that happens everyday,
that's called a bug. But I mean they won't break other components'
implementations. That's very important.

> > so applications even if
> > poorly designed, don't introduce security holes into those components.
> 
> Of course they do, as they are part of the overall system implementation.
> 
> > What introduces security holes is poor implementations.
> 
> Agreed.  And specifications can not stop that, because specifications
> actually introduce features.

Specifications tell implementations how to act on bad inputs and how to
react to unfriendly or unexpected external actions.

> >> My point is that it isn't a very strong argument to argue
> >> that the applications must be careful not to break the security of the
> >> middle ware, when the middleware has to deal with entropy any way.
> >
> > No, let me insist. The middleware *is not* sensible to such issues and
> > tries hard not to be. However, if draft-76 were to be ratified, all
> > implementors of intermediaries would have to choose between being secure
> > and HTTP compliant,
> 
> You don't know that they are perfectly "secure and HTTP compliant".  You
> don't even know all the ways that they are not.  No one can know that.

I disagree. While I know that most people don't test equipments before
deploying them, I can assure you that there are some places where every
bit of ARP, IP, TCP and HTTP behaviours are tested before the equipment
can be deployed, and what does not work as expected has to be worked around
or fixed by the vendor. Sometimes it can take more than one year to have
something fully validated. And when you know that your application firewall
is HTTP compliant with the Upgrade header, and you want it to behave that
way because it's safe, you certainly don't want it to suddenly be degraded
because some moron has added support for draft-76 into it !

> > Interestingly, with highly interoperable protocols such as HTTP, the
> > market cannot really violate the basic specs, otherwise the components
> > simply don't work at all and cannot sell.
> 
> You don't know that.  It is not "breaking the spec", rather it is doing
> things that work within the spec (at least how the market has implemented
> the spec) and yet still open security attacks (even ones we are not aware
> of yet).

The only part of the specs that can be silently violated is the control
of inputs. That's why you should run tests. How do your proxies behave
when facing a GET request with a content-length ? How do they behave
when the content-length is between 2^31 and 2^32-1, when it's larger
than 2^32, with multi-line headers, with trailers ? How do they resolve
multiple host headers, how do they resolve IP addresses larger than 2^32 ?
Does your cache change the size of the cached object if it sees a
content-length in the trailers, etc... Once you have established the
set of incompatibilities, it's up to you to decide whether there is any
risk in deploying that component or to try another one. But in this
respect, security and features are not different, you want something
that works. Spec indicate how to correctly implement something to be
safe and reliable, but implementors have the right to be on the unsafe
side. Still that does not compromise other components' own security.

> Seems using a dedicated port would be a lot cleaner.

Will not pass through many enterprise proxies and firewalls, and result
in lower adoption.

> > To give you an idea,
> > think about a reverse proxy in front of an infrastructure hosting 100000
> > domains, most of which run only HTTP, and many of which also run
> > WebSocket.
> > You have to change the way HTTP works on this reverse-proxy just in order
> > to support the WebSocket stack on some of them. And with such numbers,
> > it's
> > not a solution to think about configuring all of them by hand.
> 
> But if 99,900 of them work fine when passing the 8 extra characters
> through, then it was a nice 80/20 hack if there was any benefit to the 8
> extra chars (but I agree there isn't any benefit).

But it's not a matter of working fine. In fact, 100% of them will still
work fine *by default*. It's just that once you start weakening your
reverse-proxy to send those 8 bytes, you end up truncating your requests
at a different place on the reverse proxy than on the server, resulting
in the easy ability for attackers to send uncontrolled requests to all
those 100000 domains. And now it's quite hard to tell which of these
100000 domains won't be affected by the unexpectedly transformed requests.

> If the end server does not fix or reject the message, the most
> interoperable proxy in that case of two Content-length would be the one
> that acted as it if did not exist, otherwise the client is going to get a
> different result when it connects to the end server without going through
> your proxy.

That's precisely why specs are important, and HTTP specs in particular
falls back to the lowest capable. If you can't parse the content-length
header, you have to fall back to the "connection: close" mode and only
have to process one request. This can be announced to the other side by
adding an explicit header, and the other side will not be fooled. That's
how haproxy worked for the first 8 years of its existence, without
particular problems. Once again, the HTTP specs are there to permit all
sorts of behaviours and ensure maximal interoperability.

OK, stopping here now, my code has gotten too much late :-/

Regards,
Willy