Re: [hybi] #1: HTTP Compliance

"Shelby Moore" <shelby@coolpage.com> Sun, 15 August 2010 17:45 UTC

Return-Path: <shelby@coolpage.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 88CED3A68C1 for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 10:45:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.483
X-Spam-Level:
X-Spam-Status: No, score=-0.483 tagged_above=-999 required=5 tests=[AWL=-0.799, BAYES_50=0.001, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0zWyeKR8LQEH for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 10:45:45 -0700 (PDT)
Received: from www5.webmail.pair.com (www5.webmail.pair.com [66.39.3.83]) by core3.amsl.com (Postfix) with SMTP id 3C1FA3A68BF for <hybi@ietf.org>; Sun, 15 Aug 2010 10:45:44 -0700 (PDT)
Received: (qmail 235 invoked by uid 65534); 15 Aug 2010 17:46:20 -0000
Received: from 121.97.54.174 ([121.97.54.174]) (SquirrelMail authenticated user shelby@coolpage.com) by sm.webmail.pair.com with HTTP; Sun, 15 Aug 2010 13:46:20 -0400
Message-ID: <392847bd7906678afc333a9011ae9aab.squirrel@sm.webmail.pair.com>
In-Reply-To: <20100815154910.GF27614@1wt.eu>
References: <f7d4bb98e444b85b9bf1af6d4d9f0772.squirrel@sm.webmail.pair.com> <20100815100717.GA27614@1wt.eu> <1c2251f66b8d01880b6943561c07d3cb.squirrel@sm.webmail.pair.com> <20100815122648.GC27614@1wt.eu> <91f61d138b8a8a80271688a0e10a685a.squirrel@sm.webmail.pair.com> <4C67EABA.1020109@gmx.de> <30e5b12387f782afa0cb1df77de847fa.squirrel@sm.webmail.pair.com> <20100815154910.GF27614@1wt.eu>
Date: Sun, 15 Aug 2010 13:46:20 -0400
From: Shelby Moore <shelby@coolpage.com>
To: Willy Tarreau <w@1wt.eu>
User-Agent: SquirrelMail/1.4.20
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Cc: hybi@ietf.org
Subject: Re: [hybi] #1: HTTP Compliance
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: shelby@coolpage.com
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Aug 2010 17:45:47 -0000

This is a horrible reply, sorry I am cross-eyed at 1:45am...just let me
punt after this please...

> On Sun, Aug 15, 2010 at 10:55:28AM -0400, Shelby Moore wrote:
>> Hi again all, please do not take my reply as combative.  It is late here
>> and I am trying to wrap this up, and I am sure I have not taken enough
>> time to compose this reply with the degree of care that I would like.
>
> OK.
>
>> > As a matter of fact, HTTPbis is going to require that recipients
>> reject
>> > messages with this kind of problem; see
>> > <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-11.html#rfc.section.3.3.p.7>.
>>
>> The HTTP header is exposed to millions of naive programmers through
>> popular and easy means, e.g. PHP header(), thus user agents that want to
>> be robust on the WWW have no choice (the choice of failing for popular
>> content is not a choice).
>
> We're not talking about failing for popular contents.

We would be if duplicate Content-length was a popular content error.  I am
thinking in terms of non-obvious cases that you are not aware of yet. 
"clean" is hindsight not foresight.  Security is a constant battle, it is
never won, nor 100% secure.

What I meant is that the only way to become more secure is to limit
features (so that we limit the variables/permutations of the unknown
attacks and non-obvious non-compliance).  Limiting features is not
popular.

It is important to admit that even 1 security hole is insecure.  Relative
security is hard to quantify, because the next security attack may harm
more than the prior one or the prior 99 of them.  The harm of the security
attack does not diminish with the duration of prior regression.


> Quite the opposite
> in
> fact, we're talking about remaining compatible with clean implementations
> so that we're not causing them to fail on popular contents.

We don't know about security attacks that haven't happened yet, but are
already available in that "clean" market place.  That very obvious
duplicate Content-length headers is just being dealt with, makes me
reasonably wary of what is lurking.

> HTTP is a very widely deployed protocol and despite the numerous
> implementations, all of them are very much interoperable.

Agreed that is strong point, that widescale use and open source code
trends to illuminate security risks sooner (more regression in the market
place).

Open source helps too because hackers can identify vulnerabilities faster.

So the attacks may become less frequent, but it does not guarantee that
when an attack does eventually happen, that it won't be less harmful.


> One of the
> reasons is that most mistakes related to the messaging semantics cause
> the other side either to fall back to the lowest level compatibility
> mode (eg: conenction: close) or to fail (400 bad request, 502 bad
> gateway, etc...). Thus, while you may find a lot of buggy applications,
> they still have their message framing correct (or at least correct
> enough to reliably be transported).


Yeah the most obvious compliance is in center of bell curve.  That was my
point.


>> Security that depends on specification is not very strong security
>> because
>> it is impractical to quantify the determinism
>
> No, that's where you are mistaken in my opinion. Security does not depend
> on the spec, but the spec dictates what to do to avoid known security
> issues.

If security depends on compliance and the the market has non-compliance,
then there is failure of security.  Worse is when we don't realize the
market is non-compliant under after the security breach.  And that will be
for future cases much less obvious than the duplicate Content-length
headers.

> For instance, there have been some add-ons due to the discovery of HTTP
> request smuggling attacks (cf Julian's link above).

And for the future attacks discovered some more years later?  All during
that time there will be false sense of security.


>> (you would have to compile
>> statistics in the market of implementations in all anticipated
>> scenarios,
>> i.e. massive regression).
>
> You have to consider the fact that the 4th RFC is being written, and its
> main
> goal is not to specify how things should be, but what is currently being
> done
> and how to interoperate with existing components. So there is no reason
> for
> regressions, quite the opposite, it could help various existing actors
> quickly
> become compatible with other implementations they're not yet aware of.


Great but that doesn't negate my points.


>> On top of this, it is my understanding (incorrect?) that HTTP itself
>> allows for varying interpretations that might someday prove to be
>> exploitable.
>
> That's what is currently being addressed. If you're aware of some
> vulnerabilities in some implementations that are only caused by a
> different interpretation of the spec, then you should really join
> the http-bis WG and report the issue so that it can be considered.


But that still wouldn't negate my points.  There is always another
unforeseen attack around the corner.


>> My reasonable suspicion derives
>> from the differences between what the spec says that user agents are
>> required to do and what they are allowed to do. And because the various
>> syntax elements of the specification are composeable in probably way
>> more
>> permutations that any human has ever wrapped inside their head.
>
> I really invite you to read the link above. The whole p1-messaging and
> p2-semantics are very interesting to read. You'll find how accurate they
> are on many aspects, and very little is left to chance.

I will try to find time to read it.  Thank you.

>
>> >> http://www.owasp.org/index.php/HTTP_Request_Smuggling
>> >
>> > It only exists when servers are not cautious enough with dangerous
>> > inputs (eg: conflicting content-lengths). And it is a problem when
>> > two implementations decide differently how to interprete the request
>> > because the spec makes it ambiguous or does not cover the case. That's
>> > been addressed in http-bis, and now that there is some guidance so
>> that
>> > everyone tries to do the same thing,
>>
>> I predict you will find that the exponential function runs away faster
>> than the perfection can fix itself.
>
> There's no seek for perfection here. Just to avoid known bad patterns.


No argument against that.  I am just saying that avoiding bad SECURITY
patterns when we know from nature there will be some in our design any way
that we are not aware of, may not be the strongest argument in terms of
prioritizing with design.  Because security is a very difficult
proposition especially as we add features in finite time.

Here is the big epipheny.  Features only have to work how we expect them
to for the cases that we expect and are interested in.  But security has
to work 100% of the time for all possible permutations of those
imperfectly design and implemented features, or we can lose everything.

"improving security" is mostly an lie to ourselves.  We think it is
improved, until we suffer a catastrophic failure and realize it isn't. 
Insurance is a big fraud of mankind (and yes this is a technical response,
although I guess to bizarre for most people to agree and I don't really
have time to expound and I think it is not worth it as we have only a
minor disagreement on importance of security, not on the concensus to
require HTTP Compliance):

http://www.marketoracle.co.uk/Article21650.html
http://www.marketoracle.co.uk/Article21864.html#comment93787

>
>> > There
>> > are many of them around you without you ever noticing. That makes them
>> > "transparent". They're generally quite robust, and most often they
>> bring
>> > the security themselves because their HTTP stack is well polished and
>> > strict enough to protect the servers from issues that could result
>> from
>> > poor implementation in the application.
>>
>> I can see that, if they have enough experience in the market place to
>> know
>> all the possible scenarios.  But isn't it a huge effort, because a new
>> application comes along and breaks them, like WebSocket's proposed
>> security hole?
>
> No, applications don't break implementations,

Applications are implementations.

> so applications even if
> poorly designed, don't introduce security holes into those components.

Of course they do, as they are part of the overall system implementation.

> What introduces security holes is poor implementations.

Agreed.  And specifications can not stop that, because specifications
actually introduce features.

> Trying to modify
> a clean implementation

There is no perfectly clean implementation of anything in the world.

> to add support for an incompatible application may
> lead to a poor implementation which is sensible to security holes.

Agreed, we shouldn't do it, given no overriding priority.  But if there
was sufficient priority, I have argued that the security argument could
possibly be weaker.

> But no
> application will cause security holes by itself.

No just as no server implementation will cause security breach by itself.
They interact and errors as side-effects of feature in each interact. 
These interactions present security holes sometimes.


>
>> My point is that it isn't a very strong argument to argue
>> that the applications must be careful not to break the security of the
>> middle ware, when the middleware has to deal with entropy any way.
>
> No, let me insist. The middleware *is not* sensible to such issues and
> tries hard not to be. However, if draft-76 were to be ratified, all
> implementors of intermediaries would have to choose between being secure
> and HTTP compliant,

You don't know that they are perfectly "secure and HTTP compliant".  You
don't even know all the ways that they are not.  No one can know that.

> or being compatible with WebSocket. And *that* is not
> acceptable.
>
>> > It's quite the opposite. The more mission critical, the more of them
>> you
>> > find because they bring protocol validation, security, filtering, load
>> > balancing, surge protection, content switching, etc... basically
>> things
>> > you don't need to show your photos to your family, but that are needed
>> > everywhere applications are setup to serve tens to hundreds of
>> thousands
>> > of concurrent clients.
>>
>> Won't we find that most all of those are optional and supplemental
>> technologies that fallback to something other than total failure for the
>> marjority of the market even when some specification is violated by the
>> market?
>
> Interestingly, with highly interoperable protocols such as HTTP, the
> market cannot really violate the basic specs, otherwise the components
> simply don't work at all and cannot sell.

You don't know that.  It is not "breaking the spec", rather it is doing
things that work within the spec (at least how the market has implemented
the spec) and yet still open security attacks (even ones we are not aware
of yet).


> If you're an ISP and you buy
> a proxy-cache that does not correctly parse content-length, you will
> probably not ask all your clients to switch to a patched browser, nor
> will you contact all the sites they're accessing to ask them to change
> their implementation. No. You will simply put that buggy component to
> the trash and buy another one from a different vendor.


I am not talking about popular content that fails in the market place.  It
wouldn't be popular.  That would be an oxymoron.  I am contemplating
popular content that has security attacks that you and others are not
aware of.

>
>> > In fact, the unreliable components are mainly those which work on
>> string
>> > matching in packets. Those do not rely on proxies and don't have a
>> very
>> > robust state machine. They can easily be fooled in many ways with hand
>> > crafted requests. Some of them will already not be able to process the
>> > protocol upgrade and will likely hang or reset the connection.
>>
>> I don't know much about this but to the extent the middleware is parsing
>> the application data in transist and acting upon it outside of any
>> deterministic specification, we are going to have problem with any HTTP
>> Upgrade extension and I assume that is what the encryption solves?
>
> it's not the "encryption" by itself, but the fact of relying on something
> non-HTTP. But it also comes with many new issues which are already
> addressed
> by HTTP : no more session stickiness, no more accessibility from the
> inside
> of most enterprises, no more virtual hosting, etc... So nothing is
> completely
> white or black.


Yeah it really looks like a mess to me.  Seems using a dedicated port
would be a lot cleaner.  Sometimes entropy stacks up against you, and you
need to start clean.  You can only put so many bandaids or spaghetti on a
paradigm before chaos defeats us.  Everything in the universe is decaying,
it is the 1856 law of thermodynamics.


>
>> > It's not the security argument against draft 76. It's that draft 76 is
>> > not HTTP compliant and breaks with HTTP-compliant reverse proxies. The
>> > only dirty hack that could be imagined to make those reverse-proxies
>> > support it would not be acceptable in those reverse proxies because
>> it's
>> > incompatible with HTTP and it would break their security.
>>
>> Agreed. I was just trying to present what I thought was an even stronger
>> reason. If the HTTP proxies can work around all kinds of market
>> application errors, they could support WebSocket proposal 76.
>
> No, because doing this would break all other usages. To give you an idea,
> think about a reverse proxy in front of an infrastructure hosting 100000
> domains, most of which run only HTTP, and many of which also run
> WebSocket.
> You have to change the way HTTP works on this reverse-proxy just in order
> to support the WebSocket stack on some of them. And with such numbers,
> it's
> not a solution to think about configuring all of them by hand.


But if 99,900 of them work fine when passing the 8 extra characters
through, then it was a nice 80/20 hack if there was any benefit to the 8
extra chars (but I agree there isn't any benefit).  And to be honest,
everything we do is a hack of varying degree.  That is why the simplest
designs win, must less collateral damage.

>
>> The HTTP
>> proxies that allow widest interoperability on the WWW, thus support HTTP
>> request smuggling now (they allow the two Content-length headers to pass
>> through).
>
> No. Those which allow request smuggling are not the most interoperable
> because if they randomly consider one of the two CL headers, half of the
> times they'll pick the wrong one. The most interoperable ones are the
> ones that fix or reject the wrong messages.

If the end server does not fix or reject the message, the most
interoperable proxy in that case of two Content-length would be the one
that acted as it if did not exist, otherwise the client is going to get a
different result when it connects to the end server without going through
your proxy.


>
>> In short, security and features and time are in tension against each
>> other.  There will always be some tradeoff.  At some points along that
>> solution space, the security argument is weaker than other priorities.
>> I
>> felt by presenting the point that did, the case against proposal 76 was
>> even more slamdunk, in order to persuade other fence sitters.
>
> Well, once again, my point is that while security around the new protocol
> may not be the most important argument, not forcing reliable and secure
> components to become unreliable an insecure for all existing applications
> is a very important argument.

Qualifying "important" is the issue we were discussing.  In any case, I
will give you the last word.  Because my replies will be getting too far
off topic and I don't want to burden you all with that.

I do appreciate your debate and we could continue it in private if you
want after you take last public word.

Maybe we can move on to figuring out how to move the WebSockets forward. 
I think we already agree with the concensus on HTTP Compliance.

I have posted too much, exceeded my worthwhile time limit.  I don't think
it is that important to defend my ideas on security, this isn't a security
working group and besides I am not trying to become an expert in security.

I simply want a reliable and easy way to do server push.  I would also
like a reliable and easy P2P channel (but that is off topic too).

Wish it wasn't so hard to achieve.  Entropy is really kicking our butt. 
Maybe we need a new port!!  For P2P too!! Time to break clean.