Re: [hybi] Handshake was: The WebSocket protocol issues.

Willy Tarreau <w@1wt.eu> Mon, 11 October 2010 20:41 UTC

Return-Path: <w@1wt.eu>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6D9513A6B80 for <hybi@core3.amsl.com>; Mon, 11 Oct 2010 13:41:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.783
X-Spam-Level:
X-Spam-Status: No, score=-2.783 tagged_above=-999 required=5 tests=[AWL=-0.740, BAYES_00=-2.599, HELO_IS_SMALL6=0.556]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wHywzMX6IR91 for <hybi@core3.amsl.com>; Mon, 11 Oct 2010 13:41:22 -0700 (PDT)
Received: from 1wt.eu (1wt.eu [62.212.114.60]) by core3.amsl.com (Postfix) with ESMTP id 9922C3A6B7F for <hybi@ietf.org>; Mon, 11 Oct 2010 13:41:21 -0700 (PDT)
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id o9BKgS7p017416; Mon, 11 Oct 2010 22:42:28 +0200
Date: Mon, 11 Oct 2010 22:42:28 +0200
From: Willy Tarreau <w@1wt.eu>
To: Eric Rescorla <ekr@rtfm.com>
Message-ID: <20101011204228.GB17225@1wt.eu>
References: <4CAFAC2B.5000800@caucho.com> <55bva61goeqtn0lifgjt5uihf50obh7kf4@hive.bjoern.hoehrmann.de> <4CAFB9C4.6030905@caucho.com> <AANLkTinv5Ym5jwUEqS76z3UkVa7GpmOBT_WXhBbFK0-m@mail.gmail.com> <20101009055723.GL4712@1wt.eu> <AANLkTimY2DjxgZybibSRtc7L34Wns2KhQC=Wa9K6PYku@mail.gmail.com> <20101009204009.GP4712@1wt.eu> <AANLkTi=Az0RmE1Uipo068zMh3YzgMpM2tQ+zYxaDT47A@mail.gmail.com> <20101011053354.GA12672@1wt.eu> <AANLkTimC6A+5ZWuNhWASehAibUB9fQKVCPVsVpShUkrW@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <AANLkTimC6A+5ZWuNhWASehAibUB9fQKVCPVsVpShUkrW@mail.gmail.com>
User-Agent: Mutt/1.4.2.3i
Cc: hybi <hybi@ietf.org>, Bjoern Hoehrmann <derhoermi@gmx.net>
Subject: Re: [hybi] Handshake was: The WebSocket protocol issues.
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Oct 2010 20:41:23 -0000

On Mon, Oct 11, 2010 at 07:07:18AM -0700, Eric Rescorla wrote:
> On Sun, Oct 10, 2010 at 10:33 PM, Willy Tarreau <w@1wt.eu> wrote:
> 
> > On Sun, Oct 10, 2010 at 09:17:21PM -0700, Eric Rescorla wrote:
> > (...)
> > > Thus, it's quite possible to implement an HTTP server which does not
> > > deadlock
> > > without looking at the Connection header at all, simply by having a short
> > > timeout.
> >
> > That's what I meant with the "deadlock", we can only end the transfer on
> > a timeout if the client expects a close and the server does not close.
> > Even if the timeout is short, it makes the situation very uncomfortable
> > for the user.
> 
> 
> I don't understand what you mean here. There are two issues:
> 
> (1) when the response is finished
> (2) when the connection can be closed.
> 
> Only the first affects the user experience, but they don't really interact.

Yes they do for components which rely on the close only and don't require
any content-length (eg: most monitoring tools). If the server did not care
about the close in the request and waited for a second request instead of
closing, the client would wait for the server to close to determine the
end of the response, which is after the keep-alive timeout of the server.

> Even
> if the client is using persistent connections, the server still needs to
> either
> incorporate a length indication or terminate the response after sending the
> response. In neither case does the user experience a stall.

This is mandatory for persistent connections only. While it's recommended for
other connections, we still see some servers that don't advertise a content
length for HTTP/1.0 request (even Tomcat until very recently).

> > > Accordingly, it's not at all clear to me that it's safe to rely on
> > > Connection: close
> >
> > Well, the best way to wipe any doubt would be to test it on a variety of
> > servers.
> 
> No, I don't agree on this. Once you're reduced to this kind of survey you
> no longer have any reasonable assurance of security. Surveys are sometimes
> OK for resolving interoperability questions, but in this case you are
> relying on this property for a security purpose.

I still think that since the complexity of the hanshake only relies on a fear
of massive attacks making use of browsers and shared hosting environments, if
we fail to find a deployed vulnerable servers, the massive attack risk
vanishes. It's not to say that no braindead server could not be attacked,
but that such a vector is basically useless.

> > BTW, I believe that Adam's example was that he could write a program on
> > a shared server that could return a valid handshake to the CONNECT request.
> > But since the valid response is a 200, by definition it's an establishment
> > of a tunnel between both sides, which ends only by the close. So once again
> > there is no other request on the wire after the handshake (rfc2817, #5.3).
> >
> 
> Yes, that's why using CONNECT is a desirable feature, since for
> interoperability
> reasons servers/proxies cannot treat data that appears in the tunnel as if
> it were HTTP traffic.

And that's also why there cannot be a second request after the 200.

> > I don't agree with that point at all. We're doing the same mistake again
> > that we did with -76 handshake : the intermediaries should not wait for
> > the connection to be completely established to take a routing decision.
> > Look at this very common example :
> >
> >                       <--- hosting provider's infrastructure --->
> >
> >                                   /---- server farm A
> >  client --- internet --- content <----- server farm B
> >                          switch   \---- server farm C
> >
> > Some server farms are shared and other ones are dedicated to some
> > customers, which is the typical scenario we find at almost every
> > hosting provider's, because some customers with very poor code,
> > high traffic or nasty reputations can cause negative side effects
> > on other sites if shared on the same farms. Here, an HTTP content
> > switch (reverse proxy and/or load balancer) will simply look at
> > the host header and forward the request accordingly to the proper
> > server.
> >
> > With Adam's proposed handshake, this is not possible anymore with
> > currently deployed components. We would have to implement WebSocket in
> > all front components just so that they can decrypt the host header and
> > see what farm is supposed to process it, if any at all.
> 
> 
> Yes, I agree that this would be required. I don't agree that it's a
> dealbreaker.

Well, first it's not compatible with any existing infrastructure, which is
a bit limited outside of the lab. Second, it means that the front server
would first have to acknowledge a WS handshake on behalf of any possible
server behind it and finally have to return in WS frames a "Sorry man, I
thought I could hand that connection to someone but there's nobody where
you want to go".

> > Not only this
> > is not compatible with existing HTTP infrastructure, but doing so makes
> > the frontend component sensible to new DoS attacks because it has to
> > maintain a context before even knowing if it has to handle the request.
> >
> >
> I don't see how this creates any meaningful increase in the state that must
> be maintained by the infrastructure element, which must already maintain
> TCP buffers, which are far larger.

Exactly, it *has* to maintain TCP buffers as well as contexts, it cannot
filter based on anything advertised in the handshake.

Right now in HTTP, it's easy to put very short timeout for the accept to
request steps because both are not subject to the RTT, so the request must
appear very quickly after the accept. Then the component just has to proceed
with the request and possibly reject it ASAP.

Here with the proposed handshake, you have to do the same thing, but then
send a WS handshake with a ping and wait for the pong for quite a longer
time (RTT) *just to know* if you have to handle this connection. In my
opinion, it makes DoSes much more easy to perform on such components than
what is currently possible.

All that is just based on the fact that we fear that some components will
accept to process an HTTP request after both a "Connection: close" and a
CONNECT that returns a 200.

I'd be tempted to suggest that such broken implementations simply deserve
to be taken down...

That said, even if we want to cover that, we could also make it mandatory
to prefix the first frame in each direction with a pattern made of impossible
HTTP bytes such as "0x00 0xFF".

Regards,
Willy