Re: [hybi] TSV-Directorate review of draft-ietf-hybi-thewebsocketprotocol-07

Willy Tarreau <w@1wt.eu> Mon, 02 May 2011 08:55 UTC

Return-Path: <w@1wt.eu>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6EC35E0655; Mon, 2 May 2011 01:55:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.338
X-Spam-Level:
X-Spam-Status: No, score=-3.338 tagged_above=-999 required=5 tests=[AWL=-1.295, BAYES_00=-2.599, HELO_IS_SMALL6=0.556]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fjCFJHa6TvwI; Mon, 2 May 2011 01:55:45 -0700 (PDT)
Received: from 1wt.eu (1wt.eu [62.212.114.60]) by ietfa.amsl.com (Postfix) with ESMTP id 0A96DE062A; Mon, 2 May 2011 01:55:43 -0700 (PDT)
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id p428tYHv017933; Mon, 2 May 2011 10:55:34 +0200
Date: Mon, 02 May 2011 10:55:34 +0200
From: Willy Tarreau <w@1wt.eu>
To: Magnus Westerlund <magnus.westerlund@ericsson.com>
Message-ID: <20110502085534.GO10529@1wt.eu>
References: <4DBAEEC0.8050409@ericsson.com> <20110429183447.GD4085@1wt.eu> <4DBE66C3.4010805@ericsson.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <4DBE66C3.4010805@ericsson.com>
User-Agent: Mutt/1.4.2.3i
Cc: "hybi@ietf.org" <hybi@ietf.org>, "ifette+ietf@google.com" <ifette+ietf@google.com>, TSV Dir <tsv-dir@ietf.org>
Subject: Re: [hybi] TSV-Directorate review of draft-ietf-hybi-thewebsocketprotocol-07
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 May 2011 08:55:46 -0000

Hi,

On Mon, May 02, 2011 at 10:09:39AM +0200, Magnus Westerlund wrote:
> >>From memory, it's because XHR can't set a Sec- header so we're certain
> > that evil code running in browsers will not fake WS handshakes. We don't
> > want to prevent any non-browser from establishing a connection (that is
> > neither possible nor desired), but to ensure that owned browsers won't
> > do that.
> 
> I agree that the web-application running in the browser can't fake this
> if you trust the browser. However, a server has no way of knowing if
> this header comes from a browser doing all correct or if it the
> websocket establishment is from do-no-good implementation. Thus, I fail
> to see the security value of it.

The idea was that the risk comes from browsers. Those are the only ones
that can remotely be massively controlled by visiting an evil site. Other
tools require an access on the system. Now that said, I'm not certain
there is more risk in letting javascript fake a WS connection and letting
javascript send a valid POST request somewhere.

> >> 22. Section 4.3:
> >> If I understand the masking correctly it for one particular frame it
> >> uses the same mask over and over again over each 32-bits. As the frame
> >> format supports message of size up to 2^63 I think I can perform a cache
> >> poisining attack anyway.
> > 
> > There was a very lengthy debate on this several months ago and it was hard
> > to reach a rough consensus. I invite you to consult the archives. In short,
> > we're certain products exist with a risk for the first bytes, we're not
> > aware of any product which present this risk past whatever bytes break
> > HTTP parsing, and the way to write such a vulnerable product appears very
> > difficult to several developers, though it can't be demonstrated that none
> > exists. The resulting proposal was accepted as a trade-off between a very
> > low risk and some serious performance hits in some specific usages, and
> > still provides good protection against the known issues.
> 
> Ok, long discussion has happened and you have a rough consensus. From my
> perspective I think you are going to need to defend the trade-off in two
> ways. First, make it clear in the security consideration that the attack
> is still possible to some degree. Secondly include some of the reasoning
> behind it. Otherwise people will continue to comment on it. And it will
> be discussed on IESG level I am pretty certain.

One of the points was that the 32-bit "GET\n" is already a valid request
for Apache, so anyway whatever the size of the key you use, it will still
appear there once every 4 GB. However, Apache, just as any other known
HTTP server supporting keep-alive, does not randomly look into a stream
looking for what looks like a request.

If you want my personal opinion, if the IESG could rely on deeper HTTP
knowledge and conclude that the masking would need to protect better the
first few bytes and does not need to be applied later, we'd be able to
make it optional and simplify the spec. I mean that in my opinion, the
masking is too much for an imagined risk, and possibly limited if the
risky component was carefully designed to fail on this.

> >From personal view I do think the WG is taking the wrong trade-off to
> leave a evident security vulnerability in the protocol. By not securing
> this sufficiently the effect is going to be that people write firewall
> rules that block websocket. I do understand that there will be a
> performance hit. But from the perspective of a somewhat slower protocol
> that can be used and a protocol that can't I know what I pick.

Well, as I explained above, the risk has been "imagined" from scratch.
Whoever has already had to deal with developing a server-side HTTP stack
knows that it's useless and almost impossible to scan a stream trying to
parse an HTTP request where there is none. Basically we're talking about
looking for "GET\n" in /dev/random. So any such method always comes down
to statistics in the end.

Considering the performance hit, we showed that the only way to safely
try to crypt the traffic was by putting the key at the end of a frame,
which forces every intermediary to buffer everything until they got the
key to decode the frame. And that still does not stop the 32-bit pattern
above from appearing.

There were valid alternative suggestions such as prepending a fake request
between the payload to force the non-compliant intermediaries to process
this fake request instead of parsing the payload. Other ones consisted in
using a masking with the 7th bit always set so that a valid request could
never be encoded in the stream.

> >> 29. Section 4.5.2:
> >>    An endpoint MAY send a Ping message any time after the connection is
> >>    established and before the connection is closed.  NOTE: A ping
> >>    message may serve either as a keepalive, or to verify that the remote
> >>    endpoint is still responsive.
> >>
> >> So one allows to have multiple outstanding PING messages? That sounds a
> >> bit strange over a reliable in-order delivery channel such as TCP. And
> >> if one does allow it should there be any rules for answering pings in
> >> order one receives them?
> > 
> > Indeed, we should possibly remind implementors that it's useless to send
> > a new ping as long as one is still pending.
> 
> I think there is two possible ways. Allow it and simply include a note
> that it is mostly useless. Or make it clear that one is not allowed to
> send new pings until one have received the pong.

Good point.

> >> 36. Section 7.1.1.
> >> This section should recommend that the client is the one that performs
> >> the TCP close so that it holds the TIME_WAIT state and not the server.
> > 
> > If we recommend something, we should recommend that the server does it, as
> > only the client is impacted by holding a time_wait, as it prevents it from
> > re-opening the connection for 2 MSL, while on the server it has no impact
> > as reopining a time_wait connection is immediate upon a new SYN with a
> > higher seq number. Do not forget that there will be quite a number of
> > intermediaries in the whole chain and we don't want to cause basic networking
> > issues between server-side proxies and the servers themselves.
> 
> Ok, I haven't thought of it from this perspective. That makes sense as
> long as you have no real issues with potential build up of TIME_WAIT
> states on the server. I guess the main issue would be the sheer number
> of connection states

TIME_WAIT states are always very cheap. My record was 35 million on a 4-GB
core2duo machine :-)  And internet-exposed HTTP servers are already tuned
to support large numbers. BTW, there will likely be many more TW from HTTP
than from WS because the former leads to far many open/close events.

Cheers,
Willy