Re: [hybi] Comments to draft-ietf-hybi-thewebsocketprotocol-05

Ian Fette (イアンフェッティ) <ifette@google.com> Sun, 20 February 2011 02:40 UTC

Return-Path: <ifette@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 099EE3A6D85 for <hybi@core3.amsl.com>; Sat, 19 Feb 2011 18:40:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -104.176
X-Spam-Level:
X-Spam-Status: No, score=-104.176 tagged_above=-999 required=5 tests=[AWL=1.500, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TEoE5sca92lf for <hybi@core3.amsl.com>; Sat, 19 Feb 2011 18:40:05 -0800 (PST)
Received: from smtp-out.google.com (smtp-out.google.com [74.125.121.67]) by core3.amsl.com (Postfix) with ESMTP id 7CEA13A6CBA for <hybi@ietf.org>; Sat, 19 Feb 2011 18:40:03 -0800 (PST)
Received: from kpbe16.cbf.corp.google.com (kpbe16.cbf.corp.google.com [172.25.105.80]) by smtp-out.google.com with ESMTP id p1K2ee7T021341 for <hybi@ietf.org>; Sat, 19 Feb 2011 18:40:40 -0800
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1298169641; bh=lrPRUJ5olSzK5MAILI9JInBrJWM=; h=MIME-Version:Reply-To:In-Reply-To:References:Date:Message-ID: Subject:From:To:Cc:Content-Type; b=tfOBsKjsYQw6Jg+SKcJOkzLG8pWRAWTDfgXxHr0iNADJBa9Q/fGnXTZvUBdbQgDoW sMudjUX+cCcT3ssaUE0rQ==
Received: from iwc10 (iwc10.prod.google.com [10.241.65.138]) by kpbe16.cbf.corp.google.com with ESMTP id p1K2ecwR029574 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for <hybi@ietf.org>; Sat, 19 Feb 2011 18:40:39 -0800
Received: by iwc10 with SMTP id 10so1506798iwc.10 for <hybi@ietf.org>; Sat, 19 Feb 2011 18:40:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=beta; h=domainkey-signature:mime-version:reply-to:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=XCXMsbyzA50rL9UxzZbI+56vqb1H7aD10H4ACNEP9Y8=; b=l5N/rAS0Nrxtu8wuEeYeNpo3gFfa3aRDKidhheL96Iy19BSxuCR612+/CzIzcMuOYg xkyc7NzDHDnILTjDZuAQ==
DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; b=LRdwnFjqbsyK7RuNRuOmViTRDrH4VHnVkMM3zHzIUHW990poyfh25HKnV4uAOPkLAA ORT2T94kkUEx3Y8XPJ1g==
MIME-Version: 1.0
Received: by 10.231.30.71 with SMTP id t7mr1798722ibc.153.1298169638713; Sat, 19 Feb 2011 18:40:38 -0800 (PST)
Received: by 10.231.37.133 with HTTP; Sat, 19 Feb 2011 18:40:38 -0800 (PST)
In-Reply-To: <AANLkTinMDCm=uv1KPaXW3PwVN-0Fmtw8sn2iWzVjSh7h@mail.gmail.com>
References: <AANLkTinMDCm=uv1KPaXW3PwVN-0Fmtw8sn2iWzVjSh7h@mail.gmail.com>
Date: Sat, 19 Feb 2011 18:40:38 -0800
Message-ID: <AANLkTi=h+1cQKW87Gs7dOKnAQRuEtt6ZDmi4VHSwHxLF@mail.gmail.com>
From: "Ian Fette (イアンフェッティ)" <ifette@google.com>
To: Silvio Ventres <silvio.ventres@gmail.com>
Content-Type: multipart/alternative; boundary="00221532c7a02c78d2049cadac57"
X-System-Of-Record: true
Cc: hybi@ietf.org, ifette+ietf@google.com
Subject: Re: [hybi] Comments to draft-ietf-hybi-thewebsocketprotocol-05
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: ifette@google.com
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Feb 2011 02:40:08 -0000

as per the first page of the draft, the proper forum is hybi@ietf.org. reply
inline.

On Sat, Feb 19, 2011 at 11:01 AM, Silvio Ventres
<silvio.ventres@gmail.com>wrote:

> Hello, Ian.
>
> Please let know if there exists a forum for discussion of these points.
> At this point, it seems the WebSocket has been designed with very
> vague goals, while the spec draft has been written to encompass
> additional goals, which can only be guessed, and are not explicitly
> specified anywhere, such as prevention of some usage scenarios or
> stream fragmentation.
>
> The comments follow section numbers:
>
> 1.6: quote: "fields starting with |Sec-| cannot be set by an attacker
> from a Web browser, even when using |XMLHttpRequest|."
> "Sec-" are _only_ filtered for XHR as specified in XHR spec, not any
> HTTP specs per se.
>
>
No other browser APIs allow these to be sent, and browser vendors would
consider that to be a bug. XHR is the closest thing to sending an arbitrary
request from a browser, and this forbids Sec- headers.


> 1.6: quote: "fail to establish a connection when data from other
> protocols, especially HTTP, is sent to a WebSocket server, for example
> as might happen if an HTML |form| were submitted to a
>   WebSocket server.  This is primarily achieved by requiring that the
> server prove that it read the handshake"
> What does the "server prove that it read the handshake" mean? How does
> the server "prove" anything? If the server sends any response to the
> HTML <form> request, that's all the response that was expected anyway,
> so who cares about "proof" ?
>

By making a computation involving data in the handshake, the server "proves"
that it actually parsed, at least in some minimal version, the request, as
opposed to returning a canned response. The intent was to ensure the client
is communicating with a WebSocket server, not an arbitrary
server/intermediary/... that may be confused and returning a response
without actually understanding WebSockets.


>
> 1.6: quote: "This protocol is intended to fail to establish a connection
> with
>   servers of pre-existing protocols like SMTP or HTTP, while allowing
>   HTTP servers to opt-in to supporting this protocol if desired.  This
>   is achieved by having a strict and elaborate handshake, and by
>   limiting the data that can be inserted into the connection before the
>   handshake is finished (thus limiting how much the server can be
>   influenced)."
>
> What are the malicious scenarios that you are defending from here ?
> Seems like the "intention to fail" establishing connection is there as
> a "good to have" thing, without elaborating as to why it's a good
> idea.
> Moreover, the "strict and elaborate" handshake does not seem to be
> necessary anymore, as the data frame format differs significantly in
> the latest versions of the draft from plaintext-based SMTP and HTTP,
> and thus WS clients will be incompatible with non-WS services with or
> without the handshake.
>
>
Please see the literally hundreds of emails on this mailing list, especially
those from adam barth, about how intermediaries can be confused by
attacker-controlled data to the effect of poisoning the cache of deployed
intermediaries. This is not something that the WG is going to revisit.


> 4.1: "..masked to avoid confusing network intermediaries, such as
> intercepting proxies."
> 4.2: "MUST NOT make it simple for a server to predict the masking-key
> for a subsequent frame."
> Wait. Can it be decided what the masking-key is there for? Is it to
> avoid confusing proxies? Or making life harder for servers?
> Why would you want to "NOT make it simple for a server to predict the
> masking-key" ?
> The masking key may as well have been chosen statically at compile
> time, or fixed in spec, in same way
> as the "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" nonce was.
> Or just removed completely, as it doesn't serve any clearly defined
> purpose.
>
>
If the next key is predictable, then the server can cause the client to send
data with a predictable value on-the-wire, which is exactly what masking is
designed to prevent. Again, I would encourage you to read the history of
this mailing list for further details on masking.


> 4.2: "..unpredictability of the masking-nonce is essential.."
> What is masking-nonce? Why is unpredictability essential?
>
>
see above


> "..prevent the author of malicious application data from selecting.."
> 4.2: If talking about the masking-key, it's transmitted along with the
> data every time, how does that prevent anyone from decoding the data?
> Sending encoded data + key is same as sending plaintext in the first
> place, why add this complexity?
>

it's not to prevent people from decoding the data. It is to prevent an
attacker from controlling the bytes on the wire and potentially sending data
that could cause an intermediary to misinterpret the data for ill effect.


> While at that, what or who is "author of malicious application data"
> and what is meant by "selecting" ?
>

again, read the mailing list, but in brief, user goes to webpage controlled
by attacker (or an ad supplied by attacker), attacker supplies javascript
that, upon request from the attacker's server, sends data to the attacker's
server. If the attacker can control the format of this data on the wire,
there exists a threat that this data can be used to trick intermediaries for
ill effect.


>
> 4.4:
>  TCP already handles fragmentation. Why add additional layer of
> fragmentation?
>

Because the frame includes the length of the frame, and without
fragmentation one would have to know the length of the message before being
able to send any part of the message. With fragmentation, a server can begin
sending a message whilst the message is being generated.

Again, please see the mailing list archives.


> If the reasoning is "to allow application to use imcomplete received
> data instead of blocking waiting for for completion",
> then the additional fragmentation doesn't add any benefit. Moreover,
> the WS fragments might _not_ align with TCP fragments,
> leading to delays in processing, while WS parser waits for the
> completion of "WS-non-fragmented" data which was fragmented
> by TCP. If the reasoning was, at some point, "to allow connection
> through proxies", then, again, by using CONNECT, the proxy
> doesn't care about the request length.
>
> 5.1.2: "If the user agent already has a WebSocket connection.."
> What is meant by "has a WebSocket connection" ? Is the connection in
> handshake state? In TCP CONNECTING state?
> Seems like it's implied that there is no limit to _established_
> connections, just limit on connections in CONNECTING
> or handshake state. This should be clarified.
>

I will clarify this. It is intended that there only be one connection in
CONNECTING state.


>
> Regarding the handshake and security in general, what is the
> Sec-WebSocket-Key made to prevent?
> Non-WebSocket servers answering WebSocket clients?
>
>
Yes


> It won't work anyway, because of the response requirement to contain
> echoes of custom headers and 101 status code.
>

Getting something to echo a request is not a significant bar to satisfy
various members of the working group. Performing computation on that data
ensures that the endpoint is WS aware, especially given that some of the
data required for the computation is not sent, but instead exists  only in
the protocol spec.


> But, by adding this complexity, you're practically removing embedded
> and low-computing-power devices from being able to
> act as WS servers.
>

This was already discussed ad-nauseum, again I refer you to the archives of
the list. Many of us do care about power, but believe that this is not a
sufficient barrier.


> Section 5.2 actually talks about a server which only purpose is
> processing the complex handshake. Why do you want to
> force people to add such front-end-servers instead of connecting to
> the real ws-server directly?
>
>
In reality, many people in large deployments do have frontends/load
balancers that terminate the connection. The spec does not require this
though.


> 5.2.2.3.3 "The nonce MUST be randomly selected randomly for each
> connection"
> Aside from the double-"randomly", why is the randomness so important here ?
> What is it made to prevent?
>

Double randomly is a typo :) As for why it's important, as I said earlier in
the response, if the server is controlled by an attacker and the server can
predict the next nonce, the server can instruct the client to send data
that, when masked with that predicted nonce, will have an appearance on the
wire that could confuse the intermediary for ill effect.


> Non-WS client getting data from the WS server?
> They wouldn't be able to parse the response anyway. And if they would,
> how is that different from a buggy "WS-conforming" client?
> The keys/nonces don't solve any real problems, they just add
> complexity to the handshake.
>
>
> Thank you for reading.
> Hope the draft can be mended to fix this, and the protocol simplified
> to conform to the goals declared.
>
> --
>  silvio
>