Re: [hybi] #1: HTTP Compliance

Ian Hickson <ian@hixie.ch> Thu, 22 July 2010 06:31 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6157A3A67F1 for <hybi@core3.amsl.com>; Wed, 21 Jul 2010 23:31:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.475
X-Spam-Level:
X-Spam-Status: No, score=-2.475 tagged_above=-999 required=5 tests=[AWL=0.124, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XA2l0jnuW7AF for <hybi@core3.amsl.com>; Wed, 21 Jul 2010 23:31:47 -0700 (PDT)
Received: from looneymail-a1.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id 421133A67B4 for <hybi@ietf.org>; Wed, 21 Jul 2010 23:31:46 -0700 (PDT)
Received: from ps20323.dreamhostps.com (ps20323.dreamhost.com [69.163.222.251]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a1.g.dreamhost.com (Postfix) with ESMTP id 4A65B15DAB1 for <hybi@ietf.org>; Wed, 21 Jul 2010 23:32:03 -0700 (PDT)
Date: Thu, 22 Jul 2010 06:32:02 +0000
From: Ian Hickson <ian@hixie.ch>
To: "hybi@ietf.org" <hybi@ietf.org>
In-Reply-To: <20100721230350.GF6475@1wt.eu>
Message-ID: <Pine.LNX.4.64.1007220500080.7242@ps20323.dreamhostps.com>
References: <068.d07026741c6694cd80652d2a7d34f236@tools.ietf.org> <4BF106AD.6020506@webtide.com> <A42E692A-7210-4FF1-AB4F-CFB3E8C38756@apple.com> <AANLkTinorjXFsTH=TvhhF-+e3Eyen8EA2qL7wFCmqpYe@mail.gmail.com> <Pine.LNX.4.64.1007212247590.7242@ps20323.dreamhostps.com> <20100721230350.GF6475@1wt.eu>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Subject: Re: [hybi] #1: HTTP Compliance
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jul 2010 06:31:55 -0000

On Wed, 21 Jul 2010, John Tamplin wrote:
> On Tue, May 18, 2010 at 5:16 PM, Ian Hickson <ian@hixie.ch> wrote:
> > On Tue, 18 May 2010, Greg Wilkins wrote:
> > >
> > > If the handshake is HTTP compliant, then the connection for a 
> > > websocket handshake could be taken from the existing pool of idle 
> > > connections to a host.  That would save the time needed to establish 
> > > the connection.
> >
> > The resemblence to HTTP is nothing more than a hack to alolow us to 
> > share ports in certain advanced scenarios. Most Web Socket servers 
> > will know nothing about HTTP.
> 
> I disagree completely -- there is going to be a web server involved in 
> delivering the web application, and I think the more usual scenario is 
> that web server also implements the WebSocket server.  Why would someone 
> prefer to deploy two servers rather than one, except in the case where 
> their web server doesn't yet support WebSocket?  If this protocol is 
> successful, over time that will drop to 0.

I see no advantage in the common case to using the same software for both. 
It's far easier to just host them separately.

Why don't people use the same software for HTTP and DNS? Or IMAP and SMTP?  
Or IRC and FTP? Why would they act differently for HTTP and WebSocket?


> > Reusing connections is a level of complexity that is completely 
> > unwarranted and that would only be useful in the rarest of cases. It's 
> > a proposal that lies on completely the wrong side of the 80/20 line 
> > and would introduce _massive_ complexity for authors, who would have 
> > no idea why their WebSocket servers were suddenly receiving random 
> > HTTP requests and vice versa.
> 
> I'm not sold on connection reuse, but I am not sure where these random 
> HTTP requests would be coming from.  If a connection was to 
> ws://foo.org/socket, the connection was closed, and then another 
> connection was needed for http://foo.org/image.gif, presumably the 
> server at foo.org:80 is capable of answering either request since it 
> would have had to handle either request on a new connection.

In this scenario, we are assuming that the server _can't_ answer the Web 
Socket request (otherwise the connection wouldn't be reused). So we are 
talking about cases where people are attempting to connect to servers that 
don't exist. If we're talking about that, then I don't see why it's any 
more of a stretch to imagine trying to connect to a Web Socket server, 
having it succeed from the server's point of view but fail from the 
client's point of view, and then having the client reuse the connection 
for some bogus HTTP request.

In any case, reusing connections when the server fails to return a valid 
Web Socket response but does return a valid HTTP response is an 
optimisation that will help in only the rarest of cases, all of which are 
indicating failurel and thus likely to be cases where the user doesn't 
really care about the milliseconds saved.
 

On Thu, 22 Jul 2010, Greg Wilkins wrote:
> 
> Currently the WS handshake can only be rejected by closing the 
> connection and discarding any potential HTTP response.  Thus a webapp 
> that wishes to fall back to a non-ws transport will have to establish a 
> new connection, maybe negotiate TLS, then handshake the new transport.  
> Thus there will be an extra 2 or 3 round trips to establish the 
> fall-back transport.

The only time this would be useful is when the script doesn't know ahead 
of time which host it will be connecting to, and doesn't know ahead of 
time what protocols that host will support, but where it does know that it 
will support either a Web Socket server or an HTTP-based mechanism. This 
will only occur during the transition period where some sites provide an 
HTTP-based protocol but not a Web Socket version, but where other sites 
provide Web Socket equivalents.

This is such an edge case that optimising for it should only be done if it 
can be essentially done for free. This is not the case here. Debugging 
connection reuse will be a huge pain. It's not worth it.

If you really truly want to handle this case, just invoke the 
XMLHttpRequest constructor at the same time as the WebSocket constructor, 
and then drop whichever one fails.


On Thu, 22 Jul 2010, Jamie Lokier wrote:
> 
> As noted some time ago, even when WS negotation *succeeds*, it can be 
> slower than comet-style HTTP, both slower in sending the first messages, 
> and slower in receiving the first responses.
> 
> It means latency-optimised apps may open *two* connections in parallel: 
> One comet-style HTTP, and one WebSocket.  They will communicate initial 
> messages over the HTTP connection, and switch to the WebSocket 
> connection when that is ready.  That's not kind on low bandwidth links, 
> nor easy to program, so it's an ugly compromise.

Could you give an example of an app where the speed in which the Web 
Socket connection is established matters? I can't think of any case where 
the client needs to send information that quickly -- after all, the user 
won't have started doing anything within one RTT of the page loading. (The 
server can easily include any data it wants in the original HTTP request, 
so this is presumably not to _get_ information.)


On Wed, 21 Jul 2010, Roberto Peon wrote:
> >
> > I could see trying multiple WebSocket protocols over one connection, 
> > but trying to try both HTTP or WebSocket connections, not to mention 
> > any other protocols the servers might provide, seems like massive 
> > complexity for negligible gain overall.
>
> I fully expect that we'll end up with multiple websocket "sockets" per 
> tab

Presumably to different hosts.


> and we typically end up with many tabs.

Tabs can share a single WebSocket to a single host using shared workers.


> [...] As for complexity! At worst, you have flow control and 
> multiplexing. Multiplexing involves a unique ID per channel. Flow 
> control involves sending periodic updates telling the other side how 
> much it can send safely. Of course you also need to have a table in 
> which you do a lookup to see that there is already a connection for that 
> domain, including a reference to that connection. None of this is 
> difficult, even in concert.

Over the past couple of months, we've had several Web developers come into 
the #whatwg channel and ask for help implementing the current Web Socket 
draft. We've seen all kinds of difficulties implementing just the current 
spec! People using regular expressions over the buffer to parse the 
handshake [1], people not considering that the handshake might be split 
into two packets, people writing code that reads straight off the end of 
their buffer if the data sent to their server isn't wellformed per the 
handshake... none of what's in the current draft is "difficult", but it's 
still difficult enough, as far as I can tell.

[1] (Which I incidentally expected; that's why there are two keys, so you 
can't trick such naive implementations by smuggling a key in the resource 
name. I didn't expect to see code saved by this so soon.)


On Thu, 22 Jul 2010, Jamie Lokier wrote:
> Willy Tarreau wrote:
> >
> > [Good description of transparent proxies at ISPs with configurable 
> > HTTP-aware rules on the routers.]

What Willy wrote was not a description of transparent proxies but of 
man-in-the-middle proxies. Transparent proxies are a different beast 
altogether. Please see the HTTP spec for details. Man-in-the-middle 
proxies are not legitimate per the HTTP spec as far as I can tell, and are 
the cause of many problems on the Web (such as the lack of our ability to 
deploy pipelining).


On Thu, 22 Jul 2010, Willy Tarreau wrote:
> 
> There are not that many ISPs in each country, I mean there are far less 
> ISPs than there are web sites or potential WebSocket implementers. 
> There's a high pressure on them to work as expected by customers.

Not high enough, clearly, or we'd be able to deploy pipelining.


On Thu, 22 Jul 2010, Willy Tarreau wrote:
> > 
> > (Note: from a conformance standpoint, the "server" includes the 
> > proxy.)
> 
> as seen from the client, yes. As seen from the proxy or the server or 
> any intermediate between them, no :-)

I mean as seen from the point of view of conformance to the specification.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'