Re: [hybi] #1: HTTP Compliance

Roberto Peon <fenix@google.com> Wed, 21 July 2010 14:42 UTC

Return-Path: <fenix@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 506FE3A6826 for <hybi@core3.amsl.com>; Wed, 21 Jul 2010 07:42:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.976
X-Spam-Level:
X-Spam-Status: No, score=-105.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l8OPrtHwJ8UG for <hybi@core3.amsl.com>; Wed, 21 Jul 2010 07:42:49 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [216.239.44.51]) by core3.amsl.com (Postfix) with ESMTP id 7BC693A67D1 for <hybi@ietf.org>; Wed, 21 Jul 2010 07:42:49 -0700 (PDT)
Received: from wpaz33.hot.corp.google.com (wpaz33.hot.corp.google.com [172.24.198.97]) by smtp-out.google.com with ESMTP id o6LEh54a010978 for <hybi@ietf.org>; Wed, 21 Jul 2010 07:43:05 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1279723385; bh=pw3h70ai98Qd5oL/os1ukyeVoxU=; h=MIME-Version:In-Reply-To:References:Date:Message-ID:Subject:From: To:Cc:Content-Type; b=uH+m4SdZqFwl2NBT6W1oR2X/ghfPZSKc99aRA9Ii6NNTsrG+u2/5se7tUoYkora5g b0ZbRBqMpXfLZ8U8e/9AA==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:date:message-id:subject:from:to: cc:content-type:x-system-of-record; b=VH4GVfnAf7yw5qc88dmToU4Ro/gzQUdSAPVKRXNqqYF2YlpPhg+NiRGvZ/n6MrlQK SKWWC13Z6iRWvf2A6wlmg==
Received: from gwaa12 (gwaa12.prod.google.com [10.200.27.12]) by wpaz33.hot.corp.google.com with ESMTP id o6LEh4kf011886 for <hybi@ietf.org>; Wed, 21 Jul 2010 07:43:05 -0700
Received: by gwaa12 with SMTP id a12so3430462gwa.22 for <hybi@ietf.org>; Wed, 21 Jul 2010 07:43:04 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.150.136.4 with SMTP id j4mr545610ybd.356.1279723384155; Wed, 21 Jul 2010 07:43:04 -0700 (PDT)
Received: by 10.150.59.4 with HTTP; Wed, 21 Jul 2010 07:43:03 -0700 (PDT)
In-Reply-To: <Pine.LNX.4.64.1007210108300.7242@ps20323.dreamhostps.com>
References: <4BF11920.2080307@webtide.com> <Pine.LNX.4.64.1005171039050.25609@ps20323.dreamhostps.com> <4BF12FF1.2020101@webtide.com> <15307.1274106895.116423@Sputnik> <Pine.LNX.4.64.1005172259030.22838@ps20323.dreamhostps.com> <20100518003753.GP20356@shareable.org> <Pine.LNX.4.64.1005180229430.22838@ps20323.dreamhostps.com> <20100518121245.GR20356@shareable.org> <AANLkTiniCjBwm5T59as8jByM5xDhPMrea-GqZFpWPAVS@mail.gmail.com> <Pine.LNX.4.64.1005182105360.22838@ps20323.dreamhostps.com> <20100519013238.GB2318@shareable.org> <Pine.LNX.4.64.1007210108300.7242@ps20323.dreamhostps.com>
Date: Wed, 21 Jul 2010 07:43:03 -0700
Message-ID: <AANLkTinN=f5tOur+GN9KQF+z90iNDSTH1wGgxPk1Gh8k@mail.gmail.com>
From: Roberto Peon <fenix@google.com>
To: Ian Hickson <ian@hixie.ch>
Content-Type: multipart/alternative; boundary="000e0cd56a84b92f12048be6d183"
X-System-Of-Record: true
Cc: "hybi@ietf.org" <hybi@ietf.org>
Subject: Re: [hybi] #1: HTTP Compliance
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Jul 2010 14:42:52 -0000

On Tue, Jul 20, 2010 at 11:47 PM, Ian Hickson <ian@hixie.ch> wrote:

> On Wed, 19 May 2010, Jamie Lokier wrote:
> > >
> > > If we want to support multiple subprotocols at once, we can do so
> > > quite easily by just making the subprotocol list be comma-separated.
> > > Would this be a good idea?
> >
> > I think it is a good idea, although there is a risk of low-quality
> > server but significant implementations matching a literal string,
> > breaking when other values are added to the comma-separate list, and
> > therefore making it impossible for clients to actually use the
> > capability.
>
> Yeah, though since that bug would be specific to implementations of
> particular subprotocols, it would be pretty localised to individual
> communities. That's probably an acceptable problem.
>
> I've added this to the spec for now.
>
>   http://html5.org/tools/web-apps-tracker?from=5172&to=5173
>
>
> > But assuming the comma-separated list did catch on, a consequence of
> > that would be the "below the API" part of WebSocket would have a place
> > to add its own distinct entries to the comma-separated list, for
> > recognition by the other side as transport option requests (such as
> > compression, etc.), safe in the knowledge it wouldn't break negotiation.
> >
> > (Separate headers would be much cleaner for that, though).
>
> I don't see why we wouldn't just use separate fields for that. No need to
> overload the subprotocol field.
>
>
> > > >    - Trying a HTTP-based protocol if WebSocket is unavailable (ditto)
> > >
> > > That wouldn't work anyway because the WebSocket object isn't the same
> > > object as the XMLHttpRequest object and they therefore create entirely
> > > different connections.
> >
> > Just being different objects is not technical reason for different
> > connections. Separate XMLHttpRequest objects have no problem sharing
> > connections. There is no technical reason why a WebSocket object could
> > not do the same, just as easily.  Especially in browsers which routinely
> > do this!
>
> I guess. I'm not convinced it's a good idea though.
>
>
> > > In any case, why wouldn't you know that Web Socket is available? I
> > > don't understand why you would guess that it is and then find it
> > > isn't. You need a URL to connect to a Web Socket, why would you make
> > > up a URL without knowing whether it'll work?
> >
> > Earlier in this thread, in a reply to you, we already gave an example.
> > It was the social networking client that talks to numerous de facto
> > standardised but different WebSocket protocols that the user shouldn't
> > have to know about.
> >
> > Basically because autonegotiation is more user-friendly, both in terms
> > of users having to be given and enter fewer details, and in terms of
> > telling the user what went wrong if things didn't work out.
> >
> > As you noted, people don't always pass around URLs - they often pass
> > around just a domain name, or a truncated URL that doesn't include the
> > "http:" prefix.
> >
> > I would agree if it were _complicated_ to do, but we are talking about
> > really trivial stuff here.  Trying one thing, then trying another, is
> > already handled by the handshake, and it is utterly trivial to do from a
> > script.  It's so easy that it's hard to see why any random web developer
> > wouldn't use it whenever it seemed like a good idea.
> >
> > It seems obvious to me that countless Javascripts will do exactly that.
>
> I could see trying multiple WebSocket protocols over one connection, but
> trying to try both HTTP or WebSocket connections, not to mention any other
> protocols the servers might provide, seems like massive complexity for
> negligible gain overall.
>
>
I fully expect that we'll end up with multiple websocket "sockets" per tab,
and we typically end up with many tabs.
This is true even if going to only one large site. For instance, it is
fairly normal to have both corporate and personal email and calendering open
at all times.
If each websocket is its own TCP socket, we'll face a drastic increase in
the number of connections from everyone.
Worse, these connections won't go away, they'll send junk bytes all the time
(keep alives), they'll be unable to go to the same server if one is using
loadbalancing, they'll have more than an order of magnitude greater chance
to overload NAT tables, they'll make DST caches in the kernel less effective
due to dilution (for smaller servers), etc.

Each and every of these is a good reason that the complexity is warranted.
I have some "small" experience with ensuring that things can serve at-scale.
There are very good reasons that I'm concerned about websocket scalability.

As for complexity! At worst, you have flow control and multiplexing.
 Multiplexing involves a unique ID per channel. Flow control involves
sending periodic updates telling the other side how much it can send safely.
Of course you also need to have a table in which you do a lookup to see that
there is already a connection for that domain, including a reference to
that connection. None of this is difficult, even in concert.

-=R

>
> > > This can be supported entirely in Web Socket if we want to do that.
> > > Currently, it's not clear that supporting this is even desireable, due
> > > to the UI issues with doing so.
> >
> > (On that topic, how do you propose to get through a proxy on port 443
> > that needs proxy authentication before it passes a CONNECT request?)
>
> I guess the UA would have to use the existing (suboptimal) UI for proxy
> auth. It's not a good situation to be in; I wouldn't recommend it.
>
>
> > > WebSocket and HTTP are different protocols. Reusing the connection
> > > makes as much sense as reusing the connection of an FTP connection
> > > attempt to do an HTTP connection attempt.
> >
> > This is different.  Both WebSockets and HTTP requests are transient in
> > many cases.
>
> FTP is more transient than WebSockets. WebSockets is intended for
> connections that just stay open. It's not intended for transient use. It's
> not at all designed for transient use. It will do poorly when used for
> transient use.
>
>
> > If FTP was also transient, and it shared the same port with
> > HTTP, and many servers especially those designed for high performance
> > implemented both together, and many clients implemented both, then of
> > course it would make sense to _permit_ connection reuse between them,
> > without _requiring_ any implementation to do so.
>
> I think it would be pretty silly even in those cases, personally. Reusing
> connections amongst multiple protocols is simply asking for bugs. The
> benefits are minimal and the costs are high. If you want to do multiple
> things on a connection, use a protocol that supports all those things,
> don't try to switch back and forth between protocols.
>
> (By extension, I think the Upgrade mechanism in HTTP isn't a particularly
> good idea. The number of times the mechanism has been used to great
> success on the Web somewhat supports my position on this, I think.)
>
>
> > There is a much stronger case for reusing a WebSocket connection after a
> > gracefully close that puts it into some kind of idle state.  That's
> > because users follow links every few seconds - and therefore WebSocket
> > scripts in ordinary pages will connect and disconnect from the same
> > server and port every few seconds in the current model.
>
> What use cases are you imagining that have Web Sockets used in such
> scenarios? None of the intended use cases even remotely resemble this. Web
> Sockets is intended for use on pages that are long-lived. It's quite
> possible that there are use cases on short-lived pages also, but they
> probably need their own protocol, optimised for those cases (e.g. HTTP and
> the text/event-stream EventSource mechanism). WebSockets isn't trying to
> be everything for everyone, it's just trying to address the specific use
> case of trivially-implementable long-lived TCP-like connections for Web
> browsers.
>
>
> > Making it optional for both sides adds *zero* complexity for authors who
> > don't do it. I am not seeing how you can think it makes any difference
> > to them.
>
> Clients have to support it. This means implementation cost, testing cost,
> and bug-fixing cost. It then propagates to documentation, which means
> reference costs, tutorial costs, etc. This further means that authors will
> see the feature in documentation. This leads to cognition costs when
> learning the material, information retrieval costs when distinguishing
> whether something is relevant or not when debugging, and communication
> costs when discussing the feature with others, e.g. when determining
> whether the feature is relevant to the question someone is asking. Then
> there's the cost of maintaining code that someone else has written that
> _does_ use the feature, there's the cost of debugging browser bugs when
> the feature is misimplemented and interacts even with code that doesn't
> use the feature, and finally there's the cost of implementing the feature
> on the server side once it makes its way onto check-mark lists of features
> that every web developer customer wants to support.
>
>
> > Seriously, how do you imagine it affects them?
>
> Optional features are a lie. Nothing is really optional in a platform like
> the Web's -- the only way a feature can be "free" is if it doesn't exist.
> This is why we have to justify everything we add, and make sure it's on
> the right side of the 80/20 line.
>
>
> > > who would have no idea why their WebSocket servers were suddenly
> > > receiving random HTTP requests and vice versa.
> >
> > That's a function of connecting to the wrong type of server already, and
> > it's already dealt with by the spec'd negotation, which the wrong type
> > of server handles already by design.  Nothing new there.
>
> If we supported connection reuse and there was misconfiguration, then the
> kinds of failure scenarios would be much more varied than they are now.
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi
>