Re: [hybi] Last Call: <draft-ietf-hybi-thewebsocketprotocol-10.txt> (The WebSocket protocol) to Proposed Standard

Willy Tarreau <w@1wt.eu> Sun, 24 July 2011 10:52 UTC

Return-Path: <w@1wt.eu>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2097C21F8B45; Sun, 24 Jul 2011 03:52:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.05
X-Spam-Level:
X-Spam-Status: No, score=-5.05 tagged_above=-999 required=5 tests=[AWL=-3.607, BAYES_00=-2.599, HELO_IS_SMALL6=0.556, J_CHICKENPOX_44=0.6]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lDDR7nei-GMv; Sun, 24 Jul 2011 03:52:34 -0700 (PDT)
Received: from 1wt.eu (1wt.eu [62.212.114.60]) by ietfa.amsl.com (Postfix) with ESMTP id 0CF0E21F8B44; Sun, 24 Jul 2011 03:52:33 -0700 (PDT)
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id p6OAqNVe026145; Sun, 24 Jul 2011 12:52:23 +0200
Date: Sun, 24 Jul 2011 12:52:23 +0200
From: Willy Tarreau <w@1wt.eu>
To: Dave Cridland <dave@cridland.net>
Message-ID: <20110724105223.GL22405@1wt.eu>
References: <CALiegfmTWMP3GhS1-k2aoHHXkUkB+eWqV=2+BufuWVR1s2Z-EA@mail.gmail.com> <20110721163910.GA16854@1wt.eu> <CAP992=FrX5VxP2o0JLNoJs8nXXba7wbZ6RN9wBUYC0ZSN_wbAg@mail.gmail.com> <9031.1311270000.588511@puncture> <CALiegf=pYzybvc7WB2QfPg6FKrhLxgzHuP-DpuuMfZYJV6Z7FQ@mail.gmail.com> <CAP992=FJymFPKcPVWrF-LkcEtNUz=Kt9L_ex+kLtjiGjL1T46w@mail.gmail.com> <4E28A51F.4020704@callenish.com> <CALiegf=4K2oWfmZjGMD7J_jyaDtS3i+Mu7R0Wh75Rr+MrQCjtw@mail.gmail.com> <20110722054345.GE18126@1wt.eu> <9031.1311500145.687172@puncture>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <9031.1311500145.687172@puncture>
User-Agent: Mutt/1.4.2.3i
Cc: Server-Initiated HTTP <hybi@ietf.org>, IETF-Discussion <ietf@ietf.org>
Subject: Re: [hybi] Last Call: <draft-ietf-hybi-thewebsocketprotocol-10.txt> (The WebSocket protocol) to Proposed Standard
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Jul 2011 10:52:35 -0000

[ should we leave ietf@ietf.org in CC or not ? I'm suspecting that people
  who read this address will quickly get bored by hybi traffic ]

On Sun, Jul 24, 2011 at 10:35:45AM +0100, Dave Cridland wrote:
> You're saying that you have a nebulous connection thing, that you  
> pump HTTP requests down, and you may want to use that for WebSocket  
> upgrade requests as well, by taking the ws URI and performing a  
> simple transformation on it into an HTTP URI to make the upgrade  
> request on.
> 
> So, given ws://example.com/foo/bar, you'll make the upgrade request  
> on http://example.com/foo/bar

No I'm not saying that because I don't understand what you mean here.
What I'm saying is that browsers try to reuse existing connections to
host:port. So if you want to connect to ws://example.com/foo/bar and
the browser already has a connection to example.com:80 because of a
previous HTTP request to that host, it's much advantageous for it to
reuse that existing connection.

Making an additional DNS request and a connection come with a cost.
In mobile environments, it's not uncommon to see 300-500 ms RTT. This
means that performing the additional DNS request could cost up to one
extra second.

> What Iñaki and I are arguing for is that for WS, there is simply one  
> additional step prior to finding a suitable connection, which is  
> performing an SRV lookup/selection.

That's clearly what I understood.

> And this new step can actually be performed before any of the HTTP  
> connection steps. So, roughly, in order to "do" WebSocket for  
> ws://example.com/foo/bar you would first do SRV lookup, which would  
> give you an equivalent HTTP URI to perform the upgrade request. So  
> for example, if you got:
> 
> _ws._tcp.example.com SRV 1 0 ws.example.com
> 
> You'd make the upgrade request on the HTTP URI of:
> 
> http://ws.example.com/foo/bar
> 
> Of course, if this failed to connect, you can find another HTTP URI  
> to try - something you cannot do without SRV.

If the connection fails here, HTTP will fail too, and right now, people
don't use SRV records with HTTP. I'm not judging whether it's good or
bad, I'm just saying what I'm seeing. Instead, people are using load
balancers, BGP and DNS mechanisms to ensure their site is available.

So whatever principle they're currently using with HTTP will also
provide the same mechanism for free to WS.

Yes it could be nice to have SRV on both HTTP and WS, but we need to
make it adopted for HTTP first. And if it's not adopted at all in the
case of HTTP, probably that is because it's not perceived as bringing
a lot of benefit *for that specific usage*. Instead it may even add
one DNS round trip which clearly is not desirable with highly interactive
protocols such as HTTP.

Also, the notion of "failure to connect" is hard to qualify in interactive
environments. If my browser tries to connect and waits more than a few
seconds before retrying on another server, I will probably already have
pressed the Esc key. We all do that when clicking links on Google search
results : if you don't get a result in one second, you click another one.
It is not reasonable to use even lower timeouts when using SRV records,
because it happens quite often to lose packets on the net, and sometimes
it's the client side which is slow. If the browser decides to fail within
one second and try another host, it will experience very frequent
difficulties.

Client-based failover (what SRV or MX are) is nice with protocols that
can wait long enough to ensure the server is dead. Some mailers try to
connect for one minute or even more. At that point, the client is
certain that the server is dead. In a browser, nobody waits that long,
so giving such a verdict is quite hard.

Still, I think that SRV might be used for site failover, it will be a
lot cleaner than using very short TTLs as some are doing. Such a usage
is compatible with deployed infrastructures, because nothing prevents
a client from prefering an SRV record over a CNAME record.

Now with that in mind, I don't see how one could suggest a MUST for this.
MUST that don't directly come from absolute requirements are never 100%
respected and cause much more trouble than SHOULD which are recommended
for optimal usage. If we say SHOULD, sites will ensure they have a working
DNS config which always announces valid addresses in A/AAAA or CNAME and
that SRV is used for failover. If we say MUST, sites won't bother emitting
correct addresses in A/AAAA or CNAME, or will even announce different ones,
and we'll get a lot of trouble at many places where the wrong record will
be picked by proxies or by clients which can't use SRV for whatever
reason.

Regards,
Willy