Re: [hybi] Last Call: <draft-ietf-hybi-thewebsocketprotocol-10.txt> (The WebSocket protocol) to Proposed Standard

Willy Tarreau <> Sun, 24 July 2011 20:42 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 56C2421F84F2; Sun, 24 Jul 2011 13:42:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.664
X-Spam-Status: No, score=-4.664 tagged_above=-999 required=5 tests=[AWL=-2.621, BAYES_00=-2.599, HELO_IS_SMALL6=0.556]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id YxbgXEJqUtOL; Sun, 24 Jul 2011 13:42:40 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 514AE21F873D; Sun, 24 Jul 2011 13:42:39 -0700 (PDT)
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id p6OKgaQI027480; Sun, 24 Jul 2011 22:42:36 +0200
Date: Sun, 24 Jul 2011 22:42:36 +0200
From: Willy Tarreau <>
To: Dave Cridland <>
Message-ID: <>
References: <> <> <9031.1311286867.939466@puncture> <> <> <> <> <> <> <9031.1311538720.416128@puncture>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <9031.1311538720.416128@puncture>
User-Agent: Mutt/
Cc: Server-Initiated HTTP <>, IETF-Discussion <>
Subject: Re: [hybi] Last Call: <draft-ietf-hybi-thewebsocketprotocol-10.txt> (The WebSocket protocol) to Proposed Standard
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 24 Jul 2011 20:42:41 -0000

On Sun, Jul 24, 2011 at 09:18:40PM +0100, Dave Cridland wrote:
> >> Open the fastest web page and tell me how long it takes. Probably  
> >you
> >> have performed a DNS A query. I don't think that a xtra DNS query
> >> would be the bottleneck, never.
> >
> >On lossy networks such as 3G, they definitely are. A lost UDP  
> >packet is
> >not retransmitted nor signaled as lost, so the browser has to  
> >retry. However,
> >once the connection is established to the server, most losses are  
> >more or
> >less smoothed by TCP extensions such as SACK. So yes, it can take  
> >several
> >seconds to just resolve a host and then only a few hundreds of ms  
> >to retrieve
> >the objects. I've observed it.
> I think what might be colouring your opinion regarding DNS resolution  
> times on mobile is the difference between the first and subsequent  
> RTTs.

Note that in the point above I was not even talking about RTT but
explaining to Iñaki that sometimes DNS can be slower than the rest
of the transfer due to losses and slow retransmits.

> 3G sessions, in a reasonable area, drop to around 100-150ms, although  
> they can go up to 300ms or higher if the network condition  
> deteriorates.

I agree. However it can get a lot worse as soon as you have even just a
little bit of traffic on your link (eg: objects fetched in parallel).

> However, the setup of DCH, the radio state normally  
> used for internet traffic (and needed for DNS requests and  
> responses), takes a healthy number of round-trips, so that the first  
> RTT is about three times longer.

Yes and depending on the equipments vendor, you may even experience
losses during this phase for several seconds. But I was really not
talking about such issues that add to the bad experience and should
not be considered as the norm.

> Moreover, it's not clear to me that SRV lookup always (or even  
> commonly) adds an additional round-trip. Take an XMPP client SRV  
> lookup to my own server:
> $ dig SRV
> ; IN	SRV
> 86400 IN SRV 5 2 5222  
>	86400	IN	NS
> 86400	IN	A
> 86400	IN	AAAA	2001:470:1f09:882:2e0:81ff:fe29:d16a
> Note that the addresses of the actual server are returned in the  
> additional section. My understanding is that in practical terms  
> this'd always happen for in-zone cases. (If there's a large number,  
> you may not get them all, since they can be discarded without error,  
> but it practise you're likely to).

That's very interesting. For an unknown reason, doing the same request
from here using either dig or host only retrieves the answer and not
the authority nor the additional sections.

> Finally, as I've said before, I think that any overhead involved is  
> going to be swallowed up in the noise of general session startup in  
> the WebSocket case. I do appreciate things are at the very least  
> perceived as different in the HTTP case, although I think SRV would  
> help solve issues (like off-site failover) there, too, but I think  
> the moment you have long-running stateful sessions, you'll end up  
> with the same impact to user experience of a few extra RTTs at  
> startup as is seen in XMPP, SIP, and so on - that is, none.
> 100ms extra on a 100ms request/response would be bad, I agree, but  
> that's not what we're talking about.

To ensure nobody gets me wrong, I'm certain this can help solve issues
*if this is optional*. If it becomes a MUST, then the negative effects
will override the positive ones. In my opinion, the client should decide
whether to enable it or not. If architectures are built with that in
mind, then it will not be an absolute requirement, and migration to SRV
could happen smoothly just like migration to make the Host header mandatory
happened. What is needed is a way to avoid the extra DNS lookup in most
cases where the SRV record is not there. Coupling it with CNAME as I
proposed could be a reasonable solution. Surely other solutions are better.

But we have to keep in mind that for SRV to work, it cannot be made
mandatory because existing infrastructure simply does not support it.
This point alone is enough to kill the mandatory requirement. And once
optional, we have to promote its adoption with a perceived benefit for
the end user with almost zero cost. Mixing all those points with the
use cases of WS leaves little place for SRV, but maybe someone wants
to work on it and identify the *real* benefits then emit a draft.