Re: [hybi] Framing take IV

Jamie Lokier <jamie@shareable.org> Wed, 04 August 2010 03:34 UTC

Return-Path: <jamie@shareable.org>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id CF4963A67BD for <hybi@core3.amsl.com>; Tue, 3 Aug 2010 20:34:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.512
X-Spam-Level:
X-Spam-Status: No, score=-2.512 tagged_above=-999 required=5 tests=[AWL=0.087, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B9LxmPUohq8W for <hybi@core3.amsl.com>; Tue, 3 Aug 2010 20:34:54 -0700 (PDT)
Received: from mail2.shareable.org (mail2.shareable.org [80.68.89.115]) by core3.amsl.com (Postfix) with ESMTP id 6D0DB3A6405 for <hybi@ietf.org>; Tue, 3 Aug 2010 20:34:54 -0700 (PDT)
Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from <jamie@shareable.org>) id 1OgUlE-0005cI-Rx; Wed, 04 Aug 2010 04:35:20 +0100
Date: Wed, 04 Aug 2010 04:35:20 +0100
From: Jamie Lokier <jamie@shareable.org>
To: Ian Hickson <ian@hixie.ch>
Message-ID: <20100804033520.GW27827@shareable.org>
References: <AANLkTinyrDoG5d_Ur6HVRy=SgMPjLzJtpJ++Ye=1DQdj@mail.gmail.com> <Pine.LNX.4.64.1008040050040.5947@ps20323.dreamhostps.com> <28A6543A-5CA6-42B7-8D2E-F5511EE20008@apple.com> <AANLkTik0-gG-CE5LNt+qDN9X1QupN0dnFtKcbt2athqO@mail.gmail.com> <Pine.LNX.4.64.1008040134310.5947@ps20323.dreamhostps.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <Pine.LNX.4.64.1008040134310.5947@ps20323.dreamhostps.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
Cc: hybi@ietf.org
Subject: Re: [hybi] Framing take IV
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Aug 2010 03:34:56 -0000

Ian Hickson wrote:
> On Tue, 3 Aug 2010, Ian Fette (?~B??~B??~C??~C~U?~B??~C~C?~C~F?~B?) wrote:
> > 
> > For us it's the number of open sockets on the frontend, plain and 
> > simple. Each open socket == more kernel memory for socket buffers. 
> > Whereas, if we have multiplexing, I can take in a frame, see what stream 
> > it is, send it off to the appropriate backend and be done. Holding open 
> > multiple connections for a single user kills us.
> 
> We shouldn't design a protocol for the next few decades around a 
> limitation of a particular kernel implementation and how it affects one 
> particular deployment. Why don't we just change the kernel to send it off 
> to the appropriate backend and be done instead of using lots of kernel 
> memory for socket buffers?

Virtually all kernels handle sockets the same way.  Windows is a
notable exception.  WebSocket may or may not take over the world, but
it's still unlikely to matter enough to cause all major OSes to change
their kernels that much :-)

But anyway, it's a limitation of TCP, not particular kernels.

TCP requires, in practice, both ends to be able to buffer about the
same as the TCP window size, which must be large enough for reasonable
throughput.  The amount needed is getting larger as links get faster.
(Some links need gigabytes per socket, to get maximum sustained throughput!).

No kernel can avoid this, although good kernels reduce the average
memory used (allowing it to be used for caches), without reducing the
committed memory potentially used.

However let's not pretend multiplexing easily avoids this buffering.
While you can dispatch messages as they come in (with a single TCP
window's worth of buffer), you still need a comparable amount of
memory to buffer incoming *dispatched* messages in a general
implementation.

For example, browsers would buffer incoming WebSocket messages in the
DOM event queues, before they are processed by the independent
handlers of different WebSocket objects in different windows, perhaps
running in different processes (Chrome).  Those queues also use
comparable memory.

Servers will have equivalents to those queues, perhaps in pipes to
application processes or threads.

Although you can pick off one message at a time to dispatch and
process, before picking off the next one, and that *can* use much less
memory, that strategy breaks independence of flows, so it is not
usable as "transparent" form of multiplexing multiple processes over a
single socket.

Nonetheless, even though transparent multiplexing uses quite a bit of
memory too, the multiplexing way does give applications a lot more
control over buffering choices, including committing much less memory
to each flow than TCP would on a high bandwidth(-delay) link, while
maintaining efficient throughput on those links.

-- Jamie