Re: [hybi] Flow control quota

"Arman Djusupov" <arman@noemax.com> Thu, 07 June 2012 11:03 UTC

Return-Path: <arman@noemax.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4981B21F87BA for <hybi@ietfa.amsl.com>; Thu, 7 Jun 2012 04:03:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.37
X-Spam-Level:
X-Spam-Status: No, score=-2.37 tagged_above=-999 required=5 tests=[AWL=0.229, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J+j2AP5RN1MR for <hybi@ietfa.amsl.com>; Thu, 7 Jun 2012 04:03:13 -0700 (PDT)
Received: from mail.noemax.com (mail.noemax.com [64.34.201.8]) by ietfa.amsl.com (Postfix) with ESMTP id E3AE121F8773 for <hybi@ietf.org>; Thu, 7 Jun 2012 04:03:12 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; t=1339066988; x=1339671788; s=m1024; d=noemax.com; c=relaxed/relaxed; v=1; bh=9PgxLpPrfNAasYFFpv5igd+xaGk=; h=From:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To:References; b=UULtv8gTvqrDLatdEdpyBlO3t6yWIvXyBzYvFZJ0Ya2QL+9x+Jv6TdK7uyS249xf0ZldrC8D2f3rCeK3O8gKClLBIs9O1SN4rRAmBFy5oRxMlURQI1s0AH8VjUcEirBWH7x+pFcN3S82X1CeRD48J9AZSPt5toA2/YqCxLddHdk=
Received: from mail.noemax.com by mail.noemax.com (IceWarp 10.4.1) with ASMTP (SSL) id RRK84306; Thu, 07 Jun 2012 14:03:06 +0300
From: Arman Djusupov <arman@noemax.com>
To: 'Jamie Lokier' <jamie@shareable.org>
References: <001a01cd3e69$4a221c10$de665430$@noemax.com> <4FC732DC.3000308@250bpm.com> <000e01cd3f1c$af15ad40$0d4107c0$@noemax.com> <4FC880A7.9070007@250bpm.com> <CAH9hSJaWrUX6gFNLT4xkXLYKHSUH5+Y7AvqN9cD_CwekvsNu3A@mail.gmail.com> <001001cd4000$fe2c82c0$fa858840$@noemax.com> <4FCCAE6B.1010306@250bpm.com> <002d01cd4262$747957b0$5d6c0710$@noemax.com> <20120607022312.GA26406@jl-vm1.vm.bytemark.co.uk>
In-Reply-To: <20120607022312.GA26406@jl-vm1.vm.bytemark.co.uk>
Date: Thu, 07 Jun 2012 14:02:56 +0300
Message-ID: <000e01cd449d$1c0ed220$542c7660$@noemax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
thread-index: AQDnHEYxNcL9Xb2UVJoRm1lbL/BkAQFoH3KRAV1ppa8CZ6BgXAIpbgj4AsMo47wDBdtyYwH7TrBAAdzt12qYMz+uIA==
Content-Language: en-us
Cc: hybi@ietf.org
Subject: Re: [hybi] Flow control quota
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jun 2012 11:03:14 -0000

We have been trying to find some sort of workaround in order to avoid
enveloping the WebSocket frame within a mux frame. If mux was a separate
layer then a WebSocket frame would need to be enveloped within a mux frame
including the WebSocket frame header. This would resolve the unfragmentable
frame issue. But this would also mean that in most cases WebSocket would be
writing two headers per frame. Writing a frame header twice (once for mux
and once the actual frame header) would have a performance overhead. Note
that enveloping the frame would produce considerable bandwidth usage when
lots of small messages are being enveloped.

However the performance overhead of mux in such a case is not tested yet. We
can try to experiment with this design. A mux frame should be able to
include fragments or multiple WebSocket frames that belong to the same
channel. If the length of an enveloped frame is less that length of the mux
frame then the mux frame contains more than one enveloped frames. If the
number of bytes remaining in a mux frame is less than the length of the last
enveloped frame encountered, then the frame is fragmented and the remaining
bytes are expected to arrive in a subsequent mux frame of the same channel.

Going towards this direction is a big change from the original spec design,
so I wasn't even considering it. But if the group decides that it needs to
be evaluated, I can try working on this direction.

With best regards,
Arman


-----Original Message-----
From: Jamie Lokier [mailto:jamie@shareable.org] 
Sent: Thursday, June 07, 2012 5:23 AM
To: Arman Djusupov
Cc: 'Martin Sustrik'; hybi@ietf.org
Subject: Re: [hybi] Flow control quota

Hi everyone,

I haven't been following the hybi list for a while (it was too
depressing/exhausting), but I'm really pleased to see deadlock-free,
starvation-free mux is being taken more seriously now.  I guess SPDY's
helped with that.

The problem being described in this thread occurs because the layers aren't
really being kept separate.

If they are properly separated, everything works, and it's easier to
implement and even a bit faster on the network.

I see a lot of confusion about the role and purpose of "fragments".
Particularly the idea that where something produces fragments, those must
literally correspond with the wire protocol, despite having another
transformative layer before the wire.

Arman Djusupov wrote:
> According to the WebSocket specification extensions are layered and 
> should be applied in a specific order. In this particular case the mux 
> specification provides two ways of applying the per frame compression
> extension: compression can be applied either before or after the mux 
> extension. In the first case the frame being multiplexed is already 
> compressed and should not be fragmented.
                 ^^^^^^^^^^^^^^^^^^^^^^^^

Mux should be able to further fragment - in a way that compression does not
see.

> This allows intermediaries to de-multiplex frames without 
> decompressing them. In the second case the mux frame is compressed 
> along with the mux header, so it cannot be de-multiplexed unless it is 
> first decompressed. Supporting both options is beneficial.
> 
> In any case the problem preventing  unfragmentable frames from being 
> relayed over flow-controlled logical connections should be resolved. 
> The per frame compression is not the only case when it might be 
> impossible to fragment a frame. A mux intermediary should either be 
> able to control the size of the frame that the sending side produces or
should be able to fragment them.
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Imo, that's the only sensible answer.

Anything else leads to a spiral of hacks - see this thread for examples!

Basically the output of compression is a stream of "unfragmentable"
things.  But several things, mux being one, need ability to break up the
data stream where it's useful - which is the point of fragments.

Deadlock-free mux pretty much _requires_ per-channel flow control and
fragmentation at the mux.  And it should be visible only to the mux.

Compression (and other layers) need to be _strictly_ separate.  But in these
discussions they've been combined in a fuzzy way, which makes everything
complicated.

I guess it was a misguided attempt to keep the wire concepts simple (fit
everything into "frames" and "fragments"), which actually makes it more
complicated.

It doesn't really matter whether compression is layered above or below, both
work fine, as long as it's a separate layer.

If you treat mux like this:

    1. Zero or more "stream of fragments" (of next layer up) in.
    2. One "stream of fragments" (of mux protocol) out.

and demux like this:

    1. One "stream of fragments" (of mux protocol) in.
    2. Zero or more "stream of fragments" (of next layer up) down.

Where "mux protocol" means it's encapsulating channel numbers, flow control
and fragmentation, everything behaves nicely.

There is no need of oddities like negative flow control tokens, wasteful
round trips, no deadlock, starvation, buffer overflow at intermediaries due
to large fragments or bandwidth limitations.

It just works(tm).  (Well you do have to implement a good mux - but that's
all, it's confined to the mux implementation.)

For compression above the mux, this means:

    - Compression layer outputs a stream of compressed fragments.
    - Mux takes it in, and produces its _own_ stream of mux fragments.
    - Demux takes that, and recombines to make the original fragments.
    - Decompression gets what it needed.

Compression below the mux is similar, but the other way.  See?

In many cases the fragement boundaries of the two layers will coincide, but
it shouldn't be assumed or restricted to that.  If there is a desire to
encode _that case_ efficiently, that's fine, as long as it's just an
encoding.

It breaks the whole point of flow-control and deadlock-avoidance if the
layers have to maintain the same boundaries - as this thread demonstrates.
One large compressed frame/fragement, and all other channels are starved -
nobody will use mux if it's that unreliable!
Certainly not cooperatively/opportunistically.

Or, if you limit the compression size to the whole path's capacity - that's
also a waste (of compression opportunity).  In that case, the attempt to
save a few bytes in the header encoding is counterproductive (but you can
still save those bytes, just make sure it's purely a syntax optimisation.).

Oh, one other thing: It's best if compression is free to make the best
compression decisions without blocking _other_ things that need
fragmentation below - namely control frames, that must be sendable, and
maybe prioritisable.

As a practical matter, I'm thinking it makes sense for the compression layer
to see things this way:

   1. Receive stream of application frames (WS messages).
         -> Compressor ->
   2. Stream of "frames" (not fragments) out, meaning "non-fragmentable".
         -> Decompresor ->
   3. Emit stream of application frames (WS messages)

The "frames" in 2 above are compression-protocol messages, and do not have
to correspond with application messages.  They are actually the output that
current compression proposals would call "fragments".

The only difference, really, is syntax in the pipeline when passed to the
layer below.  So actually this is a very small change.  The compression
method, decisions it makes, etc. are unchanged.

My summary:

Keep the mux and other layers separate, not leaky, and all the muxy things
(flow control etc.) can be implemented in the mux layer alone.

Everything will work, and the implementation will be simpler as well (more
modular, less dependencies).

When a layer (such as compression) outputs a stream of things with essential
boundaries, treat it as stream of frames, don't conflate it with "fragments"
at the wire level - even if it did involve splitting the WS application's
original messages.  Keep the relationship between these frames and
application (or higher layer) frames internal to the compression layer and
protocol (same for other layers).

It is much better to maintain clear semantics: frame boundaries are
immutable because the upper layer requires them, fragment boundaries are
_always_ allowed to be split and merged for optimal transport decisions, and
they only correspond _literally_ to the wire protocol for the lowest layer
in the stack.

As the wire protocol is now defined, it doesn't match up well with the above
semantics, but it's a syntax (header encoding) issue only.

It will even go faster on the network due to freeing up each component to do
the best for its part.  Think about the different combinations of layers
when several fragments/frames are in flight, and also multi-hop paths.  They
all benefit.

All the best,
-- Jamie