Re: [hybi] Flow control quota

"Arman Djusupov" <arman@noemax.com> Fri, 08 June 2012 11:07 UTC

Return-Path: <arman@noemax.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B70A321F8582 for <hybi@ietfa.amsl.com>; Fri, 8 Jun 2012 04:07:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.421
X-Spam-Level:
X-Spam-Status: No, score=-2.421 tagged_above=-999 required=5 tests=[AWL=0.178, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0vGArOD5gGzT for <hybi@ietfa.amsl.com>; Fri, 8 Jun 2012 04:07:15 -0700 (PDT)
Received: from mail.noemax.com (mail.noemax.com [64.34.201.8]) by ietfa.amsl.com (Postfix) with ESMTP id 9B1BE21F85E5 for <hybi@ietf.org>; Fri, 8 Jun 2012 04:07:08 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; t=1339153626; x=1339758426; s=m1024; d=noemax.com; c=relaxed/relaxed; v=1; bh=uZWbsKK1VinoTeqWIf1xfTRvYEM=; h=From:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To:References; b=7qHErC3rPw1fwXmWFO0brjw/Rgh5XWYOkV7JbT2kRlyEzruebIo5Uvfgq/DHUqNnEME6iZFV5S/dlOQLFbbcAhiHrja4KOpaTlX7wLeVoj3GTACn2TjqtCgIK4m7r+F8Xgmb+te75qUhxkD9sg43zktwwfnVWIOuczGCQvuxmaw=
Received: from mail.noemax.com by mail.noemax.com (IceWarp 10.4.1) with ASMTP (SSL) id SRP49205; Fri, 08 Jun 2012 14:07:05 +0300
From: Arman Djusupov <arman@noemax.com>
To: 'Greg Wilkins' <gregw@intalio.com>
References: <001a01cd3e69$4a221c10$de665430$@noemax.com> <4FC732DC.3000308@250bpm.com> <000e01cd3f1c$af15ad40$0d4107c0$@noemax.com> <4FC880A7.9070007@250bpm.com> <CAH9hSJaWrUX6gFNLT4xkXLYKHSUH5+Y7AvqN9cD_CwekvsNu3A@mail.gmail.com> <001001cd4000$fe2c82c0$fa858840$@noemax.com> <4FCCAE6B.1010306@250bpm.com> <002d01cd4262$747957b0$5d6c0710$@noemax.com> <20120607022312.GA26406@jl-vm1.vm.bytemark.co.uk> <000e01cd449d$1c0ed220$542c7660$@noemax.com> <CAH_y2NE6+3r_9pkYXhMOiRWfGJXGauEYCqg-8GvtOoCT9Ch0mA@mail.gmail.com>
In-Reply-To: <CAH_y2NE6+3r_9pkYXhMOiRWfGJXGauEYCqg-8GvtOoCT9Ch0mA@mail.gmail.com>
Date: Fri, 08 Jun 2012 14:06:56 +0300
Message-ID: <000401cd4566$d5f6bb70$81e43250$@noemax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 14.0
thread-index: AQDnHEYxNcL9Xb2UVJoRm1lbL/BkAQFoH3KRAV1ppa8CZ6BgXAIpbgj4AsMo47wDBdtyYwH7TrBAAdzt12oB94xqWAGEGWmHmBj31WA=
Content-Language: en-us
Cc: hybi@ietf.org
Subject: Re: [hybi] Flow control quota
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Jun 2012 11:07:16 -0000

By ignoring the RSV bits we don't resolve the problem of compressed frames
being unfragmentable. Even if the compression flag was in the payload it
would still not allow us to reassemble a compressed frame once it was
fragmented. Since a compressed frame has its final 4 bytes stripped off, the
receiver needs to know the frame boundary prior to its fragmentation in
order to append the final 4 bytes at the proper place.

We could perform a compression flush at the end of the message boundary
rather than on each frame. In this case the final frame of the message would
indicate the end of the self-contained compressed chunk and every frame
would be fragmentable. This is actually more in tune with the WS spec where
each frame is not required to  contain interpretable data. But as far as the
mux extension is concerned this would still be just a workaround since there
might be other frame-level extensions that require the frame to be
preserved.

At the same time Jamie's solution to allow mux to envelope and fragment all
frames on the wire also pushes the problem to the next layer. When
compression would be used with mux, the receiving side would have to buffer
and assemble the fragments of the frame prior to decompressing it. So, on
one side, the mux flow control algorithm would consider the data as received
and would send the quota to the remote side. At the same time,  the frame
being received would have to be buffered while its final chunk is still on
the wire. So an intermediary or server performing decompression would have
to be smart enough to decompress and forward whatever possible, buffering
only the final part of frame that it is still impossible to decompress.

Since we don't have better ideas, currently I see us having 2 options:

1. Permit the sender to send above the quota if the frame that has to be
sent is not fragmentable, resulting in the quota going negative. The sender
will have to wait for the quota to go positive before it can send the next
frame. The receiver has to make sure that it supplements the sender's quota
to a normal flow control window value once it detects that the sender has
sent a frame above the quota. As long as circumstances are normal the size
of the frame sent above the quota would be normal (e.g. more or less equal
to the size of the frames sent previously). If the receiver detects an abuse
of this functionality it may discard all frames and drop the logical
channel.

2. Permit the mux extension to fragment frames of any type and to reassemble
those frames to their original state on the receiving side. This is probably
the most flexible solution in case if we can find an optimal format for
writing those enveloped frames.

A solution for the 2nd option:
We need to encapsulate frame fragments into a mux message. In this case all
mux frames would have their FIN bit set to 1 and opCode set to 2 (binary),
so the payload of a mux frame will be transparent to all intermediaries that
do not support mux. Mux frames would retain the current format and would
have channelID in their extension data. In addition to that each mux message
that corresponds to the first fragment of the original frame would include
the frame length, opCode, FIN flag and RSV bits of the original frame;
subsequent mux frames would include only the payload of the fragment.

This format would introduce an additional overhead of 16 to 72 bits per
original frame.

This are just some option, I would be happy to read more ideas.

With best regards,
Arman

-----Original Message-----
From: Greg Wilkins [mailto:gregw@intalio.com] 
Sent: Thursday, June 07, 2012 3:41 PM
To: Arman Djusupov
Cc: Jamie Lokier; hybi@ietf.org
Subject: Re: [hybi] Flow control quota

I think we have made a rod for our own back with the RSV bits.

Because they are in the frame header, they make extensions very vulnerable
to fragmentation.

For example, if the compressed bit was in the payload rather than the frame
header, then it could be fragmented by extensions below the compression
extension without consequences  - so long as it was reassembled in the way
that Jamie describes.

The moment an extension uses a RSV bit, this means that the frame cannot be
fragmented by anything that does not understand the extension - and even
that results in problems because fragmenting a compressed frame is not just
a matter of setting the RSV bit on all the fragments - as the data needs to
be split into self contained compressed chunked if the spirit of the framing
is to be respected.

maybe we would be best to leave the RSV bits dark for now and do all
extension signalling in the payload?

regards





On 7 June 2012 13:02, Arman Djusupov <arman@noemax.com> wrote:
> We have been trying to find some sort of workaround in order to avoid 
> enveloping the WebSocket frame within a mux frame. If mux was a 
> separate layer then a WebSocket frame would need to be enveloped 
> within a mux frame including the WebSocket frame header. This would 
> resolve the unfragmentable frame issue. But this would also mean that 
> in most cases WebSocket would be writing two headers per frame. 
> Writing a frame header twice (once for mux and once the actual frame 
> header) would have a performance overhead. Note that enveloping the 
> frame would produce considerable bandwidth usage when lots of small
messages are being enveloped.
>
> However the performance overhead of mux in such a case is not tested 
> yet. We can try to experiment with this design. A mux frame should be 
> able to include fragments or multiple WebSocket frames that belong to 
> the same channel. If the length of an enveloped frame is less that 
> length of the mux frame then the mux frame contains more than one 
> enveloped frames. If the number of bytes remaining in a mux frame is 
> less than the length of the last enveloped frame encountered, then the 
> frame is fragmented and the remaining bytes are expected to arrive in a
subsequent mux frame of the same channel.
>
> Going towards this direction is a big change from the original spec 
> design, so I wasn't even considering it. But if the group decides that 
> it needs to be evaluated, I can try working on this direction.
>
> With best regards,
> Arman
>
>
> -----Original Message-----
> From: Jamie Lokier [mailto:jamie@shareable.org]
> Sent: Thursday, June 07, 2012 5:23 AM
> To: Arman Djusupov
> Cc: 'Martin Sustrik'; hybi@ietf.org
> Subject: Re: [hybi] Flow control quota
>
> Hi everyone,
>
> I haven't been following the hybi list for a while (it was too 
> depressing/exhausting), but I'm really pleased to see deadlock-free, 
> starvation-free mux is being taken more seriously now.  I guess SPDY's 
> helped with that.
>
> The problem being described in this thread occurs because the layers 
> aren't really being kept separate.
>
> If they are properly separated, everything works, and it's easier to 
> implement and even a bit faster on the network.
>
> I see a lot of confusion about the role and purpose of "fragments".
> Particularly the idea that where something produces fragments, those 
> must literally correspond with the wire protocol, despite having 
> another transformative layer before the wire.
>
> Arman Djusupov wrote:
>> According to the WebSocket specification extensions are layered and 
>> should be applied in a specific order. In this particular case the 
>> mux specification provides two ways of applying the per frame 
>> compression
>> extension: compression can be applied either before or after the mux 
>> extension. In the first case the frame being multiplexed is already 
>> compressed and should not be fragmented.
>                 ^^^^^^^^^^^^^^^^^^^^^^^^
>
> Mux should be able to further fragment - in a way that compression 
> does not see.
>
>> This allows intermediaries to de-multiplex frames without 
>> decompressing them. In the second case the mux frame is compressed 
>> along with the mux header, so it cannot be de-multiplexed unless it 
>> is first decompressed. Supporting both options is beneficial.
>>
>> In any case the problem preventing  unfragmentable frames from being 
>> relayed over flow-controlled logical connections should be resolved.
>> The per frame compression is not the only case when it might be 
>> impossible to fragment a frame. A mux intermediary should either be 
>> able to control the size of the frame that the sending side produces 
>> or
> should be able to fragment them.
>                                          
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Imo, that's the only sensible answer.
>
> Anything else leads to a spiral of hacks - see this thread for examples!
>
> Basically the output of compression is a stream of "unfragmentable"
> things.  But several things, mux being one, need ability to break up 
> the data stream where it's useful - which is the point of fragments.
>
> Deadlock-free mux pretty much _requires_ per-channel flow control and 
> fragmentation at the mux.  And it should be visible only to the mux.
>
> Compression (and other layers) need to be _strictly_ separate.  But in 
> these discussions they've been combined in a fuzzy way, which makes 
> everything complicated.
>
> I guess it was a misguided attempt to keep the wire concepts simple 
> (fit everything into "frames" and "fragments"), which actually makes 
> it more complicated.
>
> It doesn't really matter whether compression is layered above or 
> below, both work fine, as long as it's a separate layer.
>
> If you treat mux like this:
>
>    1. Zero or more "stream of fragments" (of next layer up) in.
>    2. One "stream of fragments" (of mux protocol) out.
>
> and demux like this:
>
>    1. One "stream of fragments" (of mux protocol) in.
>    2. Zero or more "stream of fragments" (of next layer up) down.
>
> Where "mux protocol" means it's encapsulating channel numbers, flow 
> control and fragmentation, everything behaves nicely.
>
> There is no need of oddities like negative flow control tokens, 
> wasteful round trips, no deadlock, starvation, buffer overflow at 
> intermediaries due to large fragments or bandwidth limitations.
>
> It just works(tm).  (Well you do have to implement a good mux - but 
> that's all, it's confined to the mux implementation.)
>
> For compression above the mux, this means:
>
>    - Compression layer outputs a stream of compressed fragments.
>    - Mux takes it in, and produces its _own_ stream of mux fragments.
>    - Demux takes that, and recombines to make the original fragments.
>    - Decompression gets what it needed.
>
> Compression below the mux is similar, but the other way.  See?
>
> In many cases the fragement boundaries of the two layers will 
> coincide, but it shouldn't be assumed or restricted to that.  If there 
> is a desire to encode _that case_ efficiently, that's fine, as long as 
> it's just an encoding.
>
> It breaks the whole point of flow-control and deadlock-avoidance if 
> the layers have to maintain the same boundaries - as this thread
demonstrates.
> One large compressed frame/fragement, and all other channels are 
> starved - nobody will use mux if it's that unreliable!
> Certainly not cooperatively/opportunistically.
>
> Or, if you limit the compression size to the whole path's capacity - 
> that's also a waste (of compression opportunity).  In that case, the 
> attempt to save a few bytes in the header encoding is 
> counterproductive (but you can still save those bytes, just make sure it's
purely a syntax optimisation.).
>
> Oh, one other thing: It's best if compression is free to make the best 
> compression decisions without blocking _other_ things that need 
> fragmentation below - namely control frames, that must be sendable, 
> and maybe prioritisable.
>
> As a practical matter, I'm thinking it makes sense for the compression 
> layer to see things this way:
>
>   1. Receive stream of application frames (WS messages).
>         -> Compressor ->
>   2. Stream of "frames" (not fragments) out, meaning "non-fragmentable".
>         -> Decompresor ->
>   3. Emit stream of application frames (WS messages)
>
> The "frames" in 2 above are compression-protocol messages, and do not 
> have to correspond with application messages.  They are actually the 
> output that current compression proposals would call "fragments".
>
> The only difference, really, is syntax in the pipeline when passed to 
> the layer below.  So actually this is a very small change.  The 
> compression method, decisions it makes, etc. are unchanged.
>
> My summary:
>
> Keep the mux and other layers separate, not leaky, and all the muxy 
> things (flow control etc.) can be implemented in the mux layer alone.
>
> Everything will work, and the implementation will be simpler as well 
> (more modular, less dependencies).
>
> When a layer (such as compression) outputs a stream of things with 
> essential boundaries, treat it as stream of frames, don't conflate it with
"fragments"
> at the wire level - even if it did involve splitting the WS 
> application's original messages.  Keep the relationship between these 
> frames and application (or higher layer) frames internal to the 
> compression layer and protocol (same for other layers).
>
> It is much better to maintain clear semantics: frame boundaries are 
> immutable because the upper layer requires them, fragment boundaries 
> are _always_ allowed to be split and merged for optimal transport 
> decisions, and they only correspond _literally_ to the wire protocol 
> for the lowest layer in the stack.
>
> As the wire protocol is now defined, it doesn't match up well with the 
> above semantics, but it's a syntax (header encoding) issue only.
>
> It will even go faster on the network due to freeing up each component 
> to do the best for its part.  Think about the different combinations 
> of layers when several fragments/frames are in flight, and also 
> multi-hop paths.  They all benefit.
>
> All the best,
> -- Jamie
>
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi



--
Greg Wilkins <gregw@intalio.com>
www.webtide.com
Developer advice, services and support
from the Jetty & CometD experts.