Re: [hybi] Multiplexing: Pre-AddChannelResponse quota

"Arman Djusupov" <arman@noemax.com> Thu, 14 June 2012 08:55 UTC

Return-Path: <arman@noemax.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2795221F8697 for <hybi@ietfa.amsl.com>; Thu, 14 Jun 2012 01:55:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VU5dEofE9B7x for <hybi@ietfa.amsl.com>; Thu, 14 Jun 2012 01:55:20 -0700 (PDT)
Received: from mail.noemax.com (mail.noemax.com [64.34.201.8]) by ietfa.amsl.com (Postfix) with ESMTP id 9F59521F8698 for <hybi@ietf.org>; Thu, 14 Jun 2012 01:55:19 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; t=1339664110; x=1340268910; s=m1024; d=noemax.com; c=relaxed/relaxed; v=1; bh=uEO+OsnTUGQZl9i9O4ilD8n68xo=; h=From:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To:References; b=XWELGv+qSyvJOisn/5EGKjD4ibeAyCtdRqdgMKu85gaR9aa1MMmnc5AT+yrm2QFuvkOb9g9+0CkQ0cNfWZHHJY9yLpuTeFq3Je/NNHUWWWxzLKC3w7Tqw7eT/x9o+zyGnFsf5wJtoZKT2kdzlfEIw7e2XOLJL7UYv+pJwnjPyqA=
Received: from mail.noemax.com by mail.noemax.com (IceWarp 10.4.1) with ASMTP (SSL) id YOR39809; Thu, 14 Jun 2012 11:55:09 +0300
From: Arman Djusupov <arman@noemax.com>
To: 'Jamie Lokier' <jamie@shareable.org>, 'Takeshi Yoshino' <tyoshino@google.com>
References: <CAH9hSJZUAHQzDm4ofq6onc620SNretLQDOcjSnr2eQ0YA9yFdQ@mail.gmail.com> <002f01cd3cc4$4791b380$d6b51a80$@noemax.com> <20120607034440.GD26406@jl-vm1.vm.bytemark.co.uk> <CAH9hSJZWMgvQMNLapZAg_CS0vri=jZbfPLpLhfninjzG+JxxmA@mail.gmail.com> <20120613175433.GC5812@jl-vm1.vm.bytemark.co.uk>
In-Reply-To: <20120613175433.GC5812@jl-vm1.vm.bytemark.co.uk>
Date: Thu, 14 Jun 2012 11:54:57 +0300
Message-ID: <000701cd4a0b$640f1c60$2c2d5520$@noemax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQHbEEjWH8q2xL+U7IgUz4ieagM0MAIpCHL8APvLbp8B3rSIHAH8ws5ulqXzKqA=
Content-Language: en-us
Cc: hybi@ietf.org
Subject: Re: [hybi] Multiplexing: Pre-AddChannelResponse quota
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Jun 2012 08:55:22 -0000

We should not overestimate the difference between pessimistic and optimistic
scenarios. The major difference is in the amount of quota they grant to the
remote side. In the pessimistic case the receiver sends a small quota,
assuming that it will be able to buffer all input in a worst case scenario.
While in the optimistic the case receiver sends a larger quota based on the
rate with which it was able to process incoming frames so far, not relying
entirely on its buffering capacity. Both strategies might work fine and both
might end up in failure. In the worst case scenario when the receiver runs
out of buffers both should be ready to fall back to TCP flow control or to
discard data.

If the mux client can send data up to the pre-handshake quota after the
AddChannelRequest without waiting for an AddChannelResponse, this would
already minimize the roundtrip in case when the data that has to be sent is
not larger than the pre-handshake quota. If this data is larger than the
pre-handshake quota than a delay may happen (it depends on how fast the
server is going to send the AddChannelResponse and the first FC frame). The
same applies to an HTTP server that does not use any mux, the client would
not be able to send a complete request if the server is not going to process
or buffer it right way. Servers usually buffer inbound requests up to some
limit and then wait for the working thread to pick up the request and
process it, only then can the client send the remaining data. In case of
HTTP such a quota is enforced by the TCP flow control, due to the server's
refusal to buffer the complete request. HTTP also supports the Expect:
100-Continue header and 100-Continue response which practically constitute
flow control. 

In other words it is fine to impose pre-handshake quota limits. In cases
when this causes delays it is natural, such delays would happen with mux or
without it.

With best regards,
Arman

-----Original Message-----
From: Jamie Lokier [mailto:jamie@shareable.org] 
Sent: Wednesday, June 13, 2012 8:55 PM
To: Takeshi Yoshino
Cc: Arman Djusupov; hybi@ietf.org
Subject: Re: [hybi] Multiplexing: Pre-AddChannelResponse quota

Takeshi Yoshino wrote:
>    On Thu, Jun 7, 2012 at 12:44 PM, Jamie Lokier <[1]jamie@shareable.org>
>    wrote:
> 
>    Arman Djusupov wrote:
>    >
>    >    Resending would be required when the allocation of a logical
>    >    channel has failed on the server or the intermediary due to
>    >    any kind of mux related error. For example: Maximum number
>    >    of logical channels per physical connection has been reached
>    >    or Intermediary cannot reach the destination endpoint. The
>    >    current draft requires that if an AddChannelResponse is
>    >    received with a failed flag, the client attempts to open a
>    >    new physical connection. Since the browser must keep the mux
>    >    extension transparent it cannot let the JS application
>    >    handle the recovery from mux related errors.
> 
>      If there is a max_simultaneous_handshakes - why not recast that as
>      "initial new-channels window" - meaning a flow control grant,
>      which is updated by further grants from the server.  (Very
>      similar to the flow control per channel).
> 
>    Good idea. I'll take it.

Excellent :-)

The idea isn't entirely random, it's prompted by some general principles of
flow control which apply in a lot of situations; see my "some thoughts"
further down.

>      Specifying the amount used in the request would seem to
>      enforce that delay, which is unnecessary in principle.

The original paragraph said:

>> Anyway, the client shouldn't have to pause/delay sending data after 
>> AddChannelRequest, until the response, if there is sufficient quota 
>> to keep sending.  Specifying the amount used in the request would 
>> seem to enforce that delay, which is unnecessary in principle.

Consider this scenario:

   This is WebSocket, so it's quite likely client and server
   application processes are generating a continuous flow of messages
   in each direction, with a wide variety of timings.  In all cases we
   would like to minimise latency, and with mux we also want 

   Client process (C) generates initial data stream.

   The first bytes available from the client process are included in
   the initial channel request.

   Server process (S) can begin processing those initial bytes, and
   generates a stream of response bytes.  Those will follow
   immediately after the channel response.

   Client process (C) is still generating data 0.1 second later, 0.2, 0.3...

   But round-trip time is 2 seconds.

If the client is allowed to send initial data with AddChannelRequest, but
then not allowed to send more data until it receives the AddChannelResponse,
there is a pause in the flow: data produced at 0.1, 0.2, 0.3 seconds cannot
be sent until 2 seconds pass, cannot be processed by the server etc.

This is the unnecessary delay.  Client should be able to send more data up
to the presumed initial-channel data window, in packets after the
AddChannelRequest.  This also means client must be able to choose the
channel identifier (for sending at least).

If those are little request messages over WebSocket, then it will be
*4* seconds before they get responses, due to doubling the round trip.
(On mobile networks, the delays are often larger.)

This additional overhead is not present without the Mux protocol, or to
applications doing their own app-mux instead.  So the Mux adds overheads on
top of non-Mux, and may even accumulate multiple such delays per-Mux/Demux
hop (multiple proxies), which is quite bad.

Since it's unnecessary, it shouldn't do it.  There is no practical
complexity to implementations allowing data after the request as opposed to
just with the request itself.

The protocol is possibly simpler that way: The request doesn't need to
handle a data payload.  All initial data can follow in their own messages.


Some thoughts on the subject of flow control and channel setup:


   - Receive without blocking other flows (channels) is a semantic
     requirement, not just a nicety.  Otherwise the flows don't have
     the same behaviour as separate TCP connections.

     Applications (including event-driven scripts) that use separate
     connections, expect them to behave quite independently, meaning
     data on one can't permanently block the flow of data on another.

     If Mux connections aren't that independent, supposedly
     independent operations can get in a tangle waiting on messages
     from each other.  (Even in a browser app, with just one server).
     (I wrote out more detail but it went on a bit long.)

     It is horrible to debug when that happens, it's one of those
     "nearly always doesn't happen, spend hours trying to understand
     why the app is broke" things, and we should never inflict it on
     anyone.  It's just a bad model, and unnecessary.

     But it's worse if Mux is a hidden optimisation, in the browser
     and/or intermediaries, as it changes the semantics.  Going from
     no-deadlock to deadlock possible is a change of semantics.

     That is the most important reason for pessimistic flow control:
     It is necessary for Mux to have the same semantics as non-Mux,
     before it can be safely deployed as a hidden transport
     optimisation.

     (Same applies to mux in SPDY, HTTP/2 etc.)

     (The various proposed standards could do with more explanation of
     why channel flow control is so important, and why it must use the
     patterns it uses to get independent flow semantics.)

   - The basic method is to never block TCP at the receiver while you
     have more than 1 flow muxed over it.  Blocking TCP is fine when
     you have only 1 flow.  But new channel setup requests are,
     logically, another flow, and so are control messages (such as
     clean shutdown).  So the 1 flow case doesn't happen, except
     during initial handshake.

   Assuming all that:

   - Pessimistic flow control, aka. grants or promises to receive
     without blocking other channels.  This keeps the complexity low
     at the sender.  The receiver is responsible for allocating its
     memory and managing fairness.  The principle applies just as well
     to connection setup counts, and initial channel data, as to
     regular channel data.

   - Optimistic flow control (commands to allow sending more than the
     receiver can accept), mean the sender may need to abort and
     re-send (or flows will become deadlocked).  These are more
     complex for everyone, but in some circumstances provide
     on-average less latency (and sometimes the other way).

     But think about it: TCP doesn't do this, and it's fine without it.
     (It uses an exponentially rising method of handing out more
     window as long as there is no congestion.)

     So is this ever worth the complication?  I don't know, but I
     would suggest anything like this be explored as an experimental
     extension, on top of the basic pessimistic methods.

   - "Xon/xoff" start-stop style flow control is a disaster: the
     receiver cannot guarantee to store all that it receives before
     the stop token reaches the sender, so TCP will block the flow
     instead and there's no way forward until the receiver gains more
     memory.  Which it won't if it's deadlocked.  This is a form of
     optimistic flow control.

   - Start-stop and optimistic need pessimistic flow control anyway!!

     Otherwise, if TCP is blocked due to over-sending what the
     receiver cannot buffer, there's no way "stop" or "re-send later"
     messages can get through in that direction.  Think about it.

   - Summing up: You *have* to have pessimistic flow control tokens at
     some stage.  You can build other things on top, but there is no
     avoiding this if you want the protocol to be unable to get stuck.
     (I'm fairly confident of this but would love to be proven wrong.)

   - Whoever is initiating a new channel should select the channel id
     for binding data blocks together, and be allowed to send more
     data immediately after the initiating request (subject to normal
     flow control).

     This effectively makes the channel exist immediately, so the
     initiator can send data as as it generates it.

     Otherwise, the initiator can send an initial "pulse" of data,
     followed by a round-trip delay until it can send more, which may
     not match the pattern of data generation, and causes mux to add
     latency, sometimes accumulating per-hop, in places where it isn't
     necessary.

     Example: Without the delay, it is possible to send HTTP-style
     requests and responses, creating a new channel for each request,
     as efficient as pipelining with independent response orders, with
     all the benefits of flow control etc.  With the round-trip delay,
     this is not an efficient pattern, and the application will likely
     devise its own request muxing protocol over a single channel,
     sometimes accidentally reproducing some of the deadlock/flow
     control scenarious discussed above.

   - When assigning new channels, it's useful to have a data window
     known in advance (or negotiated) for new channels.  It could be
     shared among all new channels or per-channel immediately; does
     not matter.  Although a protocol where channels are setup first
     seems simpler, it does not actually save any memory at the
     receiver (which needs to have dedicated memory per-channel for
     the control structures anyway), nor make it any simpler.

   - Finally, whether to use pessimistic flow control for channel
     requests, or allow them to be rejected for resource reasons and
     tell the sender to try again later, or on another TCP connection.

     The discussion above about pessimistic vs. optimistic applies to
     these as well.  Just like data, channels use memory and their
     setup has similar flow control issues.  So the same flow control
     mechanisms can apply to channel setup allowances.  That's the
     "initial new-channels window" idea.

     However, here there's a stronger case for at least evaluating
     optimistic with retry.  Think about the way a web server
     regulates the load from new TCP connections, with an unknown
     number of clients: It simply ignores new TCP SYN packets, when
     there are too many connections.  Clients retry, and back off
     exponentially, thus regulating the load over a considerable range
     (it's not perfect).  This works because clients back off
     exponentially, and servers use the minimum of resources for
     denied requests (i.e. ignore them).

     With pessimistic flow control of new channels, an initiator will
     wait until it gets a message saying it can use another one.
     While everything is running smoothly, that's a good balancing of
     resources.

     But if there's a load spike, and at the same time TCP stalls at a
     series of retransmits due to congestion, after the load spike the
     message saying it's ok to use more channels may be delayed for a
     long time, such as 30s, because of the TCP stall waiting for the
     next retransmit.

     In that situation, a client may find making a new TCP connections
     and start Mux over it can get a new channel more quickly.  I
     don't know if, in that situation, it is better to allow
     optimistic channel setup requests on the existing TCP connection
     instead of a new TCP, or if there is no point because a stalled
     TCP would delay the request anyway.

     As with data flow control, I would suggest making the optimistic
     variation (setup requests with refusal and retry) an experimental
     extension, and the pessimistic variation the basic standard.
     Experiments may show if the optimistic extension is net positive
     or negative at high loads; it's not obvious to me which.

All the best,
-- Jamie