Re: HTTP/2 flow control <draft-ietf-httpbis-http2-17>

Roberto Peon <> Thu, 12 March 2015 01:54 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 820DD1A89A7 for <>; Wed, 11 Mar 2015 18:54:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -7.011
X-Spam-Status: No, score=-7.011 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id xmQGdIwtXSUK for <>; Wed, 11 Mar 2015 18:54:18 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id A213E1A89A5 for <>; Wed, 11 Mar 2015 18:54:18 -0700 (PDT)
Received: from lists by with local (Exim 4.80) (envelope-from <>) id 1YVsI7-0002Iw-JP for; Thu, 12 Mar 2015 01:52:03 +0000
Resent-Date: Thu, 12 Mar 2015 01:52:03 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtp (Exim 4.80) (envelope-from <>) id 1YVsHz-0002IB-4s for; Thu, 12 Mar 2015 01:51:55 +0000
Received: from ([]) by with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <>) id 1YVsHy-0001On-Qs for; Thu, 12 Mar 2015 01:51:55 +0000
Received: by oigi138 with SMTP id i138so11218966oig.4 for <>; Wed, 11 Mar 2015 18:51:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=NVizpRBXoAJlLUnBqQuV3n34qiYMDJlrjAp4CYVpHfU=; b=lNPKBnPkV+sr4GxqPwMoMPAY9qGZv6CEZ78leftY2ldXsy6i2MLsuav4f0jYyOHGyw sf9l5+FsGuWMQoXG3EtW5ZHaIBf19hQ/l8F0FTJ4aifQfgWhfhD/Ts42MpC1sXFAskPH DibWnc7YfLvI5piXnG1iLephQBDi8bXuAM7/7YJsoZgBrA2WaZam9PpmPrjXEMjfifNR y6VUF9mchlHyZnQZA1gMxnk3R34q7QsVmgOo+/LGsz9JXrXcZcW6NgqUo/qlFGuxRz8k PIk4nTI9KwkZ5UuC9hC25zEb0Y7gxTOKV+r70BeNK9ME+D/1YdBQcEVb37pYeJe5BCEL FCdA==
MIME-Version: 1.0
X-Received: by with SMTP id i12mr32206590oes.74.1426125088752; Wed, 11 Mar 2015 18:51:28 -0700 (PDT)
Received: by with HTTP; Wed, 11 Mar 2015 18:51:28 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <>
Date: Wed, 11 Mar 2015 18:51:28 -0700
Message-ID: <>
From: Roberto Peon <>
To: Greg Wilkins <>
Cc: Bob Briscoe <>, HTTP Working Group <>
Content-Type: multipart/alternative; boundary="001a11c2522c523ce705110d9f2b"
Received-SPF: pass client-ip=;;
Subject: Re: HTTP/2 flow control <draft-ietf-httpbis-http2-17>
Archived-At: <>
X-Mailing-List: <> archive/latest/28937
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>


On Wed, Mar 11, 2015 at 6:42 PM, Greg Wilkins <> wrote:

> Roberto,
> nice summation.
> Note that #3 is almost an OK option in many situations.  When sending
> multiple streams S->C, each stream consumer is likely to progress at the
> same rate (just moving stream into memory) or at least faster than the
> network (when moving stream to disk).   In such situations allowing the
> multiplex streams to flow at maximum allowed by TCP flow control would not
> directly introduce any HOL blocking issues.  So long as large streams do
> not prevent small streams progressing, the server can send interleaved
> frames at a rate that will saturate the available bandwidth.
> The situation is not the same C->S where large uploads and other request
> content can be consumed at widely diverse rates and with just multiplexing
> if a large upload was allowed to hit the TCP flow control, then it could
> HOL block other small requests that should have been allowed to proceed.
> So #4 is required for C->S.   But for that to work, we need the control
> frames sent in the other direction not to be HOL-blocked, which then rules
> out #3 for S->C!
> It all seams reasonable and I can't think of what could be done better
> given the basic context of the problem (other than not incorrectly call it
> a "window"),  but it does leave us with a solution that only works as
> designed if it never runs at the maximum throughput available of the
> available TCP channel.  This might not be a problem if we routinely get
> close to the maximum available throughput, but experience may show that
> there are scenarios where we fall short of near optimal bandwidth
> utilisation.     It's something that we have to watch for carefully and
> luckily HTTP/1.1 is not going away any time soon, so we have an alternative
> for any such scenarios discovered.
> regards
> On 11 March 2015 at 12:38, Roberto Peon <> wrote:
>> If you have a multiplexed protocol, and you want proxy or loadbalance
>> (basically true for most larger sites), then you'll be met with a rate
>> mismatch where a sender on either side is sending faster than the receiver
>> is reading. You are then presented with a few options:
>>   1) drop the connection and lose all the streams in progress
>>   2) drop the stream/data and lose any progress on that stream
>>   3) pause (i.e. head of line block) for an indeterminate amount of time
>> until that stream drains
>>   4) exert flow control.
>>   5) infinitely buffer (i.e. OOM and lose all connections for all clients)
>> Of these, #4 seemed the best choice to keep the flows flowing.
>> I'd have loved to see the "blocked because of flow control" thing become
>> a reality and thus allow autotuning, but I haven't seen any moaning about
>> the lack of it. That implies either that stuff works fine, or that we're
>> not paying close enough attention.
>> -=R
>> On Tue, Mar 10, 2015 at 1:31 PM, Bob Briscoe <> wrote:
>>>  HTTP/2 folks,
>>> I know extensibility had already been discussed and put to bed, so the
>>> WG is entitled to rule out opening healed wounds.
>>> But have points like those I've made about flow control been raised
>>> before? Please argue. I may be wrong. Discussion can go on in parallel to
>>> the RFC publication process, even tho the process doesn't /require/ you to
>>> talk to me.
>>> If I'm right, then implementers are being mandated to write complex flow
>>> control code, when it might have little bearing on the performance benefits
>>> measured for http/2.
>>> Even if I'm right, and the WG goes ahead anyway, /I/ will understand. My
>>> review came in after your deadline.
>>> However, bear in mind that the Webosphere might not be so forgiving. If
>>> h2 goes ahead when potential problems have been identified, it could get a
>>> bad reputation simply due to the uncertainty, just when you want more
>>> people to take it up and try it out. Given you've put in a few person-years
>>> of effort, I would have thought you would not want to risk a reputation
>>> flop.
>>> I'm trying to help - I just can't go any faster.
>>> Bob
>>> At 14:43 06/03/2015, Bob Briscoe wrote:
>>> HTTP/2 folks,
>>> As I said, consider this as a late review from a clueful but fresh pair
>>> of eyes.
>>> My main concerns with the draft are:
>>> * extensibility (previous posting)
>>> * flow control (this posting - apologies for the length - I've tried to
>>> explain properly)
>>> * numerous open issues left dangling (see subsequent postings)
>>> The term 'window' as used throughout is incorrect and highly confusing,
>>> in:
>>> * 'flow control window' (44 occurrences),
>>> * 'initial window size' (5),
>>> * or just 'window size' (8)
>>> The HTTP/2 WINDOW_UPDATE mechanism constrains HTTP/2 to use only
>>> credit-based flow control, not window-based. At one point, it actually says
>>> it is credit-based (in flow control principle #2 <
>>> >
>>> ), but otherwise it incorrectly uses the term window.
>>> This is not just an issue of terminology. The more I re-read the flow
>>> control sections the more I became convinced that this terminology is not
>>> just /confusing/, rather it's evidence of /confusion/. It raises the
>>> questions
>>> * "Is HTTP/2 capable of the flow control it says it's capable of?"
>>> * "What type of flow-control protocol ought HTTP/2 to be capable of?"
>>> * "Can the WINDOW_UPDATE frame support the flow-control that HTTP/2
>>> needs?"
>>> To address these questions, it may help if I separate the two different
>>> cases HTTP/2 flow control attempts to cover (my own separation, not from
>>> the draft):
>>> a) Intermediate buffer control
>>> Here, a stream's flow enters /and/ leaves a buffer (e.g. at the
>>> app-layer of an intermediate node).
>>> b) Flow control by the ultimate client app.
>>> Here flow never releases memory (at least not during the life of the
>>> connection). The flow is solely consuming more and more memory (e.g. data
>>> being rendered into a client app's memory).
>>> ==a) Intermediate buffer control==
>>> For this, sliding window-based flow control would be appropriate,
>>> because the goal is to keep the e2e pipeline full without wasting buffer.
>>> Let me prove HTTP/2 cannot do window flow control. For window flow
>>> control, the sender needs to be able to advance both the leading and
>>> trailing edges of the window. In the draft:
>>> * WINDOW_UPDATE frames can only advance the leading edge of a 'window'
>>> (and they are constrained to positive values).
>>> * To advance the trailing edge, window flow control would need a
>>> continuous stream of acknowledgements back to the sender (like TCP). The
>>> draft does not provide ACKs at the app-layer, and the app-layer cannot
>>> monitor ACKs at the transport layer, so the sending app-layer cannot
>>> advance the trailing edge of a 'window'.
>>> So the protocol can only support credit-based flow control. It is
>>> incapable of supporting window flow control.
>>> Next, I don't understand how a receiver can set the credit in
>>> 'WINDOW_UPDATE' to a useful value. If the sender needed the receiver to
>>> answer the question "How much more can I send than I have seen ACK'd?" that
>>> would be easy. But because the protocol is restricted to credit, the sender
>>> needs the receiver to answer the much harder open-ended question, "How much
>>> more can I send?" So the sender needs the receiver to know how many ACKs
>>> the sender has seen, but neither of them know that.
>>> The receiver can try, by taking a guess at the bandwidth-delay product,
>>> and adjusting the guess up or down, depending on whether its buffer is
>>> growing or shrinking. But this only works if the unknown bandwidth-delay
>>> product stays constant.
>>> However, BDP will usually be highly variable, as other streams come and
>>> go. So, in the time it takes to get a good estimate of the per-stream BDP,
>>> it will probably have changed radically, or the stream will most likely
>>> have finished anyway. This is why TCP bases flow control on a window, not
>>> credit. By complementing window updates with ACK stream info, a TCP sender
>>> has sufficient info to control the flow.
>>> The draft is indeed correct when it says:
>>> "   this can lead to suboptimal use of available
>>>    network resources if flow control is enabled without knowledge of the
>>>    bandwidth-delay product (see [RFC7323]).
>>> "
>>> Was this meant to be a veiled criticism of the protocol's own design? A
>>> credit-based flow control protocol like that in the draft does not provide
>>> sufficient information for either end to estimate the bandwidth-delay
>>> product, given it will be varying rapidly.
>>> ==b) Control by the ultimate client app==
>>> For this case, I believe neither window nor credit-based flow control is
>>> appropriate:
>>> * There is no memory management issue at the client end - even if
>>> there's a separate HTTP/2 layer of memory between TCP and the app, it would
>>> be pointless to limit the memory used by HTTP/2, because the data is still
>>> going to sit in the same user-space memory (or at least about the same
>>> amount of memory) when HTTP/2 passes it over for rendering.
>>> * Nonetheless, the receiving client does need to send messages to the
>>> sender to supplement stream priorities, by notifying when the state of the
>>> receiving application has changed (e.g. if the user's focus switches from
>>> one browser tab to another).
>>> * However, credit-based flow control would be very sluggish for such
>>> control, because credit cannot be taken back once it has been given (except
>>> HTTP/2 allows SETTINGS_INITIAL_WINDOW_SIZE to be reduced, but that's a
>>> drastic measure that hits all streams together).
>>> ==Flow control problem summary==
>>> With only a credit signal in the protocol, a receiver is going to have
>>> to allow generous credit in the WINDOW_UPDATEs so as not to hurt
>>> performance. But then, the receiver will not be able to quickly close down
>>> one stream (e.g. when the user's focus changes), because it cannot claw
>>> back the generous credit it gave, it can only stop giving out more.
>>> IOW: Between a rock and a hard place,... but don't tell them where the
>>> rock is.
>>> ==Towards a solution?==
>>> I think 'type-a' flow control (for intermediate buffer control) does not
>>> need to be at stream-granularity. Indeed, I suspect a proxy could control
>>> its app-layer buffering by controlling the receive window of the incoming
>>> TCP connection. Has anyone assessed whether this would be sufficient?
>>> I can understand the need for 'type-b' per-stream flow control (by the
>>> ultimate client endpoint). Perhaps it would be useful for the receiver to
>>> emit a new 'PAUSE_HINT' frame on a stream? Or perhaps updating per-stream
>>> PRIORITY would be sufficient? Either would minimise the response time to a
>>> half round trip. Whereas credit flow-control will be much more sluggish
>>> (see 'Flow control problem summary').
>>> Either approach would correctly propagate e2e. An intermediate node
>>> would naturally tend to prioritise incoming streams that fed into
>>> prioritised outgoing streams, so priority updates would tend to propagate
>>> from the ultimate receiver, through intermediate nodes, up to the ultimate
>>> sender.
>>> ==Flow control coverage==
>>> The draft exempts all TCP payload bytes from flow control except HTTP/2
>>> data frames. No rationale is given for this decision. The draft says it's
>>> important to manage per-stream memory, then it exempts all the frame types
>>> except data, even tho each byte of a non-data frame consumes no less memory
>>> than a byte of a data frame.
>>> What message does this put out? "Flow control is not important for one
>>> type of bytes with unlimited total size, but flow control is so important
>>> that it has to be mandatory for the other type of bytes."
>>> It is certainly critical that WINDOW_UPDATE messages are not covered by
>>> flow control, otherwise there would be a real risk of deadlock. It might be
>>> that there are dependencies on other frame types that would lead to a
>>> dependency loop and deadlock. It would be good to know what the rationale
>>> behind these rules was.
>>> ==Theory?==
>>> I am concerned that HTTP/2 flow control may have entered new theoretical
>>> territory, without suitable proof of safety. The only reassurance we have
>>> is one implementation of a flow control algorithm (SPDY), and the anecdotal
>>> non-evidence that no-one using SPDY has noticed a deadlock yet (however, is
>>> anyone monitoring for deadlocks?).
>>> Whereas SPDY has been an existence proof that an approach like http/2
>>> 'works', so far all the flow control algos have been pretty much identical
>>> (I think that's true?). I am concerned that the draft takes the InterWeb
>>> into uncharted waters, because it allows unconstrained diversity in flow
>>> control algos, which is an untested degree of freedom.
>>> The only constraints the draft sets are:
>>> * per-stream flow control is mandatory
>>> * the only protocol message for flow control algos to use is the
>>> WINDOW_UPDATE credit message, which cannot be negative
>>> * no constraints on flow control algorithms.
>>> * and all this must work within the outer flow control constraints of
>>> TCP.
>>> Some algos might use priority messages to make flow control assumptions.
>>> While other algos might associate PRI and WINDOW_UPDATE with different
>>> meanings. What confidence do we have that everyone's optimisation
>>> algorithms will interoperate? Do we know there will not be certain types of
>>> application where deadlock is likely?
>>> "   When using flow
>>>    control, the receiver MUST read from the TCP receive buffer in a
>>>    timely fashion.  Failure to do so could lead to a deadlock when
>>>    critical frames, such as WINDOW_UPDATE, are not read and acted upon.
>>> "
>>> I've been convinced (offlist) that deadlock will not occur as long as
>>> the app consumes data 'greedily' from TCP. That has since been articulated
>>> in the above normative text. But how sure can we be that every
>>> implementer's different interpretations of 'timely' will still prevent
>>> deadlock?
>>> Until a good autotuning algorithm for TCP receive window management was
>>> developed, good window management code was nearly non-existent. Managing
>>> hundreds of interdependent stream buffers is a much harder problem. But
>>> implementers are being allowed to just 'Go forth and innovate'. This might
>>> work if everyone copies available open source algo(s). But they might not,
>>> and they don't have to.
>>> This all seems like 'flying by the seat of the pants'.
>>> ==Mandatory Flow Control? ==
>>> "      3. [...] A sender
>>>        MUST respect flow control limits imposed by a receiver."
>>> This ought to be a 'SHOULD' because it is contradicted later - if
>>> settings change.
>>> "   6.  Flow control cannot be disabled."
>>> Also effectively contradicted half a page later:
>>> "   Deployments that do not require this capability can advertise a flow
>>>    control window of the maximum size (2^31-1), and by maintaining this
>>>    window by sending a WINDOW_UPDATE frame when any data is received.
>>>    This effectively disables flow control for that receiver."
>>> And contradicted in the definition of half closed (remote):
>>> "  half closed (remote):
>>>       [...] an endpoint is no longer
>>>       obligated to maintain a receiver flow control window.
>>> "
>>> And contradicted in 8.3. The CONNECT Method
>>> <>,
>>> which says:
>>> "  Frame types other than DATA
>>>    or stream management frames (RST_STREAM, WINDOW_UPDATE, and PRIORITY)
>>>    MUST NOT be sent on a connected stream, and MUST be treated as a
>>>    stream error (Section 5.4.2) if received.
>>> "
>>> Why is flow control so important that it's mandatory, but so unimportant
>>> that you MUST NOT do it when using TLS e2e?
>>> Going back to the earlier quote about using the max window size, it
>>> seems perverse for the spec to require endpoints to go through the motions
>>> of flow control, even if they arrange for it to affect nothing, but to
>>> still require implementation complexity and bandwidth waste with a load of
>>> redundant WINDOW_UPDATE frames.
>>> HTTP is used on a wide range of devices, down to the very small and
>>> challenged. HTTP/2 might be desirable in such cases, because of the
>>> improved efficiency (e.g. header compression), but in many cases the stream
>>> model may not be complex enough to need stream flow control.
>>> So why not make flow control optional on the receiving side, but
>>> mandatory to implement on the sending side? Then an implementation could
>>> have no machinery for tuning window sizes, but it would respond correctly
>>> to those set by the other end, which requires much simpler code.
>>> If a receiving implemention chose not to do stream flow control, it
>>> could still control flow at the connection (stream 0) level, or at least at
>>> the TCP level.
>>> ==Inefficiency?==
>>>  5.2. Flow Control
>>> <>
>>> "Flow control is used for both individual
>>>    streams and for the connection as a whole."
>>> Does this means that every WINDOW_UPDATE on a stream has to be
>>> accompanied by another WINDOW_UPDATE frame on stream zero? If so, this
>>> seems like 100% message redundancy. Surely I must  have misunderstood.
>>> ==Flow Control Requirements===
>>> I'm not convinced that clear understanding of flow control requirements
>>> has driven flow control design decisions.
>>> The draft states various needs for flow-control without giving me a feel
>>> of confidence that it has separated out the different cases, and chosen a
>>> protocol suitable for each. I tried to go back to the early draft on flow
>>> control requirements <
>>> >, and I was not impressed.
>>> I have quoted below the various sentences in the draft that state what
>>> flow control is believed to be for. Below that, I have attempted to
>>> crystalize out the different concepts, each of which I have tagged within
>>> the quotes.
>>> * 2. HTTP/2 Protocol Overview
>>> <> says
>>>   "Flow control and prioritization ensure that it is possible to
>>> efficiently use multiplexed streams. [Y]
>>>    Flow control (Section 5.2) helps to ensure that only data that can be
>>> used by a receiver is transmitted. [X]"
>>> * 5.2. Flow Control
>>> <>
>>> says:
>>>   "Using streams for multiplexing introduces contention over use of the
>>> TCP connection [X], resulting in blocked streams [Z]. A flow control scheme
>>> ensures that streams on the same connection do not destructively interfere
>>> with each other [Z]."
>>> * 5.2.2. Appropriate Use of Flow Control
>>> <>
>>> "  Flow control is defined to protect endpoints that are operating under
>>>    resource constraints.  For example, a proxy needs to share memory
>>>    between many connections, and also might have a slow upstream
>>>    connection and a fast downstream one [Y].  Flow control addresses
>>> cases
>>>    where the receiver is unable to process data on one stream, yet wants
>>>    to continue to process other streams in the same connection [X]."
>>> "  Deployments with constrained resources (for example, memory) can
>>>    employ flow control to limit the amount of memory a peer can consume.
>>> [Y]
>>> Each requirement has been tagged as follows:
>>> [X] Notification of the receiver's changing utility for each stream
>>> [Y] Prioritisation of streams due to contention over the streaming
>>> capacity available to the whole connection.
>>> [Z] Ensuring one stream is not blocked by another.
>>> [Z] might be a variant of [Y], but [Z] sounds more binary, whereas [Y]
>>> sounds more like optimisation across a continuous spectrum.
>>> Regards
>>> Bob
>>> ________________________________________________________________
>>> Bob Briscoe,                                                  BT
>>>  ________________________________________________________________
>>> Bob Briscoe,                                                  BT
> --
> Greg Wilkins <>  @  Webtide - *an Intalio subsidiary*
> HTTP, SPDY, Websocket server and client that
> scales
>  advice and support for jetty and cometd.