Re: Priority implementation complexity (was: Re: Extensible Priorities and Reprioritization)

Kazuho Oku <> Tue, 16 June 2020 00:09 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 788733A0F01 for <>; Mon, 15 Jun 2020 17:09:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.748
X-Spam-Status: No, score=-2.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id DBhCgYbrZwFb for <>; Mon, 15 Jun 2020 17:09:40 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 364153A0EFD for <>; Mon, 15 Jun 2020 17:09:40 -0700 (PDT)
Received: from lists by with local (Exim 4.92) (envelope-from <>) id 1jkz86-0005au-38 for; Tue, 16 Jun 2020 00:07:06 +0000
Resent-Date: Tue, 16 Jun 2020 00:07:06 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <>) id 1jkz83-0005a9-FV for; Tue, 16 Jun 2020 00:07:03 +0000
Received: from ([2a00:1450:4864:20::636]) by with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <>) id 1jkz80-0004df-1Q for; Tue, 16 Jun 2020 00:07:03 +0000
Received: by with SMTP id gl26so19430358ejb.11 for <>; Mon, 15 Jun 2020 17:06:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XqCPy/lYErEw0eSPwhhDeM4/xr178uGaQR1/r+4ZEks=; b=JZ44XrK0mHd2ibAnQM9UdJUW8djaBylwHOOjduOOxakyKrha3nHyR7cUpf+tT7cKM0 aEi3TXsVOZl12wIxiBQLKNMH0wzInDloYOl8G8wZ6ZZrHgYnKsIQZzwRnt4ntgdZ+5wJ BE2zEXzFnZ+TpFAbBNurHhf9n1gC8yH4wAhGpwfdcJqa/YzEDyJ4+GPUK/X3Ro97MzA/ aMbmmGRMnpCr5JvnG152aLqlnNMvQvwnmyrEG/yyOJvoV2f9vAO7jGLk1SnJy6RYQ/B4 3Lpd3byIp1m5mY94L3Qg1g10RFMgfhflPpFgWBjCf2Ntk1YPKa2j1pOM/rwZLZX2/F3S BDYQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XqCPy/lYErEw0eSPwhhDeM4/xr178uGaQR1/r+4ZEks=; b=Ay+OxV/CfwXsvACjdRL1b0ndtupQ/OCukQ0KkpkzqA+w93jmEwk05ajUaLnGaKrghj t4sNUNml9Mvnf9zAtX0yQ49b4DQH4SEknwZvWuzBxyy+Nln5fPbA+XPzdnwFDyWd99PR sgjNUrwZ9xFrAnmSAi0fTmJOSl070lvIsQO+12oj+TzRV+ybc705uI0gdqIbGgvZlmQ7 gt+qAFB8PDoEm+4lxjAUWVWPCGNV8/KMCYPdvfVD9g4K1pgPdZq4jU4G7GV46WYHWRJ4 cW3eT36teeDho8lrWaYqC5T6r1EUIOkiYL2QT4a2O9blA6rLrRqeowLvKIxZEw5VfhN9 4Y2A==
X-Gm-Message-State: AOAM5332y9O7ZbX4b2wl9S+yzX7oWUwEmZrJd6n0zkoQoURMygAr8PYd 1NFL9CxTrX/dUewuX2DEoYZ6GiyeoCMF1biHgCI=
X-Google-Smtp-Source: ABdhPJxA1sWfqUi4j/xFvQNG1FIS+JKjdEDsyjSKMzKHmV0TzwpatPHCRZXfxCmV0+ndVVqeaqCniVoF5HCLD4urHHc=
X-Received: by 2002:a17:906:5496:: with SMTP id r22mr282329ejo.449.1592266008489; Mon, 15 Jun 2020 17:06:48 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <> <> <> <20200609144428.GC22180@lubuntu> <> <> <> <> <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
From: Kazuho Oku <>
Date: Tue, 16 Jun 2020 09:06:36 +0900
Message-ID: <>
To: Barry Pollard <>
Cc: Patrick Meenan <>, Yoav Weiss <>, =?UTF-8?Q?Bence_B=C3=A9ky?= <>, HTTP Working Group <>, Lucas Pardue <>, Stefan Eissing <>
Content-Type: multipart/alternative; boundary="000000000000d2f36305a8285150"
Received-SPF: pass client-ip=2a00:1450:4864:20::636;;
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: 1jkz80-0004df-1Q 8438f4bb9cca50f39399f3af44bc18f1
Subject: Re: Priority implementation complexity (was: Re: Extensible Priorities and Reprioritization)
Archived-At: <>
X-Mailing-List: <> archive/latest/37770
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

2020年6月16日(火) 0:36 Barry Pollard <>om>:

> Probably a stupid question, but should prioritisation (and in particular
> reprioritization) really affect server side processing (i.e. sending it
> downstream to a backend) or only in deciding what to send back? That is, is
> prioritisation primarily about using server side resources the most
> efficiently, or about using the client-side bandwidth the most efficiently?
> Ideally both of course, but with that comes added complexity, and to me
> client-side bandwidth limits is the main reason for prioritisation.

I think this is a very good way of phrasing the problem, and I think that
the answer is that it should not be written in the spec. It is up to the
internals of each server.

Let's use Stefan's example: a H3 reverse proxy sitting in front of a legacy
H1 server.

Assuming that the legacy H1 server cannot handle many connections at once
(i.e. prefork server), the H3 reverse proxy needs to limit the number of
connections that it opens concurrently to the H1 server. For example, the
H3 reverse proxy might support maximum stream concurrency of 256, but limit
the number of inflight requests forwarded to the backend server to 8 per
each H3 connection. If the number of inflight requests sent from the client
exceeds the latter limit (i.e. 8), they get queued, and get issued as the
inflight requests to the backend server are retired.

I'm not sure about other H2 / H3 servers, but H2O does provide this
capability of limiting the backend concurrency [1].

In such a deployment, I would argue that the H3 reverse proxy should
respect the priorities when choosing a request from the queue, regardless
of the prioritization signal being initial or reprioritization. Otherwise,
the H3 reverse proxy would end up providing responses to requests mostly in
the order of the requests being issued.

Looking at the issue from a theoretical point, the reason for having a
priority signal is to change the order in which the responses are provided.
It is beneficial to respect the signal at any point where we might see
contention, especially at points we'd have severe contention. While it is
true that the biggest bottleneck is the downstream bandwidth, there could
be other severe bottlenecks depending on each deployment.


> So could you simplify this and concentrate on the return queue only - at
> least for reprioritization? There's already discussions to
> drop reprioritization completely, so limiting it to this (I'm assuming
> simpler) scope would perhaps save us from going to that extreme position
> but also avoid making this overly complex.
> So in Stefan's example earlier (reprioritization making request A already
> in process, dependent on the later received request B not started yet) we
> basically continue to process request A (as it's already started), and if
> it's ready to go, and B is still not, then start sending A - the dependency
> is only uses when both requests are at same stage and we need to make a
> priority call between them. If that takes some time to send and B becomes
> available during that time, then now we have a queue of bytes to send and
> so a choice about what to send in next batch of bytes. Then we can using
> the prioritization hints to send B's bytes in preferences to A's.
> So I agree with Stefan's point that "it is tricky to obey
> re-prioritizations to the letter" (impossible for responses already sent)
> but do we need to? Are they not just hints and the end of the day? And he
> also says "any change after that will 'only' affect the downstream DATA
> allocation" but I'd argue that's the main point.
> Or it could be I'm massively oversimplifying this and/or completely
> misunderstanding the problem. First post to this group after lurking for a
> while so be nice to me :-)
> On Mon, 15 Jun 2020 at 15:19, Patrick Meenan <> wrote:
>> Yep, totally solvable, just not "perfectly". Something like setting
>> maximums at global and "current urgency" level would probably work well
>> enough. i.e. 12 workers per connection max, 6 workers max for the current
>> highest-urgency level. The servers also have no idea how quickly the
>> back-ends will fulfill the requests and probably need to start workers for
>> lower priority streams if there are only a few high-priority streams in
>> flight to make sure they have data to fill the pipe.
>> A trivial implementation that spawned a new worker every time a higher
>> priority request came in would be DOSable with reprioritization if a client
>> kept issuing high-urgency requests and immediately de-prioritizing them.
>> It's probably also worthwhile for browsers and web devs to keep all of
>> this in mind when issuing the requests where the optimal behavior from the
>> server's perspective would be to receive the stream requests in the order
>> of priority. Even without reprioritization of existing streams, injecting
>> higher priority streams ahead of in-flight lower priority streams is going
>> to be worse for perf than if the streams were originally requested in the
>> optimal order (how much worse is going to depend on the server
>> implementation).  Since it's all discovered dynamically on the browser and
>> server side, the web devs play a pretty big role here as well (though
>> browsers might be able to help absorb some of it by pacing new stream
>> requests as the initial parse/loading is happening depending on what is
>> already in-flight).
>> On Mon, Jun 15, 2020 at 8:52 AM Yoav Weiss <> wrote:
>>> On Mon, Jun 15, 2020 at 2:10 PM Patrick Meenan <>
>>> wrote:
>>>> Even without a priority tree it is likely that the H3 extensible
>>>> priorities structure would cause not-yet-started responses to need to be
>>>> scheduled ahead of in-flight responses. The urgency value is effectively a
>>>> parent/child relationship.
>>> I wouldn't consider the inflight response dependent on the
>>> not-yet-started one here, because the not-yet-started one doesn't need a
>>> specific response to hold for it, it just needs *any* response to finish
>>> (and free up a worker thread in the architecture Stefan outlined) in order
>>> to be sent (with its appropriate priority).
>>> So, it might be somewhat delayed, but won't be held back forever if a
>>> specific response is a long sending one. (assuming the number of worker
>>> thread is not very small)
>>>> It's not as unbounded as H2 but if you churned through a bunch of
>>>> reprioritizations with stalled streams you could cause issues for a server
>>>> that didn't protect against it.
>>>> Limiting the reprioritizations to "what stream to pick next" would help
>>>> but wouldn't solve the long download problem.
>>>> On Mon, Jun 15, 2020 at 7:44 AM Yoav Weiss <> wrote:
>>>>> On Mon, Jun 15, 2020 at 1:18 PM Stefan Eissing <
>>>>>> wrote:
>>>>>> Stefan Eissing
>>>>>> <green/>bytes GmbH
>>>>>> Hafenweg 16
>>>>>> <>
>>>>>> 48155 Münster
>>>>>> <>
>>>>>> > Am 15.06.2020 um 12:14 schrieb Yoav Weiss <>ws>:
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Mon, Jun 15, 2020 at 11:03 AM Stefan Eissing <
>>>>>>> wrote:
>>>>>> >
>>>>>> > > Am 15.06.2020 um 10:28 schrieb Yoav Weiss <>ws>:
>>>>>> > >
>>>>>> > >
>>>>>> > >
>>>>>> > > On Mon, Jun 15, 2020 at 9:55 AM Stefan Eissing <
>>>>>>> wrote:
>>>>>> > > > Am 11.06.2020 um 10:41 schrieb Kazuho Oku <
>>>>>> >:
>>>>>> > > >
>>>>>> > > > That depends on how much clients would rely on
>>>>>> reprioritization. Unlike H2 priorities, Extensible Priority does not have
>>>>>> inter-stream dependencies. Therefore, losing *some* prioritization signals
>>>>>> is less of an issue compared to H2 priorities.
>>>>>> > > >
>>>>>> > > > Assuming that reprioritization is used mostly for refining the
>>>>>> initial priorities of a fraction of all the requests, I think there'd be
>>>>>> benefit in defining reprioritization as an optional feature. Though I can
>>>>>> see some might argue for not having reprioritization even as an optional
>>>>>> feature unless there is proof that it would be useful.
>>>>>> > >
>>>>>> > >
>>>>>> > > > We should decide if reprioritization is good or bad, based on
>>>>>> as much data as we can pull, and make sure it's implemented only if we see
>>>>>> benefits for it in some cases, and then make sure it's only used in those
>>>>>> cases.
>>>>>> > >
>>>>>> > > When thinking about priority implementations, I recommend
>>>>>> thinking about a H3 reverse proxy in front of a legacy H1 server. Assume
>>>>>> limited memory, disk space and backend connections.
>>>>>> > >
>>>>>> > > (Re-)prioritization in H2 works well for flow control, among the
>>>>>> streams that have response data to send. Priorities can play a part in
>>>>>> server scheduling, but
>>>>>> > > it's more tricky. By "scheduling" I mean that the server has to
>>>>>> pick one among the opened streams for which it wants to compute a response
>>>>>> for. This is often impossible to re-prioritize afterwards (e.g. suicidal
>>>>>> for a server implementation).
>>>>>> > >
>>>>>> > > Can you expand on why it is "suicidal"?
>>>>>> >
>>>>>> > It is tricky to obey re-prioritizations to the letter, managing
>>>>>> memory+backend connections and protecting the infrastructure against DoS
>>>>>> attacks. The reality is that there are limited resources and a server is
>>>>>> expected to protect those. It's a (pun intended) top priority.
>>>>>> >
>>>>>> > Another priority topping the streams is the concept of fairness
>>>>>> between connections. In Apache httpd, the resources to process h2 streams
>>>>>> are foremost shared evenly between connections.
>>>>>> >
>>>>>> > That makes sense. Would re-prioritization of specific streams
>>>>>> somehow require to change that?
>>>>>> >
>>>>>> > The share a connection gets is then allocated to streams based on
>>>>>> current h2 priority settings. Any change after that will "only" affect the
>>>>>> downstream DATA allocation.
>>>>>> >
>>>>>> > I *think* this makes sense as well, assuming that by "downstream"
>>>>>> you mean "future". Is that what you meant? Or am I missing something?
>>>>>> >
>>>>>> > Also, the number of "active" streams on a connection is dynamic. It
>>>>>> will start relatively small and grow if the connection is well behaving,
>>>>>> shrink if it is not. That one of the reasons that Apache was only partially
>>>>>> vulnerable to a single issue on the Netflix h2 cve list last year (the
>>>>>> other being nghttp2).
>>>>>> >
>>>>>> > tl;dr
>>>>>> >
>>>>>> > By "suicidal" I mean a server failing the task of process thousands
>>>>>> of connections in a consistent and fair manner.
>>>>>> >
>>>>>> > Apologies if I'm being daft, but I still don't understand how
>>>>>> (internal to a connection) stream reprioritization impacts cross-connection
>>>>>> fairness.
>>>>>> *fails to imagine Yoav as being daft*
>>>>> :)
>>>>> Thanks for outlining the server-side processing!
>>>>>> A server with active connections and workers. For simplicity, assume
>>>>>> that each ongoing request allocates a worker.
>>>>>> - all workers are busy
>>>>>> - re-prio arrives and makes a stream A, being processed, depend on a
>>>>>> stream B which has not been assigned a worker yet.
>>>>> OK, I now understand that this can be concerning.
>>>>> IIUC, this part is solved by with Extensible Priorities (because
>>>>> there's no dependency tree).
>>>>> Lucas, Kazuho - can you confirm?
>>>>>> - ideally, the server would freeze the processing of A and assign the
>>>>>> resources to B.
>>>>>> - however re-allocating the resources is often not possible  (Imagine
>>>>>> a CGI process running or a backend HTTP/1.1 or uWSGI connection.)
>>>>>> - the server can only suspend the worker or continue processing,
>>>>>> ignoring the dependency.
>>>>>> - a suspended worker is very undesirable and a possible victim of a
>>>>>> slow-loris attack
>>>>>> - To make this suspending less sever, the server would need to make
>>>>>> processing of stream B very important. To unblock it quickly again. This is
>>>>>> then where unfairness comes in.
>>>>>> The safe option therefore is to continue processing stream A and
>>>>>> ignore the dependency on B. Thus, priorities are only relevant:
>>>>>> 1. when the next stream to process on a connection is selected
>>>>>> 2. when size/number of DATA frames to send is allocated on a
>>>>>> connection between all streams that want to send
>>>>>> (Reality is often not quite as bad as I described: when static
>>>>>> file/cache resources are served for example, a worker often just does the
>>>>>> lookup, producing a file handle very quickly. A connection easily juggles a
>>>>>> number of file handles to stream out according to priorities and stalling
>>>>>> one file on another comes at basically no risk and cost.)
>>>>>> Now, this is for H2 priorities. I don't know enough about QUIC
>>>>>> priorities to have an opinion on the proposals. Just wanted to point out
>>>>>> that servers see the world a little different than clients. ;)
>>>>> I checked and it seems like Chromium does indeed change the parent
>>>>> dependency as part of reprioritization. If the scenario you outlined is a
>>>>> problem in practice, we should discuss ways to avoid doing that with H2
>>>>> priorities.
>>>>>> Cheers, Stefan
>>>>>> > >
>>>>>> > >
>>>>>> > > If we would do H2 a second time, my idea would be to signal
>>>>>> priorities in the HTTP request in a connection header and use this in the
>>>>>> H2 frame layer to allocate DATA space on the downlink. Leave out changing
>>>>>> priorities on a request already started. Let the client use its window
>>>>>> sizes if it feels the need.
>>>>>> > >
>>>>>> > > Cheers, Stefan (lurking)
>>>>>> >

Kazuho Oku