Re: Priority implementation complexity (was: Re: Extensible Priorities and Reprioritization)

Barry Pollard <barry@tunetheweb.com> Tue, 16 June 2020 16:50 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A6F7E3A00C1 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 16 Jun 2020 09:50:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.448
X-Spam-Level:
X-Spam-Status: No, score=-2.448 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)" header.d=tunetheweb.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yq3Tn35gr9zj for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 16 Jun 2020 09:50:48 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B191B3A03FF for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 16 Jun 2020 09:50:48 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1jlEky-0000P7-Mg for ietf-http-wg-dist@listhub.w3.org; Tue, 16 Jun 2020 16:48:16 +0000
Resent-Date: Tue, 16 Jun 2020 16:48:16 +0000
Resent-Message-Id: <E1jlEky-0000P7-Mg@lyra.w3.org>
Received: from www-data by lyra.w3.org with local (Exim 4.92) (envelope-from <barry@tunetheweb.com>) id 1jlEkw-0000LL-Du for ietf-http-wg@listhub.w3.org; Tue, 16 Jun 2020 16:48:14 +0000
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <barry@tunetheweb.com>) id 1jkrAR-0007LE-AA for ietf-http-wg@listhub.w3.org; Mon, 15 Jun 2020 15:36:59 +0000
Received: from mail-lj1-x22a.google.com ([2a00:1450:4864:20::22a]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <barry@tunetheweb.com>) id 1jkrAO-0001E0-4x for ietf-http-wg@w3.org; Mon, 15 Jun 2020 15:36:58 +0000
Received: by mail-lj1-x22a.google.com with SMTP id y11so19762750ljm.9 for <ietf-http-wg@w3.org>; Mon, 15 Jun 2020 08:36:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tunetheweb.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=q4a465UWf63hJ4wC1Ufyz/XJy/Dcp4wYvixSq4+q4L4=; b=TQwt2ZUzK8+H09kQ3dku89kJv/dUlL5XfBVbN7rLvP5nZSMR0u0g/Q9qI4qH3Koygh DjLL3R0X9vSWeCYeGRt32YX4SAdXyUTSEiTKz5BlPnba833eWJpwTxzEAi+ZU5+FLpSi oiw4FNY2QVbY7dPOA5c6kY3ZazldaSIZHPObQ=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=q4a465UWf63hJ4wC1Ufyz/XJy/Dcp4wYvixSq4+q4L4=; b=YFdfE6UITaMfbCDSUbf/4OdLt8qVLdi1vOE2e5ByRgm0Y4AZMCLmVXrDN8PRjVLJYk HKLepzhCgZDD745sS4bLAmrNPRZbf107yP+ZxyJ3w9sv9EGjm41x8EOPti6Rwwf/+vPe aZAuQ3br5Rvu1RUJi0joFg7j0BVoHtkemwktaDaYvsuUTDUOmJ9GZ6o0h08qGZ39oe8a bvvXtvEJsfh1Xg7xaZMvCJxXJHJ1ZMl8er5KvFdKJowA99kYoXdcQekljkKaaR/7d0B7 /9pfln0hBHaJE5htj+FcVg/AeX8IjMaXaXBP5Z3Q5VSmokXWuG8xmRJv15lTv5FNLxc8 f3EA==
X-Gm-Message-State: AOAM531QA5MbGJLN7k/UKAFQgh7nlaEnfndGCGY8bkZT+tMIRe3b7e0s j0T/R7suVSowHsyDOxHRc10pIK5YFvMTkGwymv6Xfg==
X-Google-Smtp-Source: ABdhPJxmbgrHfWAl4cejMJtND2gR2BxzDGJS6Bd24Q4NsgJV4YQNWAWlQTUOnTKJrZWyBPCv7wM8tpfeob9c1e4fonE=
X-Received: by 2002:a2e:22c2:: with SMTP id i185mr14459663lji.200.1592235404396; Mon, 15 Jun 2020 08:36:44 -0700 (PDT)
MIME-Version: 1.0
References: <CALGR9obRjBSADN1KtKF6jvFVzNS1+JzaS0D0kCVKHKkd4sn+MQ@mail.gmail.com> <459C86F8-A989-4EF4-84DC-3568FF594F36@apple.com> <CANatvzwSpSHd7kZD-4tyMGkBJDdCBi6r_pLBvnaT8rrQy6SBHQ@mail.gmail.com> <CACMu3treK0m2mbpw9FebOjOcEed0bW-DbLbryHJH1DWAHoz+9g@mail.gmail.com> <CALGR9oZgE7ZfXdoYdUh9LUYC1fi8fMUyyTpvmV3GF7Z6Oxgg1g@mail.gmail.com> <20200609144428.GC22180@lubuntu> <CAJV+MGyuhxx=P6kZKktuREeq5pipZjxmwWP4jE_Sxhj_+krU2Q@mail.gmail.com> <CANatvzx_eg84V7UefOtSF+NHGHnTg7h-9n5bsRZRXxBqsaOkfQ@mail.gmail.com> <CACj=BEip6+7AunFsD=6qM5rsgrTfg6bRctOMu1gOe-KVjAW7Dw@mail.gmail.com> <CANatvzyv03VH9=+J=M2yY0EwCXp7HMWsXYaXOE=WYGDKBHdaVA@mail.gmail.com> <2C53D8AF-EFA8-42A3-9666-955A054468DB@greenbytes.de> <CACj=BEic2qzMXEfcsKS9CYnowChc-kMRjH66d3uKs+pqTz9Fug@mail.gmail.com> <4E0E8032-A903-46A2-A131-F1F4DE3CC037@greenbytes.de> <CACj=BEjOC-8S38U36Jw+Yb7yH_BZjxBkeLE6dLWH=8VMyBW80Q@mail.gmail.com> <ECF2C350-5D53-4E3B-9AAC-2F7E3FD4B528@greenbytes.de> <CACj=BEh53ZWj1UV6tNDaWPiuHbmbVmkYimu_rdYcYm06dZJAAQ@mail.gmail.com> <CAJV+MGzesbFRZFGK5SUM70HJrx3fdJ8AAGGqmwqDQhrL3eFmUw@mail.gmail.com> <CACj=BEiVPfOPGPuSu-tx5AovWwvTEwjCTqEooMq2muEteHZLAw@mail.gmail.com> <CAJV+MGyrKyhamN3WY+HE1i_0hKWQo6kuLe-6hO53YMrHPbto=w@mail.gmail.com>
In-Reply-To: <CAJV+MGyrKyhamN3WY+HE1i_0hKWQo6kuLe-6hO53YMrHPbto=w@mail.gmail.com>
From: Barry Pollard <barry@tunetheweb.com>
Date: Mon, 15 Jun 2020 16:36:33 +0100
Message-ID: <CAM37g06T_U+Th83D7sH3S_LB-b_9TcaX5b9nDSFE8Z-x5U+c9A@mail.gmail.com>
To: Patrick Meenan <patmeenan@gmail.com>
Cc: Yoav Weiss <yoav@yoav.ws>, Bence Béky <bnc@chromium.org>, HTTP Working Group <ietf-http-wg@w3.org>, Kazuho Oku <kazuhooku@gmail.com>, Lucas Pardue <lucaspardue.24.7@gmail.com>, Stefan Eissing <stefan.eissing@greenbytes.de>
Content-Type: multipart/alternative; boundary="000000000000ada3b305a8213119"
Received-SPF: pass client-ip=2a00:1450:4864:20::22a; envelope-from=barry@tunetheweb.com; helo=mail-lj1-x22a.google.com
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1jkrAO-0001E0-4x 897d489c394ada0832bc8cf17ca7152a
X-caa-id: b0e46b9c5c
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Priority implementation complexity (was: Re: Extensible Priorities and Reprioritization)
Archived-At: <https://www.w3.org/mid/CAM37g06T_U+Th83D7sH3S_LB-b_9TcaX5b9nDSFE8Z-x5U+c9A@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/37772
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Probably a stupid question, but should prioritisation (and in particular
reprioritization) really affect server side processing (i.e. sending it
downstream to a backend) or only in deciding what to send back? That is, is
prioritisation primarily about using server side resources the most
efficiently, or about using the client-side bandwidth the most efficiently?
Ideally both of course, but with that comes added complexity, and to me
client-side bandwidth limits is the main reason for prioritisation.

So could you simplify this and concentrate on the return queue only - at
least for reprioritization? There's already discussions to
drop reprioritization completely, so limiting it to this (I'm assuming
simpler) scope would perhaps save us from going to that extreme position
but also avoid making this overly complex.

So in Stefan's example earlier (reprioritization making request A already
in process, dependent on the later received request B not started yet) we
basically continue to process request A (as it's already started), and if
it's ready to go, and B is still not, then start sending A - the dependency
is only uses when both requests are at same stage and we need to make a
priority call between them. If that takes some time to send and B becomes
available during that time, then now we have a queue of bytes to send and
so a choice about what to send in next batch of bytes. Then we can using
the prioritization hints to send B's bytes in preferences to A's.

So I agree with Stefan's point that "it is tricky to obey
re-prioritizations to the letter" (impossible for responses already sent)
but do we need to? Are they not just hints and the end of the day? And he
also says "any change after that will 'only' affect the downstream DATA
allocation" but I'd argue that's the main point.

Or it could be I'm massively oversimplifying this and/or completely
misunderstanding the problem. First post to this group after lurking for a
while so be nice to me :-)

On Mon, 15 Jun 2020 at 15:19, Patrick Meenan <patmeenan@gmail.com> wrote:

> Yep, totally solvable, just not "perfectly". Something like setting
> maximums at global and "current urgency" level would probably work well
> enough. i.e. 12 workers per connection max, 6 workers max for the current
> highest-urgency level. The servers also have no idea how quickly the
> back-ends will fulfill the requests and probably need to start workers for
> lower priority streams if there are only a few high-priority streams in
> flight to make sure they have data to fill the pipe.
>
> A trivial implementation that spawned a new worker every time a higher
> priority request came in would be DOSable with reprioritization if a client
> kept issuing high-urgency requests and immediately de-prioritizing them.
>
> It's probably also worthwhile for browsers and web devs to keep all of
> this in mind when issuing the requests where the optimal behavior from the
> server's perspective would be to receive the stream requests in the order
> of priority. Even without reprioritization of existing streams, injecting
> higher priority streams ahead of in-flight lower priority streams is going
> to be worse for perf than if the streams were originally requested in the
> optimal order (how much worse is going to depend on the server
> implementation).  Since it's all discovered dynamically on the browser and
> server side, the web devs play a pretty big role here as well (though
> browsers might be able to help absorb some of it by pacing new stream
> requests as the initial parse/loading is happening depending on what is
> already in-flight).
>
> On Mon, Jun 15, 2020 at 8:52 AM Yoav Weiss <yoav@yoav.ws> wrote:
>
>>
>>
>> On Mon, Jun 15, 2020 at 2:10 PM Patrick Meenan <patmeenan@gmail.com>
>> wrote:
>>
>>> Even without a priority tree it is likely that the H3 extensible
>>> priorities structure would cause not-yet-started responses to need to be
>>> scheduled ahead of in-flight responses. The urgency value is effectively a
>>> parent/child relationship.
>>>
>>
>> I wouldn't consider the inflight response dependent on the
>> not-yet-started one here, because the not-yet-started one doesn't need a
>> specific response to hold for it, it just needs *any* response to finish
>> (and free up a worker thread in the architecture Stefan outlined) in order
>> to be sent (with its appropriate priority).
>> So, it might be somewhat delayed, but won't be held back forever if a
>> specific response is a long sending one. (assuming the number of worker
>> thread is not very small)
>>
>>
>>> It's not as unbounded as H2 but if you churned through a bunch of
>>> reprioritizations with stalled streams you could cause issues for a server
>>> that didn't protect against it.
>>>
>>> Limiting the reprioritizations to "what stream to pick next" would help
>>> but wouldn't solve the long download problem.
>>>
>>> On Mon, Jun 15, 2020 at 7:44 AM Yoav Weiss <yoav@yoav.ws> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jun 15, 2020 at 1:18 PM Stefan Eissing <
>>>> stefan.eissing@greenbytes.de> wrote:
>>>>
>>>>>
>>>>> Stefan Eissing
>>>>>
>>>>> <green/>bytes GmbH
>>>>> Hafenweg 16
>>>>> <https://www.google.com/maps/search/Hafenweg+16+%0D%0A48155+M%C3%BCnster?entry=gmail&source=g>
>>>>> 48155 Münster
>>>>> <https://www.google.com/maps/search/Hafenweg+16+%0D%0A48155+M%C3%BCnster?entry=gmail&source=g>
>>>>> www.greenbytes.de
>>>>>
>>>>> > Am 15.06.2020 um 12:14 schrieb Yoav Weiss <yoav@yoav.ws>:
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Jun 15, 2020 at 11:03 AM Stefan Eissing <
>>>>> stefan.eissing@greenbytes.de> wrote:
>>>>> >
>>>>> > > Am 15.06.2020 um 10:28 schrieb Yoav Weiss <yoav@yoav.ws>:
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > On Mon, Jun 15, 2020 at 9:55 AM Stefan Eissing <
>>>>> stefan.eissing@greenbytes.de> wrote:
>>>>> > > > Am 11.06.2020 um 10:41 schrieb Kazuho Oku <kazuhooku@gmail.com>:
>>>>> > > >
>>>>> > > > That depends on how much clients would rely on reprioritization..
>>>>> Unlike H2 priorities, Extensible Priority does not have inter-stream
>>>>> dependencies. Therefore, losing *some* prioritization signals is less of an
>>>>> issue compared to H2 priorities.
>>>>> > > >
>>>>> > > > Assuming that reprioritization is used mostly for refining the
>>>>> initial priorities of a fraction of all the requests, I think there'd be
>>>>> benefit in defining reprioritization as an optional feature. Though I can
>>>>> see some might argue for not having reprioritization even as an optional
>>>>> feature unless there is proof that it would be useful.
>>>>> > >
>>>>> > >
>>>>> > > > We should decide if reprioritization is good or bad, based on as
>>>>> much data as we can pull, and make sure it's implemented only if we see
>>>>> benefits for it in some cases, and then make sure it's only used in those
>>>>> cases.
>>>>> > >
>>>>> > > When thinking about priority implementations, I recommend thinking
>>>>> about a H3 reverse proxy in front of a legacy H1 server. Assume limited
>>>>> memory, disk space and backend connections.
>>>>> > >
>>>>> > > (Re-)prioritization in H2 works well for flow control, among the
>>>>> streams that have response data to send. Priorities can play a part in
>>>>> server scheduling, but
>>>>> > > it's more tricky. By "scheduling" I mean that the server has to
>>>>> pick one among the opened streams for which it wants to compute a response
>>>>> for. This is often impossible to re-prioritize afterwards (e.g. suicidal
>>>>> for a server implementation).
>>>>> > >
>>>>> > > Can you expand on why it is "suicidal"?
>>>>> >
>>>>> > It is tricky to obey re-prioritizations to the letter, managing
>>>>> memory+backend connections and protecting the infrastructure against DoS
>>>>> attacks. The reality is that there are limited resources and a server is
>>>>> expected to protect those. It's a (pun intended) top priority.
>>>>> >
>>>>> > Another priority topping the streams is the concept of fairness
>>>>> between connections. In Apache httpd, the resources to process h2 streams
>>>>> are foremost shared evenly between connections.
>>>>> >
>>>>> > That makes sense. Would re-prioritization of specific streams
>>>>> somehow require to change that?
>>>>> >
>>>>> > The share a connection gets is then allocated to streams based on
>>>>> current h2 priority settings. Any change after that will "only" affect the
>>>>> downstream DATA allocation.
>>>>> >
>>>>> > I *think* this makes sense as well, assuming that by "downstream"
>>>>> you mean "future". Is that what you meant? Or am I missing something?
>>>>> >
>>>>> > Also, the number of "active" streams on a connection is dynamic. It
>>>>> will start relatively small and grow if the connection is well behaving,
>>>>> shrink if it is not. That one of the reasons that Apache was only partially
>>>>> vulnerable to a single issue on the Netflix h2 cve list last year (the
>>>>> other being nghttp2).
>>>>> >
>>>>> > tl;dr
>>>>> >
>>>>> > By "suicidal" I mean a server failing the task of process thousands
>>>>> of connections in a consistent and fair manner.
>>>>> >
>>>>> > Apologies if I'm being daft, but I still don't understand how
>>>>> (internal to a connection) stream reprioritization impacts cross-connection
>>>>> fairness.
>>>>>
>>>>> *fails to imagine Yoav as being daft*
>>>>>
>>>> :)
>>>>
>>>> Thanks for outlining the server-side processing!
>>>>
>>>>
>>>>> A server with active connections and workers. For simplicity, assume
>>>>> that each ongoing request allocates a worker.
>>>>> - all workers are busy
>>>>> - re-prio arrives and makes a stream A, being processed, depend on a
>>>>> stream B which has not been assigned a worker yet.
>>>>>
>>>>
>>>> OK, I now understand that this can be concerning.
>>>> IIUC, this part is solved by with Extensible Priorities (because
>>>> there's no dependency tree).
>>>>
>>>> Lucas, Kazuho - can you confirm?
>>>>
>>>>
>>>>> - ideally, the server would freeze the processing of A and assign the
>>>>> resources to B.
>>>>> - however re-allocating the resources is often not possible  (Imagine
>>>>> a CGI process running or a backend HTTP/1.1 or uWSGI connection.)
>>>>> - the server can only suspend the worker or continue processing,
>>>>> ignoring the dependency.
>>>>> - a suspended worker is very undesirable and a possible victim of a
>>>>> slow-loris attack
>>>>> - To make this suspending less sever, the server would need to make
>>>>> processing of stream B very important. To unblock it quickly again. This is
>>>>> then where unfairness comes in.
>>>>>
>>>>> The safe option therefore is to continue processing stream A and
>>>>> ignore the dependency on B. Thus, priorities are only relevant:
>>>>> 1. when the next stream to process on a connection is selected
>>>>> 2. when size/number of DATA frames to send is allocated on a
>>>>> connection between all streams that want to send
>>>>>
>>>>> (Reality is often not quite as bad as I described: when static
>>>>> file/cache resources are served for example, a worker often just does the
>>>>> lookup, producing a file handle very quickly. A connection easily juggles a
>>>>> number of file handles to stream out according to priorities and stalling
>>>>> one file on another comes at basically no risk and cost.)
>>>>>
>>>>> Now, this is for H2 priorities. I don't know enough about QUIC
>>>>> priorities to have an opinion on the proposals. Just wanted to point out
>>>>> that servers see the world a little different than clients. ;)
>>>>>
>>>>
>>>> I checked and it seems like Chromium does indeed change the parent
>>>> dependency as part of reprioritization. If the scenario you outlined is a
>>>> problem in practice, we should discuss ways to avoid doing that with H2
>>>> priorities.
>>>>
>>>>
>>>>>
>>>>> Cheers, Stefan
>>>>>
>>>>>
>>>>> > >
>>>>> > >
>>>>> > > If we would do H2 a second time, my idea would be to signal
>>>>> priorities in the HTTP request in a connection header and use this in the
>>>>> H2 frame layer to allocate DATA space on the downlink. Leave out changing
>>>>> priorities on a request already started. Let the client use its window
>>>>> sizes if it feels the need.
>>>>> > >
>>>>> > > Cheers, Stefan (lurking)
>>>>> >
>>>>>
>>>>>