Re: [iccrg] Updates to draft-briscoe-iccrg-prague-congestion-control-03

Bob Briscoe <ietf@bobbriscoe.net> Tue, 31 October 2023 16:33 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B2BD8C1705FC for <iccrg@ietfa.amsl.com>; Tue, 31 Oct 2023 09:33:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.106
X-Spam-Level:
X-Spam-Status: No, score=-7.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6ZrVLEayuoTQ for <iccrg@ietfa.amsl.com>; Tue, 31 Oct 2023 09:33:41 -0700 (PDT)
Received: from mail-ssdrsserver2.hostinginterface.eu (mail-ssdrsserver2.hostinginterface.eu [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9F1B0C1705E0 for <iccrg@irtf.org>; Tue, 31 Oct 2023 09:33:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Content-Type:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=0BRfR9rrp/h0S4bIuEl7tiBVcFpJTCCcVGc0gAnayl4=; b=g4rnkMBp4FmrwbM5rUwJGMfxdp XYOEFYfcC0D1PKWyVqO+EB1z4Pa5w+JDUdPbkrWyGcaw1wR/qUeNeQjxgVzl2nqYwhb2TudlVxvVB RlG8VFpMMw/XhyNF/0rTsqLl7gHFz3nRLuqvQw8FJgepmWRCb+7EAv48qljqBZMnGK9hJ910OIl5T tDigrLATPbjqPYiDXBO48aWlldPQE1OEp59AoW+FYvyFzhpstmVeYFfTef5SxvjYqCgR5StgD/sEC N1aW6u2vOz4BylrnsGBKFuL8s3kp0/X6+xWVv1nAkXcZ1o/YEGFu7TQQQ6NYu7spcbmKVwAir2UVc 2hE7mbpw==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:42552 helo=[192.168.1.3]) by ssdrsserver2.hostinginterface.eu with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96.2) (envelope-from <ietf@bobbriscoe.net>) id 1qxrgR-0008RW-2V; Tue, 31 Oct 2023 16:33:36 +0000
Content-Type: multipart/alternative; boundary="------------oUrAqCDsiIYQ44WBSgZy5Fj9"
Message-ID: <ba04ef94-17b5-424b-a417-4fce9598ab1a@bobbriscoe.net>
Date: Tue, 31 Oct 2023 16:33:14 +0000
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-GB
To: Marten Seemann <martenseemann@gmail.com>
Cc: iccrg IRTF list <iccrg@irtf.org>
References: <169728527879.18854.17962028148144369127@ietfa.amsl.com> <0c9d15e7-6f15-4b7c-b1ce-f50854152aef@bobbriscoe.net> <CAOYVs2rFgyRQ1Hdk6g1j9Ku23TS1FRjW2r104H_eUPJioLJLiw@mail.gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
In-Reply-To: <CAOYVs2rFgyRQ1Hdk6g1j9Ku23TS1FRjW2r104H_eUPJioLJLiw@mail.gmail.com>
X-MagicSpam-TUUID: 41756a8b-6043-45f2-a06e-8d0b160bd6ae
X-MagicSpam-SUUID: eba1c63c-0afc-4534-8737-daf96c4718e5
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hostinginterface.eu
X-AntiAbuse: Original Domain - irtf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hostinginterface.eu: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hostinginterface.eu: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/dX4-pc8f0A4dVMEYWEf8qTAgEF8>
Subject: Re: [iccrg] Updates to draft-briscoe-iccrg-prague-congestion-control-03
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://mailman.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://mailman.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Oct 2023 16:33:45 -0000

Marten,

On 31/10/2023 10:50, Marten Seemann wrote:
> I read the draft and I'm trying to figure out how I'd implement Prague 
> in my QUIC stack. There are a couple of things I've noticed:
>
> 1.
>
>     Section 2.3.2: It’s unclear to me when exactly /alpha/ is updated.
>     I assume that once I receive the first ACK, I save the timestamp.
>     When I receive a new ACK, there are two code paths: if it’s
>     received within one /rtt_virt/, just accumulate the counters used
>     to calculate /frac/. If it’s received after /rtt_virt/, update
>     /alpha /according to the equation given in this section, reset the
>     counters for /frac/ and save the timestamp as the beginning of the
>     next /rtt_virt/ epoch. However, this would mean that the /alpha/
>     value used for multiplicative decrease (section 2.4.2) would
>     always be slightly outdated, which seems suboptimal for an
>     immediate response to a growing queue. Is there a better way?
>
[BB] Indeed. It's actually lot worse than "slightly outdated". As it 
says at the end of the section you refer to:

    However, another approach is being investigated because these
    per-RTT updates introduce 1--2 rounds of delay into the congestion
    response on top of the inherent round of feedback delay (see Section
    3.1.3
    <https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-03#pracc_faster_response>
    in the section on variants and future work).


Section 3.1.3 goes on to summarize the brief tech report I posted on 
arXiv that explains and defines an alternative approach that removes all 
this lag, cited as [PerAckEWMA]:
Removing the Clock Machinery Lag from DCTCP/Prague 
<https://arxiv.org/abs/2101.07727>

Joakim Misund was evaluating it about a year ago, when he decided to get 
a proper job ;) instead of doing his PhD with me. I am just getting 
round to working on it myself again. But if you wanted to try it 
yourself as wel, we could certainly compare notes.

> 2.
>
>     Section 2.3.3: QUIC uses both packet- and time-threshold loss
>     detection (see sections 6.1.1 and 6.1.2 of RFC 9002). I’m not sure
>     what exactly the recommendation of this draft is.
>
[BB] I hadn't appreciated that QUIC deems there's been a loss if 
/either/ of these conditions is met [RFC9002; §6.1]:

    The packet was sent kPacketThreshold packets before an acknowledged
    packet (Section 6.1.1
    <https://datatracker.ietf.org/doc/html/rfc9002#packet-threshold>),
    or it was sent long enough in the past (Section 6.1.2
    <https://datatracker.ietf.org/doc/html/rfc9002#time-threshold>).


Whereas TCP RACK [RFC8985], only uses DupThresh at the start of a flow 
(when the RTT is likely to be inaccurate) until a decent reordering 
window is established (if reordering is detected at all):

    if some reordering has been observed, then RACK does not trigger
    fast recovery based on DupThresh.

Assuming the QUIC RFC is not just badly written, it doesn't seem right. 
Once a flow has got going, a time threshold is generally considered more 
robust than a packet-threshold. So, even if the packet-threshold is 
adapted in parallel to the time-threshold, using the logical OR of them 
both will often override the time-threshold with the less robust packet 
threshold.

> 2.
>
>     [cont] It would certainly be possible to turn off packet-threshold
>     loss detection, and rely on time-threshold altogether. Is that
>     what QUIC implementations should do?
>

[BB] Well, that's certainly what I thought RACK was meant to do.
This might end up requiring an erratum to RFC9002.

> 3.
>
>     Section 2.4.2: Is the suppression of further decreases after one
>     ECN-triggered decrease for one /srtt/, or is it one /rtt_virt/?
>     Reading section 2.4.4 it sounds like it’s /rtt_virt/, but this
>     could probably be clarified in this section.
>

[BB] Yes. I will check that the Linux code does that though.

We should have caught that mistake in the draft recently when we went 
through the draft checking all the places where it had said 'RTT' before 
we introduced rtt_virt. I've added this to my list of edits to make for 
the next rev.


> 4.
>
>     Section 2.4.3: The QUIC ACK frame acknowledges (multiple) ranges
>     of packets at the same time, together with cumulative ECN counts.
>     It’s therefore not possible to tell which packet was ECN-marked.
>     This means that a QUIC stack will be able to determine
>     /acked_sacked/, but not /ece_delta/. Is it valid to approximate it
>     by assuming that all packets had the same average size? Either
>     way, this is pretty awkward to fit into the pseudo-code given in
>     appendix B.5 of RFC 9002.
>

[BB] Yes. It has to be, given the current QUIC protocol.

Despite having the opportunity to fully integrate ECN into the design of 
QUIC, it seems it was still a bit of an afterthought (I shouldn't 
complain, 'cos I volunteered to help with adding ECN, but then didn't 
get involved 'cos of other pressing work at the time).


> 5.
>
>     Section 2.4.3: Similarly, what's the correct order to process an
>     ACK that reports an ECN marking: For example, an ACK might
>     acknowledge 20 new packets, and report one ECN marking. I think
>     the correct order would be applying the additive increase for 19
>     packets first, and then applying the multiplicative decrease
>     afterwards. This is because receiving a CE-marked packet would
>     elicit an immediate ACK frame from a QUIC receiver (RFC 9000,
>     section 13.2.1). The draft should probably be explicit about this.
>

[BB] Good point.
I agree with your logic, and I've added this to the list of edits too.
However, I think I'll word it as a SHOULD, 'cos it makes sense when CE 
triggers feedback, but the implementer might have better info in some 
scenarios.

> 6.
>
>     Section 2.4.4: I'm struggling to follow how exactly cwnd is
>     supposed to change for small RTTs. Most important from an
>     implementation perspective: section 2.4.3 says that
>     /ai_per_rtt/ will have a different value for small RTTs. It would
>     be helpful if section 2.4.4 would contain an equation for
>     /ai_per_rtt/.
>

[BB] I think the equation you want is already in §2.4.4, altho I agree 
that the text could be clearer, because it states rules you might think 
are right then explains why they're wrong, rather than saying what is 
right first, then explaining why:

    Therefore, the increase in cwnd per packet has to be (1/M^2) * (1/cwnd).

This gives the increase per packet, not per RTT, but that's what is 
needed in an implementation isn't it? Or am I misunderstanding why you 
want the increase per RTT in particular?

Thanks for all these useful comments and questions.



Bob

>
>
> On Sat, 14 Oct 2023 at 19:45, Bob Briscoe 
> <ietf=40bobbriscoe.net@dmarc.ietf.org> wrote:
>
>     iccrg,
>
>     We've just posted an update to prague-congestion-control.
>     Links to diffs are quoted below.
>     The main technical changes:
>
>       * the Apple implementation falls back to CUBIC behaviour on loss
>         (both the reduction and the subsequent increase). Currently
>         the Linux implementation still falls back to Reno on loss, but
>         that is being changed.
>       * how the Apple implementation over QUIC behaves when the path
>         or the remote peer fails to support ECN properly
>       * the items already discussed on this list in response to Neal's
>         review, some of which were editorial, but others were
>         technical, e.g.
>           o pseudocode for removing integer rounding bias
>           o clarifying the RTT-independence approach
>
>     Cheers
>
>
>     Bob & co-authors
>
>
>
>     -------- Forwarded Message --------
>     Subject: 	New Version Notification for
>     draft-briscoe-iccrg-prague-congestion-control-03.txt
>     Date: 	Sat, 14 Oct 2023 05:07:58 -0700
>     From: 	internet-drafts@ietf.org
>     To: 	Bob Briscoe <ietf@bobbriscoe.net>
>     <mailto:ietf@bobbriscoe.net>, Koen De Schepper
>     <koen.de_schepper@nokia.com> <mailto:koen.de_schepper@nokia.com>,
>     Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com>
>     <mailto:olivier.tilmans@nokia-bell-labs.com>, Vidhi Goel
>     <vidhi_goel@apple.com> <mailto:vidhi_goel@apple.com>
>
>
>
>     A new version of Internet-Draft
>     draft-briscoe-iccrg-prague-congestion-control-03.txt has been
>     successfully
>     submitted by Bob Briscoe and posted to the
>     IETF repository.
>
>     Name: draft-briscoe-iccrg-prague-congestion-control
>     Revision: 03
>     Title: Prague Congestion Control
>     Date: 2023-10-14
>     Group: Individual Submission
>     Pages: 34
>     URL:
>     https://www.ietf.org/archive/id/draft-briscoe-iccrg-prague-congestion-control-03.txt
>     Status:
>     https://datatracker.ietf.org/doc/draft-briscoe-iccrg-prague-congestion-control/
>     HTML:
>     https://www.ietf.org/archive/id/draft-briscoe-iccrg-prague-congestion-control-03.html
>     HTMLized:
>     https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control
>     Diff:
>     https://author-tools.ietf.org/iddiff?url2=draft-briscoe-iccrg-prague-congestion-control-03
>
>     Abstract:
>
>     This specification defines the Prague congestion control scheme,
>     which is derived from DCTCP and adapted for Internet traffic by
>     implementing the Prague L4S requirements. Over paths with L4S
>     support at the bottleneck, it adapts the DCTCP mechanisms to achieve
>     consistently low latency and full throughput. It is defined
>     independently of any particular transport protocol or operating
>     system, but notes are added that highlight issues specific to certain
>     transports and OSs. It is mainly based on experience with the
>     reference Linux implementation of TCP Prague and the Apple
>     implementation over QUIC, but it includes experience from other
>     implementations where available.
>
>     The implementation does not satisfy all the Prague requirements (yet)
>     and the IETF might decide that certain requirements need to be
>     relaxed as an outcome of the process of trying to satisfy them all.
>     Future plans that have typically only been implemented as proof-of-
>     concept code are outlined in a separate section.
>
>
>
>     The IETF Secretariat
>
>
>     _______________________________________________
>     iccrg mailing list
>     iccrg@irtf.org
>     https://www.irtf.org/mailman/listinfo/iccrg
>

-- 
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/