Re: [iccrg] Updates to draft-briscoe-iccrg-prague-congestion-control-03

Marten Seemann <martenseemann@gmail.com> Wed, 01 November 2023 06:05 UTC

Return-Path: <martenseemann@gmail.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A5C0C14CE2E for <iccrg@ietfa.amsl.com>; Tue, 31 Oct 2023 23:05:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.104
X-Spam-Level:
X-Spam-Status: No, score=-2.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 91MoSh7T4bMG for <iccrg@ietfa.amsl.com>; Tue, 31 Oct 2023 23:05:11 -0700 (PDT)
Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 36D38C151097 for <iccrg@irtf.org>; Tue, 31 Oct 2023 23:05:11 -0700 (PDT)
Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-53e855d7dacso10648159a12.0 for <iccrg@irtf.org>; Tue, 31 Oct 2023 23:05:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698818709; x=1699423509; darn=irtf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+CH3g9Bpim6dc1416D3D3bkxnHARse5roP7PwYOAN4A=; b=c8q7wJsN1T9DPCa4a2/oCJ/2lkQYU5pEHaPs/2AwVNYxANpGCRT9aGGYEKQN6gstps JfvxRZdV4Uq/0UnB/YWVnI7v2RClcNlRcNynmijeh3Oaj1BPxuzkznAj8u30WU1xBFdd xELzSgsh5EwzhRs2Ae01RaQAG1MicNfK0DzMthq7iLcdmH1rZ98NIq8HdhdD9sZLwaN5 LNlHFG3pwpxUEBYKnYTDi6mrZDxsSnvdd7Xluc9uqKckftyFlBRxwrYvoq1H1xlSPdlF UOZVjMCqxKqodeACFh9Upc1YbPqIK2mNvddZ5DUoWziwNuePI303BQ8ZaTsiLGp82RuS sMIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698818709; x=1699423509; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+CH3g9Bpim6dc1416D3D3bkxnHARse5roP7PwYOAN4A=; b=vIfSToyiYF2yHDK1LhE5s+XuRTJsV30hcWbML8QPRTmz+9MfuU3jq2KCI6Nu6hDxjQ o5KSx3BGOidNIakoAEtBFaLFz7mFlbTPOgveHeRH+HPhAGO7nbQjJhzDk+oYg51Cvb23 mOMLGjH28je7Ve9w+vgtg8n9H9cnzDDnL4eNJzxZNrGfYc7MgqWrCku9QZhAOVjN89nG Y+b5nYaohN1o2zGcLA9wxHHt7H6yQ63ctUsplFm5SFNkSycTm/MOS8B3yw0tEqmrxa8V B3qMavvjwxSL8OqSr4N9w8Bk6hrG3ObqKf40W8OURANOlGbWd6W4QvMsJWpK2KdkLGrS /vMQ==
X-Gm-Message-State: AOJu0Yx10kHz2H88Vx1TfEWoEWwaPqlNhEOe/5pOLIC2iLL1GDAP8owb X2vwNrepX05qnXgJfJtPttne+IP7hm/Z7zkBtlzDbdOmFW1Q1g==
X-Google-Smtp-Source: AGHT+IEwcctXYt8i5LRk/Mvezyf1dJeJgdwC0o8i/yXFr1q1Vc7yna9hs8ZJNie0BKL5zO+1JbsNb6FuFRlvyKdO2HI=
X-Received: by 2002:a50:bb2f:0:b0:542:f28e:2947 with SMTP id y44-20020a50bb2f000000b00542f28e2947mr7913980ede.26.1698818709180; Tue, 31 Oct 2023 23:05:09 -0700 (PDT)
MIME-Version: 1.0
References: <169728527879.18854.17962028148144369127@ietfa.amsl.com> <0c9d15e7-6f15-4b7c-b1ce-f50854152aef@bobbriscoe.net> <CAOYVs2rFgyRQ1Hdk6g1j9Ku23TS1FRjW2r104H_eUPJioLJLiw@mail.gmail.com> <ba04ef94-17b5-424b-a417-4fce9598ab1a@bobbriscoe.net> <9AA0A02D-0557-4F58-8D55-C28BE96C456E@apple.com>
In-Reply-To: <9AA0A02D-0557-4F58-8D55-C28BE96C456E@apple.com>
From: Marten Seemann <martenseemann@gmail.com>
Date: Wed, 01 Nov 2023 13:04:57 +0700
Message-ID: <CAOYVs2pWsHzAhTN0AJ5dGK+Smar=9O1=618DdK4XSPwynjf=Qg@mail.gmail.com>
To: Vidhi Goel <vidhi_goel@apple.com>
Cc: Bob Briscoe <ietf=40bobbriscoe.net@dmarc.ietf.org>, iccrg IRTF list <iccrg@irtf.org>
Content-Type: multipart/alternative; boundary="000000000000b2d6b50609110dfb"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/737rw45Cw55YWIOAqConRM-8TFM>
Subject: Re: [iccrg] Updates to draft-briscoe-iccrg-prague-congestion-control-03
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://mailman.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://mailman.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Nov 2023 06:05:18 -0000

> With L4S, you can get consecutive CE marks and I don’t think receivers
need to ACK every CE marked packet. So, probably it’s best to not assume
that every CE would illicit an immediate ACK. Another reason is ACK packets
can get dropped or reordered and then there is no guarantee of the order
(by looking at the ACK at sender) in which packets were received at the
receiver.

That's what RFC 9000 defines though. A compliant QUIC implementation will
send a new ACK for every CE-marked packet (ignoring batching optimizations
and packet loss of packets containing ACK frames). It seems like this is
suboptimal if L4S regularly causes frequent consecutive CE markings.
We used to have an "Ignore CE" field in the ACK_FREQUENCY frame. Maybe we
need to bring back that field, or add a field that allows for richer
signaling about which ECN feedback would be useful? This QUIC extension
just entered WGLC a few days ago. I've opened
https://github.com/quicwg/ack-frequency/issues/244 to discuss this.

On Wed, 1 Nov 2023 at 05:17, Vidhi Goel <vidhi_goel@apple.com> wrote:

>
>    1. Section 2.4.3: Similarly, what's the correct order to process an
>    ACK that reports an ECN marking: For example, an ACK might acknowledge 20
>    new packets, and report one ECN marking. I think the correct order would be
>    applying the additive increase for 19 packets first, and then applying the
>    multiplicative decrease afterwards. This is because receiving a CE-marked
>    packet would elicit an immediate ACK frame from a QUIC receiver (RFC 9000,
>    section 13.2.1). The draft should probably be explicit about this.
>
>
> [BB] Good point.
> I agree with your logic, and I've added this to the list of edits too.
> However, I think I'll word it as a SHOULD, 'cos it makes sense when CE
> triggers feedback, but the implementer might have better info in some
> scenarios.
>
>
> With L4S, you can get consecutive CE marks and I don’t think receivers
> need to ACK every CE marked packet. So, probably it’s best to not assume
> that every CE would illicit an immediate ACK. Another reason is ACK packets
> can get dropped or reordered and then there is no guarantee of the order
> (by looking at the ACK at sender) in which packets were received at the
> receiver.
>
> Instead of thinking about CC response based on guessing the order of
> packets being received at the receiver, we should think about it as,
> whatever maybe the order of CE or non-CE packets at the receiver, is
> applying additive increase before multiplicative decrease better than doing
> the other way.
>
> 1. AI before MD - This will apply reduction on a larger cwnd, probably
> safer.
> 2. MD before AI - this will apply reduction on existing cwnd (which is
> smaller than 1.) and then do additive increase. Final cwnd will be higher
> than option 1.
>
> I think option 1. is probably safer to alleviate the congestion faster at
> the bottleneck.
>
>
> Vidhi
>
> On Oct 31, 2023, at 9:33 AM, Bob Briscoe <ietf=
> 40bobbriscoe.net@dmarc.ietf.org> wrote:
>
> Marten,
>
> On 31/10/2023 10:50, Marten Seemann wrote:
>
> I read the draft and I'm trying to figure out how I'd implement Prague in
> my QUIC stack. There are a couple of things I've noticed:
>
>    1. Section 2.3.2: It’s unclear to me when exactly *alpha* is updated.
>    I assume that once I receive the first ACK, I save the timestamp. When I
>    receive a new ACK, there are two code paths: if it’s received within one
>    *rtt_virt*, just accumulate the counters used to calculate *frac*. If
>    it’s received after *rtt_virt*, update *alpha *according to the
>    equation given in this section, reset the counters for *frac* and save
>    the timestamp as the beginning of the next *rtt_virt* epoch. However,
>    this would mean that the *alpha* value used for multiplicative
>    decrease (section 2.4.2) would always be slightly outdated, which seems
>    suboptimal for an immediate response to a growing queue. Is there a better
>    way?
>
> [BB] Indeed. It's actually lot worse than "slightly outdated". As it says
> at the end of the section you refer to:
>
> However, another approach is being investigated because these per-RTT
> updates introduce 1--2 rounds of delay into the congestion response on top
> of the inherent round of feedback delay (see Section 3.1.3
> <https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-03#pracc_faster_response>
>  in the section on variants and future work).
>
>
> Section 3.1.3 goes on to summarize the brief tech report I posted on arXiv
> that explains and defines an alternative approach that removes all this
> lag, cited as [PerAckEWMA]:
>     Removing the Clock Machinery Lag from DCTCP/Prague
> <https://arxiv.org/abs/2101.07727>
>
> Joakim Misund was evaluating it about a year ago, when he decided to get a
> proper job ;) instead of doing his PhD with me. I am just getting round to
> working on it myself again. But if you wanted to try it yourself as wel, we
> could certainly compare notes.
>
>
>    1.
>
>    Section 2.3.3: QUIC uses both packet- and time-threshold loss
>    detection (see sections 6.1.1 and 6.1.2 of RFC 9002). I’m not sure what
>    exactly the recommendation of this draft is.
>
> [BB] I hadn't appreciated that QUIC deems there's been a loss if /either/
> of these conditions is met [RFC9002; §6.1]:
>
> The packet was sent kPacketThreshold packets before an acknowledged packet
> (Section 6.1.1
> <https://datatracker.ietf.org/doc/html/rfc9002#packet-threshold>), or it
> was sent long enough in the past (Section 6.1.2
> <https://datatracker.ietf.org/doc/html/rfc9002#time-threshold>).
>
>
> Whereas TCP RACK [RFC8985], only uses DupThresh at the start of a flow
> (when the RTT is likely to be inaccurate) until a decent reordering window
> is established (if reordering is detected at all):
>
> if some reordering has been observed, then RACK does not trigger fast
> recovery based on DupThresh.
>
> Assuming the QUIC RFC is not just badly written, it doesn't seem right.
> Once a flow has got going, a time threshold is generally considered more
> robust than a packet-threshold. So, even if the packet-threshold is adapted
> in parallel to the time-threshold, using the logical OR of them both will
> often override the time-threshold with the less robust packet threshold.
>
>
>    1. [cont] It would certainly be possible to turn off packet-threshold
>    loss detection, and rely on time-threshold altogether. Is that what QUIC
>    implementations should do?
>
>
> [BB] Well, that's certainly what I thought RACK was meant to do.
> This might end up requiring an erratum to RFC9002.
>
>
>    1.
>
>    Section 2.4.2: Is the suppression of further decreases after one
>    ECN-triggered decrease for one *srtt*, or is it one *rtt_virt*?
>    Reading section 2.4.4 it sounds like it’s *rtt_virt*, but this could
>    probably be clarified in this section.
>
>
> [BB] Yes. I will check that the Linux code does that though.
>
> We should have caught that mistake in the draft recently when we went
> through the draft checking all the places where it had said 'RTT' before we
> introduced rtt_virt. I've added this to my list of edits to make for the
> next rev.
>
>
>
>    1. Section 2.4.3: The QUIC ACK frame acknowledges (multiple) ranges of
>    packets at the same time, together with cumulative ECN counts. It’s
>    therefore not possible to tell which packet was ECN-marked. This means that
>    a QUIC stack will be able to determine *acked_sacked*, but not
>    *ece_delta*. Is it valid to approximate it by assuming that all
>    packets had the same average size? Either way, this is pretty awkward to
>    fit into the pseudo-code given in appendix B.5 of RFC 9002.
>
>
> [BB] Yes. It has to be, given the current QUIC protocol.
>
> Despite having the opportunity to fully integrate ECN into the design of
> QUIC, it seems it was still a bit of an afterthought (I shouldn't complain,
> 'cos I volunteered to help with adding ECN, but then didn't get involved
> 'cos of other pressing work at the time).
>
>
>
>    1. Section 2.4.3: Similarly, what's the correct order to process an
>    ACK that reports an ECN marking: For example, an ACK might acknowledge 20
>    new packets, and report one ECN marking. I think the correct order would be
>    applying the additive increase for 19 packets first, and then applying the
>    multiplicative decrease afterwards. This is because receiving a CE-marked
>    packet would elicit an immediate ACK frame from a QUIC receiver (RFC 9000,
>    section 13.2.1). The draft should probably be explicit about this.
>
>
> [BB] Good point.
> I agree with your logic, and I've added this to the list of edits too.
> However, I think I'll word it as a SHOULD, 'cos it makes sense when CE
> triggers feedback, but the implementer might have better info in some
> scenarios.
>
>
>    1. Section 2.4.4: I'm struggling to follow how exactly cwnd is
>    supposed to change for small RTTs. Most important from an implementation
>    perspective: section 2.4.3 says that *ai_per_rtt* will have a
>    different value for small RTTs. It would be helpful if section 2.4.4 would
>    contain an equation for *ai_per_rtt*.
>
>
> [BB] I think the equation you want is already in §2.4.4, altho I agree
> that the text could be clearer, because it states rules you might think are
> right then explains why they're wrong, rather than saying what is right
> first, then explaining why:
>
> Therefore, the increase in cwnd per packet has to be (1/M^2) * (1/cwnd).
>
> This gives the increase per packet, not per RTT, but that's what is needed
> in an implementation isn't it? Or am I misunderstanding why you want the
> increase per RTT in particular?
>
> Thanks for all these useful comments and questions.
>
>
>
> Bob
>
>
>
> On Sat, 14 Oct 2023 at 19:45, Bob Briscoe <ietf=
> 40bobbriscoe.net@dmarc.ietf.org> wrote:
>
>> iccrg,
>>
>> We've just posted an update to prague-congestion-control.
>> Links to diffs are quoted below.
>> The main technical changes:
>>
>>    - the Apple implementation falls back to CUBIC behaviour on loss
>>    (both the reduction and the subsequent increase). Currently the Linux
>>    implementation still falls back to Reno on loss, but that is being changed.
>>    - how the Apple implementation over QUIC behaves when the path or the
>>    remote peer fails to support ECN properly
>>    - the items already discussed on this list in response to Neal's
>>    review, some of which were editorial, but others were technical, e.g.
>>       - pseudocode for removing integer rounding bias
>>       - clarifying the RTT-independence approach
>>
>> Cheers
>>
>>
>> Bob & co-authors
>>
>>
>>
>> -------- Forwarded Message --------
>> Subject:  New Version Notification for
>> draft-briscoe-iccrg-prague-congestion-control-03.txt
>> Date:  Sat, 14 Oct 2023 05:07:58 -0700
>> From:  internet-drafts@ietf.org
>> To:  Bob Briscoe <ietf@bobbriscoe.net> <ietf@bobbriscoe.net>, Koen De
>> Schepper <koen.de_schepper@nokia.com> <koen.de_schepper@nokia.com>,
>> Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com>
>> <olivier.tilmans@nokia-bell-labs.com>, Vidhi Goel <vidhi_goel@apple.com>
>> <vidhi_goel@apple.com>
>>
>> A new version of Internet-Draft
>> draft-briscoe-iccrg-prague-congestion-control-03.txt has been successfully
>> submitted by Bob Briscoe and posted to the
>> IETF repository.
>>
>> Name: draft-briscoe-iccrg-prague-congestion-control
>> Revision: 03
>> Title: Prague Congestion Control
>> Date: 2023-10-14
>> Group: Individual Submission
>> Pages: 34
>> URL:
>> https://www.ietf.org/archive/id/draft-briscoe-iccrg-prague-congestion-control-03.txt
>> Status:
>> https://datatracker.ietf.org/doc/draft-briscoe-iccrg-prague-congestion-control/
>> HTML:
>> https://www.ietf.org/archive/id/draft-briscoe-iccrg-prague-congestion-control-03.html
>> HTMLized:
>> https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control
>> Diff:
>> https://author-tools.ietf.org/iddiff?url2=draft-briscoe-iccrg-prague-congestion-control-03
>>
>> Abstract:
>>
>> This specification defines the Prague congestion control scheme,
>> which is derived from DCTCP and adapted for Internet traffic by
>> implementing the Prague L4S requirements. Over paths with L4S
>> support at the bottleneck, it adapts the DCTCP mechanisms to achieve
>> consistently low latency and full throughput. It is defined
>> independently of any particular transport protocol or operating
>> system, but notes are added that highlight issues specific to certain
>> transports and OSs. It is mainly based on experience with the
>> reference Linux implementation of TCP Prague and the Apple
>> implementation over QUIC, but it includes experience from other
>> implementations where available.
>>
>> The implementation does not satisfy all the Prague requirements (yet)
>> and the IETF might decide that certain requirements need to be
>> relaxed as an outcome of the process of trying to satisfy them all.
>> Future plans that have typically only been implemented as proof-of-
>> concept code are outlined in a separate section.
>>
>>
>>
>> The IETF Secretariat
>>
>>
>> _______________________________________________
>> iccrg mailing list
>> iccrg@irtf.org
>> https://www.irtf.org/mailman/listinfo/iccrg
>>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://mailman.irtf.org/mailman/listinfo/iccrg
>
>
>