Re: [tsvwg] What is "Scalable Congestion Control" in L4S?

Sebastian Moeller <moeller0@gmx.de> Tue, 16 April 2024 10:30 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A559C14F60C; Tue, 16 Apr 2024 03:30:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.847
X-Spam-Level:
X-Spam-Status: No, score=-6.847 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OhYSCA6_RL52; Tue, 16 Apr 2024 03:30:41 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 17950C14F695; Tue, 16 Apr 2024 03:30:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.de; s=s31663417; t=1713263439; x=1713868239; i=moeller0@gmx.de; bh=K+VVezvkxdTjrD+0Pc8j5V9/YuZ/NrHY9DDlBgWmsHg=; h=X-UI-Sender-Class:Content-Type:Mime-Version:Subject:From: In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id: References:To:cc:content-transfer-encoding:content-type:date:from: message-id:mime-version:reply-to:subject:to; b=UaFEbSbhY9gNTxAqO/cbL+CP/vkhs6nzSpYHGiBIStMroTm0Y6oOUN8ycYpz6al3 +AI75NM2hgpPljjHVlCsaqN+4Q23npwLNDB6wnoNfH7G5yPTPiSzgs3PZ7UG0reCS 0v8KKBOkhMWjeyy6dZHPD6OxBA7HffqzmIp2nvjiEZMYgboi3gDa4iuLiYlTurxOf vV5W/gvOAegM0DJ6K7djDf6+VNNlNYnJzUDFzz8zXsQXGZRlF3TKiaYJ/YaFDskGW YFwiLfxn9BRal60HxJuCXLH0H1sw0N8KIkm1622JAIrT6ySN4/kGM9IyFbll9JjAh z94Ezk2LVVDQ9GqfAg==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MOzSu-1s6VIh0Ac0-00PPg1; Tue, 16 Apr 2024 12:30:39 +0200
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <30f6c4b411034046814d6a90956f9949@huawei.com>
Date: Tue, 16 Apr 2024 12:30:28 +0200
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <BD28D463-9D61-4E91-88B3-78875F6CA45E@gmx.de>
References: <30f6c4b411034046814d6a90956f9949@huawei.com>
To: Vasilenko Eduard <vasilenko.eduard=40huawei.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3774.500.171.1.1)
X-Provags-ID: V03:K1:xu6oNhljMCFBs8Wqk6vc1AcfuffdAarZn/xKrrJ98L93XBQhiQL qcNJvGUcmzCOXKhnN1I+wLVh9LUCZd6q6FeRk1Omxq0T5OLAsM3piYSuxsg6OnsLnglbTsk Yqz0wbfDYT0TxXEtCVDIkp5IFmyfZo2f0Gcklpzpw+vc8Lee3/3xZ1x3R0V1hBv1w7OGAZR QNy9zGttvv7mi6eibVMQQ==
UI-OutboundReport: notjunk:1;M01:P0:Jy4WGNtRB3A=;tkhuB1WMC9A1x7cXLokvy78wJox yZxBQTxt0lsC+Si169rBbnv0jnsACb6ZjOLWdnZ0+J9b6BbUpnT4kXOs2K7pIx2tXJjZgY41f nYWl9g7RhXRL+gEgf4pixZ1X1hZv/as033A8lyH+gHPeJEr71U64NzmA1cb/a9fg5Ka0N3vFQ rVO573O//C3MF4wfAmHkuPZsVnt9pBdyi5fRiIaQAkl2224se8Paf6Xy5TMLFmOybD16ABkWz jowcuM8G76rnVSTXM5b9x3Z8A6LW5scRozOHwdiZovdUxBNqhj5IQu2BrG6nKjtXu0xtTI8Gr b013ht9wRFBYm9Y2xtA4A/LxMZmZYWiYCJAm3fx8Ghfix2lpPwMKugSC6IEEGRQapSOrCaNh2 aidfTLvHqThmNLnikTb3wjXCuhfwJJWyTJuIGeInX8WaE4TkV9g2K5f4S4iXS2vai8z4IaBhT OnpNN7V79MCQAO/OYoJcJp+tKyNKUKP6Ie2LEIw+UIeJbRNKfjrXs9OaIE5moyA4cMwMLM9++ MuBKVV5k07d0dcnwuFuzXn5uESBnoOTl4dymKbeEwmG1CuDsK4A4k2jYUqGxP3GzSAUYH2D4k RvR2t/AUJX4uZ2K+V8eSlkjlWkLpJT17Xyj8RYT/2J3m6gc0txYm1/8g4dERJRF4hme2Ey9Yt fmzuwqVwoti3jLe4E1C1cVUHLtFecMU8lGhXPzuBaaMOMv9LTo2fuF3nCIEF7UQtgTbK4h5rb L01xxtynIQ0pLszBiySVNlqPjiLhtIpzAGfy+eKdA4kFpwg08guauHiQGaKiZ8g7mX7vhgw+R h0/JuVlJOss4r86jUZuVwvj6YRWysqTYynfmSkvicTLwM=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/kOaFvQlute4HHy0upQT1AvxWKNA>
Subject: Re: [tsvwg] What is "Scalable Congestion Control" in L4S?
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Apr 2024 10:30:45 -0000

Hi Ed,

I stumbled over the same previously, but the subtle issue is the formal definition is about marking rate in marks/second, while the second looks at marking probability as marks/packets over a time window, and while the marking rate stays constant the resulting marking probability will decrease with increasing packet rate. This is also true if marking probability is measured as marks/byte. However I fail to see a clear methods to deduce the relevant timewindow to calculate marking probability over.


> On 16. Apr 2024, at 10:25, Vasilenko Eduard <vasilenko.eduard=40huawei.com@dmarc.ietf.org> wrote:
> 
> Hi all,
> 
> Both RFCs (9332, 9330) gives a formal definition that:
> "Scalable Congestion Control: A congestion control where the average time from one congestion signal to the next (the recovery time) remains invariant as flow rate scales, all other factors being equal."
> It is just a rate of the congestion signal, a simple matter.

[SM] Yes, this is marking rate in Hz.

> 
> RFC 9332 section 2.1 gives the impression that Scalable Congestion Control has more fundamental differences:
> "the steady-state cwnd of Reno is inversely proportional to the square root of p" (drop probability)
> But
> "A supporting paper [https://dl.acm.org/doi/10.1145/2999572.2999578] includes the derivation of the equivalent rate equation for DCTCP, for which cwnd is inversely proportional to p

[SM] But here this is marking probability which will depend on the actual data rate of the flow...

> (not the square root), where in this case p is the ECN-marking probability.

[SM] And that got me confused previously as marking rate and marking probability for a given data rate are proportional so I read p as a different way to say marking rate.

> DCTCP is not the only congestion control that behaves like this, so the term 'Scalable' will be used for all similar congestion control behaviours". Then in section 1.2 we see the BBR in the list of "Scalable CCs".
> 
> 1. The formal definition of "Scalable CC" looks wrong. At least it contradicts section 2.1.

[SM] Let's say that either description might be served well with explicitly describing the rate versus probability issue.

> 2. It is difficult to believe that BBR and CUBIC/RENO have so different reactions to overload signals because they both play fairly (starting from BBRv2) in one queue as demonstrated in many tests.

[SM] But they do differ... Traditional Reno will half its congestion window as a response to a dropped packet (or if rfc3168 is in use also as response to a CE-marked packet), while BBR will not do this... (older Versions of BBR will completely ignore marks and also try to ignore drops, newer versions of BBR will use a scalable response but still ignore drops up to a certain threshold). But these differences are not that relevant to BBR's sharing behaviour, as BBR determines its equitable capacity share via its probing mechanism and hence comes up with a decent response under similar conditions as reno, just based on different principles.

> It is probably impossible for such different sessions to share the load fairly if one session is reacting to p, but the other is reacting to the square root from p (p is the probability for congestion signal).

[SM] That is a true point, and that is why L4S requires a strict separation between the different response types and specific AQMs for each traffic type that take this into account.

Regards
	Sebastian

> 
> Best Regards
> Eduard Vasilenko
> Senior Architect
> Network Algorithm Laboratory
> Tel: +7(985) 910-1105
>