Re: [tsvwg] L4S status: #17 Interaction w/ FQ AQMs

Greg White <> Wed, 14 August 2019 23:27 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 751C4120901 for <>; Wed, 14 Aug 2019 16:27:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 5jR0ag634leZ for <>; Wed, 14 Aug 2019 16:27:49 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E31131208E6 for <>; Wed, 14 Aug 2019 16:27:48 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901;; cv=none; b=Lg9y3XgQ4atYjvXg/tmfMjscxWp9ZW/F3c3WuHu9Y7sGwqrVckQ5su0fe+axP3uJtYr1Ssyp/q8pL0WDqeA6s8/Q2SYnusWc/Vhcz21RIRGQh1YpGNDaU3TCx10CgAkDEHoypoEsS81v3hKqBgHqbVgamSnDIcl2SO2kg/6QBNbeYrgJp+oZNZif4DqYuK5wWv6UIY9D451YpKGpF2OnxxGqf2txGP9v4vXIgHM99rQAf6DRWTTUpeRKLdQMU52kekqnybkPQyyxoqDOksKsk9/dGdppHVTuaLF+Ao9f3AL/upYHUIrXs/HaqCsJoxl29FhkmYjZQ6ynTJFan9XRvg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LZGMMtUaPENJG/9JA8WkwfMvvp1+X9W1Vq7WE6VUR4U=; b=T6VXYDuNjOTKFfI88+v+4Xb2N/v2ooH3SaU7Y1tRHdzkG2NjNaSXqWuErhD9TTizjqjNFNdHOEyfY/A/jW9Hzai+NgGtteXC9su7AFMURZ8KGXYHyZXzP/8FgJ5CwiP7osFEJTqlJvtKq8fklD9xRE5ingrYt33eseVXLbfxRPenlNm4y+oIrit0APq+F4UCO/qQNQZKS76+3Kt5pgq6dC0Z6KjyF0bvEuHabbu06GUe7HGBJVL1PLkdoMJMtsJFyGYTa6SYSKCs9na7t6ndUSGzxFCspZb+MhuV3kOtTnaqiPXCy8DjQ7KJBXWqavFyoFtGm14nDzhqrTlp+TYGVg==
ARC-Authentication-Results: i=1; 1; spf=pass; dmarc=pass action=none; dkim=pass; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LZGMMtUaPENJG/9JA8WkwfMvvp1+X9W1Vq7WE6VUR4U=; b=SRlhvgEdvIrvTvHu9cFAVpxlRvhy3A9mNZiY3+sTktg6oy1uDoT/UkixnaXnHGmLpOwi7QPojyP8yLbyyl4+jvYpIwKdq57bKwqvG4Ia7NZeBnSt90m2J8PTxGpJC2sM9jpVXrxyvdByVcZhxaLjVTgXEXF+9Fultbf4Uj8iJaA=
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2157.23; Wed, 14 Aug 2019 23:27:46 +0000
Received: from ([fe80::d33:804d:d298:58b3]) by ([fe80::d33:804d:d298:58b3%3]) with mapi id 15.20.2157.022; Wed, 14 Aug 2019 23:27:46 +0000
From: Greg White <>
To: Jonathan Morton <>
CC: Wesley Eddy <>, "" <>
Thread-Topic: [tsvwg] L4S status: #17 Interaction w/ FQ AQMs
Thread-Index: AQHVUIQ5bdBq+WM6aUOvw+6LfuRJ2ab3zrwAgAB9soCAAp22AA==
Date: Wed, 14 Aug 2019 23:27:46 +0000
Message-ID: <>
References: <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/10.1a.0.190530
authentication-results: spf=none (sender IP is );
x-originating-ip: []
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 48fc4752-b15f-4b7c-5ca1-08d7210f03c4
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:SN6PR06MB4270;
x-ms-traffictypediagnostic: SN6PR06MB4270:
x-microsoft-antispam-prvs: <>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 01294F875B
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(346002)(39850400004)(136003)(366004)(396003)(189003)(199004)(11346002)(26005)(66066001)(14444005)(4326008)(6246003)(86362001)(6512007)(54906003)(229853002)(6436002)(64756008)(71200400001)(186003)(1411001)(58126008)(6486002)(53936002)(6506007)(316002)(25786009)(66476007)(53546011)(71190400001)(102836004)(66946007)(91956017)(76116006)(66556008)(99286004)(305945005)(2616005)(256004)(486006)(446003)(66446008)(476003)(76176011)(7736002)(2906002)(6916009)(36756003)(33656002)(81156014)(5660300002)(81166006)(6116002)(3846002)(8936002)(478600001)(14454004)(8676002)(85282002); DIR:OUT; SFP:1102; SCL:1; SRVR:SN6PR06MB4270;; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None ( does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: 70TgvP90d8ehI+y57SenF8Sp352vffeMpdZ5g6GPhFexF1rNfatfgXhVTMGFOv9VZ95DEa1Ti/hMQBW9u5s4UQyRs+u5IOQmRweGs+alVz5heBMjTW3F4E+7KoWtIsS8GSE71WY5kra9a4nFpwkaw27TAvYNw4h0igtj43QIYdnPl5JxqDfhPeDB8PuJQSgomyeFuzXW0Eh/heIngN6a4QhgHZN7rsxs+U5l6CM/Y6okFk5BraAcaXQijGfaYFBLUjPrvgQ3Ezt/lCtZ/1PYcuBc1ruFaC9Tz2sCwCR83m5hugSGKNXJzBhIVjovIF0ridtrcv8IS3kIoAW8wU7jsGO0Ufj6G1qHJtlA5Idm7oiF/OmNuqMA0wdxaexwVFcsVBNqQXM3hYaF/+H3wbjuxRDwsWKaRbLRWUPyUImjNa4=
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-ID: <>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 48fc4752-b15f-4b7c-5ca1-08d7210f03c4
X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Aug 2019 23:27:46.5222 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: ce4fbcd1-1d81-4af0-ad0b-2998c441e160
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: GRE9o8Wc713ZdeA9nhAhB4tGUS1tFMcSve6I/+/mZCsrboRM/1kFtGBuNg6jnUb+6bRyIIxTPuFbaLlJ4fqOvg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR06MB4270
Archived-At: <>
Subject: Re: [tsvwg] L4S status: #17 Interaction w/ FQ AQMs
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 14 Aug 2019 23:27:51 -0000

Comments prefaced with "[GREG]" below.

On 8/12/19, 7:30 PM, "Jonathan Morton" <> wrote:

    > On 13 Aug, 2019, at 3:00 am, Greg White <> wrote:
    > [Sebastian]     But let's assume I have two flows one L4S one RFC 3168, and my link is already at saturation with each flow getting 50% of the shaper's configured bandwidth…
    Okay, this scenario covers a steady-state case.  I'm more concerned with transient behaviour, specifically ensuring that short-term changes in the traffic conditions result in only short-term effects on other traffic.  But there are a couple of fallacies in the following wall of text that I was able to spot, and which need squashing.
    > I don't see how the RFC3168 sender would be triggered to reduce its cwnd below 50% of BDP.  (by an FQ-AQM, competing with one other bulk flow)
    This is trivially demonstrable as soon as you have an RTT long enough for a given sampling tool to observe the effect reliably, and in fact you can see it in the SCE slide deck (in which there is one chart showing FQ-AQM behaviour).
    RFC-3168 senders respond to single CE marks with a Multiplicative Decrease, and FQ-AQMs will provide these marks very shortly after the fair share is reached.  Assuming the standard 50% decrease is employed with NewReno, that means it will be sent down to about 25% link capacity, or more precisely, half the fair share - from where it must linearly grow back again.  CUBIC reduces to 70%, which means it is sent down to about 35% briefly, and recovers more quickly with its polynomial response.

[GREG] I don't believe that what I described was a fallacy.   To be clear, when an RFC-3168 sender responds to a CE mark with a multiplicative decrease, what it is multiplicatively decreasing is NOT its sending rate.  It is decreasing its cwnd.  When cwnd is greater than BDP, sending rate is no longer proportional to cwnd.  In an idealized case, the bottleneck link will trigger cwnd reductions such that the reduced cwnd is exactly equal to the BDP (and queuing delay is zero), which means that the cwnd reduction is triggered when cwnd = BDP/0.5 (for Reno) or cwnd = BDP/0.7 (for Cubic).  This preserves continuous 100% link utilization with the minimum queuing delay, and cwnd is always >= BDP.   The CA sender behavior when cwnd is >= BDP is (via ack clocking or pacing) that the sender sends traffic *just slightly* greater than the link bandwidth (B) causing the queue to grow.  In the case of Reno the sender rate is B + (1 MTU / RTT) (where RTT = base RTT + queue delay), and the queue grows by one MTU each RTT.   The situation where the behavior you described would happen is if there is essentially zero buffering, and so the cwnd reduction happens when cwnd = BDP.  This would reduce the cwnd to 0.5*BDP (Reno) or 0.7*BDP (Cubic) which (due to zero queuing delay) would actually equate to a sending rate of 0.5*B or 0.7*B.  If the AQM in the systems you are using causes this behavior, then I would argue that it is not CoDel.  The design of CoDel was such that the *minimum* (not maximum) sojourn time over the last window (100ms) is driven to 5ms or below (not zero).  It does happen that CoDel occasionally drops/marks excessively, such that the queue fully drains and the link is underutilized, but not to the extent you are describing (unless CoDel is implemented incorrectly or misconfigured).    The goal of AQMs has generally been to preserve (in as many scenarios as possible) near 100% link utilization, while minimizing queuing delay.   Is your AQM aiming at some other goal?

    > It seems as if your concern is that, when the RFC3168 flow cuts its cwnd to drain standing queue, that the L4S sender will shut out the RFC3168 sender on the access network link, and thus force the RFC3168 flow's FQ queue to drain completely.
    No, the *eventual* convergence to fair shares in steady state is not in question.
[GREG] Neither (in my view) is the rapid convergence to fair shares**.  But this apparently needs to be demonstrated.

    FQ enforces it to the best of its ability (which, crucially, is determined by the traffic which makes it through the preceding network elements),
[GREG]  FQ enforces it *precisely* as long as neither queue drains completely.

    and Codel (at least in COBALT form, which is what we tested) does eventually ramp up its marking rate to the point where a DCTCP response function is properly controlled.  But it takes four whole seconds to get there when a new L4S flow starts up, with default settings and the short inherent RTTs that the L4S team appears to favour.  Note that I'm not claiming that four-second queues exist, only that elevated queuing latency (peaking around 125ms) persists for four seconds.

    A lot can happen in those four seconds, from the user's perspective.  It's long enough to seriously disrupt online gameplay, causing the player to miss crucial shots because his timing is thrown off by an eighth of a second.  VoIP streams will be forced to switch on their jitter buffers and pause the received audio stream to catch up, and will most likely never recover from that condition to the more realtime behaviour they previously enjoyed.  And so on.
[GREG]  Ok, here I think we're on the same page.  The CoDel response is too sluggish to bring latency under control as quickly as we would like.  An L4S sender is aiming to see 2 marks per RTT, and the normal CoDel ramp up of marking rate (in marks/second) over time is approx. 7 + 50*t.  So, for a 10ms RTT, where an L4S sender is wanting to see 200 marks per second, CoDel would take about 3.8 seconds to ramp up to that level.  We will need to conduct the experiments to see what the latency looks like during that period, but if it is up to 125ms of queuing delay for 3.8 seconds, then that is not what an L4S sender would be wanting to see.  (minor quibble: I'd argue that VoIP probably isn't the best example of an L4S flow, but cloud gaming certainly is.)   

    Here is a simple experiment that should verify the existence and extent of the problem:
    [Sender] -> [Baseline RTT 10ms] -> [FQ_Codel, 100% BW] -> [Receiver]
    [Sender] -> [Baseline RTT 10ms] -> [Dumb FIFO, 105% BW] -> [FQ_Codel, 100% BW] -> [Receiver]
    Background traffic is a sparse latency-measuring flow, essentially a surrogate for gaming or VoIP.  The instantaneous latency experienced by this flow over time is the primary measurement.
    The experiment is simply to start up one L4S flow in parallel with the sparse flow, and let it run in saturation for say 60 seconds.  Repeat with an RFC-3168 flow (NewReno, CUBIC, doesn't matter which) for a further experimental control.  Flent offers a convenient method of doing this.
    Correct behaviour would show a brief latency peak caused by the interaction of slow-start with the FIFO in the subject topology, or no peak at all for the control topology; you should see this for whichever RFC-3168 flow is chosen as the control.  Expected results with L4S in the subject topology, however, are a peak extending about 4 seconds before returning to baseline.
[GREG] I would not expect to see the FIFO holding a queue in either case, (i.e. I would expect this phenomenon to only affect the latency of the L4S flow) as long as BW > 24 Mbps (see ** below) but we will see.

    Please let me know how you get on with reproducing this result.
     - Jonathan Morton
[GREG] ** In "Sebastian's Topology" the difference between the Dumb FIFO BW and the FQ_Codel BW needs to be greater than or equal to one MTU / RTT (for Reno or DCTCP/Prague) to prevent queuing from occurring in the Dumb FIFO for a single steady state flow.  I suppose it needs to be an even bigger difference to prevent FIFO queuing with Cubic.   What is the rationale for using a multiplicative factor (105%) rather than an additive one scaled based on expected CC dynamics?