Re: [L4s-discuss] Configuring a L4S test plant

Sebastian Moeller <moeller0@gmx.de> Fri, 06 October 2023 07:09 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: l4s-discuss@ietfa.amsl.com
Delivered-To: l4s-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E835C14CF1C; Fri, 6 Oct 2023 00:09:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.853
X-Spam-Level:
X-Spam-Status: No, score=-1.853 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Si95kNg4jgGV; Fri, 6 Oct 2023 00:09:46 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 59D87C14CE2F; Fri, 6 Oct 2023 00:09:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1696576183; x=1697180983; i=moeller0@gmx.de; bh=xSNsBrYVe2v9sTJ0kMqijXplai0aeyGet3dVBsFjhGI=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=cAvQ1T4br10XNqvZhf8tUPMdSKRy9SMHM9qg+U7aW4yA3uMytwdOr3uRUGFoqXYXilh1cCsr5mo 6sMMcGSpjbrx6MdqqDp1RYxmlgcd3SU28UuGtgA5CkZptcjNkyWuRvbUQNUrRsRM2jNwcFgCklz8p 6cu3Be8gPpGP4rU9RjSocnkhUU4Uh1HY78mYFhzTuu0yZxncHpBk7Vw3/ZnoTYXF/rshfDiXw8+5v B9y7468Aybuav+oHgRikqwP0QlvgNxxWUiXVm4olfcP62XJWA3pLSL+iTFa8M31S+S9zHpsSkDPJi IynQMXYqUt0aDx3LpjncXIS8OrAcde0mSQzg==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MKsj7-1r2lbf3m2f-00LGJJ; Fri, 06 Oct 2023 09:09:42 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <AM9PR07MB73139697A34A0A74DBF29300B9CAA@AM9PR07MB7313.eurprd07.prod.outlook.com>
Date: Fri, 06 Oct 2023 09:09:41 +0200
Cc: Matteo Guarna S303434 <matteo.guarna@studenti.polito.it>, "l4s-discuss@ietf.org" <l4s-discuss@ietf.org>, "Chia-Yu Chang (Nokia)" <chia-yu.chang@nokia-bell-labs.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <7366EED7-BE11-464A-B2FE-A12129F4E370@gmx.de>
References: <7952e11516cc7b25484b53ae1380d88c@studenti.polito.it> <230D9924-C32F-4DE8-8BBD-F3D35D94B05B@gmx.de> <b82b81e36e168f6e627798d8cd588db8@studenti.polito.it> <A3BEF415-8574-4854-93D5-7CD1DB7B60F5@gmx.de> <CADVnQynOTd3FsHRk-BG5BTTmEYaM3JdnPj5qJQ9BHOqY_SPwsQ@mail.gmail.com> <727ed5bc3df58dff2e23115a8165b9b2@studenti.polito.it> <AM9PR07MB73139697A34A0A74DBF29300B9CAA@AM9PR07MB7313.eurprd07.prod.outlook.com>
To: "Koen De Schepper (Nokia)" <koen.de_schepper=40nokia-bell-labs.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3696.120.41.1.4)
X-Provags-ID: V03:K1:+HnDc4qanFFaGB2xUlPrc3O5XSGPMx9k8F1CSFwkxfHHJfIgpi9 n5T2vvSouN4nFGyFYUznt8PhEXTPreFEwXQ9CQKkDyMdSQAbUDTe0f31N+hI++LVEpvJIMh i2qUvgtrNKTz71Sx4wi7ZgzHSfEXJoPvHLpIwFvhi8KVXmvt6choz0sLKDVgxe4+F1hAaIA +uPte2uWAZSgZd7KBAX7w==
UI-OutboundReport: notjunk:1;M01:P0:o2tZ6A3/5yE=;IKXd+jic5QhnhD/tAtqmMPrUN4w 2XwdkAUUqGurJ4eNYx37PcIxNanA9F4JNZPljQeHXtLSBJzPKNeT1wOE10nMhOszBtag4MT2E kkVM0d5WiYZP4itv6K6sJlGOOHQoOQwYBk6MXGIPY420OLq0DJxnYIKeTtZsh7lREKBbY5Vmc oiB5CcMTuRX1Ew94kjx5aHgNkUJA4JLBXeK2X2Zu1C5nGfIlG1xMHAxoNYQSZZEVK7oAHlcf3 ntC3urxb2F/IJGcoAWo3A06U6dn25vSVYP4pIzB6P5Y8kt7TnaNDEneqnlcitYokJvKFzCFgI YsaFKDRWFpNfPQ8+3mCkoBe6FoHNPmpHiNsUW97bMKSiY5VKNkNq8irkC1afPuFuMC5BGzAZn zT5OxvPQ2zBhcrs6Q3X3Va3htCqELZbhgrsy0rbwggEer1hqcisgTePDrrsRQyYPfnbxwZDOu wpU4I+konjOyxIBqsNG0wKHED739srxBiRMAWJI2xfTnJ5RGWwlNNr8Uo4CgU4U9Otmo6p1qF NFj6Nc+fEnR9UsicQr7AGmEteG3+5R04LRl5l+i8CBBz9TCleGIjXhenHa0j+9CT6IZYEsOoB cmlmie70oyGwgazB69PAyK7qTa7z2ddhmTblaNwr3NlYkDvE9kcvQqniALYCC2d68rASoTIpv yQiBerEyBVM1+v7QOJpAP73HgpLTqxSoYc36LxYLUBfQdnUpep/md7UY+xutFhzr+4cAKdm2p erNSNJOBOGOHDFI1LZ7SiqeO68PWCXCzWSwKt2gqAsst8KwZgzumZbJy1nk38SEeIoVLoGndb Go2UeQ4vT0odUhHXqEJpqGleD7QfiIyN2kl3IFFyV9/V2xd3QBoUlhNLIoGimNLXdWY09bwO9 lrScI2w7oY+lr7rNeVWiUHeI0HWT1PZ9G9XtlaWn71Ytuu6CLYW1qljw4aIydeb/s5au/mlG/ +Dkxfj760RleiZSUv1oeR8YCyIM=
Archived-At: <https://mailarchive.ietf.org/arch/msg/l4s-discuss/nyzEJdAhoXZJ1pRmghgrCn6TRHY>
Subject: Re: [L4s-discuss] Configuring a L4S test plant
X-BeenThere: l4s-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Low Latency, Low Loss, Scalable Throughput \(L4S\) " <l4s-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/l4s-discuss>, <mailto:l4s-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/l4s-discuss/>
List-Post: <mailto:l4s-discuss@ietf.org>
List-Help: <mailto:l4s-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/l4s-discuss>, <mailto:l4s-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Oct 2023 07:09:51 -0000

Hi Koen,


> On Oct 5, 2023, at 21:43, Koen De Schepper (Nokia) <koen.de_schepper=40nokia-bell-labs.com@dmarc.ietf.org> wrote:
> 
> Hi Matteo,
> 
> Does your topology have a base RTT configured on your bottleneck, or is it just LAN type of idle ping (like a few µs's)? The Prague version we have on the testing branch does not have the "below minimum window of 2" requirement implemented. This means that every Prague flow will have at least 2 packets in flight per RTT, with as a result that Prague flows on such low RTTs get a very big rate and become unresponsive when it should reduce its window below 2packets per Xµs. As a result the DualPI2 AQM will go in overload protection mode and will drop both Prague and Classic traffic,

	[SM] Looking at the Prague trace, I do not seem to see dupACKs or other obvious signs of drops (I might be doing it wrong though). In the cubic trace I do see dupACKs, so there drops do happen...



> where Prage with only 2 packets in flight gets a high chance of losing all packets in flight and goes in RTO (I believe 200ms timeouts),

	[SM]In the prague trace there are no signs of 200ms timeouts...


> causing a very low average rate. Cubic has many packets in flight and in a buffer, and usually never gets RTO events, so can keep its more normal rate.
> 
> If you add a netem latency on the middle router of 5 to 20ms (which is typical in the real internet today) you will see that this issue is not appearing (at least not at 100Mbps anymore).

	[SM] Speaking of 20ms, is the RTT-de-bias mode on be default in Prague, and if so what value is the reference RTT set to?


> 
> In our latest IETF interop we demonstrated the same problem on our PON (FttH) emulation setup, showing that this 'below minimum window of 2 packets' requirement is important to implement. But also in this case we assumed that the server is on an edge cloud, near to the Access network to have such low base latencies (of 300µs RTT).

	[SM] QUESTION: since most deployed PONs seem to use some sort of request/grant mechanism (e.g. DBA in GPON) is 0.3ms actually an achievable RTT or would that not be closer to 1-2 ms?


> As a result we have implemented a new Prague update that can go to rates of below 2 packets in flight per RTT, and will be controllable down to 100kbps on any RTT below 25ms (even 0µs, and at that low rate gets theoretically 91% marking for L4S and 20% loss for Classic).
> So as a second testcase, you could try keeping that really low base RTT, and use the Prague kernel on the "ratebase" branch. Note this is an alpha version, not fully regression tested by us, but since you are facing the problem, you can use it. If you find any issues, contact me or Chia-Yu in CC.
> 
> Success,
> Koen.
> 
> 
> -----Original Message-----
> From: Matteo Guarna S303434 <matteo.guarna@studenti.polito.it> 
> Sent: Thursday, October 5, 2023 6:22 PM
> To: l4s-discuss@ietf.org
> Subject: Re: [L4s-discuss] Configuring a L4S test plant
> 
> Hi Neal,
> 
> thank you for reaching me. I executed the script on both the prague and the cubic server as you asked.
> 
> The prague server has IP address 192.168.202.21, and transmits data towards 192.168.201.17 The cubic server has IP address 192.168.202.22, and transmits data towards 192.168.201.18
> 
> All connections lasted for 20 seconds and were established via iperf3 in reverse mode
> 
> Please forgive me for having the date on the two machines out of sync (the flows had in fact started at the same time):
> - the transmission timestamp on the prague server begins at Thu Oct 5 2023, 05:22:50 PM CEST
> - the transmission timestamp on the cubic server begins at Fri Sep 29 2023, 01:37:53, CEST
> 
> I am providing you with the captures as attachments to this mail: I named them with the "prague" and "cubic" suffixes after the servers where the capture took place.
> 
> 
> If you need more information please don't hesitate to contact me
> 
> Best regards and thank you in advance,
> 
> Matteo Guarna
> 
> 
> 
> Il 2023-10-04 17:18 Neal Cardwell ha scritto:
>> Thanks for the report, Matteo.
>> 
>> To help debug this, could you please gather and share the following 
>> instrumentation during one of your tests? This would need to be 
>> collected on both data senders (servers), as root:
>> 
>> (while true; do date; ss -tenmoi; sleep 1; done) > /root/ss.txt & 
>> tcpdump -w /root/dump.pcap -n -s 100 -c 1000000 host $REMOTE_HOST -i 
>> $INTERFACE & nstat -n; (while true; do date; nstat; sleep 1; done)  > 
>> /root/nstat.txt &
>> 
>> The data should probably only be needed for the time interval starting 
>> from before the test and ending when the flows reach steady state, 
>> which may be 10-20 secs into the test.
>> 
>> thanks,
>> neal
>> 
>> On Wed, Oct 4, 2023 at 6:03 AM Sebastian Moeller <moeller0@gmx.de>
>> wrote:
>> 
>>> Hi Matteo,
>>> 
>>>> On Oct 4, 2023, at 11:48, Matteo Guarna S303434
>>> <matteo.guarna@studenti.polito.it> wrote:
>>>> 
>>>> Hi Sebastian and thank you for your answer
>>>> 
>>>> Il 2023-10-03 16:39 Sebastian Moeller ha scritto:
>>>>> Hi Matteo.
>>>>>> On Oct 3, 2023, at 15:42, Matteo Guarna S303434
>>> <matteo.guarna@studenti.polito.it> wrote:
>>>>>> Greetings everyone,
>>>>>> I hope the question isn't too off-topic, please forgive me in
>>> advance if it is so.
>>>>>> I am still trying to perform some fairness measurements with
>>> both L4S and classic flow, although now on a physical test plant 
>>> instead of a virtualized one. I'm relying on the L4STeam Github 
>>> project for the deployment of the L4S architecture and I am looking 
>>> for someone who's familiar with the project and might be willing to 
>>> help me: in fact I seem not to be able to achieve the correct 
>>> configuration.
>>>>>> My setup is very simple: I have four servers (two senders and
>>> two receivers) exchanging two traffic flows through one server acting 
>>> as a router. One client-server pair uses Prague as CC, while the 
>>> other uses Cubic. All servers have the patched kernel provided in the 
>>> https://github.com/L4STeam/linux/ repository branch.
>>>>>> If I trigger a congestion on the router by generating both the
>>> Prague and the Cubic flows (let's say the flows measure 100 Mbit/s 
>>> each, and they come though a L2 switch both on the same router's 
>>> input interface on a 1Gb Ethernet link; only a 100M link though is in 
>>> place on the output interface towards the receivers) I see the L4S 
>>> flow having higher delay, higher jitter and a smaller (and more
>>> variable) bandwidth share. The Prague share is 1/4 of the Cubic 
>>> share. I am sending an attachment with a graphical representation of 
>>> the scenario here described.
>>>>>> I configured my L4S endpoints as follows:
>>>>>> - I set the CC as tcp Prague (sysctl -w
>>> net.ipv4.tcp_congestion_control=prague)
>>>>>> - I set the AccEcn, even if it's not necessary apparently
>>> (sysctl -w net.ipv4.tcp_ecn=3)
>>>>>> - I disabled the required offloading capabilities on the
>>> endpoints (sudo ethtool -K $NETIF tso off gso off gro off lro off)
>>>>> [SM] I think you need to do the same on the router... or
>>> with your
>>>>> topology with running prague and cubic over separate end-points 
>>>>> especially on the router itself. Side-node, sch_cake grew a
>>> split-gso
>>>>> mode to automatically handle this issue because it can be a bit
>>> of a
>>>>> whack-a-mole problem to make these configs stick (and in the case
>>> of
>>>>> cake the idea was to make deployment easy even for non-experts).
>>>> 
>>>> [MG] I tried as you suggested and unfortunately the situation
>>> remains unvaried.
>>> 
>>> [SM2] Hmmm, that would indicate that it might not be "lumpyness" of 
>>> inputs into the router. I guess I would take packet captures on both 
>>> interfaces of the router to see whether there is any unexpected 
>>> distribution of packets between both input and output? Also worth 
>>> looking is the CPU usage on the router... we occasionally run into 
>>> issues with aggressive?
>>> power/voltage/frequency scaling where a CPU might take much longer to 
>>> wake up than expected, the L-queue with its rather low (IMHO too
>>> low) reference delay of 1ms would be especially sensitive to such 
>>> issues.
>>> Also does your 100Mbps interface support BQL?
>>> 
>>>> Still, I think I missed the point regarding sch_cake, could you
>>> explain again what it is and if and how could it be useful?
>>> 
>>> [SM2] I am talking about Linux's cake qdisc and just as example, cake 
>>> does not support special treatment of ECT(1) but implements rfc3168 
>>> ECN signaling for both ECT(0) and ECT(1). So for your experiments it 
>>> might not be that useful (but for the fun of it, maybe try it as 
>>> alternative for DualQ) I just mentioned it as an example for a qdisc 
>>> that opted for not simply disabling all offloads. After all these 
>>> offloads are quite useful, as they can considerably reduce the CPU of 
>>> networking. (GSO/GRO work by ameliorating the somewhat fixed 
>>> per-packet cost of Linux network-stack over multiple ethernet frames, 
>>> as long as the increased deelay inherent in such bathing approaches 
>>> this can help a lot).
>>> 
>>>> Apologize, I guess I perfectly fit into the definition of "non
>>> experts". I tried to look it up on the internet but I struggled to 
>>> find any clarification.
>>> 
>>> [SM2] Sorry, my bad, I should have been clearer that I was talkning 
>>> about a qdisc here, see "man tc-cake" on a sufficietly modern Linux 
>>> system, the source code file is called sch_cake.c (see e.g.
>>> https://elixir.bootlin.com/linux/latest/source/net/sched/sch_cake.c)
>>> 
>>>> 
>>>>>> - I configured the fair queue on the endpoints (sudo tc qdisc
>>> replace dev $NETIF root fq)
>>>>>> I configured my router as follows:
>>>>>> - I enabled forwarding through these interfaces to obtain the
>>> routing capabilities (sudo sysctl -w net.ipv4.ip_forward=1)
>>>>>> - I set the dualpi2 on both interfaces (sudo tc qdisc replace
>>> dev $NETIF root dualpi2)
>>>>>> I then applied the fair queue and disabled the offloading
>>> capabilities on both my classic endpoints to ensure that the classic 
>>> and l4s flows act as fairly as possible, but to no avail (even 
>>> without these precautions the results remain roughly the same).
>>>>> [SM] Again, I think with your topology offloads at the
>>> endpoints
>>>>> should not have much influence, but at the router the well might.
>>> If
>>>>> that turns out to help this might be explained by Prague's
>>> (and/or
>>>>> DualQ's L-queue) considerably higher sensitivity to bursty
>>> traffic
>>>>> compared to classic traffic and queue.
>>>>>> I am sure I am missing some important details in the setup, and
>>> I would really appreciate some help.
>>>>> [SM] To me this looks rather straight forward, and I
>>> probably would
>>>>> try something similar, but I did not actually try in practice.
>>>>> Regards & good luck
>>>>> Sebastian
>>>> 
>>>> [MG] Thanks in advance for your help, and if you have other
>>> tips or if you (or anyone else for that matter) are by any chance 
>>> aware of a paper or project using the prague branch of the L4STeam 
>>> repository, that might indeed be really helpful too.
>>> 
>>> [SM] I am not the best/most objective person to quizz here, as I 
>>> consider L4S in general too little too late and neither TCP Prague 
>>> nor the DualQ AQM worth deploying in their current state (but that is 
>>> why I consider your effort researching these admirable, both IMHO 
>>> really need more research direly).
>>> 
>>> I would always try to run the same tests over a bottleneck using a 
>>> fq-scheduler, be it the all in one cake or fq_codel. Fq_codel 
>>> actually con be configured to treat ECT(1) mire in line with what TCP 
>>> Prague desires, so that might well be a decent starting point for 
>>> alternative measurements....
>>> 
>>> Regards
>>> Sebastian
>>> 
>>>> 
>>>> My best regards to you and the community, Matteo
>>>> 
>>>>>> Regards,
>>>>>> Matteo
>>>>>> P.s.
>>>>>> I just want to point out that by looking at the packet traces
>>> everything seems fine: Prague carries the ECN=1, the dualpi2 marks 
>>> packets with ECN=3, the AccEcn control signals on the ACE fields are 
>>> coherent, and no losses occur in the Prague flow, while they do 
>>> happen with the Cubic flow. It looks like Prague is underperforming 
>>> for whatever reason. Furthermore, if I switch back to two Cubic flows 
>>> I measure perfect share, equal delay and equal jitter, so it looks to 
>>> me like there are no physical impairments on the
>>> testbed.<testplant_issue.pdf>--
>>>>>> L4s-discuss mailing list
>>>>>> L4s-discuss@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/l4s-discuss
>>>> 
>>>> --
>>>> L4s-discuss mailing list
>>>> L4s-discuss@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/l4s-discuss
>>> 
>>> --
>>> L4s-discuss mailing list
>>> L4s-discuss@ietf.org
>>> https://www.ietf.org/mailman/listinfo/l4s-discuss
> -- 
> L4s-discuss mailing list
> L4s-discuss@ietf.org
> https://www.ietf.org/mailman/listinfo/l4s-discuss