Re: [L4s-discuss] Configuring a L4S test plant

Matteo Guarna S303434 <matteo.guarna@studenti.polito.it> Sat, 07 October 2023 15:31 UTC

Return-Path: <matteo.guarna@studenti.polito.it>
X-Original-To: l4s-discuss@ietfa.amsl.com
Delivered-To: l4s-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 33065C151522 for <l4s-discuss@ietfa.amsl.com>; Sat, 7 Oct 2023 08:31:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.006
X-Spam-Level:
X-Spam-Status: No, score=-2.006 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=studenti.polito.it
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fcswc1lw2weG for <l4s-discuss@ietfa.amsl.com>; Sat, 7 Oct 2023 08:31:03 -0700 (PDT)
Received: from compass.polito.it (compass.polito.it [130.192.55.110]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 718E6C15108D for <l4s-discuss@ietf.org>; Sat, 7 Oct 2023 08:31:03 -0700 (PDT)
Received: from compass-fwd (localhost [127.0.0.1]) by compass.polito.it (Postfix) with ESMTP id A44B860001CC for <l4s-discuss@ietf.org>; Sat, 7 Oct 2023 17:31:01 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1]) by compass.polito.it (Postfix) with ESMTP id A251A60001CB for <l4s-discuss@ietf.org>; Sat, 7 Oct 2023 17:31:01 +0200 (CEST)
Authentication-Results: compass.polito.it (amavisd-new); dkim=pass (1024-bit key) reason="pass (just generated, assumed good)" header.d=studenti.polito.it
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d= studenti.polito.it; h=content-transfer-encoding:user-agent :x-sender:message-id:references:in-reply-to:subject:subject:to :from:from:date:date:content-type:content-type:mime-version :received:received; s=y2k10; t=1696692660; bh=Tp+zxBGpcsezU8lYcp ruoTmw/k2dK+TZuurmYCLuYRs=; b=k1FCTYoM6ojOjha+xiEjFDioQxcE2tnAI4 yuLAxlWKY+OjYVfkgJ35uukHm3yDKetLiDwi72VQb1XBVuhCELP/mV9c2rZcuM3N Px+MGu0UTr3/cmBjnUBn+oXOB7bTpgJtLvSk2Jc6eKs2L8FETCdiPAvWwvbQheQM oDo3sHoY0=
X-Virus-Scanned: amavisd-new at studenti.polito.it
Received: from compass.polito.it ([127.0.0.1]) by localhost (compass.polito.it [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id slyDnnmeV0Gs for <l4s-discuss@ietf.org>; Sat, 7 Oct 2023 17:31:00 +0200 (CEST)
X-AccountStatus: yes
Received: from mail.studenti.polito.it (mail.studenti.polito.it [130.192.55.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: s256987@studenti.polito.it) by compass.polito.it (Postfix) with ESMTPSA id 3AD3260001C1 for <l4s-discuss@ietf.org>; Sat, 7 Oct 2023 17:31:00 +0200 (CEST)
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Date: Sat, 07 Oct 2023 17:31:00 +0200
From: Matteo Guarna S303434 <matteo.guarna@studenti.polito.it>
To: l4s-discuss@ietf.org
In-Reply-To: <6F852039-08FB-419F-A396-C1F8EB1CD79D@gmx.de>
References: <7952e11516cc7b25484b53ae1380d88c@studenti.polito.it> <230D9924-C32F-4DE8-8BBD-F3D35D94B05B@gmx.de> <b82b81e36e168f6e627798d8cd588db8@studenti.polito.it> <A3BEF415-8574-4854-93D5-7CD1DB7B60F5@gmx.de> <CADVnQynOTd3FsHRk-BG5BTTmEYaM3JdnPj5qJQ9BHOqY_SPwsQ@mail.gmail.com> <727ed5bc3df58dff2e23115a8165b9b2@studenti.polito.it> <CADVnQyn=zSoDiCTK=wbXMt9zaSArYkTv_VTVtt=ve4R011GHxQ@mail.gmail.com> <8f0a95fe65ab1397269afabfd365aaaa@studenti.polito.it> <6F852039-08FB-419F-A396-C1F8EB1CD79D@gmx.de>
Message-ID: <650e1eda28d1d49dab091d04cd56ad15@studenti.polito.it>
X-Sender: matteo.guarna@studenti.polito.it
User-Agent: Roundcube Webmail/1.2-rc
X-Webmail-IP: [ 93.36.161.1 ]
X-Encoded-IP: ea8y+GBMIAqPeS6DAcMNzhXjuzg+x5YS8KnGimjL9GEFrX/b/t7u6zMasVyKLsi9cgCo0OvUckM=
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/l4s-discuss/7rLadYMoGn_IytINU_X82Uc4x7A>
Subject: Re: [L4s-discuss] Configuring a L4S test plant
X-BeenThere: l4s-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Low Latency, Low Loss, Scalable Throughput \(L4S\) " <l4s-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/l4s-discuss>, <mailto:l4s-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/l4s-discuss/>
List-Post: <mailto:l4s-discuss@ietf.org>
List-Help: <mailto:l4s-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/l4s-discuss>, <mailto:l4s-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Oct 2023 15:31:08 -0000

Hi Koen, thank you for being always ready to leng me your help!

Il 2023-10-06 17:57 Sebastian Moeller ha scritto:
> Hi Matteo,
> 
> 
>> On Oct 6, 2023, at 14:32, Matteo Guarna S303434 
>> <matteo.guarna@studenti.polito.it> wrote:
>> 
>> Hi Neal. Thank you for providing me with your impressions so quickly,
>> 
>> On 2023-10-05 20:41 Neal Cardwell wrote:
>>> Thanks for the detailed data!
>>> You mention the L4S flow having a higher delay... what's the source
>>> for that data?
>>    [MG] I am using spindump to capture the flows passing through the 
>> router. Its code is available here: 
>> https://github.com/EricssonResearch/spindump
>> I can try and produce a log of the captures, but unfortunately I have 
>> to wait until monday to access the test plant again. Still, I repeated 
>> my measurements many times over and I get a really consistent RTT 
>> measurement for Cubic each second (33.4 ms) while the Prague flows by 
>> the second vary mostly between 33.9 ms and 34.7 ms.
>> 
>>> From a quick glance at the pcaps and ss data, it seems like:
>>> - From the ss data, CUBIC sees RTT delays between 35ms and 53ms;
>>> Prague sees RTT delays between 31ms and 35ms.
>> 
>>    [MG] Your observations are much more in line with their supposed 
>> behaviour than mine. I can see that myself on the ss capture, now that 
>> you're pointing that out... Maybe Spindump is having problems with the 
>> measurements for some reason? I have to look it up I guess. Thank you!
> 
> 	[SM] Hhmm, when comparing RTTs in the two traces, Prague and Cubic
> look for the longest time pretty close (Cubic has some "spikes" later
> in the trace), but that should not really be if the DualQ does its
> thing correctly... with DualQ as egress qdisc, how did you configure
> the actual interface (how deep were the interface buffers and was BQL
> active or not)?
> 
     [MG2] Honestly I do not know the depth of the buffer, I did not 
think about changing it so it's going the be the default size for my 
machine. On monday/tuesday I will be able to access it again and I will 
surely check, but is there a chance for it to actually affect L4S's 
behaviour? Like, may a buffer that's very deep result in lower than 
usual drops in the classic queue and in turn a smaller share for the L4S 
traffic? Do you have some hypoteses? Still, as soon as I am returning to 
the lab I will run some tests with various buffer sizes, so thenk you a 
lot for the suggestion.

Regarding the BQL I honestly didn't know it could be managed, how should 
it be set up to allow L4S's best performance?
> 
>> 
>>> - Prague is getting about a 6% ECN mark rate, and given that it is
>>> correctly converging to a rate of roughly 1/.06 - 1 ~= 15 Mbps. That
>>> rate is far below its fair share of 50 Mbps. So if there is an issue
>>> here, it might be in dualpi2 providing too many ECN marks to the L4S
>>> flow and/or too few drops to the CUBIC flow.
>>    [MG] It may well be, in fact I generate traffic with iperf3 and I 
>> can see how many retransmissions actually happen during trials of 60 
>> seconds, where I run both flows at 100 Mbps through the bottleneck. 
>> There, while I have virtually 0 retransmissions with Prague, I can see 
>> very little retransmissions with Cubic, meaning around 20in the first 
>> second and then 1 or 2 every three seconds on average. I think this 
>> might be little too few, do you?
> 
> 	[SM] This matches what you can see in the packet captures as well if
> you do a tcptrace plot, essentially zero duplicate ACKs (signs of
> drops) for Prague and some for Cubic, so this is consistent...
> 

     [MG] That's reassurig to hear, but it raises a question: do you deem 
the number of trops in Cubic too low or in line with your expectations? 
The scenario consists in two 100 mbps flows on a 100 mbps bottleneck?

Thank you in advance

Matteo

> 
>> 
>> Thank you once again for your valuable insights
>> 
>> Matteo
>> 
>>> neal
>>> On Thu, Oct 5, 2023 at 12:23 PM Matteo Guarna S303434
>>> <matteo.guarna@studenti.polito.it> wrote:
>>>> Hi Neal,
>>>> thank you for reaching me. I executed the script on both the prague
>>>> and
>>>> the cubic server as you asked.
>>>> The prague server has IP address 192.168.202.21, and transmits data
>>>> towards 192.168.201.17
>>>> The cubic server has IP address 192.168.202.22, and transmits data
>>>> towards 192.168.201.18
>>>> All connections lasted for 20 seconds and were established via
>>>> iperf3 in
>>>> reverse mode
>>>> Please forgive me for having the date on the two machines out of
>>>> sync
>>>> (the flows had in fact started at the same time):
>>>> - the transmission timestamp on the prague server begins at Thu Oct
>>>> 5
>>>> 2023, 05:22:50 PM CEST
>>>> - the transmission timestamp on the cubic server begins at Fri Sep
>>>> 29
>>>> 2023, 01:37:53, CEST
>>>> I am providing you with the captures as attachments to this mail: I
>>>> named them with the "prague" and "cubic" suffixes after the servers
>>>> where the capture took place.
>>>> If you need more information please don't hesitate to contact me
>>>> Best regards and thank you in advance,
>>>> Matteo Guarna
>>>> Il 2023-10-04 17:18 Neal Cardwell ha scritto:
>>>>> Thanks for the report, Matteo.
>>>>> To help debug this, could you please gather and share the
>>>> following
>>>>> instrumentation during one of your tests? This would need to be
>>>>> collected on both data senders (servers), as root:
>>>>> (while true; do date; ss -tenmoi; sleep 1; done) > /root/ss.txt &
>>>>> tcpdump -w /root/dump.pcap -n -s 100 -c 1000000 host $REMOTE_HOST
>>>> -i
>>>>> $INTERFACE &
>>>>> nstat -n; (while true; do date; nstat; sleep 1; done)  >
>>>>> /root/nstat.txt &
>>>>> The data should probably only be needed for the time interval
>>>> starting
>>>>> from before the test and ending when the flows reach steady state,
>>>>> which may be 10-20 secs into the test.
>>>>> thanks,
>>>>> neal
>>>>> On Wed, Oct 4, 2023 at 6:03 AM Sebastian Moeller
>>>> <moeller0@gmx.de>
>>>>> wrote:
>>>>>> Hi Matteo,
>>>>>>> On Oct 4, 2023, at 11:48, Matteo Guarna S303434
>>>>>> <matteo.guarna@studenti.polito.it> wrote:
>>>>>>> Hi Sebastian and thank you for your answer
>>>>>>> Il 2023-10-03 16:39 Sebastian Moeller ha scritto:
>>>>>>>> Hi Matteo.
>>>>>>>>> On Oct 3, 2023, at 15:42, Matteo Guarna S303434
>>>>>> <matteo.guarna@studenti.polito.it> wrote:
>>>>>>>>> Greetings everyone,
>>>>>>>>> I hope the question isn't too off-topic, please forgive me in
>>>>>> advance if it is so.
>>>>>>>>> I am still trying to perform some fairness measurements with
>>>>>> both L4S and classic flow, although now on a physical test plant
>>>>>> instead of a virtualized one. I'm relying on the L4STeam Github
>>>>>> project for the deployment of the L4S architecture and I am
>>>> looking
>>>>>> for someone who's familiar with the project and might be willing
>>>> to
>>>>>> help me: in fact I seem not to be able to achieve the correct
>>>>>> configuration.
>>>>>>>>> My setup is very simple: I have four servers (two senders and
>>>>>> two receivers) exchanging two traffic flows through one server
>>>>>> acting as a router. One client-server pair uses Prague as CC,
>>>> while
>>>>>> the other uses Cubic. All servers have the patched kernel
>>>> provided
>>>>>> in the https://github.com/L4STeam/linux/ repository branch.
>>>>>>>>> If I trigger a congestion on the router by generating both the
>>>>>> Prague and the Cubic flows (let's say the flows measure 100
>>>> Mbit/s
>>>>>> each, and they come though a L2 switch both on the same router's
>>>>>> input interface on a 1Gb Ethernet link; only a 100M link though
>>>> is
>>>>>> in place on the output interface towards the receivers) I see the
>>>>>> L4S flow having higher delay, higher jitter and a smaller (and
>>>> more
>>>>>> variable) bandwidth share. The Prague share is 1/4 of the Cubic
>>>>>> share. I am sending an attachment with a graphical representation
>>>> of
>>>>>> the scenario here described.
>>>>>>>>> I configured my L4S endpoints as follows:
>>>>>>>>> - I set the CC as tcp Prague (sysctl -w
>>>>>> net.ipv4.tcp_congestion_control=prague)
>>>>>>>>> - I set the AccEcn, even if it's not necessary apparently
>>>>>> (sysctl -w net.ipv4.tcp_ecn=3)
>>>>>>>>> - I disabled the required offloading capabilities on the
>>>>>> endpoints (sudo ethtool -K $NETIF tso off gso off gro off lro
>>>> off)
>>>>>>>> [SM] I think you need to do the same on the router... or
>>>>>> with your
>>>>>>>> topology with running prague and cubic over separate end-points
>>>>>>>> especially on the router itself. Side-node, sch_cake grew a
>>>>>> split-gso
>>>>>>>> mode to automatically handle this issue because it can be a bit
>>>>>> of a
>>>>>>>> whack-a-mole problem to make these configs stick (and in the
>>>> case
>>>>>> of
>>>>>>>> cake the idea was to make deployment easy even for
>>>> non-experts).
>>>>>>> [MG] I tried as you suggested and unfortunately the situation
>>>>>> remains unvaried.
>>>>>> [SM2] Hmmm, that would indicate that it might not be
>>>>>> "lumpyness" of inputs into the router. I guess I would take
>>>> packet
>>>>>> captures on both interfaces of the router to see whether there is
>>>>>> any unexpected distribution of packets between both input and
>>>>>> output? Also worth looking is the CPU usage on the router... we
>>>>>> occasionally run into issues with aggressive?
>>>>>> power/voltage/frequency scaling where a CPU might take much
>>>> longer
>>>>>> to wake up than expected, the L-queue with its rather low (IMHO
>>>> too
>>>>>> low) reference delay of 1ms would be especially sensitive to such
>>>>>> issues.
>>>>>> Also does your 100Mbps interface support BQL?
>>>>>>> Still, I think I missed the point regarding sch_cake, could you
>>>>>> explain again what it is and if and how could it be useful?
>>>>>> [SM2] I am talking about Linux's cake qdisc and just as
>>>>>> example, cake does not support special treatment of ECT(1) but
>>>>>> implements rfc3168 ECN signaling for both ECT(0) and ECT(1). So
>>>> for
>>>>>> your experiments it might not be that useful (but for the fun of
>>>> it,
>>>>>> maybe try it as alternative for DualQ) I just mentioned it as an
>>>>>> example for a qdisc that opted for not simply disabling all
>>>>>> offloads. After all these offloads are quite useful, as they can
>>>>>> considerably reduce the CPU of networking. (GSO/GRO work by
>>>>>> ameliorating the somewhat fixed per-packet cost of Linux
>>>>>> network-stack over multiple ethernet frames, as long as the
>>>>>> increased deelay inherent in such bathing approaches this can
>>>> help a
>>>>>> lot).
>>>>>>> Apologize, I guess I perfectly fit into the definition of "non
>>>>>> experts". I tried to look it up on the internet but I struggled
>>>> to
>>>>>> find any clarification.
>>>>>> [SM2] Sorry, my bad, I should have been clearer that I was
>>>>>> talkning about a qdisc here, see "man tc-cake" on a sufficietly
>>>>>> modern Linux system, the source code file is called sch_cake.c
>>>> (see
>>>>>> e.g.
>>>> https://elixir.bootlin.com/linux/latest/source/net/sched/sch_cake.c)
>>>>>>>>> - I configured the fair queue on the endpoints (sudo tc qdisc
>>>>>> replace dev $NETIF root fq)
>>>>>>>>> I configured my router as follows:
>>>>>>>>> - I enabled forwarding through these interfaces to obtain the
>>>>>> routing capabilities (sudo sysctl -w net.ipv4.ip_forward=1)
>>>>>>>>> - I set the dualpi2 on both interfaces (sudo tc qdisc replace
>>>>>> dev $NETIF root dualpi2)
>>>>>>>>> I then applied the fair queue and disabled the offloading
>>>>>> capabilities on both my classic endpoints to ensure that the
>>>> classic
>>>>>> and l4s flows act as fairly as possible, but to no avail (even
>>>>>> without these precautions the results remain roughly the same).
>>>>>>>> [SM] Again, I think with your topology offloads at the
>>>>>> endpoints
>>>>>>>> should not have much influence, but at the router the well
>>>> might.
>>>>>> If
>>>>>>>> that turns out to help this might be explained by Prague's
>>>>>> (and/or
>>>>>>>> DualQ's L-queue) considerably higher sensitivity to bursty
>>>>>> traffic
>>>>>>>> compared to classic traffic and queue.
>>>>>>>>> I am sure I am missing some important details in the setup,
>>>> and
>>>>>> I would really appreciate some help.
>>>>>>>> [SM] To me this looks rather straight forward, and I
>>>>>> probably would
>>>>>>>> try something similar, but I did not actually try in practice.
>>>>>>>> Regards & good luck
>>>>>>>> Sebastian
>>>>>>> [MG] Thanks in advance for your help, and if you have other
>>>>>> tips or if you (or anyone else for that matter) are by any chance
>>>>>> aware of a paper or project using the prague branch of the
>>>> L4STeam
>>>>>> repository, that might indeed be really helpful too.
>>>>>> [SM] I am not the best/most objective person to quizz here,
>>>>>> as I consider L4S in general too little too late and neither TCP
>>>>>> Prague nor the DualQ AQM worth deploying in their current state
>>>> (but
>>>>>> that is why I consider your effort researching these admirable,
>>>> both
>>>>>> IMHO really need more research direly).
>>>>>> I would always try to run the same tests over a bottleneck using
>>>> a
>>>>>> fq-scheduler, be it the all in one cake or fq_codel. Fq_codel
>>>>>> actually con be configured to treat ECT(1) mire in line with what
>>>>>> TCP Prague desires, so that might well be a decent starting point
>>>>>> for alternative measurements....
>>>>>> Regards
>>>>>> Sebastian
>>>>>>> My best regards to you and the community,
>>>>>>> Matteo
>>>>>>>>> Regards,
>>>>>>>>> Matteo
>>>>>>>>> P.s.
>>>>>>>>> I just want to point out that by looking at the packet traces
>>>>>> everything seems fine: Prague carries the ECN=1, the dualpi2
>>>> marks
>>>>>> packets with ECN=3, the AccEcn control signals on the ACE fields
>>>> are
>>>>>> coherent, and no losses occur in the Prague flow, while they do
>>>>>> happen with the Cubic flow. It looks like Prague is
>>>> underperforming
>>>>>> for whatever reason. Furthermore, if I switch back to two Cubic
>>>>>> flows I measure perfect share, equal delay and equal jitter, so
>>>> it
>>>>>> looks to me like there are no physical impairments on the
>>>>>> testbed.<testplant_issue.pdf>--
>>>>>>>>> L4s-discuss mailing list
>>>>>>>>> L4s-discuss@ietf.org
>>>>>>>>> https://www.ietf.org/mailman/listinfo/l4s-discuss
>>>>>>> --
>>>>>>> L4s-discuss mailing list
>>>>>>> L4s-discuss@ietf.org
>>>>>>> https://www.ietf.org/mailman/listinfo/l4s-discuss
>>>>>> --
>>>>>> L4s-discuss mailing list
>>>>>> L4s-discuss@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/l4s-discuss--
>>>> L4s-discuss mailing list
>>>> L4s-discuss@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/l4s-discuss
>> 
>> --
>> L4s-discuss mailing list
>> L4s-discuss@ietf.org
>> https://www.ietf.org/mailman/listinfo/l4s-discuss