Re: [L4s-discuss] Configuring a L4S test plant
Neal Cardwell <ncardwell@google.com> Wed, 04 October 2023 15:18 UTC
Return-Path: <ncardwell@google.com>
X-Original-To: l4s-discuss@ietfa.amsl.com
Delivered-To: l4s-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D4C7BC1522AD for <l4s-discuss@ietfa.amsl.com>; Wed, 4 Oct 2023 08:18:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -22.607
X-Spam-Level:
X-Spam-Status: No, score=-22.607 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PvL_4893uSue for <l4s-discuss@ietfa.amsl.com>; Wed, 4 Oct 2023 08:18:41 -0700 (PDT)
Received: from mail-vk1-xa2f.google.com (mail-vk1-xa2f.google.com [IPv6:2607:f8b0:4864:20::a2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ED6EFC14F5E0 for <l4s-discuss@ietf.org>; Wed, 4 Oct 2023 08:18:40 -0700 (PDT)
Received: by mail-vk1-xa2f.google.com with SMTP id 71dfb90a1353d-49d45964fcaso699913e0c.1 for <l4s-discuss@ietf.org>; Wed, 04 Oct 2023 08:18:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696432720; x=1697037520; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=An0LxxCHeG/4rD8B+/JY+Lb1uvc/mhr/Z28plOftZJg=; b=0j2+YkVdfqtPA94pS9VpXJ2GQgWr0eXYudMBHn7fonnYPyN8vcguDyo4zA1IJ5recF VYa7q0rtw2lDMaEvXX//VBjT11C35k+902lf/IvOfUFVwMXYdQrM6vBXuXEhubjST6Jf sZMWT5wztcan9hrD28UUmy+vqfUeVMxhmwNiiR/qUf97cFy7jwAAtaCdlyHvUiilxqRS M8KvHS2OcVudiHKgjvXSd7dNo8lvNd4pjhpQcWGBVgOZXK7rvuutnwXfnfu2Nzgg0eRo CdwlU08OWTMmD8QVTTFk2VuSLd19zB6en6u6I0C5P1F5QwDXm3cEUOWYwE0EvQLw3vGL Nr3Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696432720; x=1697037520; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=An0LxxCHeG/4rD8B+/JY+Lb1uvc/mhr/Z28plOftZJg=; b=vmmrqlBp5h+S26JUlbBG6fiW53QjzM1sGQXDEajLDTBggLHmsJt5AFxT/3jK7e3shS opq56s2OvwrjT7CaCUD1ucjw28MqkUKPuUkzWNduLvr1Dd5qD+cAsnlo+D89aGH+1qkI Km+dK1gaRlCecoxfCuU35ArGeY+YvUjDSs4lR6a/dxfeVvsgUsH0a31KfuaT1aW3Jby6 LfAhhvXjYBMUSrR5K1yKqEXCLlKv2itdFhVp/VlbV0koqUE30DNuLdp0PmyMLrOYqRmx ezLnVoNVRFbEYvMgrmt8YWQndxhwg+LWqUZSnnlR0IqMhFdFrBM77koS9J2lvpbIbSWm Fb+A==
X-Gm-Message-State: AOJu0Yx7ZB/JCvMHrFIjv5iK4JWe0XRdMRsUyfrhuFLt/hACflY+2z8n o6RVABJ8xMJL8BUr9Q7rdn06Gsah9UK8HMrgtwRByGRjdXm8+CXod61OZg==
X-Google-Smtp-Source: AGHT+IE9oLMIU2CEBnL9FS81/cX6B9ZDwEHjX9vo0emq91LvNkIvQTDZxyCjWSIprdl+Ne6TcbaNgo0VId5KRv7UUZY=
X-Received: by 2002:a1f:2994:0:b0:49d:20fb:c899 with SMTP id p142-20020a1f2994000000b0049d20fbc899mr2129721vkp.4.1696432719779; Wed, 04 Oct 2023 08:18:39 -0700 (PDT)
MIME-Version: 1.0
References: <7952e11516cc7b25484b53ae1380d88c@studenti.polito.it> <230D9924-C32F-4DE8-8BBD-F3D35D94B05B@gmx.de> <b82b81e36e168f6e627798d8cd588db8@studenti.polito.it> <A3BEF415-8574-4854-93D5-7CD1DB7B60F5@gmx.de>
In-Reply-To: <A3BEF415-8574-4854-93D5-7CD1DB7B60F5@gmx.de>
From: Neal Cardwell <ncardwell@google.com>
Date: Wed, 04 Oct 2023 11:18:23 -0400
Message-ID: <CADVnQynOTd3FsHRk-BG5BTTmEYaM3JdnPj5qJQ9BHOqY_SPwsQ@mail.gmail.com>
To: Matteo Guarna S303434 <matteo.guarna@studenti.polito.it>
Cc: Sebastian Moeller <moeller0@gmx.de>, l4s-discuss@ietf.org
Content-Type: multipart/alternative; boundary="000000000000a642af0606e5853f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/l4s-discuss/_Wgk5WpUxD5sTqfsaEEyLQWegAE>
Subject: Re: [L4s-discuss] Configuring a L4S test plant
X-BeenThere: l4s-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Low Latency, Low Loss, Scalable Throughput \(L4S\) " <l4s-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/l4s-discuss>, <mailto:l4s-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/l4s-discuss/>
List-Post: <mailto:l4s-discuss@ietf.org>
List-Help: <mailto:l4s-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/l4s-discuss>, <mailto:l4s-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Oct 2023 15:18:46 -0000
Thanks for the report, Matteo. To help debug this, could you please gather and share the following instrumentation during one of your tests? This would need to be collected on both data senders (servers), as root: (while true; do date; ss -tenmoi; sleep 1; done) > /root/ss.txt & tcpdump -w /root/dump.pcap -n -s 100 -c 1000000 host $REMOTE_HOST -i $INTERFACE & nstat -n; (while true; do date; nstat; sleep 1; done) > /root/nstat.txt & The data should probably only be needed for the time interval starting from before the test and ending when the flows reach steady state, which may be 10-20 secs into the test. thanks, neal On Wed, Oct 4, 2023 at 6:03 AM Sebastian Moeller <moeller0@gmx.de> wrote: > Hi Matteo, > > > On Oct 4, 2023, at 11:48, Matteo Guarna S303434 < > matteo.guarna@studenti.polito.it> wrote: > > > > Hi Sebastian and thank you for your answer > > > > Il 2023-10-03 16:39 Sebastian Moeller ha scritto: > >> Hi Matteo. > >>> On Oct 3, 2023, at 15:42, Matteo Guarna S303434 < > matteo.guarna@studenti.polito.it> wrote: > >>> Greetings everyone, > >>> I hope the question isn't too off-topic, please forgive me in advance > if it is so. > >>> I am still trying to perform some fairness measurements with both L4S > and classic flow, although now on a physical test plant instead of a > virtualized one. I'm relying on the L4STeam Github project for the > deployment of the L4S architecture and I am looking for someone who's > familiar with the project and might be willing to help me: in fact I seem > not to be able to achieve the correct configuration. > >>> My setup is very simple: I have four servers (two senders and two > receivers) exchanging two traffic flows through one server acting as a > router. One client-server pair uses Prague as CC, while the other uses > Cubic. All servers have the patched kernel provided in the > https://github.com/L4STeam/linux/ repository branch. > >>> If I trigger a congestion on the router by generating both the Prague > and the Cubic flows (let's say the flows measure 100 Mbit/s each, and they > come though a L2 switch both on the same router's input interface on a 1Gb > Ethernet link; only a 100M link though is in place on the output interface > towards the receivers) I see the L4S flow having higher delay, higher > jitter and a smaller (and more variable) bandwidth share. The Prague share > is 1/4 of the Cubic share. I am sending an attachment with a graphical > representation of the scenario here described. > >>> I configured my L4S endpoints as follows: > >>> - I set the CC as tcp Prague (sysctl -w > net.ipv4.tcp_congestion_control=prague) > >>> - I set the AccEcn, even if it's not necessary apparently (sysctl -w > net.ipv4.tcp_ecn=3) > >>> - I disabled the required offloading capabilities on the endpoints > (sudo ethtool -K $NETIF tso off gso off gro off lro off) > >> [SM] I think you need to do the same on the router... or with your > >> topology with running prague and cubic over separate end-points > >> especially on the router itself. Side-node, sch_cake grew a split-gso > >> mode to automatically handle this issue because it can be a bit of a > >> whack-a-mole problem to make these configs stick (and in the case of > >> cake the idea was to make deployment easy even for non-experts). > > > > [MG] I tried as you suggested and unfortunately the situation remains > unvaried. > > [SM2] Hmmm, that would indicate that it might not be "lumpyness" > of inputs into the router. I guess I would take packet captures on both > interfaces of the router to see whether there is any unexpected > distribution of packets between both input and output? Also worth looking > is the CPU usage on the router... we occasionally run into issues with > aggressive? power/voltage/frequency scaling where a CPU might take much > longer to wake up than expected, the L-queue with its rather low (IMHO too > low) reference delay of 1ms would be especially sensitive to such issues. > Also does your 100Mbps interface support BQL? > > > > Still, I think I missed the point regarding sch_cake, could you explain > again what it is and if and how could it be useful? > > [SM2] I am talking about Linux's cake qdisc and just as example, > cake does not support special treatment of ECT(1) but implements rfc3168 > ECN signaling for both ECT(0) and ECT(1). So for your experiments it might > not be that useful (but for the fun of it, maybe try it as alternative for > DualQ) I just mentioned it as an example for a qdisc that opted for not > simply disabling all offloads. After all these offloads are quite useful, > as they can considerably reduce the CPU of networking. (GSO/GRO work by > ameliorating the somewhat fixed per-packet cost of Linux network-stack over > multiple ethernet frames, as long as the increased deelay inherent in such > bathing approaches this can help a lot). > > > > Apologize, I guess I perfectly fit into the definition of "non experts". > I tried to look it up on the internet but I struggled to find any > clarification. > > [SM2] Sorry, my bad, I should have been clearer that I was > talkning about a qdisc here, see "man tc-cake" on a sufficietly modern > Linux system, the source code file is called sch_cake.c (see e.g. > https://elixir.bootlin.com/linux/latest/source/net/sched/sch_cake.c) > > > > > >>> - I configured the fair queue on the endpoints (sudo tc qdisc replace > dev $NETIF root fq) > >>> I configured my router as follows: > >>> - I enabled forwarding through these interfaces to obtain the routing > capabilities (sudo sysctl -w net.ipv4.ip_forward=1) > >>> - I set the dualpi2 on both interfaces (sudo tc qdisc replace dev > $NETIF root dualpi2) > >>> I then applied the fair queue and disabled the offloading capabilities > on both my classic endpoints to ensure that the classic and l4s flows act > as fairly as possible, but to no avail (even without these precautions the > results remain roughly the same). > >> [SM] Again, I think with your topology offloads at the endpoints > >> should not have much influence, but at the router the well might. If > >> that turns out to help this might be explained by Prague's (and/or > >> DualQ's L-queue) considerably higher sensitivity to bursty traffic > >> compared to classic traffic and queue. > >>> I am sure I am missing some important details in the setup, and I > would really appreciate some help. > >> [SM] To me this looks rather straight forward, and I probably would > >> try something similar, but I did not actually try in practice. > >> Regards & good luck > >> Sebastian > > > > [MG] Thanks in advance for your help, and if you have other tips or > if you (or anyone else for that matter) are by any chance aware of a paper > or project using the prague branch of the L4STeam repository, that might > indeed be really helpful too. > > [SM] I am not the best/most objective person to quizz here, as I > consider L4S in general too little too late and neither TCP Prague nor the > DualQ AQM worth deploying in their current state (but that is why I > consider your effort researching these admirable, both IMHO really need > more research direly). > > I would always try to run the same tests over a bottleneck using a > fq-scheduler, be it the all in one cake or fq_codel. Fq_codel actually con > be configured to treat ECT(1) mire in line with what TCP Prague desires, so > that might well be a decent starting point for alternative measurements.... > > Regards > Sebastian > > > > > > My best regards to you and the community, > > Matteo > > > >>> Regards, > >>> Matteo > >>> P.s. > >>> I just want to point out that by looking at the packet traces > everything seems fine: Prague carries the ECN=1, the dualpi2 marks packets > with ECN=3, the AccEcn control signals on the ACE fields are coherent, and > no losses occur in the Prague flow, while they do happen with the Cubic > flow. It looks like Prague is underperforming for whatever reason. > Furthermore, if I switch back to two Cubic flows I measure perfect share, > equal delay and equal jitter, so it looks to me like there are no physical > impairments on the testbed.<testplant_issue.pdf>-- > >>> L4s-discuss mailing list > >>> L4s-discuss@ietf.org > >>> https://www.ietf.org/mailman/listinfo/l4s-discuss > > > > -- > > L4s-discuss mailing list > > L4s-discuss@ietf.org > > https://www.ietf.org/mailman/listinfo/l4s-discuss > > -- > L4s-discuss mailing list > L4s-discuss@ietf.org > https://www.ietf.org/mailman/listinfo/l4s-discuss >
- [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Neal Cardwell
- Re: [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Neal Cardwell
- Re: [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Matteo Guarna S303434
- Re: [L4s-discuss] Configuring a L4S test plant Sebastian Moeller
- Re: [L4s-discuss] Configuring a L4S test plant Koen De Schepper (Nokia)