Re: [tsvwg] Why L4S need a separate queue?

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Thu, 18 April 2024 12:26 UTC

Message-ID: <da6b5bb7-56c9-42f2-9c21-6129278757b8@erg.abdn.ac.uk>
Date: Thu, 18 Apr 2024 13:26:38 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-GB
To: Sebastian Moeller <moeller0=40gmx.de@dmarc.ietf.org>, Vasilenko Eduard <vasilenko.eduard=40huawei.com@dmarc.ietf.org>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <6284927dbd704a6abdec7fca7f6bade2@huawei.com> <8a83c60e2dd841679f4a0b1848830b6f@huawei.com> <C8959351-9C9A-457E-9E6F-CE3ABA195DAE@gmx.de>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Organization: UNIVERSITY OF ABERDEEN
In-Reply-To: <C8959351-9C9A-457E-9E6F-CE3ABA195DAE@gmx.de>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/HAjMW-OmTzMLSonfCl3mSpdsGZg>
Subject: Re: [tsvwg] Why L4S need a separate queue?
Precedence: list

On 18/04/2024 09:28, Sebastian Moeller wrote:
> Hi Ed,
>
>
>> On 18. Apr 2024, at 09:58, Vasilenko Eduard <vasilenko.eduard=40huawei.com@dmarc.ietf.org> wrote:
>>
>> If somebody in the future would develop yet another CC algorithm (nobody could predict future requirements),
>> But dependency between algorithms (formula) is hard-coded in the router’s hardware,
>> Then it is an infrastructure replacement again.
>> It was wrong to put a formula for CC cooperation/fairness into hardware.
> [SM] I actually agree that both redefining what response is expected for a single CE mark and the imprecise coupling between two queues are obvious flaws of L4S, that could and should have been avoided. However IMHO the solution really is to actually signal the piece of information end points need to adapt to the situation at the bottleneck, and that IMHO would be a multibit signal about the bottleneck state sent in every individual packet. Flows can then reflect this information and adjust their rate according to the veridical dynamics of the bottleneck load.  That also seems to be the way the data center is going (see e.g. https://www.usenix.org/conference/nsdi23/presentation/wang-weitao) abandoning the approach of emulating a true per-packet signal with a rate code of the same state bit essentially toggling a single bit.
>
So, since this is an IETF list, let's be clear. To my knowledge, the 
IETF has not to this date decided to standardise any multi-bit approach 
to signalling experienced congestion. That, I suggest is an engineering 
decision that has been revisited a number of times over the year, most 
recent - from the Internet perspective (i.e. tsvwg list) - in the 
currently recommended specification for the deisgn of L4S.

>>   It may have a small positive effect on the transition period to isolate new traffic from legacy,
> [SM] IMHO isolation really is the key, but just separating traffic based on the expected differential CE response as L4S does is really just the bare minimum, and is mostly owed to the fact that these two responses are inherently incompatible and that normal internet traffic expects/requires deeper queues.
>
>> But BBR has surpassed half of the traffic already and resolved this problem differently.
> [SM] Mmmh, but has it? BBRv3, as far as I know actually evaluates and responds to L4S style CE marks.
 From the IETF perspective, BBR is documented in a draft of ICCRG, and - 
I think - continues to be of interest there, and I believe the draft 
authors plan to continue to update the spec. There have been ICCRG 
presentations that talk about the interaction with CE marks, and likely 
more will be welcome.
>>   Especially in the situation when the formula was analytically derived from the wrong assumptions.
>> It is a time of errata for RFC 9332 where RENO and CUBIC dependencies are claimed on the square root of marking probability.
> [SM] Maybe it is time you present a proof of the claim that the square root 'law' is not 'good enough'? CoDel (https://datatracker.ietf.org/doc/html/rfc8289) is built ion that assumption and in practice works pretty well. To me this implies that the theory is 'good enough'.
>
>>   By the way, RFC 9332 references PI2 publication which claims different formulas for RENO and CUBIC. CUBIC is cubical as anybody could guess – very different from RENO.
> [SM] Indeed, but depending on the regime cubic is in it is close tro Reno, and IIRC was designed to peacefully coexist with Reno.
>
>> CUBIC has almost replaced RENO on the market, but RFC 9330 and RFC 9332 propose to use a formula from RENO. It is not logical at all.
> [SM] I seem to recall that at the time Reno was the only TCP variant that had a published standard RFC, so honouring this as precedent seems totally expected for the IETF, no?
>
>> Even if nobody spotted that analytical formulas are wrong, why the formula from RENO was chosen for DualQ, not for CUBIC?
> [SM] See above, as Reno has RFC statius while cubic has not, and cubic tries to accommodate Reno.
>
>> It is an additional errata for the RFC 9332.
>>   Opensource has won IETF again and unfortunately, I suspect why.
> [SM] Excuse me? That comes quite unexpected, what do you mean with opensource here, and what is your exact criticism, it is not that RFCs based on proprietary commercial work are always perfect either...
>
>> This silence is a good sign of what is going on in TSVWG.
>> Nobody wants to discuss the problem in essence. Everything has already been decided by politics.
> [SM] As much as I agree with the view that politics has a stronger influence on IETF decisions than the self-description of the IETF's processes would imply, I also consider that to be pretty much expected when a larger number of folks need to agree on things... But typically well-made arguments will elicit a response in this mailing list by actual experts (and not just the peanut gallery visitors like me) but that can take a bit, so maybe have a bit more patience?
>
> Regards
> 	Sebastian

Although there may not be many comments on the thread, please be aware 
this discussion is going to many many people who are interested in 
developing IETF standards. You're welcome to  have these thoughts on 
what you think should be, but please try to use the list to discuss the 
specifications of adopted work or work that is planned to be brought to 
the IETF.

Best wishes

Gorry

>> Eduard
>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Vasilenko Eduard
>> Sent: Tuesday, April 16, 2024 17:06
>> To: tsvwg@ietf.org
>> Subject: [tsvwg] Why L4S need a separate queue?
>>   Hi Experts,
>> For sure, some balancing is needed between the new CCA against the old CCA to achieve the fairness. It may have a complex formula – no problem with that.
>> But why this formula has been put on routers?
>> BBR (without ECN feedback) has finally developed a good enough formula to play nicely with CUBIC.
>> I have not seen a test for BBR with ECN against CUBIC with ECN or against CUBIC without ECN. If anybody saw it – please share.
>> No doubt that such a formula could be developed for BBR with ECN (if it is not a part of BBR already).
>> Any CCA MUST have fairness with CUBIC in one queue – the world has a huge number of already installed hardware (with fixed schedulers).
>>   Then the only thing that would be needed from routers is to support ECN instead (or in addition) of drop.
>> And ECN(1) would not be needed.
>>   I do not see a value in the requirement for an additional scheduler type on routers.
>> It would not affect the performance. Because any new CCA MUST have fairness for one queue anyway for adoption strategy.
>> It just additionally complicated L4S adoption (with rip&replace approach).
>>   PS: everything above is my personal opinion.
>>   Best Regards
>> Eduard Vasilenko
>> Senior Architect
>> Network Algorithm Laboratory
>> Tel: +7(985) 910-1105

[tsvwg] Why L4S need a separate queue? Vasilenko Eduard
Re: [tsvwg] Why L4S need a separate queue? Vasilenko Eduard
Re: [tsvwg] Why L4S need a separate queue? Sebastian Moeller
Re: [tsvwg] Why L4S need a separate queue? Gorry Fairhurst