Re: [IPv6] 6MAN: looking for feedback to draft-eckert-6man-qos-exthdr-discuss (Re: New Version Notification for ...)

Hi Toerless,
I hope you are aware that Tofino is a history. This business model (for a really flexible data plane) has failed.

I have to disappoint you that latency is not so much dependent on AQM (Active Queue Management). Latency is more dependent on CCA (Congestion Control Algorithm).
For example, CUBIC (still the default for almost all OSes from 2006) would consume *ALL* buffers on the bottleneck link driving so-called "bufferbloat" (spotted around 2010-2012 with big noise on the Internet).
If your minimum RTT is 5ms (speed of light), but the buffer on the bottleneck link is 50ms, then you would get latency close to 10x from minimal. Not much could be done on routers in transit. It is more dependent on CUBIC.
In the latest presentation, Google showed the test when the bottleneck link buffer was 2.5s. And yes, CUBIC did drive the latency to 2.5s!
Google has started to solve "CUBIC bufferbloat" in 2013 and finished just now.
I am not aware of a good concise overview of the result (BBRv2) - this is many documents spreading over the internet (primarily presentations to different IETF meetings).
But Google's intermediate achievement (BBRv1 in 2016 has solved only Google's problems, BBRv1 was not general enough) is published here https://dl.acm.org/doi/10.1145/3012426.3022184.
Read it to understand why CCA has much more influence on many QoS parameters than AQM. Definitely, latency and throughput are more dependent on CCA.

Or you could look at how Microsoft has modified CCA to get lossless transmission: https://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p523.pdf
They did touch AQM too (ECN is mandatory for the solution), but the primary art is on the CCA's side.

Hence, the discussion of latency or packet loss or throughput in the context of AQM-only looks funny.

BBRv2 is finishing the world transition: https://www.rbftpnetworks.com/blog/bbr-congestion-control-for-improved-transfer-throughput-complete-guide/
It would be good to hear how C-SCORE, TCQF, TFQ, EDF, CSQF, gLBF, DPS, LBF, whatever
Would help BBRv2 to improve latency, throughput, whatever.

Of course, anybody is welcome to propose something better than BBRv2 in a package with a new super-AQM.
IMHO: AQM discussion separated from CCA is pretty useless.

The market is reluctant to implement even the bare minimum: ECN is still disabled by default on the majority of OSes, despite the IETF call to activate it (RFC 7567).
ECN could permit BBR to stop guessing the buffer load by jitter, the buffer load could be clearly signaled.
But even this minimum has not happened yet.

> But also remember that QoS is not only about CC. DetNet is about admission controlled forwarding
I do not understand the value of a scheduler in admission control.
Admission control was always a policer.

Eduard
-----Original Message-----
From: Toerless Eckert <tte@cs.fau.de> 
Sent: Tuesday, March 5, 2024 21:08
To: Vasilenko Eduard <vasilenko.eduard@huawei.com>
Cc: ipv6@ietf.org; draft-eckert-6man-qos-exthdr-discuss@ietf.org
Subject: Re: [IPv6] 6MAN: looking for feedback to draft-eckert-6man-qos-exthdr-discuss (Re: New Version Notification for ...)

On Tue, Mar 05, 2024 at 07:41:44AM +0000, Vasilenko Eduard wrote:
> Hi Toerless, a few comments.
> 
> 1. The first comment is on scale.
> I did never understand all these attempts to greatly expand the DSCP scale. What is the point of having more bits for QoS signaling if it would not be supported in hardware?

- More than 80% of IP networks are not just for Internet access/transit but are for all
  type of controlled network purposes - enterprise, IoT, media, industrial, converged SP cores.

- Advanced QoS functions would initially primarily target those networks

- I am currently looking mostly into high-speed, but of course espeecially with mesh/radio networks
  there is also a lot of low-speed of interest (e.g.: RAW-WG merged into DETNET-WG for this purpose).

- The new QoS functions are meant to be used because they are then supported by HW.

> 6 bits permit 64 schedulers/queues per virtual interface (BRAS could have 1/4M of schedulers per line card but it is still 8+ schedulers per subscriber virtual interface).
> Do you know any switch or router that has more schedulers per virtual or physical interface?

There are various queuing schemes originated from IEEE/TSN that go beyond this traditional queuing of typicaly SP routers you are referring to. Those are all supported/supportable by 1/10Gbps, which is currently the core focus of TSN. DetNet is looking into higher speed (>= 100Gbps) and wide-area

For example, one of our proposals to DetNet, draft-eckert-detnet-tcqf (TCQF) was hardware validated with standard metro routers (100Gbps interface) in a 2000km deployment. Draft has references to those HW/deployment validation and also scale simulation papers.

More generically, we are evolving into the time of programmable QoS.

In speeds <= 100Gbps, we do see this with SmartNICs in DC, which today of course is mostly used with stuff like iOAM and just acceleration of existing QoS like CC. but it also enables broader QoS which would then benefit from new headers.

In FPE, we are getting to the point of scalable programmable QoS driven by packet metadata. The FPE can with simplae calculations (validated often via P4 on Tofino) calculate scheduling parameters, such as target departure time. "Push-In-First-Out" queues can then support that flexible scheduling.
This approach can hence programmable provide for a wide range of previously one-off HW scheduling algorithms.

> Any plans to produce it? Why? What is the business case?

The HW for TCQF is AFAIK available, but given absence of agreement on header, i am not sure if there is a feature offering in the router products with that HW. Likewise i would really like to be able to do all of TSN QoS in IP networks, as opposed to hve all those workaround for how to tunnel IP on top of an L2-only 802.1x ethernet-switching (and bridging ;-) network. Which would mean to bring alll this functionality to IP level. A lot of that could likely work without new extension headers, but IMHO not all. 

More generally: We did provide extension headers to allow new work/innovation to happen. But the gap between rfc8200 and getting to a workable exgtension header when you're just an industry/router expert for QoS is too big. I think we need to close that gap with such a document to make it easier for multiple, per-instry, per-research QoS methods to be defineable. 

> Middlebox could be software-based. It could have an unlimited number of schedulers. But it is just 1 hop from 7.
> What is the value if only the middlebox would have >64 schedulers per virtual interface?

One of the generic schemes is to not have per-flow-state. A lot of the solutions with many queues end up to derail into per-flow queues. And often this requires per-queue management plane work. Which does not scale. So many of these newer algorithms work on achieving per-flow benefits without per-flow state. See for example the 23 year old DPS i included in the draft. It replaces a per-flow WFQ with one additional parameter in an appropriate extension header and then a more intelligent, single queue algorithm, which i think today could also be done in hardware.

> 2. The second comment on the functionality.
> Indeed, some schedulers could benefit from additional information (like "packet depreciation time").
> Looking at the draft, it looks like functionality is the reason for the DSCP extension.
> IMHO: this reason is valid.
> But I have a doubt that it is possible to guess what particular type of schedulers the market would accept. The strong business case is needed for data plane extension.

One core core point of my proposal is to make experimentation easier. programmable FPE plus PIFO in the ASIC seems like one very flexible next step, but there can of course be others.
I just don't want IP itself to be the roadblock for innovation which wrt. to QoS i think it currently is.

"I think there is a world market for maybe 8 bit of QoS markings" - IETF

(noo... we want more ;-))

> And I do not see any discussion on BBR/CUBIC/DCQCN in the document.

Yes, right now most of my examples are from the DetNet side. I'd be happy to add more from the congestion cntrol side, ideally from folks wanting to see them included. But i have not tracked what the recent ideas are for per-packet metadata to improve them.

Last time i checked:

- more than 1 bit of congestion feedback (ECN) helps faster CC
  (but not sure what's the latest thought is here - i only remember research experiments as old as 10
   years showing this).

- Metadata to support upspeeding faster would be lovely (aka: when there is now more free bandwidth than in before)
  (but likewise, i am not sure what good proposals here are. I know we did experiemnt with
   some indicators for available bandwidth, but that was in CPU forwarding).

> QoS is more dependent on host's CCA then on router's AQM, but both have the influence.

Depends on where the bottleneck is i guess.

But also remember that QoS is not only about CC. DetNet is about admission controlled forwarding, and managing latency can also include CC or admission controlled aspects. 

Finally, there is also all the loss protection QoS, which we call PREOF in DetNet, and which also requires additional metadatsa in packet headers - see the draft.

> It strange to see only AQM discussion without any reference to CCA assumed. IMHO: such discussion has no value.

As said above: happy to include other references/examples where additional metadata in packet headers is beneficial. The DPS example i include is such a CCA.

> 3. New IPv6 header.
> The only motivation that some routers would bypass QoS header.

The motivation is for the header to hold the additional required metadata for the CCA if it's a CCA algorithm the header is suporting. Else it's supporting latency control or loss control algorithms.
Aka it's all parameters for per-hop QoS algorithms (or in other cases end-to-end algorithms).

> Looks strange because it would additionally complicate CCA-to-AQM interaction if AQM would be different on different routers.

No, not different on different routers. Different for different type of packets.
A packet (or likely all packets of a class of flows) will use one particular method of the new QoS header, and the method defines the metadata. All routers along the path should support that method. Depending on what algorithm you do, this must be full suport hop-by-hop, or with simpler goals (like CCA), it could be partial support. For example, ECN does not need full support on every hop to be useful. Likewise metadata tht indicates a more-than-one-bit congestion indication would be beneficial with incremental deployment.

> IMHO: The thing that looks reasonable is the request to "add headers in transit". Indeed, tunnels inside tunnels looks not good.

Well. That's a generic issue with IPv6 extension headers. I tried to stay away from this discussion when it was fought over SRH. I do agree it would be nice to be able to add such headers in transits, but for QoS i think there are a lot of use-cases where the inability to do this with current rfc8200 rules is not a problem. So if someone wants to take the lead in expanding the ability of IPv6 to add headers, i'll be happy to support it, but i won't think i need to drive it for this QoS purpose.

Cheers
   Toerless

> Eduard
> -----Original Message-----
> From: ipv6 <ipv6-bounces@ietf.org> On Behalf Of Toerless Eckert
> Sent: Monday, March 4, 2024 23:37
> To: ipv6@ietf.org
> Cc: draft-eckert-6man-qos-exthdr-discuss@ietf.org
> Subject: [IPv6] 6MAN: looking for feedback to 
> draft-eckert-6man-qos-exthdr-discuss (Re: New Version Notification for 
> ...)
> 
> Dear 6MAN-WG:
> 
> I have just posted an extremely rough draft draft-eckert-6man-qos-exthdr-discuss, to help start a discussion about common IPv6 extension headers for (mostly) stateless QoS beyond what we can do with just DSCP.
> Right now this is a discussion draft not intended to become RFC because it's my impression that the 6MAN community might benefit from some useful summary of how DetNet (and potentially other WGs) might use this work, but this would not be part of a final spec draft, and likewise i have a wide range of open questions instead of answers, and i included those questions into the draft seeking for feedback from 6MAN. 
> 
> Overall, i didn't want to go down a possible rabbit hole of working on details of the spec if it just turns out to involve insurmountable IETF process obtacles to go this route. For example, we could continue to standardize all advanced forwarding functions only into MPLS and ignore IPv6 as DetNet has done so far (*mumble ;-).
> 
> The lack of such extension headers has IMHO held back innovation into better (stateless) QoS, especially in many controlled networks since at least 25 years, for example when draft-stoica-diffserv-dps was abandomed because it was too painfull trying to get to through all the IETF IPv6 bureaucracy - for just one algorithm, when there are so many that would deserve experimentation in specific networks. But given the good recent/ongoing work for example into  I-D.ietf-6man-hbh-processing, i would hope that we're closer now to actually wanting our extensibility of IPv6 actually be used by the industry (instead of all this happening only in MPLS).
> 
> With DetNet we are too in the situation that we have multiple candidates on the table and IMHO it will not be very useufl trying to run a lottery for a single "winner" and standardize just that.
> 
> I have seen a lot more success in the industry by just letting different algorithms compete with each othrer in products and let the market decide. That was quite a lot happening in e.g.: packet scheduling in routers at least since the end of the 90th when in my impression every new hardware forwarding router implemented it's own new packet scheduler based on the just hired lead engineers PhD thesis. And over a period of 20 years, a lot of commonality and industry knowledge evolved in that space. For this type of scheduling, this innovation was possible because it did not require new packet headers, but just a lot of (ab)use of DSCP and/or more or less horrenduous QoS configurations. But for those solutions that do require additional in-packet-QoS metadata, we never created a viable method where it was easy for the  innovators/implementers to concentrate on the novelties of the algorithm in question and get all the knucklehead "how to packetize and what generic requirements/functionalities" be provided as much as possible by an existing framework/RFC.
> 
> So, i'd be very happy to find interest to help progress this work, aka: writing something that ultimately would become a draft-ietf-6man-common-qos-exthr or the like. I have tentatively asked for a slot for IETF119 6MAN to present and get feedback, if you think that would be time well spent, pls. chime in.
> 
> Cheers
>     Toerless, for the authors
> 
> On Mon, Mar 04, 2024 at 12:30:53PM -0800, internet-drafts@ietf.org wrote:
> > A new version of Internet-Draft
> > draft-eckert-6man-qos-exthdr-discuss-00.txt
> > has been successfully submitted by Toerless Eckert and posted to the 
> > IETF repository.
> > 
> > Name:     draft-eckert-6man-qos-exthdr-discuss
> > Revision: 00
> > Title:    Considerations for common QoS IPv6 extension header(s)
> > Date:     2024-03-04
> > Group:    Individual Submission
> > Pages:    27
> > URL:      https://www.ietf.org/archive/id/draft-eckert-6man-qos-exthdr-discuss-00.txt
> > Status:   https://datatracker.ietf.org/doc/draft-eckert-6man-qos-exthdr-discuss/
> > HTMLized: 
> > https://datatracker.ietf.org/doc/html/draft-eckert-6man-qos-exthdr-d
> > is
> > cuss
> > 
> > 
> > Abstract:
> > 
> >    This document is written to start a discussion and collect opinions
> >    and ansers to questions raised in this document on the issue of
> >    defining IPv6 extension headers for DETNET-WG functionality with
> >    IPv6.
> > 
> > 
> > 
> > The IETF Secretariat
> 
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------
> 

--
---
tte@cs.fau.de