Re: [iccrg] Will Fair Queuing change Congestion Control?

Hi Maximilian,

> On Jan 3, 2023, at 23:18, Maximilian Bachl <maximilian.bachl@gmail.com> wrote:
> 
> Hi Vidhi,
> 
> Thank you for your comments! 
> 
> Few things to note, FQ expands to Flow Queue (not fair queueing) which means every flow (4 tuple) has a different queue. Priorities are implemented based on service classes, for example. Also, Apple devices are not the most common bottleneck on an end to end path.
> I assumed "flow queuing" and "fair queueing" to be mostly synonymous, as for example the fq_codel manual page in Linux calls it “fair queuing” (https://man.archlinux.org/man/tc-fq_codel.8.en). To clarify, what I meant is “flow queuing”. 

	[SM] I would argue that both flow and fair are expansions of the Fi FQ that are commonly used, one could even argue for calling this flow-fair queueing ;). It just turns out that a number of folk seem to be allergic about the term "fair" hence a recent tendency to call this flow queueing (which admittedly is more precise anyway).

> 
> Even if we assume there is enough deployment of flow queueing in the network, flows can hurt themselves by not responding to growing latency. This problem has been addressed in L4S drafts and I highly recommend reading them. 
> Endpoints could use delay-minimizing congestion control if they could be certain that fair queuing is deployed at the bottleneck of a path. Thus, I think they wouldn’t necessarily hurt themselves. 

	[SM] Here is the rub, for delay based-CC to work flows need to be sure not to be in competition with "drop-based" CCs as otherwise they will loose. What L4S actually tries to deliver is a separation for "old" and "new" CCs so that new signaling and CC-response behavior can be used without getting into competition with old-style CC.
However while on principle this sounds like a viable way forward, the way this was designed and implemented again confounds old and new CC-signaling, as L4S in a hare-brined move worthy only of ridicule decided to re-use the exiting ECN signaling, and especially to re-define the meaning of what a CE mark means in a non-backward-compatible manner. IMHO this is not a sign of solid engineering, but what do I know, I am a wet-ware biologist... 

> 
> ECN combined with an immediate AQM combined with Prague congestion control is basically the idea of L4S.
> I think that L4S can improve how congestion control is done. However, from what I understand, it doesn't necessarily isolate flows.

	[SM] Indeed, in its default form it only classifies traffic into onl and new-style CC and schedules flows in one of two queues in which all packets of a class are essentially in a FIFO, so class-isolation, but no flow-separation. Historically the L4S drafts have been full of arguments why flow-queueing sucks* and needs to be avoided at all costs. These philippics have been excised in later versions and replaced with text that describes as FQ as one way to implement a L4S AQM. IMHO this is clearly an improvement over earlier drafts, but given my experience with SQM I would argue that L4S should heavily recommend FQ scheduling and only offer dual-queue scheduling as band-aid for situations where FQ is impossible for one way or the other. (My rationale for that change is that the most likely deployment of L4S schedulers is at the ISP to end-user edge, where as the SQM project has shown, FQ-scheduling is a viable option that actually works even when just using old-style "dropped-based" CC).

*) Mostly boiling down to that an attacker that can split its traffic in multiple flows can get more than his/her fair share of the capacity, a failure mode that FQ shares with a FIFO I would argue so, yes an attack surface FQ does not remedy, but hardly a novel attack surface... to be expolicit, this is IMHO not a convincing argument, but again as biologist I have apparently little insiight in what "engineers" find convincing.

> Thus a bad faith actor could get an unfairly large share of bandwidth, which couldn't happen when FQ were deployed.

	[SM] Mostly yes, except if the attacker can split his/he traffic into multiple different flows. Cake offers an elegant way around this dilemma by offering an additional layer of isolation, where it first splits capacity equitably between internal IP addresses, thereby restricting the flow-inflation problem to the traffic mix sent to individual computers.

> With FQ being deployed on a path, each flow could also choose its own congestion control and whether it wants to prioritize low delay or maximum throughput. That's why I think that exciting possibilities open up when more vendors deploy FQ like Apple did. 

	[SM] However Apple's fq_codel seems to differ from RFC fq_codel in several ways, so I am not sure (in that I personally do not know about the details) whether Apple's deployment actually helps.

Regards
	Sebastian

> 
> Regards,
> Max
> 
> On Tue 3. Jan 2023 at 20:45, Vidhi Goel <vidhi_goel@apple.com> wrote:
> Hello Max,
> 
> 
> Few things to note, FQ expands to Flow Queue (not fair queueing) which means every flow (4 tuple) has a different queue. Priorities are implemented based on service classes, for example. Also, Apple devices are not the most common bottleneck on an end to end path.
> 
> Even if we assume there is enough deployment of flow queueing in the network, flows can hurt themselves by not responding to growing latency. This problem has been addressed in L4S drafts and I highly recommend reading them. ECN combined with an immediate AQM combined with Prague congestion control is basically the idea of L4S.
> 
> Vidhi
> 
>> On Jan 3, 2023, at 9:20 AM, Maximilian Bachl <maximilian.bachl@gmail.com> wrote:
>> 
>> Apple introduced fair queuing (FQ) on more than a billion of devices in the last years (https://blog.cerowrt.org/post/state_of_fq_codel/). I wonder if this could have an impact on congestion control. When there’s fair queuing at the bottleneck link of a path, the endpoints can use delay-minimizing congestion control and don’t have to worry about more aggressive senders. 
>> 
>> However, currently, there’s no established mechanism for endpoints to know of the existence of fair queuing on a path. There are some possibilities, though, how endpoints can be made aware of the existence of fair queuing: 
>> 	• A dedicated bit in some packet header. For example, some ECN or DSCP bits could be repurposed to indicate the existence of fair queuing at the bottleneck. Drawback: It’s hard to convince people to repurpose/introduce new bits and switches/routers have to support it. 
>> 	• An end-host-based mechanism to detect fair queuing without the involvement of switches/routers. I created a proof-of-concept implementation of such a mechanism, which can detect the presence/absence of fair queuing with an accuracy of > 95% (https://arxiv.org/abs/2206.10561).
>> 	• Closed ecosystems in which all components are aware that there is fair queuing: As a hypothetical example, let’s assume a user plays a game on nvidia’s cloud-gaming platform on an iPhone. The phone is connected to T-Mobile’s 5G network. Both the client on the iPhone as well as the server in nvidia’s network know that there’s fair queuing on the path between the client and the server, since they have an agreement with T-Mobile, which controls the entire path. No extra bits in packet headers are needed, just mutual agreement between the user’s device, the cellular network provider as well as the server on the internet. 
>> 
>> If you have some opinions on whether fair queuing is going to change how we do congestion control, I’d be curious. 
>> 
>> Regards,
>> Max
>> _______________________________________________
>> iccrg mailing list
>> iccrg@irtf.org
>> https://www.irtf.org/mailman/listinfo/iccrg
> 
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg