Re: [tsvwg] I-D Action: draft-ietf-tsvwg-nqb-15.txt

Sebastian Moeller <moeller0@gmx.de> Thu, 30 March 2023 10:14 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D8D29C15270B for <tsvwg@ietfa.amsl.com>; Thu, 30 Mar 2023 03:14:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.546
X-Spam-Level:
X-Spam-Status: No, score=-2.546 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M9AsMS8vf0mt for <tsvwg@ietfa.amsl.com>; Thu, 30 Mar 2023 03:13:58 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 44B90C151557 for <tsvwg@ietf.org>; Thu, 30 Mar 2023 03:13:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1680171225; i=moeller0@gmx.de; bh=ONuHYrnW8+4BZLc1UEtM1cfDwF9Vqmg6l0pO3Vsa99c=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=Xcix52TuoiDnEFXiLW3m3cblukA/+FwDeDdre2tATOlg0K+xwT0+qWwu3feMxxOxc xwoBrFG3u1jRZoYXP6YJ4QQHuttZzzctrveh1ebEDtgCrEMWbAUpTxVR7kySMmymB1 HoejyOMgT84y9IWvo0/CePf7CcaAZ+pz/GLESVH9MfPVb3rPpJffPiH1OdRvRr5CIa 8yZis6Ryv6T3SWhdPWGgoggq6FLUfnDejTXU0OMt7lvwo0iBqeShCK+KSXvlJoljAL d4EsGkH6V5teQbRF8VY1P7ngOCaUQkjPNSUoUcoceqKFQC40qveDB8PbgBlAGye6wP xrA6QmJCOWk/g==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MUXpK-1pqWS107Hc-00QXAT; Thu, 30 Mar 2023 12:13:45 +0200
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.2\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <vzeYc5YbJNlwG-Eelc9Ua79lI-OftySXhDOTNwS4NG115U4aIeX7D4Rs06Euqp6y7xFuUZpGOJY6pba_YN-DP5yuKjzP7ilpOwVLpuK8iD4=@ealdwulf.org.uk>
Date: Thu, 30 Mar 2023 12:13:43 +0200
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>, Greg White <g.white@cablelabs.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <B8D8C664-EF35-4F97-86BE-6AD45F20068C@gmx.de>
References: <167348364734.15098.9183646444272144529@ietfa.amsl.com> <99798453-8EAB-4FAC-9F04-060EA42C5D37@cablelabs.com> <8FEB8FB8-849F-4F6E-BDAD-0EF53F010173@gmx.de> <vzeYc5YbJNlwG-Eelc9Ua79lI-OftySXhDOTNwS4NG115U4aIeX7D4Rs06Euqp6y7xFuUZpGOJY6pba_YN-DP5yuKjzP7ilpOwVLpuK8iD4=@ealdwulf.org.uk>
To: Alex Burr <alex.burr@ealdwulf.org.uk>
X-Mailer: Apple Mail (2.3696.120.41.1.2)
X-Provags-ID: V03:K1:oUDlqwwJMn/3cEJ48f0bq1BDwn0ivMoIl8KRW0x5bnUXAJC9KX2 sbuN/BtkpdEoe2+M619YfQa5nwk4k6K9fcpPlIO6KtDiCtvrN7lBWcboDXe3OmKtwapPcZO WnNzP4FfQEiQIkm976WAUVlTcbh9vPHmE+Eiy+xc5bRHeHuHFi0bFB7vn6Ja16aWFdEd2Rz C9t6YNh67rP0A7SEcCnmw==
UI-OutboundReport: notjunk:1;M01:P0:5sHIxU/dgwI=;Eg40+Rb5vEljnCltumL2Q61AYJp /7L0ilrebWIdu5zlTkrWkQ4Cpyp8Z4XHxg3TIA8Kt5PvlFgV4m77uX4nVqSCag5gtsRM9MLoH n7KlOESYdfrNZ2qGcnK77pZnNjIJKQD1v+CVfMJMbZvbkN60cAd2lBxXkKxCqRhe6T0X7D3sQ 2d2yurFNZUXae4TnLPdCAibijyRaszf4UakWa+L92NkMkPY5q62lX0c1QdbdXB/w/NbVvKdEr kTYRJ+rVRqsWvy0NlAr3H9Hp3BP4RFU/zDzfkR1XuptYAtbh0AZn25c1RT/VYGhQGLBu75X8a NNILwRQPJV+M+B7QTep24A8EE5vTbhOgUsX57DDmOsLOkLKcJCO3DY2VK34arG2XZMmuCB2XW s+i22el/VKZn8/Z1f7HiXdjEmPCkQ5gW5CdUk3Ns/YwarJ7vzmtjZjUqRGKSzl2TAt6n/cm07 UudffNWzb62chSEAh0jvjbArl8ziO+yDDmKySsMpLB4a9piRJvK8ugE7QbXYcKC+odc+9wjmY dEZEFtEhmoATjbBx7c7V4UhMSAkDiPVDH9n5QttlIaxZ9ws/PjVKD+W6XGJ6d7HCZrxPbTKtJ SNdb0+KIPDhve5+FZYHyhbv1Q3QXgzEYUcEwcAm5kJEcCEv41pe/J2EqJbEonRglfJW5AspYN ZwQo+CDVAJNLdXY+n52Zsq5ZXS38XP6baP4T0OZe6ADpvYpiCq8+f+nm5kD5Ey9l/7MPNQRim DY2tYbM+GE0bK8aLodEYGjwoc8xlEo7WHV31T7n7HCs+7Ow0ZkySSHku9ZtvhuVMHPfdHVgC4 qrHMsEeuSqIjFZ7ef4mBItBVUqNAwFsr0rvEnyjNgPtecZ0W4uxmoOsM7SGBwSfJwAa+meda2 9U3x4SHVH6hURVA6+iaOD3yHSVlb0Lt3oDDjj4QXGJjlBkIiKYFjMcmzuPeO2jWZOssz74m7e DAIUtz8Ae9YjF36MdDdYWQcSsmU=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/2MCxrUuYOyPIItGxRZmhPrN7BdU>
Subject: Re: [tsvwg] I-D Action: draft-ietf-tsvwg-nqb-15.txt
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Mar 2023 10:14:02 -0000

Hi Alex,

Thanks for the detailed response. See [SM] below (probably too verbose, feel free to only address points you consider valid/interesting and ignore the others)

On 30 March 2023 00:24:59 CEST, Alex Burr <alex.burr@ealdwulf.org.uk> wrote:
> Sebastian,
> 
> You write: " NQB very much prioritizes NQB traffic over QB traffic, see my ample posts about how this is, and why this is the ONLY way to deliver lower delay and jitter",  and "Anybody disagreeing that scheduling under load is a zero-sum game please point me to references showing that this is wrong."
> 
> The following thought experiment should illustrate that moving from a single queue, to two queues with a scheduler between them, can be positive sum for latency/delay:

[SM] I respectfully agree. That is not exactly the question I asked and that is not what the NQB PHB aims to mainly achieve:
https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-17#name-non-queue-building-sender-r
"The purpose of this NQB PHB is to provide a separate queue that enables smooth, low-data-rate, application-limited traffic flows, which would ordinarily share a queue with bursty and capacity-seeking traffic, to avoid the latency, latency variation and loss caused by such traffic. "

	[SM] This seem to describe a system that is aimed at specifically treating two traffic classes differently with the aim of "isolating" one from the other. All in all, I maintain that such an isolation requires on average prioritization of one class over the other, but I concede immediately that "ONLY" is a phrazing that sets me up to a "black swan" counter example. So I do deserve the one I get delivered here. ;)


	[SM] But even for the intra queue 'scheduling' (fifo in our thought experiments) on a drop the dropped packet pays the ultimate price, while packets queued behind it will see lower queuing delay. So the zero sum game happens inside that queue. This is however supposedly a rare condition and not the regime NQB is intended for.

> 
> We have two traffic types A and B. In the "control group" case, they are in the same queue. In the test case, we separate them into two queues. The queue for type A offers the same treatment the original single queue. The queue for type B has a different treatment with a shallow queue.
> 
> The scheduler between the two is  Maxwell's Demon. The demon knows exactly which time slots the packets from each type *would have used* if they had been moving through as single queue. It schedules as follows:

	[SM] I would have called that an oracle scheuler, but I like your Maxwell's Demon model better.


> * When a packet of type A would have exited the single queue, the demon lets through a packet from queue A. 
> * When a packet of type B would have exited the single queue, the demon lets through a packet from queue B.
> * If a packet from the appropriate queue is not available, it lets the slot go empty.

	[SM] [Tangent] So this is a non-work conserving scheduler. Do these actually get used much on the open internet? 

> 
> Theorem 1: By construction, because Queue A treats packets as before, each packet of type A leaves the scheduler at exactly the same time in both the control case and the test case.
> 
> Your claim, that latency is zero sum,implies  that Theorem 1 entails that packets of type B can gain no latency advantage, since packets of type A have had no latency loss. But this is not true.

	[SM] I disagree, in this case we play an intra queue zero sum game. However I agree that here the additional latency decrease for the non-dropped B queue packets does not come at the expense of A type traffic. (In a work-conserving scheduler the latency reduction would be spread over all queued up packets that will not be dropped by the scheduler later, so would be spread over both A and B queues)

	[SM]However wouldn't the same be also true for e.g. an EF PHB where EF is given 20% of the rate, and both EF and DF have queues sized to the adjusted BDP (20/80 queue size ratio, compared to an single queue BDP of 100 arbitrary units). And even in a single queue FIFO this intra queue "scheduling" still happens (and even in tail drop queues, even though here we do not affect packets already in the queue but those that will be enqueued in the near future).

	[SM]In both cases EF/DF and NQB/QB this intra-queue mechanism is in action and very much not the intended operation mode to achieve lower latency/jitter for one class over another (if only because the queue is in dropping mode state should be rare).


> 
> Most obviously, suppose that in the test case Queue B simply discards packets after a time shorter than the sojourn time in Queue A. 

	[SM] OK, note the same would happen in the 20/80 EF/DF example (assuming there is no sufficient admission control, but such an admission control really will just move the drop/no-drop decision out og the EF scheduler to the admission controller, but for the affected flow it seems to matter little where exactly a packet was dropped, no?).

> 
> Lets take the example of an unresponsive, but low bandwidth voip flow. It will build up a short backlog in Queue B, some
> of the earliest packets will be discarded, and them some later packets will start being scheduled in the slots that earlier
> type B packets would have used, and so have lower latency than it had before, all without making any difference to the packets
> of type A.

	[SM] Yes, that would be true. Please note that this is not how NQB is designed though. The NQB queue is scheduled equally to the QB queue resulting on our (naturally paced) VoIP flow building no queue until it approaches exceeds 50% of the capacity (assuming not other traffic in B).

https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-17#name-relationship-to-the-diffser
"lso, the NQB traffic is to be given a separate queue with priority equal to Default traffic and given no reserved bandwidth other than the bandwidth that it shares with Default traffic. As a result, the NQB PHB does not aim to meet specific application performance requirements. Instead, the goal of the NQB PHB is to provide statistically better loss, latency, and jitter performance for traffic that is itself only an insignificant contributor to those degradations."

	[SM]So I think it is save to say that intra-queue "prioritization" is not the major mode how NQB intends to deliver lower latency & jitter.

> 
> Similarly, an L4S flow will have a shorter sojourn time in Queue B than it would have had in Queue A, if Queue B applies earlier congestion  signals. 

[SM] Yes in the dropping regime you can argue that. For L4S traffic what counts is less the congestion signal (necessary) and the more appropriate and timely response (sufficient), but that is nitpicking. However this is a process internal to the B queue that works in addition to the conditional scheduling between the queues, this will also be in effect in a single queue.

> 
> Now of course, we can't build a scheduler using Maxwell's demon, because it is using counterfactual information. 

[	SM] We likely would not want to, as non-work conserving scheduler it could result in lower utilization than possible. It would also not achieve the goals for the NQB PHB, as it essentially levels the jitter inside the B queue without isolating it from the latency and jitter in the A queue. 

> But the insight we gain is applicable to real systems: later packets can get earlier time slots without disturbing other flows. 

	[SM] That seems strictly only true if we posit that in your example that each queue houses a single flow. Even if the dropped packet and the scheduled-instead packet are from the same flow, all other packets in the queue will profit from that "jump at the queue head". Unless you talk about the L4S signaling, which in the context of NQB seems less relevant.

> We only need counterfactual information to make *exactly* no difference to packets of type A, for the purposes of exposition.

[SM] Yes. But 'no difference' is not a prediction of 'zero sum game'. Zero sum game here really only denotes that an egress transmit opportunity can be taken by a single packet at most (logically, some link technologies schedule/transmit batches, let's ignore that).
HOWEVER, as I said above by hyperbolically claiming "ONLY" I set myself up for a correction, and I think you managed to deflate my "ONLY" claim and conceed that point to you.


> 
> 
> Your claim might be true if packets were conserved. But packets are not conserved across a change in queuing treatment, because they can be discarded when they were not before, or the sender can throttle and send fewer packets based on different congestion signals.

	[SM] I do not disagree about the consequence of packet drops, but I argue that does not substantially change the zero-sum-gameeness of the dequeuing/transmission. But again that is not what I strictly claimed with my "ONLY" above.

> 
> Because latency is not zero sum,

	[SM] See above. Scheduling the transmission opportunity (in a work conserving fashion) is IMHO still a zero sum game, you only have two options:
a) select one
b) select another (might be multiple candidates)

we can not magically schedule more than one..., with a non-work conserving scheduler we leave the definition of a zero sum game in that we can fore-go that transmission opportunity completely (thereby having "losses" for all candidates summing up to !=0). Now, I do not see either NQB or L4S for that matter propose non-work-conserving schedulers.


> I think your effort to get the NQB doc to say that it prioritises NQB packets is actually the opposite of what you want, if you want to avoid prioritisation. 

	[SM] You might misunderstand my point, I want the draft to drop the claim 'without prioritization', I do not argue to write 'using mostly/mild prioritization' even though I think the latter claim is closer to reality, just look at the DoCSIS spec that describes this IIRC as 'conditional priority' (in DOCSIS NQB is recommended to be sorted into the "Low Latency Service Flow").


CM-SP-MULPIv4.0-I05-220328 :
7.7.3.2 Inter-SF Scheduler
As the Dual Queue Coupled AQM architecture provides only one-way coupling from the Classic Service Flow to the Low Latency Service Flow, it relies on the Inter-SF Scheduler to balance this by ensuring that conditional priority is given to the Low Latency Service Flow within the ASF. "Conditional priority" means that traffic of the Low Latency Service Flow will be serviced with a priority, yet without the Classic Service Flow being starved. Weighted Round Robin (WRR) is a simple scheduler that achieves the desired results, and is recommended in [draft-ietf-tsvwg-aqm-dualq-coupled].

For Upstream ASFs, the CMTS MUST implement a weighted scheduler between the Low Latency Service Flow and the Classic Service Flow within the Aggregate Service Flow. Since the WRR algorithm acts on variable-length packets, and the CMTS schedules Upstream Service Flows in terms of minislots, this specification requires a simple "Weighted" scheduler for upstream that assigns minislots for the two Service Flows according to the configured weight.

For Downstream ASFs, the CMTS SHOULD implement a WRR scheduler between the Low Latency Service Flow and the Classic Service Flow within the Aggregate Service Flow.

As discussed in Section 7.7.4.4, the Traffic Priority values for the Classic Service Flow and Low Latency Service Flow do not influence the Inter-SF Scheduler.


	[SM] So again, not to diminish your successful deflation of my hyperbolic claim, it seems odd that the likely first/widest deployed NQN-aware scheduling solution claims to operate via 'conditional priority' to also claim that NQB operates 'without priority'. 
	All I am asking for is to correct the claim in the abstract, which could mean e.g.:
a) dropping that claim without replacement
b) changing it to 'without strict prioritization" as conditional is not strict
c) by adding a definition of what "prioritization" means in the context of the NQB draft

The current solution, seems sub-optimal to me. I had not expected this request to turn into a long discussion, after all there are three easy ways to solve this with pure verbiage...

My point is simply, that treating one class to lower delay on average than another class is very much the definition of what prioritization entails (which is about deciding on the temporal sequence of actions), so I argue hat you can not achieve the goal without that method, so NQB can not "avoid prioritization" because at its heart it IS a sort of prioritization.


> Because then the specification will  discourage, or even disallow, alternative implementations that are more to your liking.

[SM] This seems to be a pretty strong claim, what dropping 'without prioritization' from the abstract would cause.... I am not sure this is going to happen (an alternative implementation)... And yes, even replacing this with an FQ-scheduler will result in prioritization of flows below their equitable share over flows at or above their share. So I fail to see alternative implementations that will deliver on average "avoid the latency, latency variation and loss caused by such traffic" for other traffic that is not at least partly prioritizing other traffic over "such traffic", as I see no other way of isolating these traffic classes. Your argument above is sufficient to shoot down my needlessly hyperbolic "ONLY" claim, but it does not show an alternative implementation to achieve NQB's goals without relying mostly on prioritization, at least according to the dictionary definition of prioritization.

See:
https://dictionary.cambridge.org/de/worterbuch/englisch/prioritize
"to decide which of a group of things are the most important so that you can deal with them first:"

https://www.collinsdictionary.com/dictionary/english/prioritize
"2. If you prioritize the tasks that you have to do, you decide which are the most important and do them first."

https://www.merriam-webster.com/dictionary/prioritize
": to list or rate (projects, goals, etc.) in order of priority"

These definitions seem to fit well with what happens in a 50/50 scheduler selecting from a deep and a shallow queue, also note how neither definition talks about the differences between strict and conditional?



Sebastian



> 
> Alex
>