Re: [tsvwg] TSVWG: WG adoption of draft-white-tsvwg-nqb!

Sebastian Moeller <moeller0@gmx.de> Fri, 13 September 2019 08:46 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DFF6B120219 for <tsvwg@ietfa.amsl.com>; Fri, 13 Sep 2019 01:46:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.348
X-Spam-Level:
X-Spam-Status: No, score=-2.348 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4UHjmC7EeOK2 for <tsvwg@ietfa.amsl.com>; Fri, 13 Sep 2019 01:46:27 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0D90F1200B9 for <tsvwg@ietf.org>; Fri, 13 Sep 2019 01:46:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1568364378; bh=Vh93q0/kE2ToqwABvDcgO8k2OZm0ozi3HUZEok7F3X0=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=EpXdxHzdH8WpUYqtAEDjbm41X1YJ4l4awNXiqAs/IjzrepF/iJyjOCSstl5AWKfZr /f1bvcCwOnQJbngr5sRdo+6UL5N4XEb7xrtpEkLO6r7rtLw7h9d7eSiSJHl+d1YsPw ma6CID9Jk8Ew6VcSmgZsEpxmMcvY7Jchep7VB+R0=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.32] ([134.76.241.253]) by mail.gmx.com (mrgmx101 [212.227.17.168]) with ESMTPSA (Nemesis) id 0MfiFU-1hnht80tQi-00N8sK; Fri, 13 Sep 2019 10:46:18 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <C3FEFBF9-C090-4232-981F-2DD02F116D31@cablelabs.com>
Date: Fri, 13 Sep 2019 10:46:16 +0200
Cc: "Ruediger.Geib@telekom.de" <Ruediger.Geib@telekom.de>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <F0C44520-543D-47B8-93C2-966772BF3258@gmx.de>
References: <CE03DB3D7B45C245BCA0D24327794936306BBE54@MX307CL04.corp.emc.com> <AA4DBFC5-8D8F-4F43-80E4-BB9BA7F53486@cablelabs.com> <LEJPR01MB1178B6C102455F1F9886D49A9CBB0@LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE> <F351D86E-DCE4-45CD-9B08-2E0C11090BF1@cablelabs.com> <LEJPR01MB11789EE6D8B7C732393BD1439CB70@LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE> <B11AB47E-7E35-4599-A85B-DB0EF883E2B2@cablelabs.com> <BDF260C9-881C-4ACC-AF92-8E99C1CB07E0@gmx.de> <4B5C14EE-B3CF-455B-86C9-67D6E9BAEF4C@cablelabs.com> <40417573-1036-4238-A451-BFA6D8310B20@gmx.de> <E437444E-B896-4BD8-BC3B-01A535A6858D@cablelabs.com> <E35C0C36-9C33-40EF-B7F4-1D3FB508E4CB@gmx.de> <C3FEFBF9-C090-4232-981F-2DD02F116D31@cablelabs.com>
To: Greg White <g.white@cablelabs.com>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:KHat1dWRpX0TJeXga1jfzpRXhlMY7ColcmV68lUcnyC+ZNH0UK4 CPfUK5CK3WQMQDxJV6Ek9jU3tnjh5k9Rhwcu7uOAS5Nj/8f/BkKnC6oZBjU90r4QakJjOw6 cv+GSbS5D13AcA14FE3VqIVTyzl8sMebJpExGllN6IwMB/j1SHjCvE5EGyEOFMqOnaQHLuh 7DsgNR42Yd22/NRuk0Vag==
X-UI-Out-Filterresults: notjunk:1;V03:K0:pBuCNvsYjow=:RJM/WTgO4uP/7vnOJQep8N XUO7YLxzsOeMa0y5yOg9H8kNh5gShj+dkto9AUbC7/x/yFeOpxS9eDreCHNy6Sa6KorTM4jnP Q2NImJ3mfMcBcY7R7dActWWsC3eMtDNJCUw6czhLCW2m3ctiV+4g6ueJ/li0cWeZAU/oEFZa5 XKTa3a0tX0o8mLNlzkgxNAr+5t9ywuxMxypVWyUF2Z6cvgZ5bDWkFjbdNOv29KLEUtWIZE5sL x1elcbgzTv3lvi3J+wavDF7ACqQb3v9SFQN5SUgSvH0jHML+P9b9GD3Qg5g7dPqke0r5s22Q/ zYtyzEkFMcY1PR1rc5qGRVHfM2Z1g/REgwktb0qAUGTXtrsdLoLwNdFMZtThxfXaY2S8LIHJ3 LkR26Pywkb+24dlw97rOGp/zNFwdrlvOkBZ9ChAxstvicNqDhkdZU6mHdyaeYik99B/XYWIan pC5TTzLW5S/P9pju0F3IQFjSFjJ9tXxldZ2R28pAJoIJjj5qbOYmVXt4J8d24ulfX7Fc6DiQM 1ihaIU9P5nUbz8kidtSnr1y5MBI8fiRw8RcsrCr/71qSFzWKl0q92IuNa+10DQNMX22Qfz7mw hTkCHOGXg6xNTpQoWXtO+LDKpZVK7IBcZHfJed4lEyvPFLABEL0XeSRr5M6tjrZuIamUHy/Xi +plNLEJIRffw5zglH9/KgSbVvzzx7chio05BbMwDuU0my2ztY0xkcv/YUgw5Qpi5aTLXYYltz pdnXnlXLd8tN6tkVVwFLqmwbWO8KkNKEte7deWXEh7ConiZsgsG5sRq3fTpilHBFZFWIcn/7V 7mN3lWLkHCwTJ616yHKPLdNfqtBklTFmJDg28GiTB7pY+yu0rl4MVEEA2wHcVn8PC9l4uQCBq HUHrhavEAjtPagTyVY5MukVgY1J6Dqy0gIn+LI4oPA5M/YW74jACpLlbMMRcRmGCbnPlbWhBS ZCsX30I36CaR5KrczPCYuKtS0u4Z0gxk3pDLgL1l47W0TzPLMcTfAlMd9sIeCCbE/98WIiWmf yxzeN+ltXyX0mRiul6eUg70hnbLg0YjOE76DU2xzuYMddkL+lCh1HtwkAE5rCgL450lass1kC 5Q8plpJYY9F/sngI6QuOspVGzlZsE34Nc85D1DrPTDGihadIHFOpRLnksB8in4AdX1TDFazUc ZNN1C9PMSsHh/Qew4Ijx2Sv3+rYxsiuWWtvNbAjEQ75h1EtfYJWXDqh8gEx9AO+hQ2KhRGW86 RBML4gS0mNsQaygk6n4ahBiQYfu/yLoxnBZ6hSdFYTj4abnCrfeKXgZYN6ho=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/lgWSXtEV28HoyiqYx9RQn8a62_k>
Subject: Re: [tsvwg] TSVWG: WG adoption of draft-white-tsvwg-nqb!
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Sep 2019 08:46:32 -0000

Hi Greg,



> On Sep 13, 2019, at 01:34, Greg White <g.white@cablelabs.com> wrote:
> 
> Sebastian,
> 
> Since it seems we’re going in circles (or maybe it's a very slowly converging spiral), why don't I make the edits that I think you want to see, and then we can go from there.

	[SM] I am all for that approach, and I want to repeat that I realize that I am not the arbiter in this matter and convincing me (while nice from my perspective) is not important, convincing the chairs that my objections are baseless or handled well seems more important. 
	Since I seem to be the main person in this discussion with the "how does it look from the receiving end/end-user perspective" hat on, I will try to stay maximally critical (but let me repeat that I like the idea in general, modulo the side-effects I predict from a few of the detail-choices in the draft) that might be annoying on an inter-personal social level, but hopefully leads to a better technical soution.


> 
> Based on your arguments, I've agreed to add some detail in a section on "Applicability" that indicates more concretely that the intent of the NQB marking is not for capacity-seeking traffic.  That text will supplement the existing language in section 3 of the draft that currently reads:
> 
>  There are many applications that send traffic at relatively low data
>   rates and/or in a fairly smooth and consistent manner such that they
>   are highly unlikely to exceed the available capacity of the network
>   path between source and sink.  These applications do not make use of
>   network buffers, but nonetheless can be subjected to packet delay and
>   delay variation as a result of sharing a network buffer with those
>   that do make use of them.  Many of these applications are negatively
>   affected by excessive packet delay and delay variation.  Such
>   applications are ideal candidates to be queued separately from the
>   capacity-seeking applications that are the cause of queue buildup,
>   latency and loss.
> 
>   These Non-queue-building (NQB) flows are typically UDP flows that
>   send traffic at a lower data rate and don't seek the capacity of the
>   link (examples: online games, voice chat, DNS lookups).  Here the
>   data rate is essentially limited by the Application itself.  In
>   contrast, Queue-building (QB) flows include traffic which uses the
>   Traditional TCP or QUIC, with BBR or other TCP congestion
>   controllers.
> 
> If you can point out specifically what in that text led you to believe it was intended for capacity-seeking traffic, I would appreciate it.

	[SM] Given how much language lawyering is applicable on when parsing RFCs I would like to see more precision for the following terms:

"relatively low data rates" as an application developer how do I decide what is relatively low? Or put differently relative to what exactly. Also If I have a paced-UDP based video delivery application that does application level switches between different bitrates based on experienced buffering/capacity/delay (like a paced DASH equivalent over UDP, or given that the sending rate for each segment is essentially fixed even standard paced-DASH) does this qualify for NQB or not? I believe the draft should make it crystal clear what can and what can not be considered to be NQB material.

"such that they are highly unlikely to exceed the available capacity of the network path between source and sink" while I understand the thought behind this, it is simply impossible for an application to know that, given that "available capacity" is not fixed (also a sending application will have very little information about the available path capacity when it starts). Maybe I am being to pedantic here, but starting with a wrong premise irks me.

"(examples: online games, voice chat, DNS lookups)" this still has the DNS lookups which collide with rfc8325, I understand why giving DNS lookups a low latency treatment (as DNS latency this directly translates into browsing latency, albeit only for uncached entries), but keeping DNS here has consequences on the wifi section IMHO.

"Queue-building (QB) flows include traffic which uses the Traditional TCP" why the qualifier? As far as I understand TCPPraque would also be classified to the QB queue by virtue of being capacity seeking. In a sense Prague is included in the "other TCP congestion controllers" set, but I would prefer a less ambiguous wording like 

"In contrast, Queue-building (QB) flows include all flows which seek to utilize the full capacity of the link like all TCP flows, as well as many SCTP  and UDP flows, like QUIC"
If you want you can add "all TCPs independent of the used congestion controller". 
If you disagree and believe TCPPrague should be allowed in the NQB queue, then at say so explicitly (but as far as I understand you really want to exclude all TCPs).

Or put differently I would like to see a description of the behavior precise enough that I can easily deduce from a packet capture whether a given flow meets the NQB requirements, in a sense an enumerated list of expected behaviors and of behaviors that disqualify for NQB marking/treatment.
I believe that would be in agreement with the stated goal that "the NQB designation and marking would be intended to convey verifiable traffic behavior, not needs or wants"



	Conceptually my next issue is that I think just trusting the end-points to mark correctly is too optimistic and that any scheduler giving special treatment according to NQB-ness must be required to ascertain that each individual flow is behaving according to the NQB requirements and ideally re-mark non-compliant flows CS0 to protect downstream devices (if NQB maps to anything but AC_BE). This requirement mainly is driven by the side effects NQB will have on wifi, so if the special wifi power of NQB goes away, this issue gets less important to me (also there are wifi scenarios where queue protection as proposed upstream will not be sufficient).


	Final and strongest objection: incompatibility between current wifi gear and NQB. 

I hope we can all agree that :

a) Wifi is a shared system not only between up- and downstream between stations and APS, but also due to the fact that the same frequency bands are often shared between different wifi networks (either directly be using the same center frequency band, or by overlap of the side-bands).

b) no currently deployed wifi gear (NICs and APs) supports special treatment for NQB flows.

c) most deployed gear supports access classes and will in all likelihood use the default DSCP to AC mapping you described in a different post. (as far as I can tell WMM is mandatory for wifi >= 802.11n)

d) The AC priority system is essentially a (weak) precedence system (lower ACs might sneak in a tx_op even with higher AC queues full, but that is gong to be rare and can result in multi second delays for packets in the lower ACs, as observed in my home network). So in traditional thinking using ACs requires some sort of access control/rate limiting to avoid undesired starvation issues.

e) For quite a number of end-users the wifi rate is what limits the internet access and not the access link itself (at least for higher bandwidth internet plans in crowded RF environments like apartment buildings)

f) NQB is not intended to reduce the "performance of QB flows" but rather isolate well behaving sparse and/or paced fixed-rate traffic from the transient queue building effects of non-paced non-rate-limited flows.

g) NQB is intended such that  "that malicious or badly configured nodes can't abuse it."

h) NQB traffic is not rate limited or rate policed. Even if each individual flow is self rate-limiting the aggregate clearly is not intended to be.

i) For QB flows, the QB queue provides better performance (considering latency, loss and throughput) than the NQB queue

j) NQB-marking (hopefully) will one day survive end to end.

Now, from e) it follows that the ISPs upstream NQB aware AQM will not trigger in quite a number of instances, so making sure that NQB does no harm is up to the wifi gear. Due to h) it is clear that NQB traffic will be able to saturate the wifi link. Together with c) and d) this means that the mapping of NQB to AC by the default rules will define the precedence level of the saturating traffic. Due to a) good wifi citizens try to be considerate in not hogging a channel to themselves (and due to CSMA/CA users on the same channel should stochastically share tx_ops "fairly" in each AC). Most wifi traffic currently is using AC_BE, but due to j) and h) that might change in the future.

I now argue that using anything but AC_BE for NQB will violate f) and g). Essentially in this not unlikely scenario NQB will confer a super-power (lower latency and higher bandwidth) without any checks and balances (incompatible with g)). 
To make it explicit, with ACs > BE in play, AC_BE will essentially only get the left-over bandwidth, which runs counter to i) as now QB flows need to mark themselves such that they also get scheduled into the same AC as NQB to get a level playing field.


For me that obviously means that the choice of DSCP for NQB needs to take care that the above scenario can not happen.
So I vote to follow Ruediger's proposal for a 000xx1 codepoint for the intended PHB such that the status quo is not negatively affected. With the added bonus that this now has a chance of actually surviving end to end today.
NQB aware APs/NICs can then implement proper NQB handling with or without selective priority boosting to different ACs while making sure no other flows/APs are starved.


Since I have made this point several times, without convincing you, I would be happy to hear your rationale why this scenario is unlikely enough to still justify aiming for a DSCP that maps to a AC > BE?
Ideally that rationale would be more refined than just noting, that this kind of abuse is possible already today, the aim should be to at least "do no harm", not just "do not make it significantly worse".


Best Regards
	Sebastian