Re: [tsvwg] TSVWG: WG adoption of draft-white-tsvwg-nqb!

Sebastian Moeller <moeller0@gmx.de> Thu, 12 September 2019 08:20 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4A18C120864 for <tsvwg@ietfa.amsl.com>; Thu, 12 Sep 2019 01:20:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.348
X-Spam-Level:
X-Spam-Status: No, score=-2.348 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J2kWjEDfAVSK for <tsvwg@ietfa.amsl.com>; Thu, 12 Sep 2019 01:19:58 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EECEB12026E for <tsvwg@ietf.org>; Thu, 12 Sep 2019 01:19:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1568276387; bh=9C4A+E4iog9eJooc0fFMQBfqTrnIhUYjEpQgs+FC5Mg=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=OEF4My4fXKdstXPWeU55QRhv3/3rRfFZl+P4OJFQPj3Kkdffs8PXw2KM5BzkLB1Ab XIHqs3MOL+qfoTQD3Vj7FEjlB/mT6Fw6jpoKv10xofDS+Kh9/oLWaR6Pt3tw388SRE AAe3EX1ims2AXgzjbp+KXU1L/b7jXuTyLP0waods=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.32] ([134.76.241.253]) by mail.gmx.com (mrgmx102 [212.227.17.168]) with ESMTPSA (Nemesis) id 0MATlG-1hxpfM3eZe-00BZqm; Thu, 12 Sep 2019 10:19:47 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <E437444E-B896-4BD8-BC3B-01A535A6858D@cablelabs.com>
Date: Thu, 12 Sep 2019 10:19:45 +0200
Cc: "Ruediger.Geib@telekom.de" <Ruediger.Geib@telekom.de>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <E35C0C36-9C33-40EF-B7F4-1D3FB508E4CB@gmx.de>
References: <CE03DB3D7B45C245BCA0D24327794936306BBE54@MX307CL04.corp.emc.com> <AA4DBFC5-8D8F-4F43-80E4-BB9BA7F53486@cablelabs.com> <LEJPR01MB1178B6C102455F1F9886D49A9CBB0@LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE> <F351D86E-DCE4-45CD-9B08-2E0C11090BF1@cablelabs.com> <LEJPR01MB11789EE6D8B7C732393BD1439CB70@LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE> <B11AB47E-7E35-4599-A85B-DB0EF883E2B2@cablelabs.com> <BDF260C9-881C-4ACC-AF92-8E99C1CB07E0@gmx.de> <4B5C14EE-B3CF-455B-86C9-67D6E9BAEF4C@cablelabs.com> <40417573-1036-4238-A451-BFA6D8310B20@gmx.de> <E437444E-B896-4BD8-BC3B-01A535A6858D@cablelabs.com>
To: Greg White <g.white@CableLabs.com>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:eRnzvn4we7sAeZDqpDu88nbbqPkP7NHeYy0EJ7Wh9ymLxHpJK0Z amykQrN5gCt2Vp2/V8xg3kJUXSWk/+bp+emcaCcb3xRisRSi8QQUMWg31KBQ33lIPIMvCsU oNZvp16AOKYKNR/tI3YzB8XzrgaqXYnhOk8kyHudIygOSJ7gmYq+L4b1Xav3aURYHspqqlQ xqWMB5cV0hbiFLH5MvcNw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:lJIkVkacU/Y=:qmXxQnm/jhmzyjyFUVa4dA 3nLVZo289hLl804V9tpZQaO9YZp3393HdQlbgGlKLPdi6n6hEfnAhtriDILf16Oc6LZxBLJgv asPu4CEKtMxOXSraPVD4wErzs+3EadzzA4U+wTryw0KcbQfgKvcIm9sb5htuDsFBKuk4zzBSN wAargH9HdR+PjqAoV+xO+SmVDbGk1XvQCXEtGmPWyi46RF1t3YK5+gLxgMSZ3K8reUSU17gcH WG9HINgbvxLNCDw/ZIfhXBX0mpYhpqiIAYa9hQDm4MLtW2YEj4uMotE8azRWwTiTp0Vrc4klI RR2ePOBiiwPmjOx+Q7MIeO0qDfzMTJK1fjB/QaiIfyIcI5+yiKynFJwIkcRdHiIZxE54k7xxv ok/L9jv0t9hh4cIK8e0600owP+dQcM+o8137vr/lCiJp7lHHyF3Ez/0qi5ohbgvu8hQtZA4p7 qR2i6fC3qg58JYKZOjgiG+Ih6EVpvH+ApTDycToJm218COp8BjdtKl/MXJeeacsst6mijpG3u R7MeTUhJcKaJLvms+BaLBsqb0lfSFwA2cgzY+dJPgE17MjMZ4AN0TZ6x4HpIDpZoHI6wNaS1L kSLiZLdJPaq13aK9+Z3tTbUZ9UMiBnjREHqKymvBYxXDx+lhEYhvBnqhuMUcfqpqZh8SwHIlu N80gqhkcJFLmnANWTDl+jUifcAxmsq7IbmRJOruiC4FYFMJ0TmX8dhUrl4Bc0T8Jt0IAQSwta KxkWqUe4Qu45swFPVKJAbPcGTqWjb93es1XwKb4mUrlGKGUP2TfgFLE03tg9i81N8TqmJCHdr LrqXb4hDNh158ByVL4qPLZlNBi7Caj8kq392AzSZiZJOby0y3GIHXWoLohtcFjPJdtuwLDW4N kjKtKZF7SoCFTl4s0PIwUwnT5hzv75PfK7NwHdFaEhSAx4cU1sD6AVVdqS21dwYOgAEpJ/QQL qjJMtihWJjzAsR6v1ojGuzW4u35+hTnWg/ghb5rRd8KL20TvCQjX7mzwphGYoMRvXHoSQxm6C h+upvE+rpVMLkaMYMkcSWYhywPxK1YLFQYXpP8pw4VVqIcsUy7/QGaCBZCjKBCU3JTTdQZAYw 7YjnJ9oKfMxQrvPZ+SfTS7ZjgzZPhWhIjoPuWQqzTdihpa0T5E8SRbJ5AC/HnmRyMIjxL+ORH 3riruW6RIamdMd3hl9sLsSDrmISK6LZD4NxxuAbsy6S8IU2YQDMQNwnhpDyya4FxFvs9I=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ysv-0Jc9TVzb7mE7TqDht4N3aF4>
Subject: Re: [tsvwg] TSVWG: WG adoption of draft-white-tsvwg-nqb!
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Sep 2019 08:20:02 -0000

Hi Greg,


> On Sep 12, 2019, at 07:21, Greg White <g.white@CableLabs.com> wrote:
> 
> See [GW2]
> 
> On 9/11/19, 5:44 AM, "Sebastian Moeller" <moeller0@gmx.de> wrote:
>> [GW] What validation is done today for applications that mark their packets as AC_VO or AC_VI?    
>    	[SM] As far as I know almost no validation, but do to the traditional DSCP bleaching over the internet and the reluctance of most applications to twiddle with DSCP bits, most traffic will be in the AC_BE, but admittedly rather by chance than by design. In addition the IEEE's default DSCP to AC mappings are quite conservative (like mapping EF to AC_VI not AC_VO) making high volume traffic in the higher ACs relatively unlikely with such DSCPs that might survive internet passage. But with NQB's desirable approach to be transmitted e2e, that is going to change.
> 
> 
> [GW2] According to one recent study, roughly 50% of ASes bleach DSCP, so yes a decent percentage of traffic that traverses an interconnect before reaching the WiFi link will end up in AC_BE, but other traffic will not.  

	[SM2] Mmmh let's say a typical path traverses at least 2-3 ASs, so the chance marking surviving will be 100*0.5*0.5*0.5  = 12.5 %, meaning that 87.5% of packets will be bleached.

> None of the traffic generated by local devices on the WiFi network gets bleached.  

	[SM2] Sure, but almost no user knows how to actually modify DSCPs for sending applications, and very few application use something other than BE, I know anecdotally because I help support sqm-scripts users that run exactly into these issues, when they actively want to change the DSCPs.

> All OSes that I'm aware of provide open APIs for applications to select their desired DSCP.

	[SM2] But only few applications take advantage of that.

>  In the case of Windows, applications (unless they are privileged) are limited to selecting from a set of 4 DSCPs that map to the 4 WMM Access Categories.  

	[SM2] A priviledged user can override that on the command line, but knowledge about that fact is scarce (in a windows domain, the policy editor allows configuration via GUI ,but typical endusers need to rely on the CLI).

> Applications do use these, without validation or policing (at least in residential environments, in enterprise WiFi systems it may be a different story).

	[SM2] In competently managed enterprise environments I would assume that the admin uses the qos_map feature to tell clients which DSCP to AC mapping to use, but for home networks it is going to be the defaults, as no comerciaklrouter I know allows editing qos_map (nor does OpenWrt).

>  In my experience approximately 10% of packets are sent as non-AC_BE (~4% BK, ~1% VI, ~5% VO) in current WiFi networks, though I'm sure there is wide variability depending on the applications in use.  

	[SM2] So as long as NQB is not going to change that ratio considerably there should be not much changes. Unfortunately the current NQB draft does not contain language to assure that that is the case. Changing these ratios significantly will basically get into untested territory with new unexpected pathologies abound (in my testing the macbook's AC_VO completely dominated the AP's attempts to get packets into AC_VO, potentially indicating different interpretations of the WMM spec). I want to stress, that I base my reservations on real data here, not mere theoretical observations.


> Moreover, I would find it hard to believe that either RFC8325 or the default mappings were defined under the assumption that ALL traffic arriving at a WiFi AP on its WAN port would be guaranteed to be default DSCP.  

	[SM2] I agree, but I also do not see in rfc8325 links to any empiric studies on the effect of the proposed dscp to AC mapping on aggregate throughput, fairness and robustness against starvation. But rfc8325 does seem to recommend to use AC > BE cautiously and should be controlled:

"8.2.  Security Recommendations for WLAN QoS
   The wireless LAN presents a unique DoS attack vector, as endpoint
   devices contend for the shared media on a completely egalitarian
   basis with the network (as represented by the AP).  This means that
   any wireless client could potentially monopolize the air by sending
   packets marked to preferred UP values (i.e., UP values 4-7) in the
   upstream direction.  Similarly, airtime could be monopolized if
   excessive amounts of downstream traffic were marked/mapped to these
   same preferred UP values.  As such, the ability to mark/map to these
   preferred UP values (of UP 4-7) should be controlled.
[...]
Finally, it should be noted that the recommendations put forward in
   this document are not intended to address all attack vectors
   leveraging QoS marking abuse.  Mechanisms that may further help
   mitigate security risks of both wired and wireless networks deploying
   QoS include strong device- and/or user-authentication, access-
   control, rate-limiting, control-plane policing, encryption, and other
   techniques; however, the implementation recommendations for such

The whole section seems relevant in that it explicitly tackles the starvation issues I mentioned and that it recommends to take active measures against getting into pathological conditions.


> Given this, I see no justifiable reason for NQB to map to AC_BE in traditional WiFi gear.  

	[SM2] Well, I fail to see the mitigation techniques that rfc8325 discusses in NQB to merit anything but AC_BE. Given that you explicitly do not want to bandwidth constraint the NQB queue and that you want to lump quite different traffic tuypes into that class. 


> The traffic expected to be labeled NQB is sparse unresponsive traffic or smooth low-data rate flows, which is compatible with AC_VO.  

	[SM2] Great, I bleieve the " low-data rate" aspect has not been sufficiently stressed in the draft yet. That still leaves the issue that NQB type traffic is potentially allowed 100% of the bandwidth, and in that case each individual flow might be justified to AC_VI but the whole aggregate will not be. 


> We've proposed a DSCP that, conservatively, aligns with the mapping of CS4, AF41, AF42, AF43, CS5, VA and EF to AC_VI in default mapping equipment, and aligns with the mapping of VA, EF and CS6 to AC_VO in RFC8325 equipment.  

	[SM2] for end users there is no RFC8325 equipment, these devices so far default to IEEE defaults (or typically linux defaults). I am not saying RFC8325 is bad, but it certainly is not tested well in the real world.


> As I said previously, I don't have a strong view on VI vs VO for RFC8325, but since RFC8325 (unfortunately) maps some capacity-seeking bulk data traffic to AC_VI, I believe that it is more appropriate that NQB sparse traffic be VO in those systems.  

	[SM2] I can understand this from your vantage point, but that is rather selfish and will lead to an arms race to AC_VO quickly.


> The draft has a recommendation that all devices supporting NQB implement a queue protection mechanism for NQB traffic, which for RFC8325 devices supplements all of the detailed recommendations* on protecting QoS in section 8 of RFC8325, which still remain intact.  So, my view is that AC_VI for NQB in RFC8325 equipment is defensible.  

	[SM2] This you got backwards. No wifi device out in the field will do anything of that sort, but most will use the default mappings, so the only conservative approach is to use a dscp that maps to AC_BE and teach new NQG-aware wifi devices to map NQB to something else. You are basically justifying a lot of the properties NQB by declaring it is not a priority-based scheduling system, but on existing wifi, that changes into a semi-strict priority scheme (well it is almost a precedence like scheme with some residual starvation avoidance) which aas we all agree will have pathological issues with out proper access control.

> 
> *Note that RFC8325 makes all of these protections RECOMMENDED, not REQUIRED, in alignment with the NQB draft's handling of queue protection.

	[SM2] In case someone wonders, that is exactly the kind of attitude which makes me completely oppose Bob's quest to avoid a MUST for queue protection... rfc2219  says:

3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course.

I am not convinced that your rationale above fulfills these requirements yet. But admittedly that is not my call to make, but I feel entitled to voice my concerns and leave the final arbitration to the experts and chairs.




> 
> [GW2]  A point to be repeated here is that this proposed mapping of DSCP to WMM was not driven by the goal of giving priority (either bandwidth priority or latency priority) to NQB traffic, but rather to have it queued separately from QB traffic.

	[SM2] I applaud that goal, but IMHO the real world effect trumps the intention here, and WMM is a priority/precedence scheme, so once wifi becomes the choking point, NQB will look like a priority scheme from the end-user's perspective.


>  In existing WiFi gear, that isn't possible without using different ACs and we have to work within the limitations of the equipment.

	[SM2] That is one option, the other option would be realize the impedance mismatch and opt for the conservative alternative.

>   Perhaps the NQB draft should make this more clear.  In other words, the draft should clearly require that WiFi APs that claim compliance with the NQB PHB definition have a separate queue for NQB traffic.  

	[SM2] +1; IMHO wifi gear that explicitly supports NQB and has the appropriate safety valves in place (anti-starvation/fairness measures) can go all AC_VO on NQB traffic, but that is not the currently rolled-out wifi gear. I am mainly concerned about not making things worse compared to the status quo.

> If that separate queue is then scheduled alongside a QB queue, both using AC_BE, that would be compliant.

	[SM2] ??? If you have you second NQB queue then you can (with some safeties) go wild and use AC_VO as you will still be able to share airtime between QB and NQB equally. But in the current situation you can not do that, so you need to be care-full when using ACs > BE to maintain roughly equal sharing (or at least reasonable starvation avoidance). Again I am voicing an opinion, and the decision is not mine to make.

> 
> 
>> [GW] Perhaps we've left it too ambiguous in the draft, so that needs some review.  It was not the intent that capacity-seeking flows (even L4S) mark their packets as NQB.
>    	[SM] Great, maybe this can be made more explicit?
> 
> [GW2] Yes, I have this on my to do list.   I was modeling the NQB draft on some of the PHB RFCs that were very explicit about PHB requirements, but fairly vague on what applications should use it. 
> 
> 
>> [GW] The target flows would be ones where the sender has a degree of confidence that it will not exceed the available capacity of the path.  
>    	[SM] With a variable bandwidth/rate path like wifi, the sender's confidence might not be the best measure? I would guess the hops operating the RF link would be in a better position to predict the path capacity in the immediate future?
> 
> [GW2]  There are many applications that can a) be very confident in their belief that they will not exceed the available capacity of the path,

	[SM2] That is obviously wrong as path capacity in general is not fixed but variable; we can haggle about what you mean by "very confident" but that is besides my point, especially in the context of variable rate eifi where the actual rate is set individually for each tx slot.

> and b) that would benefit from not being grouped with capacity-seeking traffic that is causing latency and loss in the network.   Keep in mind that the concept of segregating QB & NQB traffic isn't so that we can guarantee deterministic latency for NQB traffic.  This isn't DETNET.  The goal is to allow NQB applications to have statistically much better latency characteristics than they do today. 

	[SM2] That part I agree with.

> 
> 
>> [GW] Other links (including WiFi) could implement it as well.   
>    	[SM] I fear I am less optimistic, and would argue that without this implemented wifi should be very careful to prioritize NQB (unless we can agree on NQB qualifying behavior that makes harm very unlikely).
> 
> [GW2] I'm confident that we come up with language to more clearly describe NQB traffic in a manner that is compatible with the descriptions of other DSCPs that share an AC with it.  I assume this (perhaps in addition to something like I mentioned above about segregation being the only requirement for the PHB) would satisfy you, no? 

	[SM2] If the NQB definition is compatible with RFC8325 definitions or goes the extra queue for NQB traffic route, my objections go away (but again, my objections are non-binding anyway).

> 
> 
>> [GW] For example, WiFi APs or Stations could monitor queue depth or queue latency for the AC_VI (or AC_VO, whichever NQB maps to) queue and re-direct NQB traffic to AC_BE, either in a flow-aware way (probably preferred) or possibly even in a flow-blind way.  In other situations it may not be necessary (e.g. in controlled environments or on links that support FQ), or it may not be feasible.  
>    	[SM] All good ideas; the question to me is more in light of the lack of such mechanisms would it not be safer to default to AC_BE and only selectively elevate NQB traffic if the radio controller/AP is NQB aware? 
>    	I ask because I just did a flent rrul_CS8 (that is bi-directional greedy TCP traffic with one flow dscp-marked for each CS) test-run on my home net, and saw my macbook's sent CS6 and CS7 flows almost starve all other traffic, including CS6/CS7 marked traffic from the AP; I am not confident, that the hardware out there is ready for significant traffic volumes on AC_VO/AC_VI yet... (happy to share the flent data file on request).
>    	Now, I realize that NQB is not intended for bulk flows, but as long as the marking is not actively verified (and maybe even bleached on detected mis-marking) the intention does not matter that much, NQB-mismarking will give a considerable bandwidth and latency improvement. With access links getting faster and faster, wifi is becoming more and more a (transient) bottleneck making this issue sensitive.
> 
> [GW2]  I hope we're past this in the discussion now, but just in case we're not:   I don't think that NQB materially changes this situation.  Today (effectively) no traffic has its WMM selection policed.  It is totally up to the application to decide if it wants to mark its packets CS7 or not, and no one is mandated by the IETF to stop them.  Don't get me wrong, I am not opposed to WiFi devices implementing queue protection (or the QoS protection features in RFC8325), in fact they SHOULD do so (if they want to comply with the draft anyway).  But as long as the traffic types that are recommended to be marked NQB (i.e. sparse/smooth relatively low data rate flows) are compatible with the types of traffic that are recommended to be marked EF, CS6, etc. then I think we're covered.  

	[SM2] I think that making it even clearer in the draft that the NQB requirement is "sparse/smooth relatively low data rate flows" with some more clear language what "relatively low" actually translates to we have removed one of the obstacles, the other is aggregate traffic volume. I agree that there is nothing stopping people from abusing ACs today, but I observe in my network, that this is not happening quantitatively. If NQB catches on as a method to improve latency for "real-timish" applications this is going to change and since that is predictable we should  tackle this issue today. (IMHO the only two options are NQB is not adopted in real life (unlikely due to DOCSIS)) then nothing really matters, or it will see quantitate use, and the the aggregate traffic volume issue should also be discussed in the draft. Again, nit an IETF member, so all do here is voice my personal opinions and try to get the domain experts to chime in and make the final judgements.

> 
> [GW2] I really don’t follow your insistence on the IETF mandating that implementers build in safeguards like this.  Recommendations are great, and I support them. 

	[SM2] Well reading rfc2119 indicates that even a SHOULD/RECOOMENDED is basically already a mandate, just one that allows an escape route as long that has a strong rationale behind it. See above about your RFC8325 alignment issue (where you argue since rfc8325 only RECOMMENDS protections so we are good without them) and Bob's stance on queue protection in general, why I am wary about being lax in enforcement ;)

> 
> [GW2] I know you are a fan of fq_codel.  What do you implement in your router that detects and prevents an application from sharding its traffic across multiple queues?  If you aren't preventing this simple and powerful exploit, then maybe your router should be banned from the internet!  (this is a joke, in case anyone misunderstands)

	[SM2] I guess I need to sharpen my pitch fork and try to get rfc8290 changed ;)


Best Regards
	Sebastian

P.S.: Not that it matters, I am not opposed to the idea behind the NQB PHB in general, and I am also not opposed to a careful increase of wfi ACs > BE, but I would like to see this done in a conservative way that treads lightly and accounts for the fact that there is little operational experience how already deployed wifi gear is going to behave in a high AC_[I/O] environment. This is especially important as  I indicated before, the effect of using the higher ACs will be felt in the whole RF neighbourhood, so side-effects will not be restricted to the user requesting the NQB marked traffic but quite a lot of "innocent" bystanders.