Re: [tsvwg] Traffic protection as a hard requirement for NQB (was: WG adoption of draft-white-tsvwg-nqb!)

"Black, David" <David.Black@dell.com> Tue, 03 September 2019 21:49 UTC

Return-Path: <David.Black@dell.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B173E120071 for <tsvwg@ietfa.amsl.com>; Tue, 3 Sep 2019 14:49:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=dell.com header.b=KwfKEIjL; dkim=fail (1024-bit key) reason="fail (message has been altered)" header.d=emc.com header.b=sLanacWo
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qDmiyitN5eg3 for <tsvwg@ietfa.amsl.com>; Tue, 3 Sep 2019 14:49:10 -0700 (PDT)
Received: from mx0b-00154904.pphosted.com (mx0b-00154904.pphosted.com [148.163.137.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6152012002E for <tsvwg@ietf.org>; Tue, 3 Sep 2019 14:49:10 -0700 (PDT)
Received: from pps.filterd (m0170397.ppops.net [127.0.0.1]) by mx0b-00154904.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id x83LeJ7s020159; Tue, 3 Sep 2019 17:48:24 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dell.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=smtpout1; bh=5tj/D2YW9Io8qTysKcB8+NJ5BVHErDtJU2HgZBpUh9A=; b=KwfKEIjL9m2EU+NGNjyNpvQM++5xSAEtWGVwKghqQEeILokoMbrPLt3+Xvwr4T+6AhwZ bUk3UtU3uI++1AcCZoEFZI/Oal7aS4Nwb53X9AsT2hLTWGMCQOEiSQ8X/aTzL8DBzPqi ns+zx63NaUqcpH9wm7SAEnD36I12KIuxJUNBVHuYkfgPMKkHk9XM7H/KbHMTLslC9ZmO u/t675BirBDgU80sHhno+dG+dHKjMbDpM7bT7tpmQYk6n4Cpys0PMdqvwb+tYE6qZdva w3GUSSTxBnpMhFmUO6yiUTmkk2bMIR8uJVSukoRUS2kSPliHTV3cJbEI/wZ+kxjn2DKv ng==
Received: from mx0a-00154901.pphosted.com (mx0a-00154901.pphosted.com [67.231.149.39]) by mx0b-00154904.pphosted.com with ESMTP id 2uqhyun3u2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 03 Sep 2019 17:48:24 -0400
Received: from pps.filterd (m0133268.ppops.net [127.0.0.1]) by mx0a-00154901.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id x83LgtDW109717; Tue, 3 Sep 2019 17:48:23 -0400
Received: from mailuogwdur.emc.com (mailuogwdur.emc.com [128.221.224.79]) by mx0a-00154901.pphosted.com with ESMTP id 2ur5yjbb3p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 03 Sep 2019 17:48:23 -0400
Received: from maildlpprd52.lss.emc.com (maildlpprd52.lss.emc.com [10.106.48.156]) by mailuogwprd53.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id x83LmGTL010959 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 3 Sep 2019 17:48:21 -0400
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd53.lss.emc.com x83LmGTL010959
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=emc.com; s=jan2013; t=1567547301; bh=aCZASPsrZ6l3uCZrj5s5IIhTqQQ=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:MIME-Version; b=sLanacWovLk9cwz5r0HP12uf3pnRlMPmhA90aaeTyIOfp2ysu2Oo0qQFlwvcle6Z0 UdiJQSQsMzdcTYC8KSzmwtHN5+0ReoQN7fjhctedFUeO4qwYttgsmiAkDzF2iv7u+9 +RozbH1YCEYdf9IW2ztV8kQqauGYU6cps7n8Dp58=
Received: from mailusrhubprd52.lss.emc.com (mailusrhubprd52.lss.emc.com [10.106.48.25]) by maildlpprd52.lss.emc.com (RSA Interceptor); Tue, 3 Sep 2019 17:46:30 -0400
Received: from MXHUB309.corp.emc.com (MXHUB309.corp.emc.com [10.146.3.35]) by mailusrhubprd52.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id x83LkHou030744 (version=TLSv1.2 cipher=AES128-SHA256 bits=128 verify=FAIL); Tue, 3 Sep 2019 17:46:29 -0400
Received: from MX307CL04.corp.emc.com ([fe80::849f:5da2:11b:4385]) by MXHUB309.corp.emc.com ([10.146.3.35]) with mapi id 14.03.0439.000; Tue, 3 Sep 2019 17:46:26 -0400
From: "Black, David" <David.Black@dell.com>
To: Sebastian Moeller <moeller0@gmx.de>, Bob Briscoe <in@bobbriscoe.net>
CC: "tsvwg@ietf.org" <tsvwg@ietf.org>
Thread-Topic: [tsvwg] Traffic protection as a hard requirement for NQB (was: WG adoption of draft-white-tsvwg-nqb!)
Thread-Index: AQHVYZ1NPeFk69HaOEuWBOzpYxY3eqcZ79WAgACFUTA=
Date: Tue, 03 Sep 2019 21:46:25 +0000
Message-ID: <CE03DB3D7B45C245BCA0D24327794936306D4F3F@MX307CL04.corp.emc.com>
References: <CE03DB3D7B45C245BCA0D24327794936306BBE54@MX307CL04.corp.emc.com> <56b804ee-478d-68c2-2da1-2b4e66f4a190@bobbriscoe.net> <AE16A666-6FF7-48EA-9D15-19350E705C19@gmx.de>
In-Reply-To: <AE16A666-6FF7-48EA-9D15-19350E705C19@gmx.de>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Enabled=True; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_SiteId=945c199a-83a2-4e80-9f8c-5a91be5752dd; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Owner=david.black@emc.com; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_SetDate=2019-09-03T21:41:35.0475525Z; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Name=External Public; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Application=Microsoft Azure Information Protection; MSIP_Label_17cb76b2-10b8-4fe1-93d4-2202842406cd_Extended_MSFT_Method=Manual; aiplabel=External Public
x-originating-ip: [10.238.21.131]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Sentrion-Hostname: mailusrhubprd52.lss.emc.com
X-RSA-Classifications: public, GIS Solicitation
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.70,1.0.8 definitions=2019-09-03_05:2019-09-03,2019-09-03 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 impostorscore=0 lowpriorityscore=0 mlxscore=0 phishscore=0 priorityscore=1501 clxscore=1015 mlxlogscore=999 spamscore=0 suspectscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1906280000 definitions=main-1909030215
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 lowpriorityscore=0 spamscore=0 mlxscore=0 malwarescore=0 phishscore=0 mlxlogscore=999 bulkscore=0 clxscore=1015 priorityscore=1501 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1906280000 definitions=main-1909030215
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/xn2wPAKu1RZIlCHIfqHWgopHN4E>
Subject: Re: [tsvwg] Traffic protection as a hard requirement for NQB (was: WG adoption of draft-white-tsvwg-nqb!)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Sep 2019 21:49:14 -0000

[Writing as an individual]

The core of my current views is this:

> ... “traffic protection” (e.g., “queue
> protection” or a suitably configured FQ AQM) appears to be necessary in
> general to keep queue-building traffic out of the NQB traffic aggregate, as
> allowing such traffic degrades the properties of the NQB PHB.

I see serious potential here for a "tragedy of the commons" outcome where lots of QB traffic is sent marked NQB because the resulting latency is (at least initially) lower.

> > I think we should be wary of making traffic protection a hard requirement
> at such an early stage in our knowledge of the NQB behaviour. I believe this
> is a case where the market, not the IETF, ought to decide whether protection
> is required.

I want to see clear and convincing evidence that the "tragedy of the commons" outcome is highly unlikely, which this text ...

> > The draft claims incentives can be aligned by an implementation being
> arranged to ensure that NQB traffic benefits from NQB marking and QB
> traffic benefits from QB marking. Incentives are hard to guess, so that may or
> may not be true. However, I don't think we (the tsvwg/IETF) can state
> categorically that it is not true.

... is not.

The above text not only reverses the burden of proof but also changes the question from whether a "tragedy of the commons" is possible to whether it will always (categorically) happen - the resulting straw proposition is then easy to demolish.  Nice try, no sale ... or as Sebastian writes:

> 	[SM] I would have thought it would be on those claiming a laxer
> enforcement model to proof that that model is sufficient?

I agree.  As for requirements language ...

> > While there is no operational experience of NQB deployments, I think the
> market (i.e. most early-adopting operators) will want the warm feeling of
> some form of traffic protection. But as we get more experience we might
> find incentives really are aligned. And we might find accidents and malice are
> not a significant problem.
> 
> 	[SM] @ the chairs, how hard would it be to retroactively change from
> a SHOULD to  a MUST? If it is easy I agree with Bob that a SHOULD would be
> nice, but if it is hard I would vote for a MUST as that seems to be the safer
> option.

... a common IETF approach to this sort of uncertain situation is "MUST implement, SHOULD use" with discussion of when the "SHOULD" may not apply and/or possible consequences of ignoring the "SHOULD."  The underlying rationale is that the functionality will be available for use if/when the need for it is discovered in a fashion that surprises the network operator.  It is much easier to relax "MUST implement" based on implementation experience than it is to upgrade "SHOULD implement" as the former doesn't have the potential to break any "running code" implementations.

> > As an analogy, when TCP congestion control was first developed, it was
> known that end-systems could run a subverted TCP algorithm or just use
> unresponsive UDP. At that time, the view could have been taken that per-
> flow scheduling would have to be a 'MUST' requirement for all Internet
> bottlenecks.   As it has turned out, the Internet does have /per-user/
> scheduling at bottlenecks,

I understand the TCP congestion control history, and the even-more-relevant ECN for TCP history, where there was a serious potential for a "tragedy of the commons" outcome - ignoring ECN congestion indications was a potential new way to cause congestion collapse of the Internet (fortunately, that did not happen).  There are at least a couple of aspects that seem rather different in the current situation:
- At the time, the IETF community had strong influence over the important TCP stacks. 
	The network stacks have gotten much more diverse since then, and a lot of NQB traffic
	can be expected to come from applications that have a much weaker relationship to the IETF community.
- Switching away from TCP to UDP was a high hurdle, e.g., as TCP uses stream sockets, whereas UDP uses datagram sockets, and ECN functionality was in the core of TCP implementations.
	The hurdle here is much lower, particularly for applications that already use UDP.

> > In the TCP case, it turned out that a delicate balance of incentives proved
> sufficient to allow most Internet equipment to be simpler and cheaper.
> There is a poorly understood balance of incentives in the NQB case. So let's
> not require equipment to be more complex than it might need to be, at least
> not yet.

I concur with Sebastian that the onus is on the proposers of the complexity/cost savings to demonstrate that the resulting designs are safe for the Internet.  I'm reminded by analogy of something that my software engineering professor said back when I was in grad school - "I can make the code run arbitrarily fast if it doesn't have to be correct." [Mary Shaw]

Thanks, --David

> -----Original Message-----
> From: Sebastian Moeller <moeller0@gmx.de>
> Sent: Tuesday, September 3, 2019 5:15 AM
> To: Bob Briscoe
> Cc: Black, David; tsvwg@ietf.org
> Subject: Re: [tsvwg] Traffic protection as a hard requirement for NQB (was:
> WG adoption of draft-white-tsvwg-nqb!)
> 
> 
> [EXTERNAL EMAIL]
> 
> Dear Bob,
> 
> allow me to chime in.
> 
> > On Sep 2, 2019, at 16:47, Bob Briscoe <in@bobbriscoe.net> wrote:
> >
> > David,
> >
> > Thanks for your closing remarks on the NQB adoption call.
> > You say your last point is open for discussion, so I will dive straight in to start
> that discussion.
> >
> > On 30/08/2019 16:40, Black, David wrote:
> >> 	• [snip]
> >> 	• The criticisms on this list of the “queue protection” requirement in
> the draft are largely accurate.   The draft needs at least an Editor’s Note that
> this material will be revised, as while the DOCSIS mechanism is an example of
> how to do queue protection, it is not appropriate to require implementation
> of that mechanism.   A plausible plan that I have discussed with the authors is
> to write a set of functional/behavioral requirements for NQB “traffic
> protection” that can be satisfied by a “queue protection” mechanism such as
> the DOCSIS mechanism, or by a suitably configured FQ AQM implementation.
> [snip]
> >>
> >> In addition, related to item 2), my expectation (which is open to further
> discussion) that “traffic protection” will be a “MUST” requirement, perhaps
> with some well-specified exceptions (including explanations of why the
> exceptions are ok).   This is because “traffic protection” (e.g., “queue
> protection” or a suitably configured FQ AQM) appears to be necessary in
> general to keep queue-building traffic out of the NQB traffic aggregate, as
> allowing such traffic degrades the properties of the NQB PHB.
> > I think we should be wary of making traffic protection a hard requirement
> at such an early stage in our knowledge of the NQB behaviour. I believe this
> is a case where the market, not the IETF, ought to decide whether protection
> is required.
> 
> 	[SM] The draft says "... it is worthwhile to note that the NQB
> designation and marking would be intended to convey verifiable traffic
> behavior, not needs or wants." in the light of this requirement it seems
> obvious that a hop willing to honor that DSCP should/must actually verify the
> traffic behavior, no? Requiring behavior to according to a set of requirements
> but not enforcing these requirements seems very very optimistic.
> 
> 
> >
> > The draft claims incentives can be aligned by an implementation being
> arranged to ensure that NQB traffic benefits from NQB marking and QB
> traffic benefits from QB marking. Incentives are hard to guess, so that may or
> may not be true. However, I don't think we (the tsvwg/IETF) can state
> categorically that it is not true.
> 
> 	[SM] I would have thought it would be on those claiming a laxer
> enforcement model to proof that that model is sufficient?
> 
> >
> > The draft makes the point that, even if incentives are aligned, queue-
> building traffic could be mismarked as NQB, either accidentally or maliciously.
> That's a sound reason for an implementer to include traffic protection, but I
> don't think it's a good reason for us (the tsvwg/IETF) to require them to.
> >
> > While there is no operational experience of NQB deployments, I think the
> market (i.e. most early-adopting operators) will want the warm feeling of
> some form of traffic protection. But as we get more experience we might
> find incentives really are aligned. And we might find accidents and malice are
> not a significant problem.
> 
> 	[SM] @ the chairs, how hard would it be to retroactively change from
> a SHOULD to  a MUST? If it is easy I agree with Bob that a SHOULD would be
> nice, but if it is hard I would vote for a MUST as that seems to be the safer
> option.
> 
> >
> > So I think the current 'SHOULD' is the right call. It could be beefed up with
> warnings on the risks of not providing protection - not least the risk no early
> adopter will want to use such an implementation.
> >
> >
> >
> > As an analogy, when TCP congestion control was first developed, it was
> known that end-systems could run a subverted TCP algorithm or just use
> unresponsive UDP. At that time, the view could have been taken that per-
> flow scheduling would have to be a 'MUST' requirement for all Internet
> bottlenecks.
> > As it has turned out, the Internet does have /per-user/ scheduling at
> bottlenecks,
> 
> 	[SM] Question is this universally true? I am not 100% sure. My access
> link is limited by my contracted rate, but due to oversubscription there is no
> guarantee that on a congested link between my home and my ISP's traffic-
> shaper/the wider internet I get a share that reflects my "per-user" fraction of
> the shared capacity.
> 	I fail to see any scheduler beyond my ISPs traffic shaper that has a
> wholistic-enough view to classify traffic based on "user" (think NATed IPv4
> address, but variable length IPv6 prefixes, plus a router's ipv6/64), could you
> elaborate please?
> 
> > but there has been little need for /per-flow/ scheduling for capacity sharing
> (yes, FQ exists, but it's not needed for capacity sharing).
> >
> > In the TCP case, it turned out that a delicate balance of incentives proved
> sufficient to allow most Internet equipment to be simpler and cheaper.
> There is a poorly understood balance of incentives in the NQB case. So let's
> not require equipment to be more complex than it might need to be, at least
> not yet.
> 
> 	[SM] I believe for malicious actors NQB will be a attractive DSCP as it
> promised to allow doing harm even at low bandwidth (just send a low
> average rate in lumpy bursts and I would expect an NQB-honoring L4S
> scheduler to get into trouble*, without requiring a large offensive traffic
> load, keeping it cheap and hard to detect yet sufficiently disruptive to rob the
> low latency queue of its intended functionality).
> 
> 
> *) This is a hypothesis I have not confirmed in any way, but it seems to be in
> line with how I understand dual queue aqm to work (so it might be more of a
> reflection of my level of understanding and not so much of dual queue aqm).
> Please correct me if this seems wrong.
> 
> >
> > u
> >
> > Bob
> >
> >
> >>
> >> Thanks, --David (TSVWG co-chair, will be shepherd for NQB draft).
> >> ----------------------------------------------------------------
> >> David L. Black, Senior Distinguished Engineer
> >> Dell EMC, 176 South St., Hopkinton, MA  01748
> >> +1 (774) 350-9323 New    Mobile: +1 (978) 394-7754
> >> David.Black@dell.com
> >> ----------------------------------------------------------------
> >>
> >
> > --
> >
> __________________________________________________________
> ______
> > Bob Briscoe
> > http://bobbriscoe.net/