Re: [tsvwg] David Black (individual) on safety of L4S for the Internet

"alex.burr@ealdwulf.org.uk" <alex.burr@ealdwulf.org.uk> Thu, 14 May 2020 17:04 UTC

Return-Path: <alex.burr@ealdwulf.org.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 507003A0CA4 for <tsvwg@ietfa.amsl.com>; Thu, 14 May 2020 10:04:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.121
X-Spam-Level:
X-Spam-Status: No, score=-1.121 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gfCrRml-SanN for <tsvwg@ietfa.amsl.com>; Thu, 14 May 2020 10:04:51 -0700 (PDT)
Received: from sonic304-9.consmr.mail.bf2.yahoo.com (sonic304-9.consmr.mail.bf2.yahoo.com [74.6.128.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1DC6A3A0CCE for <tsvwg@ietf.org>; Thu, 14 May 2020 10:04:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1589475869; bh=p9c5U1QyLNoTmkvuf++l7R2rSC14IO59SIhzfRvvSZk=; h=Date:From:Reply-To:To:In-Reply-To:References:Subject:From:Subject; b=mDJJtwcisjP8EfEiLAy4kd6uhLj6dBhis2XtpvLYeSUXEcfrSsjtN3DOcufe5e7BwDU2Cpo1Fd3ZiBnoCwxxAl8o9bqmTYroNTSr50eFFP93rTNMvR5nOveH/p4s9Mwyu+OfjF3TZJgD5q/Cm9vrK1fuG7yHuykDv9m+k+J86X+hQnHv+wByI33sD7BJxNpcxiDDYMPGtB4gMeGTNOzzCW+Jcljm/BSxV54WwUQ2zeQRT+Mw6hhK2Grrys/xuz+8YxsC+/BX2wujpo6OcOhhpqSI4IMMoZzNmGLEnNL+CYg11i0LhBBDYctUtB4uv+TWpZvbuxtDIDTX3y7pAvVmDw==
X-YMail-OSG: Z0hjONwVM1ltEtAHKjD4_idw4AvA.vpnvFJvHyz.R0DSG0AO1CSnvRpEatD4enE odlfjleGxeuB_APO3fGfEMpUE_X6ri48ERUdKtJ5euSQFC7T2lnUruc0erkCZ2aQyu2qqld1Nuny xxjWB7fMbtnQYDFrfc8PbKW7P7eHwU8.lSnJCXPlm9DegOd6w79pworoN8EKOU0tHJzFl3b.oWK_ ebvDJOb_QsWk0C1chqJB_mQWMoG6J7cKtxSOcas9lDI2HHBDLNGeDXg0E0X5ioxCoqn7NO_KHSll iCuOYPvo2JEOJJD49W45JUVC2EHMSxluEUK0dk8ZBMti4yCHhT.i7nzTcUl6FPUfJw_H0NdchWAi gZfo7.6E.IqT7TD78XJm6a3.3j3Rrq9fnbp5.73lJI.qRMaCI3CgFcSrVTHdQn0HXG0hsEp0ysk6 lbH6IOSbKookNO4qlPU2QMtPum_Op5QY2ag8PksfVTpIAFIpOAGsgfF2eB8rhkltNNpIA0TCUEGa MpnWCcdGx5Xe4d48AxOExJaBVh.ZAIgrGuIbqOLetUXKnaf0YLmtVbU0IDhTJ50OIzBDt27DrsjA BqoJNipQuJSsfJqaXjljoQw13mpOgQe2Nxh5qcSUfnJsm65E2Jx3klvlEHk3pz86YzN0.faJFUQI 54Txo3oFJxQ.netC.7ETptGcoa6qQAekV4VFsl7EFzIA0aeHE9NxKIDcO063LHTiE3HyUftYUy2C gkyBOk5NmVyFOSCoaWGsKph9WT5b6E.UtTOw.CjuLYcYEtQkCZZz8I8pxC3afCBy00P7cUpsnR78 3xhObyb.Ti6EZdUIzBgdTFcPs72JlMIHWKB4n6ngZeSJ_4tdQpoO40tUnExikRWbzyc0MUF2jw4u 4WBGG.bxkBgKQa0hNtx0G.ZH4y2n8A0IImM0aRdC46t8cQyxvjBCLQE4FVucKSpW12FOs0mFkWf9 GIfwOvTvYlqsBCpNsDygYhg2uSo8SQQiCITQoCAGdc4zRtGrA9ZY845wjqYwbSZ4rBdgeHW44V9. c7Va1Zk2_1kbotp9jlOH7D2CakDxKuZJ4eb35cgzP6ULP0XL_mgXQnG3ngcIUyaCmwpW9fOHbFV3 Y138QGvVAJEP6MFi0zwcKKY_h9wakn66Mn7DQgLPKTG8EbDFfq8W1xO0vIO6FiYGjbtl5zrg6zNY 3ghAV5.tjjMhKN_i65myZB6.qdYj4j2b4XLgTqvOqcYS0ZkiV96SDIzNRnF0F..0Gj_3bF5CrqrC 8C83jLv0ktkhxRegCl_hkp0Ydp28lVMHeAtEfCsjYqiqLRy4wBji7eL3gbs374X29x6nEQlmoBN4 ZzNYtmAYl4PawhUBIeZHArWKnRsv0rw--
Received: from sonic.gate.mail.ne1.yahoo.com by sonic304.consmr.mail.bf2.yahoo.com with HTTP; Thu, 14 May 2020 17:04:29 +0000
Date: Thu, 14 May 2020 17:03:56 +0000
From: "alex.burr@ealdwulf.org.uk" <alex.burr@ealdwulf.org.uk>
Reply-To: "alex.burr@ealdwulf.org.uk" <alex.burr@ealdwulf.org.uk>
To: "tsvwg@ietf.org" <tsvwg@ietf.org>, "Black, David" <David.Black@dell.com>
Message-ID: <75435344.181735.1589475836060@mail.yahoo.com>
In-Reply-To: <MN2PR19MB4045DBC270D70DECE5F2B4AC83A20@MN2PR19MB4045.namprd19.prod.outlook.com>
References: <MN2PR19MB4045DBC270D70DECE5F2B4AC83A20@MN2PR19MB4045.namprd19.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: WebService/1.1.15941 YMailNorrin Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/8QKRjC2LQgXjSsPxRI6AClPMHpA>
Subject: Re: [tsvwg] David Black (individual) on safety of L4S for the Internet
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 May 2020 17:04:56 -0000

Hi David, 
 you write "In contrast to reliance on TCP Prague, that combination of FQ and bleaching looks like it could provide a robust safety case." 

Some of those who have responded to the consensus call seem to assume  that a safety case should not necessitate any action on the part of third parties, eg operators who are running RFC3168 AQMs. That you reject bleaching not on that ground, but because it would nobble the use of ECT(1), makes me wonder if your position is more nuanced. 

Some responders are evidently more  relaxed about third parties needing to act in order to maintain effective  congestion control on the internet, as evidenced by their willingness to contemplate ECT(1) bleaching.

If action on the part of third parties is allowed to be necessary as part of a safety case, as long as that action is sufficiently easy, then it seems possible that an action could be found which does make L4S safe without making ECT(1) bleaching likely, or make it  limited in scope. For example, maybe there is a simple way for operators to signal to endpoints that RFC3168 AQMs are in use. So I was wondering what your position is, regarding third party actions being necessary  as part of a safety case - what limitations should there be, on  third party actions that a safety case may necessitate?

Alex



On Friday, May 8, 2020, 9:10:18 PM GMT+1, Black, David <david.black@dell.com> wrote: 





  


This the longer explanation of why I find myself in group 2 on the ECT(1) question because “If you believe L4S will not be safe for the internet without significant architectural changes, you are in this group.”

 

Current practice is not to mix DCTCP-like traffic (1/p-class congestion control) with TCP-like traffic (1/sqrt(p)-class congestion control, e.g., NewReno) in the same queue because the DCTCP-like traffic outcompetes the TCP-like traffic to the point of starvation of the latter.  By using ECT(1) as the only classifier for low-latency traffic, L4S causes mixing of those two classes of traffic at network nodes that have not been modified to use ECT(1) as a classifier.  Hence the L4S experiment proponents are responsible for the safety of that traffic mixing at unmodified nodes.  I understand the L4S approach to achieving safety at unmodified nodes to be use of TCP Prague, specifically its heuristics that detect RFC 3168 bottlenecks.  As previously discussed, this traffic mixing does not cause problems at otherwise unmodified nodes that use FQ.

 

For the proposed Internet-wide L4S experiment, the technology to be tested is necessarily experimental, but the safety of the experiment needs to be fundamental to avoid serious damage if things go wrong.  I do not currently see a strong safety case for L4S+TCP Prague – an example of such a strong safety case can be found in QUIC congestion control, where the fact that it is based on NewReno congestion control provides strong assurance that QUIC congestion control is safe for the Internet.   In contrast, TCP Prague is still research (it’s great research and I applaud the people who have invested their time and effort in it for what has been achieved), but a safety case ought not to be based on not-yet-stable research, and I don’t think another 6 months of testing and bug fixing will leave TCP Prague as anything other than research (ruling out group 3 for me).  Further, the L4S low-latency service will likely be used by protocols other than TCP Prague, which may not receive the level of scrutiny and testing that has been applied to TCP Prague.  The safety case needs to come from elsewhere.

 

If ECT(1)-marked traffic starts causing problems, network operators are not going to wait for TCP Prague or other protocol implementations to be patched – they’re going to take action, e.g., as Neal Cardwell describes (and I’ve heard about this sort of potential consequence in other conversations, so please don’t blame Neal for this):

 

> - L4S flows potentially causing unfairness in RFC3168 ECN bottlenecks has been mentioned as a potential concern. However, a robust RFC3168 ECN bottleneck

> should already have a mechanism to avoid unfairness caused by flows that are marked as ECT(0|1) and yet not performing RFC3168 responses. In particular,

> many of the large sources of known deployments of  RFC3168 --  Linux fq_codel and cake -- are already deployed with fair queueing. In such bottlenecks L4S

> traffic should not cause harm to other non-L4S flows. Furthermore, if there really are ISPs with deployments of RFC3168 bottlenecks that have neither FQ nor

> any other protection from non-RFC3168-ECT(1) flows, then they can bleach incoming ECT(1) code points to Not-ECT and treat L4S as Not-ECT (ISPs typically

> already transform the DSCP byte at their ingress anyway). So I do not see harm to RFC3168 ECN bottlenecks as a prohibitive concern.

 

In contrast to reliance on TCP Prague, that combination of FQ and bleaching looks like it could provide a robust safety case.  It would be most unfortunate if things wind up depending on this, especially bleaching, as the fact that ECT(1) is not bleached at network boundaries was a major reason for selection of ECT(1) as the identifier for low-latency L4S traffic.  Among the implications is that if the L4S experiment fails, that bleaching could be around much longer, making it infeasible to use ECT(1) for anything else Internet-wide.  In my view, it is the responsibility of the L4S proponents to design the L4S  technology and structure the experiment in a fashion that makes ECT(1) bleaching unlikely and/or likely to be limited in scope if it happens.

 

Thanks, --David

----------------------------------------------------------------

David L. Black, Sr. Distinguished Engineer, Dell EMC

Dell Technologies, Infrastructure Systems Group

176 South St., Hopkinton, MA  01748

+1 (774) 350-9323           Mobile: +1 (978) 394-7754

David.Black@dell.com

----------------------------------------------------------------