[tsvwg] Update to Position Statement on ECT(1)

"Holland, Jake" <jholland@akamai.com> Fri, 08 May 2020 19:42 UTC

Return-Path: <jholland@akamai.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 45FDA3A0915 for <tsvwg@ietfa.amsl.com>; Fri, 8 May 2020 12:42:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BnGr05JmiqOj for <tsvwg@ietfa.amsl.com>; Fri, 8 May 2020 12:42:10 -0700 (PDT)
Received: from mx0a-00190b01.pphosted.com (mx0a-00190b01.pphosted.com [IPv6:2620:100:9001:583::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5F39C3A088E for <tsvwg@ietf.org>; Fri, 8 May 2020 12:42:10 -0700 (PDT)
Received: from pps.filterd (m0122332.ppops.net [127.0.0.1]) by mx0a-00190b01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 048JdeKs002321 for <tsvwg@ietf.org>; Fri, 8 May 2020 20:42:10 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : subject : date : message-id : content-type : content-id : content-transfer-encoding : mime-version; s=jan2016.eng; bh=d29BmGPdv8UP70bGKmQB8xgRhLqKQRNLZipoLKtY25s=; b=Puu21ULLzWG17RgrrsE227z/uvffNZohwJeaRBZ2tlJxfGo/d40ZjRpRRqp+pqCotlfU mCNYrabYYV2JOF6cAFP9RCBQO35GXN4rBcmbvaePznFuPxK6ddDJNrqrwkokY1jqrtUh hxuTNn0jmkaSq2OmuuKB6HAWUeYVq7BRMSHrrFrvv4lFltFcVNdgNVMeTDTpGLaYeoji SoJa4WhRTc3EZ8EK2m7y4sfQEVrymtEb4T7W3Q+jC/ltjMgTTn9TLwNv9vv603nqQxYH sh0tZu9voukz2E0OPnQ5BsY8EXwEmOklqKQaPVA0MZTbPzoNRMtNItK8LCjVQ8nrz7fy YA==
Received: from prod-mail-ppoint5 (prod-mail-ppoint5.akamai.com [184.51.33.60] (may be forged)) by mx0a-00190b01.pphosted.com with ESMTP id 30vteyac1w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <tsvwg@ietf.org>; Fri, 08 May 2020 20:42:09 +0100
Received: from pps.filterd (prod-mail-ppoint5.akamai.com [127.0.0.1]) by prod-mail-ppoint5.akamai.com (8.16.0.27/8.16.0.27) with SMTP id 048JbELA015625 for <tsvwg@ietf.org>; Fri, 8 May 2020 12:42:08 -0700
Received: from email.msg.corp.akamai.com ([172.27.123.33]) by prod-mail-ppoint5.akamai.com with ESMTP id 30s6uaadu8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for <tsvwg@ietf.org>; Fri, 08 May 2020 12:42:08 -0700
Received: from usma1ex-dag1mb6.msg.corp.akamai.com (172.27.123.65) by usma1ex-dag1mb6.msg.corp.akamai.com (172.27.123.65) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 8 May 2020 15:42:07 -0400
Received: from usma1ex-dag1mb6.msg.corp.akamai.com ([172.27.123.65]) by usma1ex-dag1mb6.msg.corp.akamai.com ([172.27.123.65]) with mapi id 15.00.1497.006; Fri, 8 May 2020 15:42:07 -0400
From: "Holland, Jake" <jholland@akamai.com>
To: "tsvwg@ietf.org" <tsvwg@ietf.org>
Thread-Topic: Update to Position Statement on ECT(1)
Thread-Index: AQHWJXDCMj0HjdV7wk+M4pcr+ovB2Q==
Date: Fri, 08 May 2020 19:42:07 +0000
Message-ID: <BE44EAE9-5CFB-4F5D-85B8-05AFA516C151@akamai.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.36.20041300
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [172.19.89.130]
Content-Type: text/plain; charset="utf-8"
Content-ID: <975AD1E89EF6404BB8963E014325E5F2@akamai.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.676 definitions=2020-05-08_18:2020-05-08, 2020-05-08 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-2002250000 definitions=main-2005080165
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.676 definitions=2020-05-08_18:2020-05-08, 2020-05-08 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 adultscore=0 priorityscore=1501 malwarescore=0 spamscore=0 clxscore=1015 mlxlogscore=999 suspectscore=0 phishscore=0 bulkscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005080165
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/vW8eEt7V58SWPoLIJ5dklCG4WqM>
Subject: [tsvwg] Update to Position Statement on ECT(1)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 May 2020 19:42:17 -0000

Hi tsvwg,

As promised in my original position statement[1], if I substantially changed
my views on the ECT(1) question I would post an update.

It has come to my attention that a technical fix is possible for my safety
concerns with "ECT(1) as an input" by changing the signaling scheme.

Although I still stand by all my claims in the original position statement,
the existence of a safe signaling scheme that uses ECT(1) as a network input
has changed my conclusion on the input/output question.

New position:
- I slightly prefer ECT(1) as an input (with qualifiers, given below)

(Note that this position should not be taken as an endorsement of L4S's
safety as currently proposed.)

The new signaling scheme that drove my change of position is this:

- ECT(1) is set from sender after negotiating endpoint support
- On-path devices change ECT(1)->ECT(0) to signal low levels of congestion
- On-path devices continue to use CE, as in RFC3168, to signal high levels
  of congestion, resulting in a required multiplicative decrease response.

Under this scheme, for any path with a single AQM that's dualq, ECT(1)
remains a very good classifier for that AQM.  Since this covers most
relevant paths that aren't within a controlled environment like datacenters,
it has a low downside.

Under this scheme, ECT(0) becomes the 1/p signal for dualq+TCP Prague, and
CE becomes the 1/sqrt(p) signal from the classic queue if the LL queue
overflows, and results in multiplicative decrease from the sender.

This would make L4S compatible with RFC 3168 without relying on a fragile
classic queue detection algorithm, so it would address my safety concerns.

As with all available signaling schemes, I acknowledge that this approach
is not perfect, and comes with tradeoffs.  A few of the known tradeoffs
would include (with thanks to Bob, Koen, and Kyle for explaining some of
these to me offline):
- existing tunneling decapsulation specs would often lose non-CE signals
- the existing accecn spec would often lose non-CE signals
- For paths with multiple AQMs, the classifier partially loses integrity in
  later AQMs when earlier AQMs are loaded.  (Note also the worse downside
  that increasing deployment of new AQMs potentially reduces the fidelity
  further.)

In spite of the downside from these tradeoffs (and the work that would be
necessary to fix the specs and their deployment to capture the most value
from L4S), a signaling scheme with the backward compatibility that this
approach provides is what would make the key difference between a safely
deployable L4S and not, IMO.

As I said, I still stand by my previous claims.  In particular, I still
believe that DSCP is a reasonable and appropriate classifier for L4S
traffic at this stage of its maturity.

However, I also acknowledge that there's value in getting a quicker wide
deployment, as long as it can be done safely.  Since I believe ECT(1) with
the above signaling scheme can do so, I now think it's as reasonable a
choice as DSCP, but carries substantial benefits.

Since this approach would give almost the same benefits as "ECT(1) as
output", and also provides a classifier that can serve dualq's needs well
in most of the deployment scenarios, "ECT(1) as input" is my current
preference, because of my new belief that it can be made compatible
with RFC 3168 queues and still mostly get the classification job done.

I remain opposed to moving L4S forward in a way that's not compatible with
RFC 3168 queues, as it's currently proposed.

I also remain skeptical that it's possible to get the classic queue
detection working robustly, I think that's probably a dead end. And I
have become more skeptical of the viability of the queue protection
mechanisms mentioned, because those seem to require access to the layer 4
packet contents, which has been flagged as too hard to be practical.

So I remain skeptical of the safety stories told so far for the current
L4S proposal, because it has no MD fallback signal except loss or detection.

Best,
Jake

PS: I also have some mostly-supportive comments on Kyle's remarks about
the input/output question that might be relevant.  Thread is here and should
soon get a new message:
https://mailarchive.ietf.org/arch/msg/tsvwg/VhgCiE9dF6F2Z-eN9wkpeVG2LX0/

[1] Jake's original position on ECT(1):
https://mailarchive.ietf.org/arch/msg/tsvwg/Zrk7Up6g9BwfnJLjKD44K0riAg4/