Re: [tsvwg] start of WGLC on L4S drafts

"Holland, Jake" <jholland@akamai.com> Sat, 21 August 2021 20:11 UTC

Return-Path: <jholland@akamai.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1480E3A0907 for <tsvwg@ietfa.amsl.com>; Sat, 21 Aug 2021 13:11:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.551
X-Spam-Level:
X-Spam-Status: No, score=-2.551 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.452, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5u2UCKeH76_l for <tsvwg@ietfa.amsl.com>; Sat, 21 Aug 2021 13:11:53 -0700 (PDT)
Received: from mx0b-00190b01.pphosted.com (mx0b-00190b01.pphosted.com [IPv6:2620:100:9005:57f::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6ACB53A090B for <tsvwg@ietf.org>; Sat, 21 Aug 2021 13:11:53 -0700 (PDT)
Received: from pps.filterd (m0122330.ppops.net [127.0.0.1]) by mx0b-00190b01.pphosted.com (8.16.1.2/8.16.0.43) with SMTP id 17LJAxWX002280; Sat, 21 Aug 2021 21:11:50 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : subject : date : message-id : references : in-reply-to : content-type : content-id : content-transfer-encoding : mime-version; s=jan2016.eng; bh=YN1qt5WefvcBGszw85xNEzw7cTFLv+82cPTEWTOJIkU=; b=GjkDQ7CImreXEAAMSyjq2Dm3M6SmXbYU1N3xFP7vqg9cVgxYIN3xUz8LWXL7TtFQK+91 X1aWRNUxQfnIf/9BNAxM8jDGJxqjuntjZvKfDNJ4bDwqi534XrplwyGTBUXV3c+tpYvF NMmF/D4ZL0iACIZ+jRyBCIJz+BMSDx/gNBvViUZ0GlCMKH0eHPP2g510zdBD3Yzqc6e5 LtiBE+Ptv+MWluFTLODjClJMY4l3Eo9NPHDCH9VFM5qotYQ6rnNIexRSu95glJxTRDWg RnQOvTBVXfw7OJm0LmmvDkArkXkM0166j8dlo4j8/FXfRs5Xw2SfajcMV8Ei/sV4uLTC Uw==
Received: from prod-mail-ppoint8 (a72-247-45-34.deploy.static.akamaitechnologies.com [72.247.45.34] (may be forged)) by mx0b-00190b01.pphosted.com with ESMTP id 3ajs4w0hx3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 21 Aug 2021 21:11:49 +0100
Received: from pps.filterd (prod-mail-ppoint8.akamai.com [127.0.0.1]) by prod-mail-ppoint8.akamai.com (8.16.1.2/8.16.1.2) with SMTP id 17LK7biN013722; Sat, 21 Aug 2021 16:11:48 -0400
Received: from email.msg.corp.akamai.com ([172.27.165.118]) by prod-mail-ppoint8.akamai.com with ESMTP id 3ajvtyk1p2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Sat, 21 Aug 2021 16:11:48 -0400
Received: from USTX2EX-DAG1MB4.msg.corp.akamai.com (172.27.165.122) by ustx2ex-dag1mb1.msg.corp.akamai.com (172.27.165.119) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Sat, 21 Aug 2021 15:11:47 -0500
Received: from USTX2EX-DAG1MB4.msg.corp.akamai.com ([172.27.165.122]) by ustx2ex-dag1mb4.msg.corp.akamai.com ([172.27.165.122]) with mapi id 15.00.1497.023; Sat, 21 Aug 2021 15:11:47 -0500
From: "Holland, Jake" <jholland@akamai.com>
To: Wesley Eddy <wes@mti-systems.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Thread-Topic: [tsvwg] start of WGLC on L4S drafts
Thread-Index: AQHXhJVfkuYkGXuAUESfMLxrbPc/qat+ZtQA
Date: Sat, 21 Aug 2021 20:11:47 +0000
Message-ID: <C220377C-0A9A-4A0E-989A-2A8D19DE7475@akamai.com>
References: <7dd8896c-4cd8-9819-1f2a-e427b453d5f8@mti-systems.com>
In-Reply-To: <7dd8896c-4cd8-9819-1f2a-e427b453d5f8@mti-systems.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.51.21071101
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [172.27.164.43]
Content-Type: text/plain; charset="utf-8"
Content-ID: <CD50F792B8E70D4B9B26A190E388EA20@akamai.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-08-21_08:2021-08-20, 2021-08-21 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 mlxscore=0 mlxlogscore=999 phishscore=0 spamscore=0 suspectscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2108210126
X-Proofpoint-ORIG-GUID: 9d0B3eSQ3rY257-hRg24obtwtdTFKis6
X-Proofpoint-GUID: 9d0B3eSQ3rY257-hRg24obtwtdTFKis6
X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-08-21_09,2021-08-20_03,2020-04-07_01
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 suspectscore=0 clxscore=1011 adultscore=0 spamscore=0 bulkscore=0 phishscore=0 impostorscore=0 mlxlogscore=999 malwarescore=0 priorityscore=1501 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2108210126
X-Agari-Authentication-Results: mx.akamai.com; spf=${SPFResult} (sender IP is 72.247.45.34) smtp.mailfrom=jholland@akamai.com smtp.helo=prod-mail-ppoint8
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/nCEByvOoaL0xmgZt3vbKo_kiwC0>
Subject: Re: [tsvwg] start of WGLC on L4S drafts
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 21 Aug 2021 20:11:59 -0000

Overall:
The documents are in much better shape than the last time I reviewed
them, thanks for all the improvements.

I'm reviewing l4s-id and l4s-arch but didn't have time to get to
dualq-aqm yet.  But I wanted to make sure these got posted, and
might not have more time before the deadline.

----

draft-ietf-tsvwg-ecn-l4s-id-19:
There's a few major issues.  These really should be fixed before
sending it to the IESG, IMO, but with these fixed I'd be happy to
see it shipped.

Major:

1. This should be proposed standard, not experimental.

This assigns a new and incompatible meaning to an IP header codepoint
in a way that can't be rolled back or kept isolated, with declared
intent to roll it out widely.  This should be considered a proposed
standard action and make an update to RFCs 3168 and 8311 that clearly
states that any existing marking queues should be changed where
possible to treat ECT(1) as either L4S or Not-ECT, even if the device
is not participating in an L4S-related experiment.

Such an update would align the specs with the stated expectations
that monitoring requirements should be in place mainly as a legacy
workaround, and would provide appropriate and potentially actionable
standards track mitigations that are available to on-path devices
implicated in any problems that are observed, whereas the specs as
currently written have none.

I understand that RFC 8311 was originally intended to cover this, but
that was published back when the L4S proposal still intended to make
the ECT(1)-marking compatible with classic marking via a robust
detection scheme, and RFC 8311 embeds that assumption throughout.  Now
that the proposal has changed to rolling this out without solving the
coexistence problem, RFC 8311's provisions are no longer adequate to
cover this IMO, and must be updated if we intend to have a self-
consistent set of specs published.

(Alternately, a separate standards-track doc could recommend changing
ECT(1) handling to Not-ECT, but I really think there's no point
separating them, and such a doc would still have to be a prerequisite
for moving this forward IMO.  I suggest that the most expedient
path as well as the better choice here is to make l4s-id standards
track and to add the Not-ECT option for ECT(1) here as a reasonable
problem mitigation strategy available to marking bottlenecks.)

2. This text from Section 4.3 should be strengthened to recommend
detection of paths that might be classic-marking L4S traffic (shared
queue or not):

   o  In uncontrolled environments, monitoring MUST be implemented to
      support detection of problems with an ECN-capable AQM at the path
      bottleneck that appears not to support L4S and might be in a
      shared queue.

Even in fq, the standing queue that will be built by a scalable cc with
classic marking is contrary to "1-2ms latency at 99th percentile" goal
of L4S, even leaving aside the harm to colliding flows (though of course
the harm to colliding flows, including from hash collisions in fq, is
also another reason to make this change).

3. This text from Section 4.3 should be strengthened to require that
sending L4S traffic in uncontrolled environments does not happen when
classic marking of L4S traffic is detected for a shared queue, and at
least recommend that it not happen even for fq:

   o  In uncontrolled environments, monitoring MUST be implemented to
      support detection of problems with an ECN-capable AQM at the path
      bottleneck that appears not to support L4S and might be in a
      shared queue.  Such monitoring SHOULD be applied to live traffic
      that is using Scalable congestion control.  Alternatively,
      monitoring need not be applied to live traffic, if monitoring has
      been arranged to cover the paths that live traffic takes through
      uncontrolled environments. 

The current text requires monitoring, but only gives a single SHOULD for
live traffic, plus non-normatively permits one alternative.  This allows
operators to monitor but not cut off (so this requirement as currently
written would be satisfied by write-only logging for instance, with the
SHOULD easily dismissable with an "implementation complexity" hand-wave
while still following the spec).

Suggested alternative, feel free to edit:

   o  In uncontrolled environments, L4S traffic MUST NOT be sent without
      monitoring to detect marking of L4S traffic by non-L4S bottlenecks.
      This monitoring can for example be performed on live traffic, or
      can rely on monitoring that covers the paths live traffic takes
      through uncontrolled environments.  Where non-L4S bottlenecks are
      observed marking L4S traffic, L4S sending MUST be disabled if the
      bottleneck is a shared queue, and SHOULD be disabled if it is FQ.

4. Although l4s-arch claims that l4s-id satisfies the RFC 4774
requirements, it's hard to tell whether it does so.  Specifically:

4.a. From section 7 of RFC 4774:
   Specifications of alternate ECN semantics must clearly state how they
   address the issues raised in this document, particularly the issues
   discussed in Section 2.

I don't see how issues 2 and 3 from section 2 are covered in l4s-id.
From section 4 of RFC 4774:
   (2) How does the possible presence of old routers affect the
       performance of the alternate-ECN connections?

   (3) How does the possible presence of old routers affect the
       coexistence of the alternate-ECN traffic with competing traffic
       on the path?

   When alternate semantics are defined for the ECN field, it is
   necessary to ensure that there are no problems caused by old routers
   along the path that don't understand the alternate ECN semantics.

4.b. Section 4 goes on to describe 3 options for how alternate ECN
semantics should be treated.  I don't see a claim in the L4S docs
specifying which of the 3 options the L4S spec for alternate ECN semantics
matches, but it implicitly appears to be trying to say it's option 3
(unsafe) and asserting that the detection and adaptive response satisfies
the requirement for isolation on option 3, I think?

Maybe there are sections in l4s-id that intend to cover these points,
and there just needs to be text listing what they are, but I don't think
the link is obvious enough to satisfy the "clearly state" requirement
from RFC 4774.  So it would be very helpful to add a list of references
to sections that are intended to address these RFC 4774 requirements.


Nits:
- the list in section 7.1 has a weird formatting problem for the sub-
  list:

   o  Did use of L4S over the Internet result in improvements to the
      following metrics:

   o

      *  queue delay (mean and 99th percentile) under various loads;

----

draft-ietf-tsvwg-l4s-arch-10:
Summary: Almost ready

+1 to Gorry's comments here, especially regarding the use of "all traffic":
https://mailarchive.ietf.org/arch/msg/tsvwg/vMMsQpXs65lk1E7NpV5RlmpyqdI/

Minor:
1. l4s-arch section 1:
   It has been demonstrated that, once access network bit rates reach
   levels now common in the developed world, increasing capacity offers
   diminishing returns if latency (delay) is not addressed.
- This needs a reference.

Editorial:
1. l4s-arch 4.2 section a:
- This is a confusing wall of text.  I think it would be better to
  give a much briefer summary here with a reference.  Exposition
  like "the obvious part" and "the less obvious part" are a minus
  here--I don't think the obviousness claims made here generalize
  well.

2. All the uses of underlining for emphasis are a minus.  The
  places where it seems necessary or useful are a good hint that
  the text on its own is not adequately capturing the intended
  meaning.
  Leaning on this kind of toned emphasis makes assumptions about
  connotations that don't hold very well even for native English
  speakers and break down entirely for non-native speakers, so they
  are generally out of place in a technical document that will need
  to be correctly interpreted by an international audience with many
  non-native speakers, IMO.

----

I'm going to be too pressed for time to do a more detailed review,
but I wanted to get the above comments in.

As a final aside, I'd like to see this happen.  The only goal I'm
pursuing at this point wrt this work is avoiding preventable harm
from pushing this out in a way that's likely to cause confusion
when and if problems are encountered in production.  (In particular,
I have given up on efforts to improve the signaling design, since
the authors have rejected all such suggestions and this work needs
to get moved out of the wg one way or another.)

Best,
Jake


On 07-29, 9:18 AM, "Wesley Eddy" <wes@mti-systems.com> wrote:

This message is starting a combined working group last call on 3 of the 
L4S drafts:

- Architecture: https://datatracker.ietf.org/doc/draft-ietf-tsvwg-l4s-arch/ 

- DualQ: 
https://datatracker.ietf.org/doc/draft-ietf-tsvwg-aqm-dualq-coupled/ 

- ECN ID: https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/ 

The WGLC will last through 4 weeks from today, and then we'll see what 
to do next.  Please submit any comments you have on these to the TSVWG 
list in that timeframe.

The chairs are considering a possible virtual interim following the 
close in order to work through feedback received.

The work on the L4S operational guidance draft is continuing in 
parallel, but that draft is not being last called yet.