[tcpm] Comments on draft-ietf-tcpm-generalized-ecn

"Scharf, Michael (Nokia - DE/Stuttgart)" <michael.scharf@nokia.com> Sun, 03 December 2017 19:17 UTC

Return-Path: <michael.scharf@nokia.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0ACF9127137; Sun, 3 Dec 2017 11:17:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.92
X-Spam-Level:
X-Spam-Status: No, score=-1.92 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UHz512WH2qBX; Sun, 3 Dec 2017 11:17:21 -0800 (PST)
Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0105.outbound.protection.outlook.com [104.47.1.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1822412711B; Sun, 3 Dec 2017 11:17:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=edwAWR4BHY2G/QpFZ6jd24vDJ7sTyXBrdGc6H5wBNBc=; b=iV6DYTM5k3DFjfT+x6S0475vhm8jkI9EQNupXB5gwYXWRqA7GcdkplehmtFc1dLxI4mJrmGFTqJvXSnAVYx78tfasahlruTqPB4RYO+0s7P7DUc6CoAxeaycktPbEQgrMCmYS9rQdiR3Y68hEvNd8bE+21wVgQ9sgFrfLQGnB7E=
Received: from AM5PR0701MB2547.eurprd07.prod.outlook.com (10.173.92.15) by AM5PR0701MB2547.eurprd07.prod.outlook.com (10.173.92.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.282.3; Sun, 3 Dec 2017 19:17:15 +0000
Received: from AM5PR0701MB2547.eurprd07.prod.outlook.com ([fe80::d8a4:eb2b:a214:9eb2]) by AM5PR0701MB2547.eurprd07.prod.outlook.com ([fe80::d8a4:eb2b:a214:9eb2%17]) with mapi id 15.20.0282.002; Sun, 3 Dec 2017 19:17:15 +0000
From: "Scharf, Michael (Nokia - DE/Stuttgart)" <michael.scharf@nokia.com>
To: "draft-ietf-tcpm-generalized-ecn@ietf.org" <draft-ietf-tcpm-generalized-ecn@ietf.org>, "tcpm@ietf.org" <tcpm@ietf.org>
Thread-Topic: Comments on draft-ietf-tcpm-generalized-ecn
Thread-Index: AdNsau8LUsq92s89RxSDA7UvFeZkmA==
Date: Sun, 03 Dec 2017 19:17:14 +0000
Message-ID: <AM5PR0701MB25475139E17B16CAAE56F002933F0@AM5PR0701MB2547.eurprd07.prod.outlook.com>
Accept-Language: en-US, de-DE
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=michael.scharf@nokia.com;
x-originating-ip: [92.203.179.248]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; AM5PR0701MB2547; 6:jMJ0W3VsYsKLFBc1T8zZK69PnQ3VZvI8wWpAoJ/Q76Y6+R62aQlvt3SQF5F37jIEluVPnfyCZmjaApohi3LO/k37LwVcWM1DsY/rFKhg1Vc+UpdG9bCLm1Y+J9gPLecdPNJBltRyWy3suISEV3Sw667zvHMzeG/IqBMT7nABGxx1ZyTyT9/n6xpMV7SU8pR64S7Uxbs9JTHRChHjgIMGr2+zd9s5cgomNQGDSY1s7CHu9VHDlriuS89FkAnutx0nC0QK87RZ65y07DhkMW2YqZyNQ1EFJu8yfDw0rNYOQJ2HbIPohua3Rvlasb/NHz0Uge2ZzqBBPuuzNhQ0QeBG3CBvscDWTKYax6rYjdhGOBI=; 5:7chpVdVHiy2+zcSlcv7VTPT8zzMD/I0VJF58KZGqyJZO6pE077B/92wWeSXjdOQVe/OwgTLd6LMVe95JObKUeg4NG4+XY1KjlPzWQE9ckVAPs8G8aOq/gmb/g53CHWjRY8rV7w5492rEEX2JyGW9YARibJaXCT8HGQJGYK8nB/s=; 24:4rI+FjgM28chMMRRZetYoNvFwq/6kWfX8OKmKjxb0LarIaTKO+V5Et4d3gma/k8cZCWYuZp9TU/bTddVngK77cjOGPurIRqLc65lZi0+rJQ=; 7:48KkLCQWzXW+I9oPAdKn5mr8fUE/5QYq9+ItT7w/sDQrdolBHLnWeTbMzjPStK7asvD4dRhPinONX7IHpUEvWKFY6cYoII91KBlW2t0gewsYCra6csauz1q1zTWwjgcNYHC8KyLVq5AArSEsmVfdV4P1atX2+PS3nD8HfnJqbfrTvfWer0shkTSZWUdzJMgKCss0R7bYyEdLC5bsqZRewpKgNM7754vz2brZkW0SMKFq8NDlgqhhRhfvQ/O7Oazh
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: 9ac872a4-cace-4238-8cd6-08d53a82767f
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(5600026)(4604075)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(48565401081)(2017052603286); SRVR:AM5PR0701MB2547;
x-ms-traffictypediagnostic: AM5PR0701MB2547:
x-microsoft-antispam-prvs: <AM5PR0701MB2547554A84BA3FE5FE04775C933F0@AM5PR0701MB2547.eurprd07.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(158342451672863)(192374486261705)(35073007944872)(227612066756510)(21748063052155);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040450)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3002001)(3231022)(6055026)(6041248)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123555025)(20161123562025)(6072148)(201708071742011); SRVR:AM5PR0701MB2547; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:AM5PR0701MB2547;
x-forefront-prvs: 05102978A2
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(366004)(39860400002)(346002)(376002)(54094003)(189002)(199003)(53754006)(52314003)(101416001)(68736007)(81166006)(7736002)(99286004)(102836003)(8676002)(790700001)(6436002)(7696005)(3846002)(5250100002)(8936002)(81156014)(2900100001)(230783001)(86362001)(53936002)(5660300001)(110136005)(6306002)(14454004)(33656002)(4743002)(15974865002)(53946003)(450100002)(74316002)(106356001)(3660700001)(6506006)(66066001)(3280700002)(189998001)(97736004)(54356011)(105586002)(54896002)(9686003)(2906002)(2501003)(316002)(25786009)(55016002)(478600001)(6116002)(579004); DIR:OUT; SFP:1102; SCL:1; SRVR:AM5PR0701MB2547; H:AM5PR0701MB2547.eurprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
received-spf: None (protection.outlook.com: nokia.com does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_AM5PR0701MB25475139E17B16CAAE56F002933F0AM5PR0701MB2547_"
MIME-Version: 1.0
X-OriginatorOrg: nokia.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 9ac872a4-cace-4238-8cd6-08d53a82767f
X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Dec 2017 19:17:14.9929 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0701MB2547
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/XxNrBz80UuLcXP2Cm_XI0plTvXE>
Subject: [tcpm] Comments on draft-ietf-tcpm-generalized-ecn
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Dec 2017 19:17:27 -0000

Hi all,

I have read draft-ietf-tcpm-generalized-ecn-02. Some high-level thoughts:

- I believe that this experiment should be specified independently from other experiments, including AccECN. I believe this is by and large possible. A separation may perhaps require moving the SYN handling to a new I-D that is more closely tied into AccECN, but at first sight I don't see a major problem with this. For instance, I could think of scenarios in which it may make sense to *only* add ECN support to TCP control packets without deploying any other experiment.

- Apart from the unnecessary complexity resulting from the dependencies on AccECN, I think this document has a lot of other text (most notably in Section 4) that is not really relevant to implementers. Maybe one could consider moving more non-normative discussion to an appendix.

- I would suggest to move all discussion on implementation guidance, such as caching strategies, to one (non-normative) section.

Please find below my comments marked as [ms]. I have read the document independent of the review from Gorry. I apologize if there is duplication.

Thanks

Michael (with no hat on)


******************************

* Abstract

   This document describes an experimental modification to ECN when used
   with TCP.  It allows the use of ECN on the following TCP packets:
   SYNs, pure ACKs, Window probes, FINs, RSTs and retransmissions.

[ms] Obsoletion of RFC 5562 should probably be mentioned here. To me, it is not super-important to have the SYN specification in this document, it could e.g. be moved elsewhere into a separate document.


* 1.  Introduction

   RFC 5562 [RFC5562] is an experimental modification to ECN that enables
   ECN support for TCP SYN-ACK packets.

[ms] If the RFC is obsoleted, it should probably be mentioned here, too.

   ECN++ is a sender-side change.  It works whether the two ends of the
   TCP connection use classic ECN feedback [RFC3168] or experimental
   Accurate ECN feedback (AccECN [I-D.ietf-tcpm-accurate-ecn]).
   Nonetheless, if the client does not implement AccECN, it cannot use
   ECN++ on the one packet that offers most benefit from it - the
   initial SYN.  Therefore, implementers of ECN++ are RECOMMENDED to
   also implement AccECN.

[ms] I don't think that this coupling of this experiment and the AccECN experiment is future-proof. We have to foresee e.g. the case that only one of the two experiments suceeds, e.g., that ECN gets enabled on control packets w/o AccECN. As a result, I think this specification should be independent of AccECN. This may require some changes for the description of the SYN pricessing, e.g., moving the SYN handling to a separate document. Such a smaller, focused document may possibly depend on this experiment and AccECN. In a nutshell, I disagree with this section and the following content that creates a dependency between this document and AccECN.

   ECN++ is designed for compatibility with a number of latency
   improvements to TCP such as TCP Fast Open (TFO [RFC7413]), initial
   window of 10 SMSS (IW10 [RFC6928]) and Low latency Low Loss Scalable
   Transport (L4S [I-D.ietf-tsvwg-l4s-arch]), but they can all be
   implemented and deployed independently.

[ms] I don't understand why this text is then needed.

   [I-D.ietf-tsvwg-ecn-experimentation] is a standards track procedural
   device that relaxes requirements in RFC 3168 and other standards
   track RFCs that would otherwise preclude the experimental
   modifications needed for ECN++ and other ECN experiments.

[ms] I am not sure if such tutorial text is really needed.


* 1.1.  Motivation

   The absence of ECN support on TCP control packets and retransmissions
   has a potential harmful effect.  In any ECN deployment, non-ECN-
   capable packets suffer a penalty when they traverse a congested
   bottleneck.

[ms] I don't understand why this is true in "any" ECN deployment. To me, it is a network policy how to handle non-ECN capable packets, e.g., during congestion, and at least theoretically different options are possible. I would rephrase this e.g. as follows: "There is a risk that non-ECN-capable packets suffer a penalty...".

   Non-ECN control packets particularly harm performance in environments
   where the ECN marking level is high.  For example, [judd-nsdi] shows
   that in a controlled private data centre (DC) environment where ECN
   is used (in conjunction with DCTCP [RFC8257]), the probability of
   being able to establish a new connection using a non-ECN SYN packet
   drops to close to zero even when there are only 16 ongoing TCP flows
   transmitting at full speed.  The issue is that DCTCP exhibits a much
   more aggressive response to packet marking (which is why it is only
   applicable in controlled environments).  This leads to a high marking
   probability for ECN-capable packets, and in turn a high drop
   probability for non-ECN packets.  Therefore non-ECN SYNs are dropped
   aggressively, rendering it nearly impossible to establish a new
   connection in the presence of even mild traffic load.

[ms] I think this section describes a particular network deployment case, and not all assumptions are clearly spelt out. For instance, if the DCTCP traffic was assigned to another DiffServ class, it is not clear if the problem would be exactly the same. Instead of describing this specific case, I think the introduction could more generally emphasize the benefit of ECT in different cases, beyond DCTCP.

   Finally, there are ongoing experimental efforts to promote the
   adoption of a slightly modified variant of DCTCP (and similar
   congestion controls) over the Internet to achieve low latency, low
   loss and scalable throughput (L4S) for all communications
   [I-D.ietf-tsvwg-l4s-arch].  In such an approach, L4S packets identify
   themselves using an ECN codepoint [I-D.ietf-tsvwg-ecn-l4s-id].  With
   L4S and potentially other similar cases, preventing TCP control
   packets from obtaining the benefits of ECN would not only expose them
   to the prevailing level of congestion loss, but it would also
   classify control packets into a different queue with different
   network treatment, which may also lead to reordering, further
   degrading TCP performance.

[ms] It is understood that the L4S experiment may benefit from this I-D. For this, it would be sufficient to referent to this document from the L4S spec. We don't know the outcome of various experiments, and as a result I prefer that published RFCs are valid independent of other experiments.


* 1.2.  Experiment Goals

[ms] This section looks good, but I suggest to merge it with the "MEASURMENT NEEDED" discussion. Is it not possible to have a *single* list of what experiments and measurements are needed, in one place in the I-D?


* 1.3.  Document Structure

[ms] This section can hopefully be just removed. The fact that it is present shows that the document may still have room for editorial improvement, and, most notably, shortening.


* 2.  Terminology

   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
   SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
   document, are to be interpreted as described in [RFC2119].

[ms] RFC 8174?

   ECT: ECN-Capable Transport.  One of the two codepoints ECT(0) or
   ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6).  An
   ECN-capable sender sets one of these to indicate that both transport
   end-points support ECN.  When this specification says the sender sets
   an ECT codepoint, by default it means ECT(0).  Optionally, it could
   mean ECT(1), which is in the process of being redefined for use by
   L4S experiments [I-D.ietf-tsvwg-ecn-experimentation]
   [I-D.ietf-tsvwg-ecn-l4s-id].

[ms] I don't understand why this has to be mentioned here. To me, it seems sufficient to reference related work at one single place in the document, in order to avoid dependencies.


* 3.  Specification

3.1.  Network (e.g.  Firewall) Behaviour

[ms] This could perhaps be moved to a dedicated section targeting switch/router/firewall/... implementers, who may not look for guidance in a section targeting TCP stack implementers. Just as a random thougt.

   This is likely to only involve
   a firewall rule change in a fraction of cases (at most 0.4% of paths
   according to the tests reported in Section 4.2.2).

[ms] This sort of data can easily get oudated and does not really fit into a normative section, IMHO.

* 3.2.  Endpoint Behaviour

   It can be seen that the sender can set ECT in all cases, except if it
   is not requesting AccECN feedback on the SYN.  Therefore it is
   RECOMMENDED that the experimental AccECN specification
   [I-D.ietf-tcpm-accurate-ecn] is implemented (as well as the present
   specification), because it is expected that ECT on the SYN will give
   the most significant performance gain, particularly for short flows.
   Nonetheless, this specification also caters for the case where AccECN
   feedback is not implemented.

[ms] As mentioned before, I would prefer that *this* document only discusses the case without AccECN. Possibly this requires that the SYN handling is moved to a separate document. To me, such a split would both be more future-proof and result in simpler documents. For instance, table 1 would be much simpler in this document. Such a change would impact quite a number of the following sections, but at first sight I don't see a reason that would prevent such a reorganization.


* 3.2.1.  SYN

[ms] As mentioned already, that whole section could move to another I-D. I am not sure if SYN would then be discussed in detail in this document. The statement in this I-D could perhaps be as simple as "Therefore, a TCP initiator MUST NOT set ECT on a SYN unless it also negotiates a mechanism for feedback to CE marks on SYNs. An example for such a feedback scheme is draft-ietf-tcpm-accurate-ecn. The specification can be found in draft-ietf-tcpm-foo."


* 3.2.1.1.  Setting ECT on the SYN

   With classic [RFC3168] ECN feedback, the SYN was never expected to be
   ECN-capable, so the flag provided to feed back congestion was put to
   another use (it is used in combination with other flags to indicate
   that the responder supports ECN).  In contrast, Accurate ECN (AccECN)
   feedback [I-D.ietf-tcpm-accurate-ecn] provides two codepoints in the
   SYN-ACK for the responder to feed back whether or not the SYN arrived
   marked CE.

   Therefore, a TCP initiator MUST NOT set ECT on a SYN unless it also
   attempts to negotiate Accurate ECN feedback in the same SYN.

[ms] Well, I don't think that Accurate ECN would be the only potential experiment to deal with CE marks on SYNs. For instance, one could design another feedback for CE marks in SYNs that leaves other parts of ECN as it is today. ECT-on-SYNs clearly requires experimentation, but that specific protocol aspect is to me by and large orthogonal to the other TCP control packets. Which is why I believe the spec depending on AccECN belongs into a separate I-D.


* 3.2.1.2.  Caching Lack of AccECN Support for ECT on SYNs

[ms] The whole caching discussion IMHO belongs into one (non-normative) place and is IMHO confusing so early in the document. The text can also be shortened IMHO.


* 3.2.1.3.  SYN Congestion Response

   If the SYN-ACK returned to the TCP initiator confirms that the server
   supports AccECN, it will also indicate whether or not the SYN was CE-
   marked.  If the SYN was CE-marked, the initiator MUST reduce its
   Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum
   segment size).

   If ECT has been set on the SYN and if the SYN-ACK shows that the
   server does not support AccECN, the TCP initiator MUST conservatively
   reduce its Initial Window and SHOULD reduce it to 1 SMSS.  A
   reduction to greater than 1 SMSS MAY be appropriate (see
   Section 4.2.1).  Conservatism is necessary because a non-AccECN SYN-
   ACK cannot show whether the SYN was CE-marked.

[ms] This congestion control algorithm may be one example where experimentation is needed prior to a proposed standard, and as congestion control evolves, it is possible that a PS would have different guidance. But this experiment would be quite independent from the TCP other control packets that raise much less questions on the congestion control interaction. To me, this is yet another reason to separate the experiments.

3.2.1.4.  Fall-Back Following No Response to an ECT SYN

   An ECT SYN might be lost due to an over-zealous path element (or
   server) blocking ECT packets that do not conform to RFC 3168.  Some
   evidence of this was found in a 2014 study [ecn-pam], but in a more
   recent study using 2017 data [Mandalari18] extensive measurements
   found no case where ECT on TCP control packets was treated any
   differently from ECT on TCP data packets.  Loss is commonplace for
   numerous other reasons, e.g. congestion loss at a non-ECN queue on
   the forward or reverse path, transmission errors, etc.

[ms] Such data gets easily outdated. IMHO does not belong into normative sections of an RFC (the same comment applies also elsewhere). Typically, measurements can be discussed in an appendix.

   Other fall-back strategies MAY be adopted where applicable (see
   Section 4.2.2 for suggestions, and the conditions under which they
   would apply).

[ms] Given that, a lot of content seems to be not normative and can probably be moved elsewhere (or even be removed).


* 3.2.2.1.  Setting ECT on the SYN-ACK

   Some classic ECN implementations might ignore a CE-mark on a SYN-ACK,
   or even ignore a SYN-ACK packet entirely if it is set to ECT or CE.
   This is a possibility because an RFC 3168 implementation would not
   necessarily expect a SYN-ACK to be ECN-capable.

      FOR DISCUSSION: To eliminate this problem, the WG could decide to
      prohibit setting ECT on SYN-ACKs unless AccECN has been
      negotiated.  However, this issue already came up when the IETF
      first decided to experiment with ECN on SYN-ACKs [RFC5562] and it
      was decided to go ahead without any extra precautionary measures

[ms] As there seem to be widely deployed "congestion control" algorithms that decide to ignore loss, I am not too worried about some cases where a TCP endpoint may ignore a single CE-mark. I may miss something, but I don't see how that could really result in congestion collapse or significant unfairness in today's Internet. A router that is really congested will drop *many* packets and thereby trigger the congestion control. Of course, I am here assuming that any transport protocol indeed reacts to loss by reducing the load immediately.

* 3.2.2.2.  SYN-ACK Congestion Response

   A host that sets ECT on SYN-ACKs MUST reduce its initial window in
   response to any congestion feedback, whether using classic ECN or
   AccECN.  It SHOULD reduce it to 1 SMSS.
[ms] I think this document can be simplified by avoiding this sort distinction wherever there is no difference.

   This is different to the behaviour specified in an earlier experiment that set ECT on the SYN-
   ACK [RFC5562].  This is justified in Section 4.3.

[ms] If RFC 5562 is obsoleted, I think a better wording would be of the form "This experiments updates the behavior defined in RFC 5562 as follow: ..."


* 3.2.2.3.  Fall-Back Following No Response to an ECT SYN-ACK

   This fall-back strategy attempts to use ECT one more time than the
   strategy for ECT SYN-ACKs in [RFC5562] (which is made obsolete, being
   superseded by the present specification).

[ms] I suggest to simplify such text by explicitly stating how RFC 5562 is updated.

   Other fall-back strategies
   MAY be adopted if found to be more effective, e.g. fall-back to not-
   ECT on the first retransmission attempt.

[ms] In general, implementation guidance e.g. on heuristics could IMHO be moved to one place in the document, outside the normative part.


* 3.2.3.  Pure ACK

   For the experiments proposed here, the TCP implementation will set
   ECT on pure ACKs.  It can ignore the requirement in section 6.1.4 of
   RFC 3168 to set not-ECT on a pure ACK.

   A host that sets ECT on pure ACKs MUST reduce its congestion window
   in response to any congestion feedback, in order to regulate any data
   segments it might be sending amongst the pure ACKs. {ToDo: Write-up
   reconsideration of this requirement in the light of WG comments.} It
   MAY also implement AckCC [RFC5690] to regulate the pure ACK rate, but
   this is not required.  Note that, in comparison, TCP Congestion
   Control [RFC5681] does not require a TCP to detect or respond to loss
   of pure ACKs at all; it requires no reduction in congestion window or
   ACK rate.

   The question of whether the receiver of pure ACKs is required to feed
   back any CE marks on them is a matter for the relevant feedback
   specification ([RFC3168] or [I-D.ietf-tcpm-accurate-ecn]).  It is
   outside the scope of the present specification.  Currently AccECN
   feedback is required to count CE marking of any control packet
   including pure ACKs.  Whereas RFC 3168 is silent on this point, so
   feedback of CE-markings might be implementation specific (see
   Section 4.4.1).

      DISCUSSION: An AccECN deployment or an implementation of RFC 3168
      that feeds back CE on pure ACKs will be at a disadvantage compared
      to an RFC 3168 implementation that does not.  To solve this, the
      WG could decide to prohibit setting ECT on pure ACKs unless AccECN
      has been negotiated.  If it does, the penultimate sentence of the
      Introduction will need to be modified.

[ms] For what it is worth, I would personally go for a more liberal specification, albeit I may here be on the rough side of the consensus. Will the world end if a TCP sender ignores a CE mark on a pure ACK? We somehow know that ignoring ACK loss in congestion control will not immediately cause congestion collapse in the Internet... So in which case is this really a relevant problem? But, sure, I may miss something here...


* 3.2.8.  General Fall-back for any Control Packet or Retransmission

[ms] I don't understand why this content is needed in a normative section. (And I also struggle with understanding the need for Section 4.9.)


* 4.  Rationale

[ms] This is a very long section with limited added value to implementers. An alternative would be to move the arguments to an appendix, and to clearly separate content that would matter to implementers.

* 4.1.  The Reliability Argument

[ms] I somehow miss in this section (and following ones) the question whether a sender really has to react to a CE mark in the same way like to a loss. There is ongoing experimentation in this space (draft-ietf-tcpm-alternativebackoff-ecn). Of course, we don't know the outcome of that other experiment, and I certainly don't want to create a dependency. Anyway, my high-level feeling is that for an experiment, this document is pretty convervative, as compared to what the Internet is as of today. But this is just my observation, this comment does not ask for any text change.

* 4.2.  SYNs

[ms] To me, this discussion is too much tied into AccECN. As mentioned before, for example, if we defined another way to feed back CE marks on SYNs in future, would this section still apply?

* 4.2.1.  Argument 1a: Unrecognized CE on the SYN

[ms] I have quite some doubts on this section and later subsections, but it may be better to sort out the high-level questions on SYNs first. As an editorial nit, I am not sure if the explanation for "S3" should dig into different data center designs.


* 5.  Interaction with popular variants or derivatives of TCP

   The following subsections discuss any interactions between setting
   ECT on all packets and using the following popular variants of TCP:
   IW10 and TFO.  It also briefly notes the possibility that the
   principles applied here should translate to protocols derived from
   TCP.  This section is informative not normative, because no
   interactions have been identified that require any change to
   specifications.  The subsection on IW10 discusses potential changes
   to specifications but recommends that no changes are needed.

   The designs of the following TCP variants have also been assessed and
   found not to interact adversely with ECT on TCP control packets: SYN
   cookies (see Appendix A of [RFC4987] and section 3.1 of [RFC5562]),
   TCP Fast Open (TFO [RFC7413]) and L4S [I-D.ietf-tsvwg-l4s-arch].

[ms] These two paragraphs can IMHO be removed without any loss of information.


* 5.1.  IW10

   If the initiator implements IW10, it seems rather over-conservative
   to reduce IW from 10 to 1 just in case a congestion marking was
   missed.  Nonetheless, the reduction to 1 SMSS will rarely harm
   performance, because:

   o  as long as the initiator is caching failures to negotiate AccECN,
      subsequent attempts to access the same server will not use ECT on
      the SYN anyway, so there will no longer be any need to
      conservatively reduce IW;

   o  currently it is not common for a TCP initiator (client) to have
      more than one data segment to send {ToDo: evidence/reference?} -
      IW10 is primarily exploited by TCP servers.

[ms] As IW10 seems pretty widely deployed, I wonder if that statement is indeed true for the broad set of use cases of TCP, most notably outside the WWW. I don't know if such text will indeed convince implementers. For sure we can publish whatever requirement we want in an RFC, but I see the risk that implementers will consider this guidance overly restrictive and e.g. just ignore this wording entirely, if it just focuses e.g. on the WWW.

   If a responder receives feedback that the SYN-ACK was CE-marked,
   Section 3.2.2.2 mandates that it reduces its initial window to 1
   SMSS.  When the responder also implements IW10, it is particularly
   important to adhere to this requirement in order to avoid overflowing
   a queue that is clearly already congested.

[ms] If the queue was "clearly already suggested", it might drop packets instead of marking them. I think it somehow depends on the AQM whether one can conclude from an ECN mark to a risk of "overflowing a queue". Instead of just stating normative limits, I think it could be valuable to implementers to explain that if the initial window is not reduced, there will be a significant risk of getting packet drops in that first window, which the corresponding impact on performance. Reacting to the CE mark reduces this risk and may thus result in better performance for this connection.


* 5.3.  TCP Derivatives

[ms] I think this can be removed


* 6.  Security Considerations

[ms] Aren't there privacy implications, e.g., for fingerprinting?