Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

Bob Briscoe <> Thu, 25 July 2019 20:51 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id A10831201F8 for <>; Thu, 25 Jul 2019 13:51:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id tIJ8ot4NdUc4 for <>; Thu, 25 Jul 2019 13:51:51 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8B5741201F1 for <>; Thu, 25 Jul 2019 13:51:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=GQuXQiqVkop55PDIt6eGUdNp8LIHDlsxo/FzYSahsRM=; b=5saWZyBK+dFSAQYbhw8ufxnlD 5SgYpNhYl/2cx0i1XZ+mYkgCIrJzzs/4+WIpXoGdXyNVW7uWS7VIrahgGf9NSdzPxunhGc2AgKhxK rwxwBerA8JzwPpKpbT27zht8UoAJZcNZxeKwM/UEB/b8BOwBaNv7Uf2bAxiUjLzYvtIRln1Ucs8Ms wiHNh5fSL5K/ZcKyw5Nhhly1jYFE+Zt5WxI23lyrmMUTBUb49IQinhZzeqRUifLp/uxsaavMK7Ljd RE6wGRdVFOZmRLPNLMfuARSgscL/bdmPXochLCEPaOlyF1udBK55sjmWPnXSNIsW3pYG/2iLWnd+R VXlVbschQ==;
Received: from ([]:52224) by with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.92) (envelope-from <>) id 1hqkiK-00035k-4H; Thu, 25 Jul 2019 21:51:48 +0100
To: Sebastian Moeller <>
Cc: "De Schepper, Koen (Nokia - BE/Antwerp)" <>, "Black, David" <>, "" <>, "" <>, Dave Taht <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <>
From: Bob Briscoe <>
Message-ID: <>
Date: Thu, 25 Jul 2019 16:51:45 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/alternative; boundary="------------F919948C2AE19FF010EA46A4"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname -
X-AntiAbuse: Original Domain -
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain -
X-Get-Message-Sender-Via: authenticated_id:
Archived-At: <>
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 25 Jul 2019 20:51:55 -0000


The protocol ID identifies the wire protocol, not the congestion control 
behaviour. If we had used a different protocol ID for each congestion 
control behaviour, we'd have run out of protocol IDs long ago (semi 
serious ;)

This is a re-run of a debate that has already been had (in Jul 2015 - 
Nov 2016), which is recorded in the appendix of ecn-l4s-id here:
Quoted and annotated below:

> B.4.  Protocol ID
>     It has been suggested that a new ID in the IPv4 Protocol field or the
>     IPv6 Next Header field could identify L4S packets.  However this
>     approach is ruled out by numerous problems:
>     o  A new protocol ID would need to be paired with the old one for
>        each transport (TCP, SCTP, UDP, etc.);
>     o  In IPv6, there can be a sequence of Next Header fields, and it
>        would not be obvious which one would be expected to identify a
>        network service like L4S;

In particular, the protocol ID / next header stays next to the upper 
layer header as a PDU gets encapsulated, possibly many times. So the 
protocol ID is not necessarily (rarely?) in the outer, particularly in 
IPv6, and it might be encrypted in IPSec.

>     o  A new protocol ID would rarely provide an end-to-end service,
>        because It is well-known that new protocol IDs are often blocked
>        by numerous types of middlebox;
>     o  The approach is not a solution for AQMs below the IP layer;

That last point means that the protocol ID is not designed to always 
propagate to the outer on encap and back from the outer on decap, 
whereas the ECN field is (and it's the only field that is).


On 21/07/2019 16:48, Sebastian Moeller wrote:
> Dear Bob,
>> On Jul 21, 2019, at 21:14, Bob Briscoe <> wrote:
>> Sebastien,
>> On 21/07/2019 17:08, Sebastian Moeller wrote:
>>> Hi Bob,
>>>> On Jul 21, 2019, at 14:30, Bob Briscoe <>
>>>>   wrote:
>>>> David,
>>>> On 19/07/2019 21:06, Black, David wrote:
>>>>> Two comments as an individual, not as a WG chair:
>>>>>> Mostly, they're things that an end-host algorithm needs
>>>>>> to do in order to behave nicely, that might be good things anyways
>>>>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>>>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>>>>> ones you think are too rigid ... maybe they can be loosened?
>>>>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>>>>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
>>>> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
>>>> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
>>>> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).
>>> 	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.
>>> Best Regards
>>> 	Sebastian
>> I think you've understood this from reading abbreviated description of the requirement on the list, rather than the spec. The spec. solely says:
>> 	A scalable congestion control MUST detect loss by counting in time-based units
>> That's all. No more, no less.
>> People call this the "RACK requirement", purely because the idea came from RACK. There is no requirement to do RACK, and the requirement applies to all transports, not just TCP.
> 	Fair enough, but my argument was not really about RACK at all, it more-so applies to the linear response to CE-marks that ECT(1) promises in the L4S approach. You are making changes to TCP's congestion controller that make it cease to be "TCP-friendly" (for arguably good reasons). So why insist on pretending that this is still TCP? So give it a new protocol ID already and all your classification needs are solved. As a bonus you do not need to use the same signal (CE) to elicit two different responses, but you could use the re-gained ECT(1) code point similarly to SCE to put the new fine-grained congestion signal into... while using CE in the RFC3168 compliant sense.
>> It then means that a packet with ECT1 in the IP field can be forwarded without resequencing (no requirement - it just it /can/ be).
> 	Packets always "can" be forwarded without resequencing, the question is whether the end-points are going to like that...
> And IMHO even RACK with its at maximum one RTT reordering windows gives intermediate hops not much to work with, without knowing the full RTT a cautious hop might allow itself one retransmission slot (so its own contribution to the RTT), but as far as I can tell they do that already. And tracking the RTT will require to keep per flow statistics, this also seems like it can get computationally expensive quickly... (I probably misunderstand how RACK works, but I fail to see how it will really allow more re-ordering, but that is also orthogonal to the L4S issues I try to raise).
>> This is a network layer 'unordered delivery' property, so it's appropriate to flag at the IP layer.
> 	But at that point you are multiplexing multiple things into the poor ECT(1) codepoint, the promise of a certain "linear" back-off behavior on encountered congestion AND a "allow relaxed ordering" ( "detect loss by counting in time-based units" does not seem to be fully equivalent with a generic tolerance to 'unordered delivery' as far as I understand). That seems asking to much of a simple number...
> Best Regards
> 	Sebastian
>> Bob
>> -- 
>> ________________________________________________________________
>> Bob Briscoe
> _______________________________________________
> Ecn-sane mailing list

Bob Briscoe