Re: [tsvwg] Deprecating RFC 3168 for future ECN experimentation

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Mon, 29 March 2021 09:49 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3BBEA3A126D for <tsvwg@ietfa.amsl.com>; Mon, 29 Mar 2021 02:49:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.62
X-Spam-Level:
X-Spam-Status: No, score=-1.62 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_DOTEDU=0.28] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CceJ5Vm3C36h for <tsvwg@ietfa.amsl.com>; Mon, 29 Mar 2021 02:49:04 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 086CD3A12F8 for <tsvwg@ietf.org>; Mon, 29 Mar 2021 02:49:03 -0700 (PDT)
Received: from GF-MBP-2.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 006691B00231; Mon, 29 Mar 2021 10:48:52 +0100 (BST)
To: "Bless, Roland (TM)" <roland.bless@kit.edu>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <1b673100019174d056c44339d3b1758df058a2aa.camel@petri-meat.com> <fc0e7ffe6cb66896000be498bf2be8ca1abd3fd7.camel@heistp.net> <4ddecbd6-184e-bc38-aae0-22a64d0de29b@kit.edu>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <04a469b6-4205-02c0-512f-17b3b45e9769@erg.abdn.ac.uk>
Date: Mon, 29 Mar 2021 10:48:52 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.9.0
MIME-Version: 1.0
In-Reply-To: <4ddecbd6-184e-bc38-aae0-22a64d0de29b@kit.edu>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/2udlB13GomGH2F5CIrJISn3Ch7A>
Subject: Re: [tsvwg] Deprecating RFC 3168 for future ECN experimentation
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Mar 2021 09:49:08 -0000

If you have good deployment data on devices with ECN enabled by default, 
then please share it.

It would be extremely helpful to also know if the home routers you 
mention were deployed using FQ.

Gorry

On 29/03/2021 09:26, Bless, Roland (TM) wrote:
> Hi,
>
> On 27.03.21 at 15:41 Pete Heist wrote:
>> I agree overall. If we want to introduce a proposal that's incompatible
>> with RFC3168, we should first make it historic.
>
> I just want to add that RFC 3168 is a proposed standard, whereas we
> want to introduce L4S as an _experiment_. All the recent work of the
> AQM WG would be also void and I thought that vendors are actually
> implementing this in home routers. I think that we must be really sure
> about RFC3168 (non-)deployment before deprecating RFC3168. Even RFC 8311
> still stated: "Forwarding behavior as described in RFC 3168 remains the
> preferred approach for routers that are not involved in ECN 
> experiments,".
>
>> Before we do that though, we should make sure that the current CE is
>> not actually useful. Figure 5 in this paper suggests some benefit to
>> two bits of signal as opposed to one:
>> http://buffer-workshop.stanford.edu/papers/paper34.pdf
>>
>> A second signal provides a harder backoff without packet loss, for
>> example during capacity changes or flow introductions. It wouldn't be
>> ideal to deprecate RFC3168, only to find out that another bit of signal
>> in line with CE, along with ABE in RFC8511, or something similarly
>> deployable with today's equipment, is still useful.
>
> I'm also in favor of having better (i.e., more fine grained)
> explicit congestion signals as end-systems may use them to
> come up with better decisions. I expressed this clearly in the
> ECT(1) input cs. output signal discussion. Now, if we come up
> with a DSCP as additional safeguard qualifier for L4S,
> which I support, then it makes sense to revisit the earlier
> WG decision and use ECT(1) as additional congestion signal.
>
>> It's also my position that we can't ignore existing RFC3168
>> bottlenecks, not just for safety but also for performance. The recent
>> ISP study we did suggested RFC3168 AQMs may be present on ~10% of
>> Internet paths there. Prior to that we heard 5% elsewhere. Whatever the
>> number is exactly, these AQMs do exist and mark in response to both
>> ECT(0) and ECT(1). If you introduce traffic that backs off much less in
>> response to CE, the AQMs may operate sub-optimally, since they weren't
>> designed with that kind of traffic in mind
>> (https://github.com/heistp/l4s-tests/#intra-flow-latency-spikes)
>
> In case home routers are using RFC 3168 and congestion is in
> downstream direction one would only be able to detect this by seeing
> ECE in the other direction. Maybe servers from large content providers
> could also provide data about this...
>
> Regards,
> Roland
>
>> On Fri, 2021-03-26 at 13:01 -0400, Steven Blake wrote:
>>> A lot (not all) of the recent arguments revolve around the assumption
>>> by some that RFC 3168 ECN deployment barely exists in the Internet, and
>>> the few networks where it does can be safely ignored, or cleaned out,
>>> or be expected to take proactive measures to protect themselves, which
>>> may in practice require them to lobby their router vendors to spin
>>> patch releases to enable (some of) the mitigation measures detailed in
>>> -l4ops-02 Sec. 5.
>>>
>>> If that is the WG consensus, then I *strongly urge* the WG to do the
>>> following:
>>>
>>> 1. Push to move RFC 3168 ECN to Historic
>>>
>>> 2. Adopt the following "New ECN" signals for future ECN
>>> experimentation:
>>>
>>> - Not-ECT
>>> - ECT
>>> - CE-a
>>> - CE-b
>>>
>>> This second step would allow for two sets of experiments. The semantics
>>> of CE-a and CE-b for the first set of experiments would be as follows:
>>>
>>> - CE-a: "Decelerate"
>>> - CE-b: "Decelerate harder" (multiplicative decrease)
>>>
>>> The exact behavior elicited by the "Decelerate" signal would be the
>>> subject of investigation. Since we are certain that any remaining RFC
>>> 3168 deployments can be safely ignored, then ECT/CE-a/CE-b can be used
>>> as unambiguous signals to steer packets into a low-latency queue, if
>>> desired.
>>>
>>> The semantics of CE-a and CE-b for the second set of experiments would
>>> be as follows:
>>>
>>> - CE-a: "Decelerate"
>>> - CE-b: "Accelerate"
>>>
>>> An aggressive fraction (100%?) of CE-b marked packets traversing a
>>> queue not in "Accelerate" state would be re-marked to either CE-a or
>>> ECT. Any packet discard (or detection of high delay variation?) must
>>> disable the transport's "Accelerate" mechanism for some interval and
>>> should cause the transport to revert to "TCP-friendly" behavior for
>>> some (different?) interval. The exact behaviors of "Accelerate" and
>>> "Decelerate" signals would be the subject of investigation. Again,
>>> since we are certain that any remaining RFC 3168 deployments can be
>>> safely ignored, then ECT/CE-a/CE-b can be used as unambiguous signals
>>> to steer packets into a low-latency queue.
>>>
>>> The differences between these two sets of experiments hinge on whether
>>> there is more utility in an "Accelerate" signal coupled with a
>>> "Decelerate" signal, or with two separate levels of "Decelerate"
>>> signals. Since it is WG consensus that the RFC 3168 ECN experiment
>>> failed after two decades, we probably only get one more chance to get
>>> this right, so careful and exhaustive experimentation which explores
>>> the design space is in order.
>>>
>>> Obviously, both sets of experiments cannot be run simultaneously on
>>> intersecting parts of the Internet. I leave the options for safely
>>> isolating these experiments as an exercise for the reader. Since we are
>>> certain that any remaining RFC 3168 ECN deployments can be safely
>>> ignored, I suggest choosing bit assignments for the four signals that
>>> induce maximum pain in the obstinate minority that might still deploy
>>> RFC 3168 ECN.
>>>
>>> Now, *if it is not WG consensus* that any existing RFC 3168 ECN
>>> deployments can be safely ignored, then I *strongly urge* the WG *to
>>> not adopt* experimental proposals that place burden and/or risk on
>>> networks that have deployed it.
>>>
>>>
>>> TL;DR: Either RFC 3168 ECN exists in the Internet, or it doesn't.
>>> Decide, and act appropriately.
>>>
>>>
>>> Regards,
>>>
>>> // Steve
>>>
>>>
>>>
>>
>>