Re: [tsvwg] Deprecating RFC 3168 for future ECN experimentation

Pete Heist <pete@heistp.net> Tue, 30 March 2021 10:57 UTC

Return-Path: <pete@heistp.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8F80A3A09BC for <tsvwg@ietfa.amsl.com>; Tue, 30 Mar 2021 03:57:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.819
X-Spam-Level:
X-Spam-Status: No, score=-1.819 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_DOTEDU=0.28] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=heistp.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eHQ_y9P_2HFi for <tsvwg@ietfa.amsl.com>; Tue, 30 Mar 2021 03:57:09 -0700 (PDT)
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C75413A09BA for <tsvwg@ietf.org>; Tue, 30 Mar 2021 03:57:08 -0700 (PDT)
Received: by mail-wr1-x42b.google.com with SMTP id z2so15785202wrl.5 for <tsvwg@ietf.org>; Tue, 30 Mar 2021 03:57:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=m93L6yjybXziBIbdf/iPkIJmRgV5yBdPnPClMLK4tSU=; b=F4zvTCZJA04U9KvV97m438NqkHOfZsmcVBxi7bMWJZjD183lhZIDnerijPu5QVmENO fPDazTeADZvQayxbdC2jsXkhvHgM1B+YSKaTTzCnTT22JqbNeN9a0Pa3lseZ2/udYepY QkKGv1Ebaz9A7FkmsAQdRKZnfUa04KRIGzzoDq7OF/h3+3bcMM0AUufwJO1YOaxOsUfI v5jBPJq/1YYt3+y8ORRR1kw4tR3rvn6lTqwFmMe3+/bJSqTMqP19v/aJz1SEUgdIV/Qk yXsVJ7NjCCnj5l1hwoTLHywhitVdSy3SOu8FJVEgoA0Fs4wobgAsHKuN9W6jL5RVchCv QVMg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=m93L6yjybXziBIbdf/iPkIJmRgV5yBdPnPClMLK4tSU=; b=t1aJfYihrBIO0uJ53IAi2te24TbYDaiGI1JGbe0u5h8qzmaK4zoFE4IsuzVlgMUABd J6TF7LNFyWZaYHBQXiYV8DapkVL8gBh3E/wGugM8lmkXnLz4TXxYIRaQJvsZu6DuDkBW 7XDLlHErBM5Eer1FxU7Ugjt5j6KLIauuD7apCCt6UDYbkGhujQgl74BCt9plklc910D2 ZJVPiHplWk3YbpeomiFZNJCDQIy8ACTQrPcn2HU30L2zznPEcLZNkMllZYdfklzN64Zg vyO9EmuyP85bClOl12YlbNE1kEvqru+uwkt5807bcKdItRO1Z3obGqX1MoZ0YWurd4P1 8gEg==
X-Gm-Message-State: AOAM5316P1MQYPesAyYnA9S1l8B5X3dKDllkBSDuDG+C8f4LO7n9QYfu Q9M/Eu8i2B6BGa1D3NnKO0YY8l/l9MrZ/w==
X-Google-Smtp-Source: ABdhPJx+CIm43PvqiwWcYZ0jmmOBwTuAVUeLkXC14V/UzirnBKY/Rl1H8fBa4Veo3DKMLPh6wfOSTA==
X-Received: by 2002:adf:c752:: with SMTP id b18mr33055145wrh.233.1617101821989; Tue, 30 Mar 2021 03:57:01 -0700 (PDT)
Received: from [10.72.0.88] (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id p10sm35421090wrw.33.2021.03.30.03.57.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 03:57:01 -0700 (PDT)
Message-ID: <88026fb8957278dce24b8f6cbdbfecc0c3bb5233.camel@heistp.net>
From: Pete Heist <pete@heistp.net>
To: "Bless, Roland (TM)" <roland.bless@kit.edu>, Steven Blake <slblake@petri-meat.com>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
Date: Tue, 30 Mar 2021 12:56:59 +0200
In-Reply-To: <4ddecbd6-184e-bc38-aae0-22a64d0de29b@kit.edu>
References: <1b673100019174d056c44339d3b1758df058a2aa.camel@petri-meat.com> <fc0e7ffe6cb66896000be498bf2be8ca1abd3fd7.camel@heistp.net> <4ddecbd6-184e-bc38-aae0-22a64d0de29b@kit.edu>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.38.4
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/M0hxkVq5Z-BGuiKZ_wDgWtRdBu8>
Subject: Re: [tsvwg] Deprecating RFC 3168 for future ECN experimentation
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Mar 2021 10:57:14 -0000

...

On Mon, 2021-03-29 at 10:26 +0200, Bless, Roland (TM) wrote:
> Hi,
> 
> On 27.03.21 at 15:41 Pete Heist wrote:
> > I agree overall. If we want to introduce a proposal that's
> > incompatible
> > with RFC3168, we should first make it historic.
> 
> I just want to add that RFC 3168 is a proposed standard, whereas we
> want to introduce L4S as an _experiment_. All the recent work of the
> AQM WG would be also void and I thought that vendors are actually
> implementing this in home routers. I think that we must be really sure
> about RFC3168 (non-)deployment before deprecating RFC3168. Even RFC
> 8311
> still stated: "Forwarding behavior as described in RFC 3168 remains the
> preferred approach for routers that are not involved in ECN
> experiments,".

Just to clarify, although I think it *would* be the right approach to
deprecate RFC3168 in order to redefine its semantics, I don't
personally support doing that. Its benefits to typical Internet traffic
may not be dramatic, but on the other hand it is fully compatible with
non-ECN traffic, and avoids some packet loss and retransmissions, while
serving as a reliable base to build upon.

I also think recent discussion suggests that there is enough support
for the integrity of RFC3168 that traffic from new proposals, if not
guarded by a DSCP, must be fully compatible with existing deployed 3168
middleboxes, effectively including those with single queue AQMs.

> > Before we do that though, we should make sure that the current CE
> > is
> > not actually useful. Figure 5 in this paper suggests some benefit
> > to
> > two bits of signal as opposed to one:
> > http://buffer-workshop.stanford.edu/papers/paper34.pdf
> > 
> > A second signal provides a harder backoff without packet loss, for
> > example during capacity changes or flow introductions. It wouldn't
> > be
> > ideal to deprecate RFC3168, only to find out that another bit of
> > signal
> > in line with CE, along with ABE in RFC8511, or something similarly
> > deployable with today's equipment, is still useful.
> 
> I'm also in favor of having better (i.e., more fine grained)
> explicit congestion signals as end-systems may use them to
> come up with better decisions. I expressed this clearly in the
> ECT(1) input cs. output signal discussion. 

> Now, if we come up
> with a DSCP as additional safeguard qualifier for L4S,
> which I support, then it makes sense to revisit the earlier
> WG decision and use ECT(1) as additional congestion signal.

> > It's also my position that we can't ignore existing RFC3168
> > bottlenecks, not just for safety but also for performance. The recent
> > ISP study we did suggested RFC3168 AQMs may be present on ~10% of
> > Internet paths there. Prior to that we heard 5% elsewhere. Whatever
> > the
> > number is exactly, these AQMs do exist and mark in response to both
> > ECT(0) and ECT(1). If you introduce traffic that backs off much less
> > in
> > response to CE, the AQMs may operate sub-optimally, since they
> > weren't
> > designed with that kind of traffic in mind
> > (https://github.com/heistp/l4s-tests/#intra-flow-latency-spikes).
> 
> In case home routers are using RFC 3168 and congestion is in
> downstream direction one would only be able to detect this by seeing
> ECE in the other direction. Maybe servers from large content providers
> could also provide data about this...

That's also what we saw in the ISP study at the gateway, that ECE is a
more reliable indicator of AQM deployment than CE:

https://www.ietf.org/archive/id/draft-heist-tsvwg-ecn-deployment-observations-02.html#name-tcp-initiated-from-lan-to-w

Pete

> Regards,
>   Roland
> 
> > On Fri, 2021-03-26 at 13:01 -0400, Steven Blake wrote:
> > > A lot (not all) of the recent arguments revolve around the
> > > assumption
> > > by some that RFC 3168 ECN deployment barely exists in the Internet,
> > > and
> > > the few networks where it does can be safely ignored, or cleaned
> > > out,
> > > or be expected to take proactive measures to protect themselves,
> > > which
> > > may in practice require them to lobby their router vendors to spin
> > > patch releases to enable (some of) the mitigation measures detailed
> > > in
> > > -l4ops-02 Sec. 5.
> > > 
> > > If that is the WG consensus, then I *strongly urge* the WG to do
> > > the
> > > following:
> > > 
> > > 1. Push to move RFC 3168 ECN to Historic
> > > 
> > > 2. Adopt the following "New ECN" signals for future ECN
> > > experimentation:
> > > 
> > > - Not-ECT
> > > - ECT
> > > - CE-a
> > > - CE-b
> > > 
> > > This second step would allow for two sets of experiments. The
> > > semantics
> > > of CE-a and CE-b for the first set of experiments would be as
> > > follows:
> > > 
> > > - CE-a: "Decelerate"
> > > - CE-b: "Decelerate harder" (multiplicative decrease)
> > > 
> > > The exact behavior elicited by the "Decelerate" signal would be the
> > > subject of investigation. Since we are certain that any remaining
> > > RFC
> > > 3168 deployments can be safely ignored, then ECT/CE-a/CE-b can be
> > > used
> > > as unambiguous signals to steer packets into a low-latency queue,
> > > if
> > > desired.
> > > 
> > > The semantics of CE-a and CE-b for the second set of experiments
> > > would
> > > be as follows:
> > > 
> > > - CE-a: "Decelerate"
> > > - CE-b: "Accelerate"
> > > 
> > > An aggressive fraction (100%?) of CE-b marked packets traversing a
> > > queue not in "Accelerate" state would be re-marked to either CE-a
> > > or
> > > ECT. Any packet discard (or detection of high delay variation?)
> > > must
> > > disable the transport's "Accelerate" mechanism for some interval
> > > and
> > > should cause the transport to revert to "TCP-friendly" behavior for
> > > some (different?) interval. The exact behaviors of "Accelerate" and
> > > "Decelerate" signals would be the subject of investigation. Again,
> > > since we are certain that any remaining RFC 3168 deployments can be
> > > safely ignored, then ECT/CE-a/CE-b can be used as unambiguous
> > > signals
> > > to steer packets into a low-latency queue.
> > > 
> > > The differences between these two sets of experiments hinge on
> > > whether
> > > there is more utility in an "Accelerate" signal coupled with a
> > > "Decelerate" signal, or with two separate levels of "Decelerate"
> > > signals. Since it is WG consensus that the RFC 3168 ECN experiment
> > > failed after two decades, we probably only get one more chance to
> > > get
> > > this right, so careful and exhaustive experimentation which
> > > explores
> > > the design space is in order.
> > > 
> > > Obviously, both sets of experiments cannot be run simultaneously on
> > > intersecting parts of the Internet. I leave the options for safely
> > > isolating these experiments as an exercise for the reader. Since we
> > > are
> > > certain that any remaining RFC 3168 ECN deployments can be safely
> > > ignored, I suggest choosing bit assignments for the four signals
> > > that
> > > induce maximum pain in the obstinate minority that might still
> > > deploy
> > > RFC 3168 ECN.
> > > 
> > > Now, *if it is not WG consensus* that any existing RFC 3168 ECN
> > > deployments can be safely ignored, then I *strongly urge* the WG
> > > *to
> > > not adopt* experimental proposals that place burden and/or risk on
> > > networks that have deployed it.
> > > 
> > > 
> > > TL;DR: Either RFC 3168 ECN exists in the Internet, or it doesn't.
> > > Decide, and act appropriately.
> > > 
> > > 
> > > Regards,
> > > 
> > > // Steve
> > > 
> > > 
> > > 
> > 
> > 
>