Re: [tsvwg] Deprecating RFC 3168 for future ECN experimentation

Pete Heist <pete@heistp.net> Tue, 30 March 2021 19:43 UTC

Return-Path: <pete@heistp.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 38FF33A1DD2 for <tsvwg@ietfa.amsl.com>; Tue, 30 Mar 2021 12:43:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.819
X-Spam-Level:
X-Spam-Status: No, score=-1.819 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_DOTEDU=0.28] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=heistp.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UVDc_L6GRQdi for <tsvwg@ietfa.amsl.com>; Tue, 30 Mar 2021 12:43:12 -0700 (PDT)
Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 926123A1DBB for <tsvwg@ietf.org>; Tue, 30 Mar 2021 12:43:12 -0700 (PDT)
Received: by mail-wr1-x42f.google.com with SMTP id e18so17360951wrt.6 for <tsvwg@ietf.org>; Tue, 30 Mar 2021 12:43:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=vV7xx8Eav+nvT1vlSGS8VV27sLkrjk8GQo2nBqbfis0=; b=cMN6H84KdmOOXeEUkg3naMy/YMbp3i57uYrwxpdht6Tw4x9alBrYTzd2OqNrKlC9uK 67a/Ggu9xPenHBp0cHoQMCjueeEewx+Hn/1daO4TB80fI4JLdzbUWEMzAaCyO6TffMdI PNYDhOOqOQFcFaikrcOh2nWGJuIhHlOlQxwhafswUPyOw0YxxDVTmxYPMwR6HrtiuKrz l5Zgo9PT3pYGszKHugmMT8Wea8iW9aO9sBQYi1htzaA/DRdiuD1+lVLxi6egnzAHtHPu kSz59D6JXFKg9NujEf5BRJ4EJXVOixurtSfrmZ3RTaku4szwSBl9Kij1iQkZ32AR8V8c poUA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=vV7xx8Eav+nvT1vlSGS8VV27sLkrjk8GQo2nBqbfis0=; b=raoN5yWwdDHaRhUPQ1Y2pKsvSwkp1qpajvZmdYshRnLz3zdjdioNMMCS4bvO+xS7t1 FN/gXhpO0audPggjDYZY348zKn7uPYb8KXG2U4E3XquudX/FYUUKlFxu3iGja8AF5RyI a4WW1nyVTHl7KqKmgUFycR6nApaieSjUSuPN2oIbA7/vSyURm9QGwfQKS4nCJ0g03rw+ qi2ZaIyFprMnik5ho4p7FzO6R57hQyPZRA6JpW7cmjyeMwhjLMc3WyFxnfTtPfUxV6m9 xwzH7GzWE4K8JJqWjH1GtmlG5+nP2B1yeY7NMKC8Uqxzg87zy/bEwotqW09qGOH6Ekh+ McoQ==
X-Gm-Message-State: AOAM530bbS37O0d8ImwXaxsKcGJMcuvVAzQUjAGWQF7YD4afFTLf/3Fk eNBpsolJut1lBiPdgbs4FjAj6xM6tMRDvw==
X-Google-Smtp-Source: ABdhPJxQyMLqTKqwnQr7MedyJMiEOkv+YD4u3jMQIboz9Fn8B4EcdJ14JHU5P7md4CGB+/NIlETA/g==
X-Received: by 2002:adf:f148:: with SMTP id y8mr34701723wro.107.1617133389116; Tue, 30 Mar 2021 12:43:09 -0700 (PDT)
Received: from [10.72.0.88] (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id d13sm37716662wro.23.2021.03.30.12.43.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 12:43:08 -0700 (PDT)
Message-ID: <811ffd5d527598930d748865fa43599029ddf736.camel@heistp.net>
From: Pete Heist <pete@heistp.net>
To: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
Date: Tue, 30 Mar 2021 21:43:07 +0200
In-Reply-To: <dddecdb7-4514-318f-4183-b2f79d3994eb@erg.abdn.ac.uk>
References: <1b673100019174d056c44339d3b1758df058a2aa.camel@petri-meat.com> <fc0e7ffe6cb66896000be498bf2be8ca1abd3fd7.camel@heistp.net> <4ddecbd6-184e-bc38-aae0-22a64d0de29b@kit.edu> <88026fb8957278dce24b8f6cbdbfecc0c3bb5233.camel@heistp.net> <dddecdb7-4514-318f-4183-b2f79d3994eb@erg.abdn.ac.uk>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.38.4
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/U9HKze9_WdqErcOVsoFrV9ukVfc>
Subject: Re: [tsvwg] Deprecating RFC 3168 for future ECN experimentation
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Mar 2021 19:43:16 -0000

More comments below...

On Tue, 2021-03-30 at 12:15 +0100, Gorry Fairhurst wrote:
> Please see below:
> 
> On 30/03/2021 11:56, Pete Heist wrote:
> > ...
> > 
> > On Mon, 2021-03-29 at 10:26 +0200, Bless, Roland (TM) wrote:
> > > Hi,
> > > 
> > > On 27.03.21 at 15:41 Pete Heist wrote:
> > > > I agree overall. If we want to introduce a proposal that's
> > > > incompatible
> > > > with RFC3168, we should first make it historic.
> > > I just want to add that RFC 3168 is a proposed standard, whereas
> > > we
> > > want to introduce L4S as an _experiment_. All the recent work of
> > > the
> > > AQM WG would be also void and I thought that vendors are actually
> > > implementing this in home routers. I think that we must be really
> > > sure
> > > about RFC3168 (non-)deployment before deprecating RFC3168. Even
> > > RFC
> > > 8311
> > > still stated: "Forwarding behavior as described in RFC 3168
> > > remains the
> > > preferred approach for routers that are not involved in ECN
> > > experiments,".
> > Just to clarify, although I think it *would* be the right approach
> > to
> > deprecate RFC3168 in order to redefine its semantics, I don't
> > personally support doing that. Its benefits to typical Internet
> > traffic
> > may not be dramatic, but on the other hand it is fully compatible
> > with
> > non-ECN traffic, and avoids some packet loss and retransmissions,
> > while
> > serving as a reliable base to build upon.
> > I also think recent discussion suggests that there is enough
> > support
> > for the integrity of RFC3168 that traffic from new proposals, if
> > not
> > guarded by a DSCP, must be fully compatible with existing deployed
> > 3168
> > middleboxes, effectively including those with single queue AQMs.
> > > > Before we do that though, we should make sure that the current
> > > > CE
> > > > is
> > > > not actually useful. Figure 5 in this paper suggests some
> > > > benefit
> > > > to
> > > > two bits of signal as opposed to one:
> > > > http://buffer-workshop.stanford.edu/papers/paper34.pdf
> > > > 
> > > > A second signal provides a harder backoff without packet loss,
> > > > for
> > > > example during capacity changes or flow introductions. It
> > > > wouldn't
> > > > be
> > > > ideal to deprecate RFC3168, only to find out that another bit
> > > > of
> > > > signal
> > > > in line with CE, along with ABE in RFC8511, or something
> > > > similarly
> > > > deployable with today's equipment, is still useful.
> > > I'm also in favor of having better (i.e., more fine grained)
> > > explicit congestion signals as end-systems may use them to
> > > come up with better decisions. I expressed this clearly in the
> > > ECT(1) input cs. output signal discussion.
> > > Now, if we come up
> > > with a DSCP as additional safeguard qualifier for L4S,
> > > which I support, then it makes sense to revisit the earlier
> > > WG decision and use ECT(1) as additional congestion signal.
> > > > It's also my position that we can't ignore existing RFC3168
> > > > bottlenecks, not just for safety but also for performance. The
> > > > recent
> > > > ISP study we did suggested RFC3168 AQMs may be present on ~10%
> > > > of
> > > > Internet paths there. Prior to that we heard 5% elsewhere.
> > > > Whatever
> > > > the
> > > > number is exactly, these AQMs do exist and mark in response to
> > > > both
> > > > ECT(0) and ECT(1). If you introduce traffic that backs off much
> > > > less
> > > > in
> > > > response to CE, the AQMs may operate sub-optimally, since they
> > > > weren't
> > > > designed with that kind of traffic in mind
> > > > (https://github.com/heistp/l4s-tests/#intra-flow-latency-spikes
> > > > ).
> > > In case home routers are using RFC 3168 and congestion is in
> > > downstream direction one would only be able to detect this by
> > > seeing
> > > ECE in the other direction. Maybe servers from large content
> > > providers
> > > could also provide data about this...
> > That's also what we saw in the ISP study at the gateway, that ECE
> > is a
> > more reliable indicator of AQM deployment than CE:
> > 
> >   
> > https://www.ietf.org/archive/id/draft-heist-tsvwg-ecn-deployment-observations-02.html#name-tcp-initiated-from-lan-to-w
> > 
> > Pete
> 
> Thanks, I think would agree on your summary of the "non dramatic" 
> benefits for existing using the currently standardised RFC-3168
> marking, 
> and hopefully what you say is not far from the summary we arrived at
> in 
> RFC 8087.

RFC8087 lists more benefits for RFC3168 than I did. What I also
appreciate about 3168 is that its compatibility with non-ECN traffic
makes it readily deployable, which is good engineering.

> I see arguments in recent discussions for retaining backwards 
> compatibility with this and also arguments stressing other aspects.

That does seem to describe the situation. It also suggests that when
there are arguments "for and against" backwards compatibility, rough
consensus on any consequential change to RFC3168 is likely to be
difficult. If so, it will remain a proposed standard, deployments will
continue, and full 3168 compatibility remains important.

Pete

> 
> Gorry
> 
> > > Regards,
> > >    Roland
> > > 
> > > > On Fri, 2021-03-26 at 13:01 -0400, Steven Blake wrote:
> > > > > A lot (not all) of the recent arguments revolve around the
> > > > > assumption
> > > > > by some that RFC 3168 ECN deployment barely exists in the
> > > > > Internet,
> > > > > and
> > > > > the few networks where it does can be safely ignored, or
> > > > > cleaned
> > > > > out,
> > > > > or be expected to take proactive measures to protect
> > > > > themselves,
> > > > > which
> > > > > may in practice require them to lobby their router vendors to
> > > > > spin
> > > > > patch releases to enable (some of) the mitigation measures
> > > > > detailed
> > > > > in
> > > > > -l4ops-02 Sec. 5.
> > > > > 
> > > > > If that is the WG consensus, then I *strongly urge* the WG to
> > > > > do
> > > > > the
> > > > > following:
> > > > > 
> > > > > 1. Push to move RFC 3168 ECN to Historic
> > > > > 
> > > > > 2. Adopt the following "New ECN" signals for future ECN
> > > > > experimentation:
> > > > > 
> > > > > - Not-ECT
> > > > > - ECT
> > > > > - CE-a
> > > > > - CE-b
> > > > > 
> > > > > This second step would allow for two sets of experiments. The
> > > > > semantics
> > > > > of CE-a and CE-b for the first set of experiments would be as
> > > > > follows:
> > > > > 
> > > > > - CE-a: "Decelerate"
> > > > > - CE-b: "Decelerate harder" (multiplicative decrease)
> > > > > 
> > > > > The exact behavior elicited by the "Decelerate" signal would
> > > > > be the
> > > > > subject of investigation. Since we are certain that any
> > > > > remaining
> > > > > RFC
> > > > > 3168 deployments can be safely ignored, then ECT/CE-a/CE-b
> > > > > can be
> > > > > used
> > > > > as unambiguous signals to steer packets into a low-latency
> > > > > queue,
> > > > > if
> > > > > desired.
> > > > > 
> > > > > The semantics of CE-a and CE-b for the second set of
> > > > > experiments
> > > > > would
> > > > > be as follows:
> > > > > 
> > > > > - CE-a: "Decelerate"
> > > > > - CE-b: "Accelerate"
> > > > > 
> > > > > An aggressive fraction (100%?) of CE-b marked packets
> > > > > traversing a
> > > > > queue not in "Accelerate" state would be re-marked to either
> > > > > CE-a
> > > > > or
> > > > > ECT. Any packet discard (or detection of high delay
> > > > > variation?)
> > > > > must
> > > > > disable the transport's "Accelerate" mechanism for some
> > > > > interval
> > > > > and
> > > > > should cause the transport to revert to "TCP-friendly"
> > > > > behavior for
> > > > > some (different?) interval. The exact behaviors of
> > > > > "Accelerate" and
> > > > > "Decelerate" signals would be the subject of investigation.
> > > > > Again,
> > > > > since we are certain that any remaining RFC 3168 deployments
> > > > > can be
> > > > > safely ignored, then ECT/CE-a/CE-b can be used as unambiguous
> > > > > signals
> > > > > to steer packets into a low-latency queue.
> > > > > 
> > > > > The differences between these two sets of experiments hinge
> > > > > on
> > > > > whether
> > > > > there is more utility in an "Accelerate" signal coupled with
> > > > > a
> > > > > "Decelerate" signal, or with two separate levels of
> > > > > "Decelerate"
> > > > > signals. Since it is WG consensus that the RFC 3168 ECN
> > > > > experiment
> > > > > failed after two decades, we probably only get one more
> > > > > chance to
> > > > > get
> > > > > this right, so careful and exhaustive experimentation which
> > > > > explores
> > > > > the design space is in order.
> > > > > 
> > > > > Obviously, both sets of experiments cannot be run
> > > > > simultaneously on
> > > > > intersecting parts of the Internet. I leave the options for
> > > > > safely
> > > > > isolating these experiments as an exercise for the reader.
> > > > > Since we
> > > > > are
> > > > > certain that any remaining RFC 3168 ECN deployments can be
> > > > > safely
> > > > > ignored, I suggest choosing bit assignments for the four
> > > > > signals
> > > > > that
> > > > > induce maximum pain in the obstinate minority that might
> > > > > still
> > > > > deploy
> > > > > RFC 3168 ECN.
> > > > > 
> > > > > Now, *if it is not WG consensus* that any existing RFC 3168
> > > > > ECN
> > > > > deployments can be safely ignored, then I *strongly urge* the
> > > > > WG
> > > > > *to
> > > > > not adopt* experimental proposals that place burden and/or
> > > > > risk on
> > > > > networks that have deployed it.
> > > > > 
> > > > > 
> > > > > TL;DR: Either RFC 3168 ECN exists in the Internet, or it
> > > > > doesn't.
> > > > > Decide, and act appropriately.
> > > > > 
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > // Steve
> > > > > 
> > > > > 
> > > > > 
> > > > 
>