[tsvwg] Gorry Fairhurst Individual thoughts on choosing whether/how to advance ECN work.

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Fri, 15 May 2020 14:54 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C9C2F3A0AD6 for <tsvwg@ietfa.amsl.com>; Fri, 15 May 2020 07:54:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mCeonMEDd7kM for <tsvwg@ietfa.amsl.com>; Fri, 15 May 2020 07:54:10 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id 98B393A0A8C for <tsvwg@ietf.org>; Fri, 15 May 2020 07:54:09 -0700 (PDT)
Received: from GF-MacBook-Pro.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id E60B51B0015F for <tsvwg@ietf.org>; Fri, 15 May 2020 15:54:04 +0100 (BST)
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
To: "tsvwg@ietf.org" <tsvwg@ietf.org>
Message-ID: <dbc71da6-70f1-7369-1d2d-f08fb3b08b69@erg.abdn.ac.uk>
Date: Fri, 15 May 2020 15:54:04 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.8.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ABx38lCExmPA1VAQKgmE8Qhsr9s>
Subject: [tsvwg] Gorry Fairhurst Individual thoughts on choosing whether/how to advance ECN work.
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 May 2020 14:54:12 -0000

Some people have asked me about various aspects of this debate, and 
towards the end of this process, I’d like to share (as an individual) 
some of my thoughts. Whatever the outcome, I think we have developed a 
lot of material to help us move forward as a group.

We do have a shortage of bits (not uncommon in the IETF), and there are 
many parts to this story:

1. A signal to the network to steer traffic towards or away from the new 
marking: Flow queues avoid needing the endpoint reaction to be known by 
the AQM, and allow each flow have a separate treatment, but have to 
consider the awkward  question of aggregates of flows, and applications 
that choose to shard over multiple flows.

2.  Signal to the network to support network operations, especially for 
wide-scale experiment: It would be good for an operator to know a flow 
is expected to be treated different in their network segment or those 
network segments that follow, and whether they for operational reasons 
wish to avoid that different treatment within their network (e.g., 
following a discovered configuration issue). L4S makes the two transport 
behaviours explicit. I expect some thought would be needed about how 
ECT(1). I think SCE makes it hard to explicitly exclude a flow from 
being a part of an experiment, so we need to be sure that any remarking 
is always benign or helpful. My own view is that using other classifiers 
such as flow IDs or DSCPs brings a lot of complexity to operations, 
because these are used for other purposes in various use-cases, so I am 
not yet clear that helps. L4S introduced an explicit signal (albeit 
on-trust) that the transport will gracefully react to CE marks in an 
appropriate way. I think L4S makes the two transport behaviours explicit 
to observers on the path, I liked that.

3. I don’t know the full set of uses for single queue AQMs that do ECN 
marking:  I would anticipate issues with any AQM that does not conform 
to current transport expectations - just as I anticipate issues with 
routers that configure old-style RED-based ECN marking using large 
thresholds and measuring over deep queues. Guidance will be needed.

4. ECN transport receiver reaction: It would help to have a way to 
reliably tell the receiving endpoint(s) whether the marks they see 
conform to their transport expectations - i.e. whether a packet was 
marked in the new way. SCE nicely makes the marking behaviour explicit, 
I like SCE from this respective.

5. We need to engineer transport methods that work: This is tsvwg’s main 
focus - I would expect to be able to update behaviour as we move forward 
with any work.  As an individual commentator, I expect significant work 
will finally be needed to arrive there (DCTCP was not designed for the 
Internet), but that I suggest we need to decide now on the path to take.

6. We must be able to deploy, manage, and measure the methods for paths 
involving tunnels and encapsulations: This is hard, encaps/tunnels are 
widespread, and tsvwg has devoted significant energy with INTAREA to 
update IETF specs in this area.This was one of the biggest activities 
that faced the proponents of deploying new ECN work, holding up the WG 
while this was done. Designing to work well with tunnels is crucial.

7. We still need to define BCPs that support operation: The use of 
scheduling/queueing fits in this area, but methods depend on the number 
of flows and the way you aggregate - what is applicable in a home CPE or 
WiFi base station is unlikely to be applicable in a service aggregation 
router or an enterprise WiFi deployment. That's why RFC7567 currently 
calls out scheduling, policing and protection as useful, but optional 
components. As an individual commentator, I can see some scenarios which 
benefit from policers, scheduling and/or protection mechanisms, and 
these may well become the accepted best practice for using ECN - and 
hope transports will evolve to mitigate issues, but I would expect 
details to remain scenario dependent.

Alas, I don’t see how we can do all using just the current ECN field, 
and that’s why I say we have a shortage of bits.

My own personal viewpoint is that we need to decide on the benefits for 
enabling ECN in the long term.

Best wishes,

Gorry Fairhurst
(not as a chair)