Re: [tsvwg] Follow-up to your DSCP and ECN codepoint comments at tsvwg interim

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Thu, 12 March 2020 12:19 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0C9FD3A0EF2 for <tsvwg@ietfa.amsl.com>; Thu, 12 Mar 2020 05:19:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 63nkbAhVh8ol for <tsvwg@ietfa.amsl.com>; Thu, 12 Mar 2020 05:19:36 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id B7A1A3A0EEB for <tsvwg@ietf.org>; Thu, 12 Mar 2020 05:19:35 -0700 (PDT)
Received: from GF-MacBook-Pro.local (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 7B6651B00127; Thu, 12 Mar 2020 12:19:30 +0000 (GMT)
To: Jonathan Morton <chromatix99@gmail.com>, Lars Eggert <lars@eggert.org>
Cc: tsvwg@ietf.org
References: <7409b3a3-ba14-eb6d-154b-97c9d2da707b@bobbriscoe.net> <fe1b3c14f94d1fdd46b99d4fb057d093525310f0.camel@petri-meat.com> <0206bfc0-2c1b-64af-9fc4-ecb38e83be45@bobbriscoe.net> <E3D0E6F7-E7C2-4E7A-8283-283A447DBD29@gmx.de> <6f051485-30d7-b025-8dc4-1ca97694e29c@mti-systems.com> <2CC63847-707F-4B50-8F44-CFC6CD22F9B0@eggert.org> <58F65740-81FE-4AC3-ABD3-CA54E6F2BB4C@gmail.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <62d04148-8024-e72e-fd29-00b729604be5@erg.abdn.ac.uk>
Date: Thu, 12 Mar 2020 12:19:29 +0000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.5.0
MIME-Version: 1.0
In-Reply-To: <58F65740-81FE-4AC3-ABD3-CA54E6F2BB4C@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/QmQGCN8Ci1Xb4TQ4naumJNI7w3k>
Subject: Re: [tsvwg] Follow-up to your DSCP and ECN codepoint comments at tsvwg interim
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Mar 2020 12:19:38 -0000

This was to be an important topic of the face-to-face meeting in 
Vancouver, where we have been planning to allow time for different 
viewpoints to be aired and the consensus of the group to be sought. We 
still plan to do that, and are considering how best to proceed.

Comments in-line to show the expected direction of travel, but please do 
avoid growing this discussion thread for a week or so, until we have a 
plan to conclude it (see end of email).

On 12/03/2020 11:45, Jonathan Morton wrote:
>> On 12 Mar, 2020, at 10:18 am, Lars Eggert <lars@eggert.org> wrote:
>>
>> I'd much rather see some short answers on some basic questions, such as:
> I think the question posed by David Black's slides (as presented and briefly discussed at the interim meeting) is intended to elicit relevant answers to these sorts of questions.  But since you asked, I'll have a go at answering them standalone, for the SCE proposal.
>
>> Is the solution incrementally deployable?
> Yes.  Each component of SCE is inherently safe to deploy into an existing network, and will have no detrimental effect on conventional traffic.  SCE flows traversing an existing network will behave conventionally and in an RFC-compliant manner.  This is not merely aspirational, but baked fundamentally into the design.
This is exactly one of the topics we need to seek thought upon.
>> Is it simple?
> Yes.  I could summarise the technicalities on two slides - or one, if I really had to.  Implementing SCE also does not require very intrusive or extensive changes to existing network stacks; a year ago, we got a basic proof of concept running in about a man-week.
Yes, we should capture those two slides for SCE! We will also ask the 
same for L4S.
>> Does it offer better performance, to participating and ideally non-participating flows?
> Yes - adding SCE to an existing ECN network usually improves throughput, peak latency and jitter for SCE flows, and improves total link goodput when at least one SCE flow is present.  Adding SCE to a dumb-FIFO network also brings significant benefits to conventional traffic, as any good AQM deployment would.
One slide on how SCE offers benefit will also be useful. Again we should 
obtain the same for L4S.
>> Is it at worst a no-op, or can it degrade performance for non-participating flows?
> Non-SCE flows see only an RFC-3168 compliant AQM on the path, and ignore the SCE signals.  SCE flows at worst have a tendency to defer to existing traffic - the opposite to what you fear - but with a well-designed bottleneck qdisc, this effect is limited or eliminated.  The SCE proposal presents several such qdiscs as examples.
>
... Or rewrite ECT(1),  as I understand for some existing 
tunnels/encaps. I recall from the early L4S discussion that this 
consideration was also be important, and hence we charted work to try 
and improve ECN-marking in the IP layers.

>> Can a partially deployed solution be rolled back?
>> Can it stay deployed but become a no-op if a different mechanism wins out in the end?
> This is probably the most complex question to answer.  Un-deploying an experiment is always more difficult than deploying it in the first place, since it's harder to prove a negative.
>
> SCE consumes the spare ECN codepoint and one spare TCP header bit, without explicitly negotiating for them (beyond what is normal for RFC-3168 ECN), and adds a permitted ECN codepoint transition in middleboxes: ECT(0) -> ECT(1).  As far as conventional ECT flows are concerned, ECT(1) is semantically identical to ECT(0), and the spare header bits are ignored if received, so conventional traffic would not be affected.  The question is really about how to free up these consumed header resources, should that be necessary.
Yes. I would love such a brief understanding of how the experiment can 
be managed if the IETF may wish in the future to terminate an 
experiment, as was done for ECN-NONCE.
> Reversing an SCE deployment thus means removing all sources of this new usage, which means both middleboxes and endpoints.  The reference implementation leaves SCE functionality switched off by default, but it is easy to automate switching it on and potentially then forget about it.
>
> The following probes can be used to seek out latent deployments, and ask their operators to disable them.  This does not require a new kernel to be compiled or installed, nor any downtime, only that the SCE functionality is configured off at runtime, eg. through a sysctl or the iproute2 tc utility.
>
> SCE enabled middleboxes are easy to detect, by passing ECT(0) traffic through it and seeing if ECT(1) marks appear alongside (or before) CE marks.  If CE marks or an AQM-like (or tail-drop-like) pattern of drops appears but ECT(1) does not, it's not an SCE enabled middlebox.  This is probably sufficient to reclaim the spare ECN codepoint.
Probably worth writing down. I've found it hard to test for ECN-marking 
on operational paths, but insight in this is always going to be welcome.
> SCE enabled TCP receivers are also easy to detect, by probing them with ECT(1) traffic and seeing if the corresponding TCP header bit is used to echo them.  Some other transports may have ECT(1) feedback designed into them, which is not affected by SCE per se, and these can be ignored.  This is probably sufficient to reclaim the TCP header bit.
>
> SCE enabled senders are a little more difficult to detect unambiguously.  An indication can be obtained by probing them with the TCP header bit artificially set in all acks, and seeing if the resulting throughput is markedly less than if that is not done.  SCE senders behave essentially identically to normal ECN senders when the corresponding TCP header bit remains cleared in acks, so their continued presence is mostly benign, unless the header bit starts being used in some other experiment in a way that confuses the SCE sender.
>
> If considered necessary by the WG, it would be possible to associate SCE with an experimental-series DSCP, and require that DSCP be present for SCE functionality to be enabled.  This would however limit experiments to networks that do not bleach or otherwise disrupt that DSCP, which may inhibit exploration of behaviour on longer Internet paths (which we think SCE is much better suited for than L4S).  The upside would be easier reclamation of the header resources upon termination of the experiment.
>
> Our proposed use of DSCP is instead an optional addition, using SCE as part of providing a low-latency Diffserv PHB, but not requiring any particular DSCP to support SCE.
>
>   - Jonathan Morton

The TSVWG Chairs are now aware that we need to plan to make progress and 
will be in touch in the next week or so, with a plan move forward in the 
absence of an IETF face-to-face meeting.

Best wishes,

Gorry,

TSVWG Chairs