Re: [tsvwg] Follow-up to your DSCP and ECN codepoint comments at tsvwg interim

Bob Briscoe <ietf@bobbriscoe.net> Sun, 08 March 2020 15:07 UTC

To: Jonathan Morton <chromatix99@gmail.com>
Cc: Sebastian Moeller <moeller0@gmx.de>, tsvwg IETF list <tsvwg@ietf.org>, Steven Blake <slblake@petri-meat.com>
References: <7409b3a3-ba14-eb6d-154b-97c9d2da707b@bobbriscoe.net> <451679EC-B6F7-4176-B497-8189D238AF03@gmx.de> <0FF50545-9EDC-4C4D-B918-EE3D8A21005E@gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <b3503b03-d70b-3845-8820-9f483c02a9a4@bobbriscoe.net>
Date: Sun, 08 Mar 2020 15:07:33 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <0FF50545-9EDC-4C4D-B918-EE3D8A21005E@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/nP19g9sxgDIOyAnsjrQPpFZd3w0>
Subject: Re: [tsvwg] Follow-up to your DSCP and ECN codepoint comments at tsvwg interim
Precedence: list

Jonathan,

On 08/03/2020 01:35, Jonathan Morton wrote:
>> On 7 Mar, 2020, at 9:16 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>>
>> I fully understand the challenges of using DSCPs end to end, and yet DSCPs still seem to be the IETF's endorsed intra-domain and end-to-end marking mechanism (see RFC 2474). Any other marking use-case (like the LE PHB) could lay claim on ECT(1) by the same rationale as L4S as it would be currently more reliable. Sure, L4S could claim that it also uses functionally uses ECN and hence has a slightly stronger claim, but since the L4S arch draft explicitly mentions allowing non-responsive bounded-rate (and hence essentially non ECN-aware) flows in the LL queue it seems clear that ECT(1) in addition to all it's other issues is not really fulfilling L4S's marking needs exhaustively.
>>
>> That said, the only real uses L4S is likely going to see any time soon, IMHO, are those that have an immediate monetary advantage for the ISP rolling out and operating the L4S infrastructure, and hence SLAs between content suppliers and ISP that make sure to conserve the L4S DSCPs seem like a natural solution to the issue.
>> 	In the spirit of walking before running it would IMHO behoove the L4S effort not trying to solve all potential problems a priori, but rather experimentally demonstrate that even a restricted scope L4S deployment has actual merits in real life.
> I broadly agree with this argument.  And, as noted further down, the relatively small improvements in latency will only be significant on paths where the baseline RTT is already quite small, implying both a short geographic distance and a small number of routing hops between the endpoints.  These are exactly the conditions most likely to be able to deploy a Diffserv solution through the normal SLA process.
The benefits of removing say 20ms of tail latency easily apply in a 
country up to the size of the US (45ms RTT coast to coast in glass).

I assume that you are aware that there is more than 1 ISP in the US.
And more than 1 ISP in Europe for that matter.

Details are important. And being on the right planet is even more important.

>
> The challenge with L4S is that it relies on the classifier not only for low-latency service (a performance improvement, whose loss would typically be tolerable) but to configure the type of congestion signalling (a safety requirement, whose loss would be dangerous).  This is, I think, why they sought a classifier which would reliably traverse AS boundaries.  This design requirement ultimately stems from their early design decision to perpetuate the DCTCP signalling method, which changes the semantics of CE.
>
> Unfortunately this is not the only requirement exposed by that decision.  Not only must the classifier signal *survive* throughout the path, but every potential bottleneck node must *understand* the classifier and apply the different congestion signalling.  This puts L4S back in the position of having to limit deployment to controlled environments, only now without the Diffserv "containment" mechanism to help indicate when it strays outside of such an environment.  This in turn leads to the present attempt to work around the problem with "classic detection" algorithms, which I believe will show significant detection latency and will therefore be unable to switch modes as seamlessly as SCE has already demonstrated.
>
> With SCE we have approached the problem from a different direction from the start: Safety First.  The spare ECN codepoint is used as an additional output from the network, and CE retains its existing meaning and usage.  This means SCE can be deployed across any existing network with congestion safety, competing normally with conventional traffic - because on a conventional network an SCE flow is almost indistinguishable from a conventional flow.  The performance benefits begin to appear, seamlessly, when the bottleneck on the path is SCE enabled.
>
> And this means that if a Diffserv codepoint is used to request low-latency service for an SCE flow, it can be given by simply adjusting AQM parameters for that traffic, *without* impairing safety should this classifier be lost en route, and with completely equivalent performance.  I think this is a very important point that the L4S team has overlooked, in their continual attempts to discredit the SCE project.
Yes, all this is known.
SCE is not the only way of doing safety though.

>
> We are working on a demonstration of this capability, which we may be able to show at Vancouver (remotely), perhaps via the Hackathon.  Of course the fully-integrated code will also be publicly available at the time of demonstration, so you can replicate and play with the results.
As I said, we will publish results of another way of doing safety (that 
we also identified right from the start): Classic ECN AQM detection and 
fall-back. Probably early this week, but certainly also in Vancouver.

SCE shows that, if you hobble an experiment too much, you make it 
unworkable over real networks with real tunnels, real transport 
protocols and real applications.

For instance, is there any interest from the major end-system stacks in 
implementing SCE rather than L4S, given how often SCE is unlikely to work?


Bob

>
>   - Jonathan Morton
>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

[tsvwg] Follow-up to your DSCP and ECN codepoint … Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Sebastian Moeller
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Jonathan Morton
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Sebastian Moeller
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Sebastian Moeller
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Sebastian Moeller
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Wesley Eddy
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Sebastian Moeller
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Lars Eggert
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Gorry Fairhurst
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Jonathan Morton
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Jonathan Morton
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… alex.burr@ealdwulf.org.uk
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Bob Briscoe
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Sebastian Moeller
Re: [tsvwg] Follow-up to your DSCP and ECN codepo… Jonathan Morton