Re: [tsvwg] plan for L4S issue #29

Sebastian Moeller <moeller0@gmx.de> Wed, 30 September 2020 09:09 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <968abd8f-ba18-23de-b9ca-9eb1a0aaba09@erg.abdn.ac.uk>
Date: Wed, 30 Sep 2020 11:09:30 +0200
Cc: Greg White <g.white@CableLabs.com>, "Rodney W. Grimes" <ietf@gndrsh.dnsmgr.net>, Mikael Abrahamsson <swmike=40swm.pp.se@dmarc.ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <03CDC52C-2783-45A3-85E8-4B1135D45384@gmx.de>
References: <202009291549.08TFnvFV068509@gndrsh.dnsmgr.net> <c7080365-233c-5f1e-ef5c-1f42c969042a@erg.abdn.ac.uk> <73562E45-3EE7-43D4-B26B-76478AE19AF8@cablelabs.com> <C68807B2-EF30-4263-BD66-29106C62261D@gmx.de> <968abd8f-ba18-23de-b9ca-9eb1a0aaba09@erg.abdn.ac.uk>
To: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ZHbemOzzMQXvgbGaAD0c2iYplb0>
Subject: Re: [tsvwg] plan for L4S issue #29
Precedence: list

Hi Gory,


> On Sep 30, 2020, at 10:21, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
> 
> See below.
> 
> On 30/09/2020 09:00, Sebastian Moeller wrote:
>> Dear Greg,
>> 
>> more below in-line, prefixed with [SM]...
>> 
>>> On Sep 30, 2020, at 00:48, Greg White <g.white@CableLabs.com> wrote:
>>> 
>>> 
>>> 
>>> On 9/29/20, 10:05 AM, "tsvwg on behalf of Gorry Fairhurst" <tsvwg-bounces@ietf.org on behalf of gorry@erg.abdn.ac.uk> wrote:
>>> 
>>>    See below.
>>> 
>>>    On 29/09/2020 16:49, Rodney W. Grimes wrote:
>>>>> On Mon, 28 Sep 2020, Gorry (erg) wrote:
>>>>> 
>>>>>> At some point the working group needs to publish the spec. - This final
>>>>>> stage is taking longer than I would hope, and I do hope that will be
>>>>>> seeing a WGLC soon.
>>>>> Do we actually?
>>>    Yes.
>>> 
>>>>> I still haven't ruled out that we decide not to use these bits, for now,
>>>>> because we don't know enough how it will affect the entire Internet.
>>>    Still possible, if the WG as a whole decides that.
>>> 
>>> We're talking about WGLC to begin an experiment.
>> 	[SM] I believe that we are conflating two "experiments" here:
>> 
>> 1) the absolutely required experiments to what degree L4S will realize its claimed characteristics (or its promises) under a number of conditions that are relevant/prevalent in the real world.
>> 2) how to deploy L4S in the real world.
>> 
>> If we look carefully at these two experiments it becomes obvious, that an RFC in the experimental track seems necessary for 2), but 1) can and should proceed long before 2) is being addressed. To be blunt, if experiment 1) is not clearly demonstrating improvements over the state of the art, then 2) becomes moot.
>> 
>> @CHAIRS: Could you please let me know, if you agree?
>> 
>>> The interest by the community in achieving the expected benefits of L4S was well represented at the consensus call. I believe that the WG needs to honor the consensus position and move forward with planning the experiment.
>> 	[SM] That is a call for experiment 1), but the drafts really aim for experiment 2).
>> 
>> 
>>> With appropriate guidance I believe the experiment can commence and we can begin to understand whether any of the concerns raised are real, and if they are, how to resolve them.
>> 	[SM] I disagree, at the current time, neither the drafts (L4S and operational guidance) are properly tailored for type 1) experiments.
>> 
>>> For the L4S Operational Guidance draft, I would appreciate constructive input from those who are interested in making the experiment a success (thanks already to Sebastian for his comments).  Important aspects include: If a sender wishes to test an L4S congestion control algorithm (Prague or otherwise) what should they be monitoring to understand how much of a benefit are they getting, what impact are they causing to non-L4S flows, etc.? If a network operator wishes to test an L4S compliant network element, what should they monitor?  What actions should either entity take with the resulting information?
>> 	[SM] These are all good points. Let me add a few thoughts here.
>> 
>> Let's assume an endpoint wants to participate in testing it will require a few things:
>> 1) a L4S-compatible transport that sets ECT(1), L4S-incompatible transports that set ECT(1), as well as standards compliant transports (ECT(0)/ no ECN)
>> 2) one/more L4S-compatible remote endpoints
>> 3) a L4S AQM at a bottleneck, that will allow monitoring all traffic (the only way endpoints will be able to assess the side-effects of L4S traffic is to actually terminate all such flows).
>> 
>> Point 3) alone makes it clear that for this kind of endpoint driven experiment, only small scale AQMs will be suitable, like AQMs on an end-user internet access link, that are exclusively managing that links traffic and do not interact with other access links' AQM instances.
>> 
>> And given these constraints, all that seems required for type 1) experiments IMHO is to put the access link AQMs under end-user control (preferably with three states: AQM off, all traffic, all traffic is treated like ET(0), full L4S-compliant behavior). If such an AQM defaults to AQM off, this is already save to deploy today, without requiring any additional RFC...
>> To be explicit, this opt-in approach allows to control the fall-out quite well; any side-effects of the L4S AQability testingM misbehaving will be mostly restricted to the end-node that opted-in. Sure that is not 100% safe, but it will allow the crucial type 1) experiments to proceed without having to commit to ECT(1) at the IETF RFC level.
>> 
>> Now, realistically there is nothing so far, that would have made that course of action impossible even today, so I wonder why such a setting has not already been used to make the required robustness and reliability tests that L4S is lacking (since almost a decade now?)
>> 
>> 
>> As is, what I predict is going to happen, is that ECT(1)/L4S is going to end up as RFC without proper robustness and reliability testing, heck without even a clear indication that its promises will realize over anything but short RTT, low hop-count links, by sheer impulse conservation.
>> 
>> Some comments by the chairs seem to indicate that we are already well along that path... but there is still time to do it right. L4S has been slow enough that waiting another X years can not be a showstopper, not that the required type 1) experiments necessarily would take years...
>> 
>> 
>> Best Regards
>>         Sebastian
>> 
>> 
>> 
>>> In my opinion, Issue 29 can be closed. RFC3168 detection should continue to evolve and should be referenced in the Operational Guidance draft, but fallback should not be required in the experimental protocol drafts for the reasons outlined in Issue 29, and on the mailing list.
>>> 
>>> -Greg
>>> 
>>> [snip]
>>> 
>>> 
> Let me try as one chair who has seen several TSVWG experiments over the years:
> 
> The WG chose to adopt this some time ago, presumably because they saw the potential to offer benefit over the state of the art.

	[SM] Please elaborate. In the ECT(0) versus ECT(1) I did not have the impression that the audience had either deep knowlegde about either L4S nor the state of the art... I base this on the comments that talked about the potential benefits, which without first running the type 1) experiments first really are just wishful thinking. Don't get me wrong, often a initial gut-feeling is a good impulse to start an experiment on, but one needs to keep open for the real possibility that a noice idea might crumble once it gets in contact with reality.


> We now have a set of working group documents.

	[SM] As I indicated in the past, the way the WG acted has set us on an almost unstoppable path to actually publish these drafts as RFC come hell or high water, without any substantial changes. All that based on the feeling that it would be nice if something like that would work over the internet. Color me not impressed.

> 
> The primary consideration for progression of this type of specification is understanding of the safety of the method for at-scale deployment.

	[SM] I would argue that the first question should be is at-scale deployment actually a viable course of action, which IMHO requires more diverse testing first. Neither of the current drafts pays any consideration what happens if the experiment fails and how to un-roll it. Following that course we will end up with L4S AQMs deployed (but potentially not activated) and the ECT(!) codepoint consumed, independent of whether L4S will actually see any meaningful use.
So the LEAST I would expect, if we insist upon publishing these RFC before doing the safety experiments to describe in the drafts how to un-roll each of them and explicitly describe in what time frame the scarce resource (IP header codepoint) will be able to be returned for other uses. 
	The way I see it, either first make sure that no roll-back is going to be likely (by performing type 1) experiments to convince ourselves the L4S as currently designed is up fo the job) or have an actionable plan of how to remove the fall-out of a failed experiment, and that action plan needs to be part of the internet draft RFC, especially in regards to the future use of ECT(1) in case of failure or lack of market adoption of L4S.

@CHAIRS: Since this seems so obvious to me, but also quite contrary to what the censensus in the WG is, could you please explain, what I am missing and why the current approach, "no proper experiments before an RFC and no documented rules how to undo the experiment" is actually aligned with the WG's aims and goals.



>  Specifically, for such an experiment, appropiate controls need to be in place to actively prevent congestion collapse, and avoid security-related pitfalls. The potential set of issues need to be teased-out for network operators and the WG to understand how to evaluate this.

	[SM] I would agree, but as said before, this is well possible right now without and of the drafts progressing to RFC status.

> 
> Technical evaluation of performance, latency, fairness, etc is always welcome - including discussion of published papers and presentations of results (e.g., in ICCRG). This is however, not the purpose of the TSVWG specification.

	[SM] IMHO this is backwards, without something demonstrably working robustly and reliably (over the existing internet) writing and publishing an RFC seems quite speculative, and in this case actually dangerous, as https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-10#page-20 states

"6.  L4S Experiments

   [I-D.ietf-tsvwg-aqm-dualq-coupled] sets operational and management
   requirements for experiments with DualQ Coupled AQMs.  General
   operational and management requirements for experiments with L4S
   congestion controls are given in Section 4 and Section 5 above, e.g.
   co-existence and scaling requirements, incremental deployment
   arrangements.

   The specification of each scalable congestion control will need to
   include protocol-specific requirements for configuration and
   monitoring performance during experiments. Appendix A of [RFC5706]
   provides a helpful checklist.

   Monitoring for harm to other traffic, specifically bandwidth
   starvation or excess queuing delay, will need to be conducted
   alongside all early L4S experiments.  It is hard, if not impossible,
   for an individual flow to measure its impact on other traffic.  So
   such monitoring will need to be conducted using bespoke monitoring
   across flows and/or across classes of traffic."

There is exactly zero mentioning how to undo the L4S experiment in case of the expected failure*. There is also zero mentioning how ECT(1) is going to be recycled. To be explicit, if the consensus is to allow the L4S experiment to spoil the ECT(1) codepoint for years to come, this absolutely MUST be spelled out explicitly in the RFC, so that everybody knows up front what is at stake.



*) I say expected failure as by now, I take the lack of data demonstration robust and reliable functionality of L4S over rather normal internet conditions (outside of the overtested "close CDN to end-host" condition) as an indirect sign, that this data is hard to produce, indicating that L4S will not keep its promises over longer haul paths but we will be left to carry its considerable cost.



> 
> In the future, those choosing to implement and deploy the specification will provide most useful input to any future progression along the standards track, and to inform any resulting best current practice - or indeed, for this WG to understand whether the experiment is ultimately deemed successful or unsuccessful.

	[SM] If we nominally still allow for data to convince us that the experiment (what ever the experiments actually is) was unsuccessful, I believe the drafts need explicit sections discussing in detail what should happen on experiment termination. This has been brought up in the past by others but has not made it into the drafts as far as I can see. 

Best Regards
	Sebastian


> 
> Gorry
>

[tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 Bob Briscoe
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Gorry (erg)
Re: [tsvwg] plan for L4S issue #29 Mikael Abrahamsson
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Greg White
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Ruediger.Geib
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Mikael Abrahamsson
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Mikael Abrahamsson
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for assessing L4S safety [was: p… Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes