Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt proposa for additional sections

Sebastian Moeller <moeller0@gmx.de> Wed, 05 May 2021 21:34 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.20\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <25905C45-2F9E-43EC-B89D-74BBF413DC00@cablelabs.com>
Date: Wed, 05 May 2021 23:34:06 +0200
Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>, tsvwg IETF list <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <31990E51-05A7-41ED-9C24-5B40F5B4CC32@gmx.de>
References: <C2BD1FC5-673F-4C29-8FFE-16CA367F388A@gmx.de> <HE1PR0701MB2299CEAEED3FB9C1CCEA5385C24B9@HE1PR0701MB2299.eurprd07.prod.outlook.com> <3D9E48B1-6CA2-44A8-B4AB-E705E0799F33@gmx.de> <HE1PR0701MB2299D0906E6C4C4692337D0EC24B9@HE1PR0701MB2299.eurprd07.prod.outlook.com> <25905C45-2F9E-43EC-B89D-74BBF413DC00@cablelabs.com>
To: Greg White <g.white@CableLabs.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/K84YmCQ7_SdRxbeN06yymWr2DwI>
Subject: Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt proposa for additional sections
Precedence: list

Hi Greg,


> On May 5, 2021, at 22:32, Greg White <g.white@CableLabs.com> wrote:
> 
> Hi Sebastian and Ingemar,
> 
> I didn't see a resolution to this on the mailing list.   
> 
> Sebastian, I agree that a section like you described would be appropriate, but in regards to the proposed text for 6.2, since the actual steps (and IETF recommendations) for discontinuing the experiment might depend on the situation at hand (what the implementations are, who deployed them, how much, where, how controlled, etc.) it seems to me to be premature (and unnecessary) to predict the specific actions that participants or non-participants should take, particularly actions like bleaching or blackholing traffic.

	[SM] I disagree, if we want to avoid a situation with ECT(1) as nonce again, we should make sure that on failure we make continued use of ECT(1) as unlikely as possible. That requires sending a strong signal to the protocol end points...


>   What seems more germane for this document is a reiteration of the requirements on implementations that would be important for concluding the experiment, i.e. that L4S congestion controllers be configurable to shut off L4S support and that L4S network elements be configurable to treat ECT1 the same as ECT0.

	[SM] That is the very least one would hope for, that nobody is going to deploy an experimental AQM with out an off-switch ;)


> Here is a suggestion on what we add to L4Sops (I also did some wordsmithing of your text prior to 6.2):
> 
> 
> 6. Conclusion of the L4S experiment
> 
> This section gives guidance on how L4S-deploying networks and endpoints should respond to either of the two possible outcomes of the IETF-supported L4S experiment.
> 
> 6.1 Successful termination of the L4S experiment
> 
> If the L4S experiment is deemed successful, the IETF would be expected to move the L4S specifications to standards track.  Networks would then be encouraged to continue/begin deploying L4S-aware nodes and to replace all non-L4S-aware RFC3168 AQMs already deployed as far as feasible, or at least restrict RFC3168 AQM to interpret ECT(1) equal to NotECT. Networks that participated in the experiment would be expected to track the evolution of the L4S standards and adapt their implementations accordingly (e.g. if as part of switching from experimental to standards track, changes in the L4S RFCs become necessary).
> 
> 6.2 Unsuccessful termination of the L4S experiment
> 
> If the L4S experiment is deemed unsuccessful due to lack of deployment,

	[SM] I do not consider passive deployment numbers to be a great metric for success and failure, sorry. The success should to be measured against the list of goals of the L4S experiment, the very least would be "active deployment" meaning if the AQM are known to be deployed but no traffic actually exercises the ECT(1) AQM path, the number of AQM nodes seems irrelevant as a success measure.


> it might need to be terminated: any L4S network nodes should then be un-deployed and the ECT(1) codepoint usage should be released/recycled as quickly as possible, recognizing that this process may take some time. To facilitate this potential outcome, [draft-ecn-l4s-id] requires L4S hosts to be configurable to revert to non-L4S congestion control, and networks to be configurable to treat ECT(1) the same as ECT(0).

	[SM] As above, if we want to expedite recycling of ECT(1) we should use stronger measures here... The thing is the AQM nodes themselves  might be under control of able network experts that change configuration in lock-step with what happens in RFC-space, but we need to get to all end-nodes that still use ECT(1) as well. 

Best Regards
	Sebastian

> 
> 
> 
> -Greg
> 
> 
> 
> 
> 
> 
> 
> 
> On 4/17/21, 12:56 PM, "Ingemar Johansson S" <ingemar.s.johansson@ericsson.com> wrote:
> 
>    Hi Sebastian
>    Please see inline [IJ]
>    /Ingemar
> 
>> -----Original Message-----
>> From: Sebastian Moeller <moeller0@gmx.de>
>> Sent: den 17 april 2021 16:55
>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>> Cc: tsvwg IETF list <tsvwg@ietf.org>; Greg White <g.white@cablelabs.com>
>> Subject: Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt proposa for
>    additional
>> sections
>> 
>> Hi Ingemar,
>> 
>> in essence you seem to argue against adding such a section, do i
>    understand
>> that correctly?
> 
>    [IJ] Not yet. But I am interested to know if there is any precedence in this
>    area in IETF, i.e, is there any experimental RFC that dictates in detail of
>    the experimental RFC is undone. What I have seen is that RFCs are declared
>    historical but I have a hard time to see how the exit of an RFC is
>    formulated in detail, especially if other SDOs adopt the RFC ? Should IETF
>    oversee that endpoints are updated. You have your mentioned how hard it is
>    to update e.g. RFC3168 edge routers. 
> 
>> 
>> 
>>> On Apr 17, 2021, at 16:21, Ingemar Johansson S
>> <ingemar.s.johansson@ericsson.com> wrote:
>>> 
>>> Hi Sebastian..
>>> Unfortunately the proposed section 6.2 sounds a lot like the famous
>>> Mission Impossible one-liner "This tape will self-destruct in five
>>> seconds. Good luck, Jim."
>> 
>> 	[SM] How that? Section 6 is requiring network operators tat
>> deployed L4S during the experiment to take active steps to incentivize
>    end-
>> points to stop using L4S-signaling ASAP, there is no "self-destruction"
>> involved at all; I am not sure whether that is a fitting analogy. How
>    about
>> commenting upon the specifics of my proposal instead?
>    [IJ] I see it as some kind of requirement for a self-destruction mechanism
>    in the sense that nodes that implement L4S should exit L4S at a given day or
>    otherwise put some requirement on network operators to terminate it. How
>    many will follow these guidelines and based what decision, a special IETF
>    interim that declares "now we close the shop, everybody out!" . What it
>    other SDOs have picked up L4S ?
>    The only thing I see that experimental RFCs fade away because or lack of
>    interest, poor performance or whatever other reason that may exist. 
> 
>> 
>> 
>>> The ECN Nonce was experimental and was recently declared historic a
>>> few years ago, literary speaking RFC3540 did not self destruct, it
>>> just faded away.
>> 
>> 	[SM] Sure, but I note that the Nonce had no negative side-effects on
>> standards compliant AQMs (unlike L4S): if after a terminated experiment
>    L4S
>> signaling is still employed by end-points this will still wreck havoc with
>    rfc 3168
>> AQM and strongly violate the expected sharing behaviour at those nodes.
>> Now at that point in time, rfc3168 will still have PS status and there are
>    no
>> motions to change that I am aware of, while L4S will have failed and the
>    RFC
>> needs to be made historic; I would humbly argue that rfc3168 operators
>> should be able expect no delayed side-effects from a terminated
>> experiment, no?
>> 
>> 
>>> I don't see anywhere in RFC3540 about what end hosts and networks
>>> should do when RFC3540 is declared historic.
>> 
>> 	[SM] Given the lack of side-effects there is/was no need to do so.
>> 
>> 
>>> And I guess that the way
>>> forward, in case the whole L4S experiment falls flat, is that the L4S
>>> ID RFC is declared historic, I guess this should be enough.
>> 
>> 	[SM] How is that going to help one iota with the situation that will
>> arose then: end-points will continue to negotiate L4S-signaling and
>    rfc3168
>> AQM on the path will take a hit from that.
>> 	Yes,  I agree that this is a consequence of a design decision in L4S
>> (one I consider sub-optimal), but it has become clear, that team L4S is
>    not
>> going to entertain the idea of actually changing the L4S design one bit.
>    So if
>> that dangerous signal stays part of L4S, we need a way to un-deploy L4S
>> signaling from end-points. My proposal does not achieve that by itself,
>    but it
>> will
>> 
>>> But lets imagine that detailed information is needed to address the
>>> case that L4S fails. This leads to the question.
>>> Are there any other examples in IETF's history where an experimental
>>> RFC has been accompanied with a detailed instruction on how to clean
>>> up after the experiment ?
>> 
>> 	[SM] While I am as much a fan of precedence as the next guy, I am
>> not interested in changing/improvinf IETF processes in general. I am
>> interested however in keeping the blast range of the L4S experiment as
>    small
>> as possible. Data so far strongly indicates failure is likely and hence
>    prident
>> engineering requires to take precautions.
>    [IJ] Sure, I understand that you believe that L4S will be a big fail, and
>    you want some exit method other than the declare historical. But perhaps I
>    misunderstand how IETF works and your intentions?. I have too few years in
>    the IETF to know how all these processes work 
> 
>> 
>> BUT Ingemar, how about you propose an alternative text for the L4S-ops
>> draft how to handle possible actions after the experiment has run its
>    course
>> that acknowledge that L4S is one of the riskier experiments the IETF has
>    ever
>> started? Maybe my choice of wording was perceived as too adversarial, and
>> we agree in principle?
>> Or is your argument that you reject adding section 6 to the OPs ID?
>> 
>> Best Regards
>> 	Sebastian
>> 
>>> 
>>> /Ingemar
>>> 
>>>> -----Original Message-----
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Sebastian Moeller
>>>> Sent: den 16 april 2021 22:57
>>>> To: tsvwg IETF list <tsvwg@ietf.org>; Greg White
>>>> <g.white@cablelabs.com>
>>>> Subject: [tsvwg] draft-white-tsvwg-l4sops-02.txt proposa for
>>>> additional sections
>>>> 
>>>> Hi Greg, hi list,
>>>> 
>>>> I think that the L4S ops draft could be improved by saying something
>>> explicit
>>>> about the end of the L4S experiment and what differential actions are
>>>> recommended for participants at that point in time. I made this
>>>> argument already at the end of
>>>> https://mailarchive.ietf.org/arch/msg/tsvwg/NXEACkPX1pFriOtqsoifPzoI8
>>>> Sk/ but I believe that it might have been overlooked there at the end
>>>> of a
>>> long
>>>> email.
>>>> 
>>>> So, here is my proposal to that regard as a new section 6:
>>>> 
>>>> 
>>>> 6. What L4S-deploying networks should to do at the end of the L4S
>>>> experiment
>>>> 
>>>> This section gives guidance how L4S-deploying networks should respond
>>>> to any of the two possible outcomes of the IETF-supported L4S
>> experiment.
>>>> 
>>>> [COMMENT: I believe that there needs to be a proper definition how
>>>> the L4S experiment is supposed to be evaluated at the end to decide
>>>> between success and failure, but the L4S-ops document seems to be the
>>>> wrong place for that, so I propose to add this to one of the L4S core
>>>> IDs, preferably
>>> to
>>>> section 6 of
>>>> https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/
>>>> which already contains discussion about what the experiment is
>>>> supposed to figure out, so adding a section how to evaluate the
>>>> outcome there seems a natural fit; IMHO the deciding factor here
>>>> could be whether L4S is
>>> promoted
>>>> to proposed standard status after the experiment, if and only if that
>>> happens
>>>> L4S should be considered a success... I base this on the already
>>>> known problematic side-effects of L4S and the fact that L4S will
>>>> consume a
>>> rather
>>>> scarce resource in an IP-level code-point the exists in both IPv4 and
>>> IPv6,
>>>> which IMHO should be released ASAP if L4S fails to succeed]
>>>> 
>>>> 6.1 Successful termination of the L4S experiment
>>>> 
>>>> If the L4S experiment is deemed successful, participating networks
>>>> are encouraged to continue deploying L4S-aware nodes and if possible
>>>> replace
>>> all
>>>> non-L4S-aware rfc3168 AQM already deployed as far as feasible, or at
>>>> least restrict rfc3168 AQM to interpret ECT(1) equal to NotECT.
>>>> Participants are
>>> also
>>>> expected to track the evolution of the L4S standards and adapt their
>>>> implementations accordingly (e.g. if as part of switching from
>>> experimental
>>>> to standards track, changes in the L4S RFCs become necessary).
>>>> 
>>>> 
>>>> 6.2 Unsuccesful termination of the L4S experiment
>>>> 
>>>> If the L4S experiment is deemed unsuccessful, it might need to be
>>>> terminated: L4S network nodes should be un-deployed and the ECT(1)
>>>> codepoint usage should be released/recycled quickly.
>>>> 	To achieve the former, participants of the L4S experiment are
>>>> expected to configure their L4S-aware network nodes such that either
>>>> the LL-queue of a dual queue AQM gets completely disabled, or that
>>>> the LL- queue is switched from issuing CE-marks to pure drops;
>>>> thereby removing L4S-signaling from the network. To achieve the
>>>> latter, all endpoint hosts need to stop negotiating L4S-congestion
>>>> signaling. For nodes under control
>>> of
>>>> participating networks that should only require re-configuration of
>>>> the endpoint hosts. For nodes not under control of participants of
>>>> the experiment that will be a considerable challenge. To give a
>>>> strong signal
>>> to
>>>> such out-of-network endpoints drastic measures might be required,
>>>> like re- marking ECT(1) packets to NotECT or even dropping all ECT(1)
>>>> packets on network ingress, as these will give clear and strong
>>>> incentives to
>>> endpoints to
>>>> stop using ECT(1)/L4S-signaling. But to avoid "burning" the ECT(1)
>>> codepoint
>>>> indefinitely, such measures should be restricted to a limited
>>>> duration
>>> (e.g. 24
>>>> month) after an unsuccessful termination of the L4S experiment, to
>>>> allow
>>> the
>>>> ECT(1) codepoint to be recycled for other uses/experiments afterwards.
>>>> 
>>>> 
>>>> 
>>>> I am sure that this is not going to be complete and that there will
>>>> be
>>> differing
>>>> opinions on these sections, but I believe something similar in scope
>>> should
>>>> be added.
>>>> 
>>>> Best Regards
>>>> 	Sebastian
> 
>

[tsvwg] draft-white-tsvwg-l4sops-02.txt proposa f… Sebastian Moeller
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Jonathan Morton
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Ingemar Johansson S
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Sebastian Moeller
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Ingemar Johansson S
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Greg White
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Sebastian Moeller
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Greg White
Re: [tsvwg] draft-white-tsvwg-l4sops-02.txt propo… Sebastian Moeller