Re: [tsvwg] plan for L4S issue #29

Bob Briscoe <ietf@bobbriscoe.net> Tue, 22 September 2020 17:42 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3F0083A18DF for <tsvwg@ietfa.amsl.com>; Tue, 22 Sep 2020 10:42:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cxREOE7Isgno for <tsvwg@ietfa.amsl.com>; Tue, 22 Sep 2020 10:41:59 -0700 (PDT)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4FCE03A18DC for <tsvwg@ietf.org>; Tue, 22 Sep 2020 10:41:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=5NARFIhu4VzRXT1rSjxfJ/h8yvCVXpytRHxuzq4Ia+0=; b=1Uvmv+FF9uvef2SJZdtVynWmuG bO5z8gauDT9kfRS95kRYCtaq6ee0LLWSJLirkO/WA+guG79IpKaZa72vzuhIvQf6peRKK0wIi1/S5 9bp54qzJRLrQvRNtMQrZLVzXGRosaj+f+W8vds1fYOIyGqDRiN6QF6BMCfXrEmzM8fzo1K4A7OF2Y w9TF8HXJOJRs0py3RM/4fvusYcVggPANc/x378BADp3L7jeUKyz437zur4Pj7zFKDenUi2kofYSSj kRsCea8ewX7D6E7xaLEFkpxxdfKXLbLmsojCp5dywnedCvd4EXSRSh1jk0yCCeF0yMdFssaqI+asA aHhDHuIQ==;
Received: from [31.185.135.145] (port=55144 helo=[192.168.0.10]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1kKmIf-0084Zz-CP; Tue, 22 Sep 2020 18:41:57 +0100
To: Jonathan Morton <chromatix99@gmail.com>, Wesley Eddy <wes@mti-systems.com>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <ca8ede0e-53a2-f4ff-751d-f1065cf5e795@mti-systems.com> <4FE5E2A4-7853-487E-82E7-7B74AA2B6FC4@gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <5ebf850e-631f-4293-2ec8-7c80349e6a02@bobbriscoe.net>
Date: Tue, 22 Sep 2020 18:41:56 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <4FE5E2A4-7853-487E-82E7-7B74AA2B6FC4@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/Y0-mXzb-NM8j_xyuRIDeIodpLzA>
Subject: Re: [tsvwg] plan for L4S issue #29
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Sep 2020 17:42:01 -0000

Jonathan, Wes, list,

Sorry, I've been dead to the L4S conversations on this list for weeks 
now. But I've resolved to get back involved now. I stopped interacting 
on L4S over the summer because I became burned out by the L4S 
conversation being dominated by the same two or three people who do not 
want the agreed deliverables at all. I have never had to work on a 
technology before where I am faced with a continual stream of blocking 
messages, rather than constructive criticism or solutions being offered.

It has always been recognized that there are not enough bits in the IP 
header for a perfect encoding scheme, so that we will have to choose the 
least worst compromise. We had a consensus call in April, in which the 
questions of Classic ECN fallback (#16) and false positives in the 
current algo (#29) were well understood by all concerned. The result was 
a decision not to go with the alternative solution (SCE) that inherently 
addresses issues #16 & #29, but instead to continue with the solution 
(L4S) that partially addresses these issues and fully addresses other 
concerns. The L4S design team were asked to prepare a new ops guidelines 
draft as part of that decision.

I became disheartened about working further on the Classic ECN AQM 
fallback algo when all the work we'd done was just dismissed as futile. 
It became apparent that no amount of work on it would ever satisfy the 
few people who have been vocally opposing L4S as a whole. Now that's a 
shame, because I/we had worked out an additional heuristic to address 
the large bulk of the false positives that appeared at low link capacity 
(note that is different from low flow rate). I admit that this solution 
would not have addressed the few cases that arise in apparently random 
circumstances. And I admit that, without having tried it, it might not 
work at all.

I now prefer to work on Classic ECN AQM detection more than fallback. My 
logic for this is that L4S can be /unfair/ against Classic flow(s) in a 
Classic ECN AQM, but it is not /unsafe/. Unsafe means starvation, but 
Classic flows always still progress against L4S. So, it should be 
sufficient for L4S senders to /monitor/ for Classic AQMs, at least in 
the initial stages of the L4S experiment.

This can then become part of the monitoring infrastructure that the L4S 
ops guidance proposes for the L4S experiment. It can highlight where 
classic ECN AQMs might have been deployed. Then, if it produces false 
positives, further digging should discover that there is not actually 
any classic ECN AQM where it says there is. Then at least false 
positives don't harm performance.

If it turns out that there are a lot of classic ECN AQMs out there, it 
then becomes important to improve the detection algo so it can be used 
in-band for fall-back. On the other hand, if we find there is very 
little, it becomes preferable to use the network configuration 
techniques described in the ops guidance to alter the classic ECN AQMs 
to isolate classic ECN and L4S ECN traffic.

Regards



Bob

PS. There are also the cases where L4S flows don't perform well 
(particularly around start-up behaviour) in a Classic ECN AQM within an 
FQ scheduler (e.g. the experiments Pete did with L4S over FQ_CoDel and 
with Cake). Over the summer Joakim and I have been working on faster 
response to marking in the Prague algo, and we need to combine that with 
delay measurements for these cases.

I would also like to see a commitment to improving the CoDel and Cobalt 
control laws to increase ECN marking more rapidly in response to 
consistently increasing delay. Because this has always been a problem 
with unresponsive flows, irrespective of whether L4S had ever appeared 
on the scene. I'd be willing and interested to help with that sort of 
work,... assuming my day job left sufficient time for this.


On 31/07/2020 19:03, Jonathan Morton wrote:
>> On 31 Jul, 2020, at 6:41 pm, Wesley Eddy <wes@mti-systems.com> wrote:
>>
>> Hello, ticket #29 for the L4S documents is about classic bottleneck detection misidentifying L4S queues as classic ECN queues.
>>
>> https://trac.ietf.org/trac/tsvwg/ticket/29
>>
>> In contrast to other issues, it doesn't seem like this should block a WGLC on the L4S drafts.
>>
>> 	• It is specific to classic bottleneck detection algorithm, which is planned to be worked on in the Prague ICCRG draft.
>> 	• The result is sometimes failing to achieve the best possible L4S behavior, but doesn't seem to be an Internet safety issue.  This resulting in people turning off classic bottleneck detection would be a different issue, and something maybe the operator guidelines would address.
>> 	• It seems like it can be worked on further in the course of L4S experimentation, without negative effects to others.
>> So, I believe we should track this work in the ICCRG, and close the ticket here.  Please let me know in the next week if I've misunderstood any aspect of this and it should remain open.
> Presently, the Prague congestion control algorithm is unsafe on the Internet without a properly functioning classic-bottleneck detection heuristic.  Additionally, the classic-bottleneck detection heuristic that has been published and demonstrated does not function properly.  This combination of facts absolutely *is* a blocker for WGLC.
>
> Therefore, this issue should not be closed until it has been concretely addressed.
>
>   - Jonathan Morton
>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/