Re: [tsvwg] path forward on L4S issue #16

Sebastian Moeller <moeller0@gmx.de> Thu, 18 June 2020 07:27 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 293473A0EDF for <tsvwg@ietfa.amsl.com>; Thu, 18 Jun 2020 00:27:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hvuX3eoorj-b for <tsvwg@ietfa.amsl.com>; Thu, 18 Jun 2020 00:27:46 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 253543A0ED0 for <tsvwg@ietf.org>; Thu, 18 Jun 2020 00:27:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1592465262; bh=F/GwYHYoiiArmzRQtmViwyr078LmP08kREEEKQ5PgS8=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=Y3PH0pubKF2yTnj8LAZnW8qP+UKM+cAjXCbfxVifXWI73kPckwpHCcIA1I1iYNe3t 4ns9IWjH0Gm87aFpu7qYEFmI4mm9NF+W5QGG/Wwh1Xpu5Zguo5j08LCmGLzXQ5W6S8 L/Vt9+49bZR0nnrAaEy3MXVTXVWSLj32cuJzGKCA=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.3] ([134.76.241.253]) by mail.gmx.com (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1N79yG-1ipkif3J3E-017TfK; Thu, 18 Jun 2020 09:27:41 +0200
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <1931e544-a8a5-bbb9-8795-58fb40c638db@mti-systems.com>
Date: Thu, 18 Jun 2020 09:27:38 +0200
Cc: tsvwg@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <466B3486-4FE0-4DC1-A83A-0AD48E33DAE0@gmx.de>
References: <202006171419.05HEJClG085550@gndrsh.dnsmgr.net> <d5c08e8f-134b-6ca2-c490-27b574688d16@mti-systems.com> <8CC077A6-D6D6-4ECF-AC66-08CFC93E9665@gmx.de> <1931e544-a8a5-bbb9-8795-58fb40c638db@mti-systems.com>
To: Wesley Eddy <wes@mti-systems.com>
X-Mailer: Apple Mail (2.3445.104.14)
X-Provags-ID: V03:K1:4QLu8XA2CmAT8Z9EP2b0qYYmtrKZc8q7wJ7JwTQ+5C//NMoJdPJ vzdIpISOKVNW0Jl4cq3vkFNUL44KRRh5yYXT4d/G0x7KNcnPLlPsaPehKZ9prC2emUPIs+F DBYKeZn81blSjmJ0h4MYUj/yjSghFHLFO0yxpW3EwUCX+HlITFxjeg+vco1DhEERDyVSn9Y ue4s2BCU9DbtXsEZKNWqA==
X-UI-Out-Filterresults: notjunk:1;V03:K0:nVbEioWKTaI=:RmGx9o2OQKWnAnYdy4ahtL KeyXYQErK0z4MazAe0Lj8UhMMuGE25DbVYfm3wNIjYWHvavkHHTlWQaHjSwjywW/9Ty89CGw0 s/18WAiVM8EPor8YWKa636z8KMK/y96diJjHAFQfD4xx4ZE6nMW8S3lZlHyee/BfWLKQT9PIS fk9dbtv7FKMYI3VdKsRl8t+a6VeZYR2cscNHH0QDblYLXOeh4llRhQChhbTmkDxu5WOFE10w1 wZE2rvtzLzdn1M4Cs03bCOH6ql4BgP0RlFvieTnHRNrMjHu78EGNioeN2y0w77EPQBDeaAVba 3IWTNLvhSU2TFGuc6gvFNXxW8QgXptXowvBZ9wjzivoMkDZAyYR0BYI0Q0st06gUqMBcWY/Jx yX7pwp2OmgeDh9O5ixEmgjw+aZwnnOPehe4Z+9CSTNqNsYY30FcBeHQxk9n+5p6OCAw/BZ+tZ ueMbBhv+izeSFbKJ7qLGzAvPLM6f+b5a8D0gkeR4r/XfXpCL6Oq5N7PkHwoGICiuDshxlU5IV 1GlLjQBAAfquCvjviKbuZG86sGLGzZL6H9h2KZ6oMdCTaSyN+wYKLPPHEXdWD5QK9RNYFOZy2 +sYJ7+scIuz1pwWxLbIIFvnRbRQRBvGWZ/2QJgVC3Xtq/d5JlfgwPY8R1VopSNmupWwkj+XWO qi1OpqgxP0GQX2nVpEUI4xlzLLSN3N2ekY8mMCvG16/mv5gTByKRv/N0LrnUMHapE5ulHDU6k FvfYz0F2fva3/kJ6GPEDFRvLqc0AaA92bFCwg5OTfZ5zsx9K5D5Pyra9qU/8Sw68DrvqBK1kT Cg5UYgyi5sDMzN0jQSSTFALj4Ff9PxvfTPVEOiSFlc5AWAXLomToY0IUxaviBLXab16tNdrLA eWTNohxHCWcYW8jBrblE3ViWg/g5ij+wjbzlbTzf8q+WkHn0nGmdDy4fjgWkuR7l3aPsnLl24 hL33YAFB2w/jpisnao5ECShZRdmWNIuNg59CuPHgWFBKYOqtoqzW56+PfCg2kQFlXqkIfX9bs WFcBDMSCQohlN9VzRLc86oxD+WebSxPzb1tc8u07boRRG+FAGG5VPvruM4F7wWjZ+iM7oMGBP gNj2Cg5wthj1Gqb+kvlvBgvSCuMeh31OLltuzd4GRX+eiJktgaonK5d+B74htjf19yCgpABQ2 Zi+C1iTvexobTIy0RR/Dny/Qh9lL+gz7shVgaa5+gtVGduZ2wxUqisVItkzfBl1aDk1Jc=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/jHGDWxkvr-vJZdd95HkfqFObOaI>
Subject: Re: [tsvwg] path forward on L4S issue #16
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jun 2020 07:27:48 -0000

Hi Wes,

More below in-line.


On 17 June 2020 22:51:53 CEST, Wesley Eddy <wes@mti-systems.com> wrote:
> (I'm at least for now ignoring the parts of this not related to this 
> thread on issue #16.)

       [SM] Certainly an option, it will not make the objections go away though, but you know that.


> 
> On 6/17/2020 3:25 PM, Sebastian Moeller wrote:
>>> (and if more emerge, they can be dispositioned and added if needed).
> This particular thread is about agreeing on a path forward on the one
> labelled "issue #16" in the tracker.  Suggestions and discussion about
> that are good and welcome; repeatedly accusing others of bad intentions
> is not.
>> 	[SM] In the spirit of progress, how about team L4S either accepts
> rfc3168 compatibility to be a hard requirement and we start to discuss
> what kind of properties the L4S solution needs to achieve (degree of
> flow imbalance, temporal fidelity of the detection, false negative
> rates, ...) or team L4S lays out a plan how to assess the position
> Ingemar (and Bob in the past) seem to take that rfc3168 AQMs are
> sufficiently rare in the real world to allow ignoring them. I would
> expect that we then discuss how to measure this** and decide on what
> thresholds we consider for significance before hand.
> 
> This is already a requirement.  In the L4S I-D draft, there is a 
> requirement:
> 
> A.1.4 Fall back to Reno-friendly congestion control on classic ECN
>           bottlenecks
> 
>    Description: A scalable congestion control needs to react to ECN
>    marking from a non-L4S, but ECN-capable, bottleneck in a way that
>    will coexist with a TCP Reno congestion control [RFC5681].

       [SM] You seem to be ignoring Bob and Ingemar's push to simply have that requirement voided due to perceived insignificance of the existing rfc3168 deployment in spite of the fact that a) nobody has deployment numbers and b) even with number it would be hard to assess significance. E.g. there are far fewer inter-AS router's than there are home router's so even if all peering router's would employ rfc3168 AQM's this would result in a relative tiny number, but I assume everybody would agree that that would be a highly significant deployment, no?
Do I understand your position correctly that this requirement will not be relaxed/given up?


> 
> Again, the point of this thread is to figure out what the working group
> 
> needs to do to achieve this.  Looking at it as a co-chair, I'm seeing 
> basic agreement that the classic bottleneck detection algorithm will 
> likely never be perfect.  I don't think there's agreement (or even very
> 
> specific suggestions) about how good it needs to be.  

       [SM] Hence my proposal for the WG to work out the requirements in terms what fraction of an RTT the detection needs to trigger by and what level of false negatives (misclassified true rfc3168 paths) is deemed acceptable.
	I will start out with a proposal that the L4S team should like:
temporal response period: 1ms
correct detection rate: 99.999

But realistically how about a more relaxed:

temporal response period:  <= 4 RTTs after congestion signals appeared

rationale: it takes <= 1 RTT to feed back the first CE to the sender so the sender has 3 RTTs to figure out whether the path is likely rfc3168 compliant or not. That limits the fallout of misclassifications significantly, if later the transport layer accumulated sufficient evidence to declare the detection a false positive, it should be allowed to re-evaluate its decision and try 1/p CE interpretation again, still with a 4 RTT fall-back window.
Ideally TCP Prague SHOULD start out with assuming all paths are rfc3168 and switch to 1/p style CE-interpretation only after sufficient evidence has been gathered. But that would amount to safety by design, an approach clearly not preferred by the L4S effort.

correct detection rate: 99.9 % (and no persistent wrong classification of individual paths)

rationale: these numbers will be arbitrary, but for all intents and purposes 99.9 should sufficiently take the sting out false negatives. BUT that only holds if we do not permit persistent mis-classifications as that can easily lead to starvation of non-1/p flows traversing the same rfc3168-compliant hop.

I am happy to discuss other explicit proposals.


> There might be 
> agreement that in any case, operators should not expect that all flows 
> using ECT(1) will succeed in being nice to classic flows. 

	[SM] It is worth mentioning that typically we take adversarial traffic into account that maliciously tries to game the system for its advantage, but here we are talking about technically well-meaning flows that are simply hobbled by their underlaying technologies short comings. Which is ironic, given that none of this is deployed yet and hence the short-coming might simply be engineered away pro-actively instead.


> Along the 
> lines of some of Paul Vixie's comments in earlier threads, protection 
> from unfriendly or lying flows is important, and this should be true 
> even without regard to whether an operator deploys L4S in their
> network, 
> or if traffic is using ECT(1).  David has suggested to focus on 
> operational guidelines that would help to avoid ECT(1) bleaching, and I
> 
> think many agree that such bleaching would be undesirable.

	[SM] As undesirable as it is unavoidable giving the approach to testing that has/is currently used. If you only leave one remedy, please do not be sad when the real world uses that one remedy. As an operator, would you really consider it to be good advice to ignore ECT(1) packets in your RFC3168 AQM and still happily pass on such bobby-trapped packets to your paying customers, or would you pro-actively clean up that mess, since you need to touch all of these packets anyway? Bleaching ECT(1) (or bleaching both ECN bits if no ECN is used by an ISP) surely is the trustworthy safe by default method.


> 
> So, it seems at the moment like we should be working on this in
> parallel 
> to any improvements on classic bottleneck detection, because it will be
> 
> important in any case.

	[SM] Well, I am really interested ot read proposals how that would look like and how that would work out with operators that are not constantly reading internet drafts.

Best Regards
	Sebastian