Re: [tsvwg] plan for L4S issue #29

Pete Heist <pete@heistp.net> Tue, 29 September 2020 08:26 UTC

Message-ID: <6ec53e8416422916913ae6f6a08bbe2aae061276.camel@heistp.net>
From: Pete Heist <pete@heistp.net>
To: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>, Sebastian Moeller <moeller0@gmx.de>
Cc: Wesley Eddy <wes@mti-systems.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Date: Tue, 29 Sep 2020 10:26:44 +0200
In-Reply-To: <AM0PR07MB6114EFE292E21A019F712CCCB9350@AM0PR07MB6114.eurprd07.prod.outlook.com>
References: <ca8ede0e-53a2-f4ff-751d-f1065cf5e795@mti-systems.com> <D0D3EDCE-3633-4E37-A167-3F1E09148ED9@heistp.net> <AM0PR07MB6114EDA6F2E8DCCB3D86D082B9200@AM0PR07MB6114.eurprd07.prod.outlook.com> <92c056567b3ad7af08777829314673ed66f5a96b.camel@heistp.net> <AM0PR07MB61140549F3BCAA65BBE6AD24B9380@AM0PR07MB6114.eurprd07.prod.outlook.com> <1F797C57-6284-4FA7-93F1-0CFCA903CC3C@gmx.de> <AM0PR07MB6114EFE292E21A019F712CCCB9350@AM0PR07MB6114.eurprd07.prod.outlook.com>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.36.5
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/uBE5u-5O1Kpz0h7cuxIJqLew9-0>
Subject: Re: [tsvwg] plan for L4S issue #29
Precedence: list

On Mon, 2020-09-28 at 09:54 +0000, De Schepper, Koen (Nokia -
BE/Antwerp) wrote:
> Hi Sebastian,
> 
> > > 	How confident are we that the "wide variety of conditions"
> > > under which this supposedly works is actually representative for
> > > the existing internet?
> 
> Only real world deployments will answer this question. I guess any
> other answer will be purely speculation...
> 
> > > 	rfc3168 present; rfc3168 detected: 	hit (true positive)
> > > 	rfc3168 present; rfc3168 not-detected:	miss	(false
> > > negative)
> > > 	rfc3168 absent; rfc3168 detected:		false alarm
> > > (false positive)
> > > 	rfc3168 absent; rfc3168 not-detected:	correct rejection (true
> > > negative)
> 
> Thanks for this. I like this classification and the tabular
> representation a lot. I guess it is ok then to use in the draft
> "missed detection" and "false detection"? I assume the other correct
> cases will be less referred to.

Personally, "missed detection" and "false detection" sound fine to me,
as I get the right sense of both the bottleneck and the result in both
cases.

> > > 	That is why I believe that rfc3168 detection in TCP Prague is a
> > > red herring that distracts from fixing L4S true issues. Like
> > > demonstrating that the current implementation actually performs
> > > as inteded over long holti-hop high RTT high Bandwdth links, over
> > > asymmetric links, and over uni- and bidirectionally saturated
> > > links.
> 
> These are all known problems that also exist in Classic TCP on FIFO
> queues. Solutions are known for several (all?) of them, which can as
> well be implemented for L4S (Prague CCs and L4S AQMs). So I wouldn't
> blame L4S for those problems or expect that a reference
> implementation is including all solutions to all known problems.
> 
> Koen.
> 
> -----Original Message-----
> From: Sebastian Moeller <moeller0@gmx.de> 
> Sent: Wednesday, September 23, 2020 7:47 PM
> To: De Schepper, Koen (Nokia - BE/Antwerp) <
> koen.de_schepper@nokia-bell-labs.com>
> Cc: Pete Heist <pete@heistp.net>; Wesley Eddy <wes@mti-systems.com>; 
> tsvwg@ietf.org
> Subject: Re: [tsvwg] plan for L4S issue #29
> 
> Hi Koen,
> 
> 
> > On Sep 23, 2020, at 17:46, De Schepper, Koen (Nokia - BE/Antwerp) <
> > koen.de_schepper@nokia-bell-labs.com> wrote:
> > 
> > Hi Pete,
> > 
> > I don't think the goal is to wait for a perfect RFC3168 detection
> > solution and we already have an implementation that works well
> > under a wide variety of conditions without additional
> > configuration.
> 
> 	How confident are we that the "wide variety of conditions"
> under which this supposedly works is actually representative for the
> existing internet?
> 
> 
> > This also means that it can still be improved, which I think can be
> > done more appropriate when facing real world issues. The goal of
> > the operational guidelines is to provide a larger set of tools for
> > solving problems with classic ECN bottlenecks, exactly to avoid the
> > need to rely on perfect end-host detection mechanism only.
> 
> 	I grudgingly agree that trying to fix L4S's AQM short comings
> by mandating fancy heuristics in the end points seems to be a loosing
> proposition. 
> 
> > Related to the "false negatives" and "false positives" naming, I
> > agree that it is very confusing. As the goal is to detect Classic
> > ECN network AQM behavior, maybe better and shorter names could be
> > "false-detect" and "false-non-detect"? 
> 
> 	I do not think that that is helpful, false-positive and false
> positive have much better well-known definitions that actually apply
> here. So if other nomenclature should be used, let's make it clearly
> different or better yet use already established terms. 
> 
> 	We could switch to signal detection theory terms (which is just
> one way to think about a classifier with just two options), then the
> four combinations of truth and classification for our RFC3168
> detector would be described with the following terms:
> 
> rfc3168 present; rfc3168 detected: 	hit (true positive)
> rfc3168 present; rfc3168 not-detected:	miss	(false
> negative)
> rfc3168 absent; rfc3168 detected:		false alarm (false
> positive)
> rfc3168 absent; rfc3168 not-detected:	correct rejection (true
> negative)
> 
> these terms are pretty much standard for similar detection problems
> (and also offer a decent approach to assess the effectiveness of the
> detector).
> 
> Let's stick to some already established nomenclature, whether that is
> the true/false positive/negative one or the DST one, please (see e.g.
> https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers).
> 
> The bigger issue I have with this is that this is still just a
> "fudge" to be able write in the internet draft "transports are
> required to implement RFC3168 detection, as demonstrated in TCP
> Prague" even though everybody here should know that that is not going
> to happen. People will mark whatever with ECT(1) and we will have to
> live with the fall-out. IMHO this is a sufficient reason to put the
> smarts in the AQM, instead of the current approach of assuming that
> end-points will just do this out of goodness of their heart.
> 
> That is why I believe that rfc3168 detection in TCP Prague is a red
> herring that distracts from fixing L4S true issues. Like
> demonstrating that the current implementation actually performs as
> inteded over long holti-hop high RTT high Bandwdth links, over
> asymmetric links, and over uni- and bidirectionally saturated links.
> 
> Best Regards
> 	Sebastian
> 
> > Koen.
> > 
> > -----Original Message-----
> > From: Pete Heist <pete@heistp.net>
> > Sent: Wednesday, September 23, 2020 2:19 PM
> > To: De Schepper, Koen (Nokia - BE/Antwerp) 
> > <koen.de_schepper@nokia-bell-labs.com>; Wesley Eddy 
> > <wes@mti-systems.com>
> > Cc: tsvwg@ietf.org
> > Subject: Re: [tsvwg] plan for L4S issue #29
> > 
> > Hi Koen,
> > 
> > I can definitely understand the need to turn bottleneck detection
> > on and off for testing, or for additional knobs during development.
> > 
> > Overall, I suspect that there will be more questions about
> > potential problems if bottleneck detection is not a MUST for
> > implementations in the draft, or not baked into the final
> > implementation in a way that works well under a wide variety of
> > conditions without additional configuration.
> > 
> > On an easier topic, I wonder if we shouldn't change the "false 
> > negatives" and "false positives" terminology to something clearer, 
> > like "mis-identification of RFC 3168 bottlenecks as L4S", or "mis- 
> > identification of L4S bottlenecks as RFC 3168", respectively. I
> > might 
> > have opened up a can of worms there in trying to save a few words.
> > :)
> > 
> > Pete
> > 
> > On Tue, 2020-09-15 at 12:43 +0000, De Schepper, Koen (Nokia -
> > BE/Antwerp) wrote:
> > > Hi Wes, Pete,
> > > 
> > > I think to make progress on avoiding both false negatives and
> > > false 
> > > positives, a good view on the conditions that cause problems is 
> > > needed. So we better have the means to detect the real life
> > > impact of 
> > > Classic-ECN-FIFO deployments. This means we need to be able to
> > > switch 
> > > off Classic ECN detection (under controlled or even known
> > > conditions).
> > > 
> > > Another point is that it would be useful also to have all
> > > control 
> > > variables of the existing implementation configurable for
> > > everyone 
> > > willing to further experiment (without necessarily needing to
> > > change 
> > > code). As I understood, the right tuning of these can bring a lot
> > > of 
> > > further improvement opportunities. Also depending on a typical 
> > > deployment, these parameters could be tuned for that specific 
> > > targeted case.
> > > 
> > > So the resolution of this issue is exactly to facilitate further 
> > > improving the detection algorithm (preferably via tuning), and
> > > being 
> > > able to disable it when conditions are controlled or safe to
> > > avoid 
> > > these false negatives.
> > > 
> > > I think these are topics that can be covered by the Operational 
> > > Guidelines draft.
> > > 
> > > Regards,
> > > Koen.
> > > 
> > > -----Original Message-----
> > > From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Pete Heist
> > > Sent: Friday, July 31, 2020 8:53 PM
> > > To: Wesley Eddy <wes@mti-systems.com>
> > > Cc: tsvwg@ietf.org
> > > Subject: Re: [tsvwg] plan for L4S issue #29
> > > 
> > > Hi Wesley,
> > > 
> > > One thing I noticed during testing was that the current 
> > > implementation of TCP Prague in Linux allows disabling
> > > bottleneck 
> > > detection through the prague_ecn_fallback kernel module parameter
> > > (
> > > https://github.com/L4STeam/linux/blob/0e7cf8acb318873c3f61084453f8da1
> > > 5 b2e398be/net/ipv4/tcp_prague.c , line 158). I don’t know if
> > > that 
> > > was left in only for testing.
> > > 
> > > In section 6.3.3 of l4s-arch, there is discussion around classic 
> > > bottleneck detection. Since I don’t see an explicit MUST that it 
> > > remain enabled (although I do see the text “an L4S sender will
> > > have 
> > > to fall back to…”), it’s not completely clear to me if it’s
> > > actually 
> > > required to be implemented and permanently enabled in all 
> > > implementations. If it is, I suppose the implementation should 
> > > reflect that also.
> > > 
> > > While I feel it best that detection identifies both types of
> > > queues 
> > > accurately, if bottleneck detection were both an explicit MUST in
> > > the 
> > > text *and* not possible to disable in any implementation, I
> > > think 
> > > that would make the misidentification of L4S queues as classic
> > > ECN 
> > > queues less of a safety concern, since it would be impossible to
> > > turn 
> > > off. It would remain an issue for the architecture overall
> > > though.
> > > 
> > > Hope that helps...
> > > 
> > > Pete
> > > 
> > > > On Jul 31, 2020, at 5:41 PM, Wesley Eddy <wes@mti-systems.com>
> > > > wrote:
> > > > 
> > > > Hello, ticket #29 for the L4S documents is about classic
> > > > bottleneck 
> > > > detection misidentifying L4S queues as classic ECN queues.
> > > > 
> > > > https://trac.ietf.org/trac/tsvwg/ticket/29
> > > > 
> > > > In contrast to other issues, it doesn't seem like this should
> > > > block 
> > > > a WGLC on the L4S drafts.
> > > > 
> > > > 	• It is specific to classic bottleneck detection
> > > > algorithm, which 
> > > > is planned to be worked on in the Prague ICCRG draft.
> > > > 	• The result is sometimes failing to achieve the best
> > > > possible L4S 
> > > > behavior, but doesn't seem to be an Internet safety
> > > > issue.  This 
> > > > resulting in people turning off classic bottleneck detection
> > > > would 
> > > > be a different issue, and something maybe the operator
> > > > guidelines 
> > > > would address.
> > > > 	• It seems like it can be worked on further in the
> > > > course of L4S 
> > > > experimentation, without negative effects to others.
> > > > So, I believe we should track this work in the ICCRG, and close
> > > > the 
> > > > ticket here.  Please let me know in the next week if I've 
> > > > misunderstood any aspect of this and it should remain open.
> > > > 
> > > >

[tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 Bob Briscoe
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Gorry (erg)
Re: [tsvwg] plan for L4S issue #29 Mikael Abrahamsson
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Greg White
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Ruediger.Geib
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 Ingemar Johansson S
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Mikael Abrahamsson
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Mikael Abrahamsson
Re: [tsvwg] plan for L4S issue #29 Gorry Fairhurst
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Jonathan Morton
Re: [tsvwg] plan for L4S issue #29 Pete Heist
Re: [tsvwg] plan for L4S issue #29 Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for L4S issue #29 Wesley Eddy
Re: [tsvwg] plan for assessing L4S safety [was: p… Sebastian Moeller
Re: [tsvwg] plan for L4S issue #29 Rodney W. Grimes