Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)

Sebastian Moeller <moeller0@gmx.de> Thu, 03 December 2020 21:37 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0AE203A0CF5; Thu, 3 Dec 2020 13:37:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rnehFNklOu3K; Thu, 3 Dec 2020 13:37:10 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3CE0F3A0CEC; Thu, 3 Dec 2020 13:37:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1607031381; bh=GAt4UZbsnO8cM4oEF8R1B3cGGPPClnZLrNb7ZmFjOv8=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=DiO8WyDeCfzQ0vYj1VANL2Lq1KU/CK3za3zJCUVtJkI0C8Si7rmLLu1vZo4vzjkgl wEMWo6+A+nmOo/PqYFYPkfMpeGmXbPFPxwl4aOs4PrnomorP39LsyrjSTxUwlnocXx PWBDr5m1Hf5Gp4GHjTC+U1bT/U00dpaK59kD3RCA=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.42.229] ([95.112.103.102]) by mail.gmx.com (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MQe5u-1kZJ0F1V3e-00NfZy; Thu, 03 Dec 2020 22:36:21 +0100
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <35560310-023f-93c5-0a3d-bd3d92447bcc@bobbriscoe.net>
Date: Thu, 3 Dec 2020 22:36:19 +0100
Cc: Ingemar Johansson S <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org>, tsvwg IETF list <tsvwg@ietf.org>, tsvwg-chairs@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <67E95ACD-2C38-4852-A86C-A330C85C4547@gmx.de>
References: <MN2PR19MB4045A76BC832A078250E436483E00@MN2PR19MB4045.namprd19.prod.outlook.com> <HE1PR0701MB2876A45ED62F1174A2462FF3C2FF0@HE1PR0701MB2876.eurprd07.prod.outlook.com> <56178FE4-E6EA-4736-B77F-8E71915A171B@gmx.de> <0763351c-3ba0-2205-59eb-89a1aa74d303@bobbriscoe.net> <CC0517BE-2DFC-4425-AA0A-0E5AC4873942@gmx.de> <35560310-023f-93c5-0a3d-bd3d92447bcc@bobbriscoe.net>
To: Bob Briscoe <ietf@bobbriscoe.net>
X-Mailer: Apple Mail (2.3445.104.17)
X-Provags-ID: V03:K1:AGwxXocQcsCh86HziHC+l8H2rL1b5p2BoEM1+cQI1HJC2jAuU2y u/AT03XoEhiLI+ukyRvR0FPcODmk5PS+Eq49fGMAHSvZw1aKER//dbCUxR5tHs1gFT40aO3 zKcGx3zWwp0vM933jBNF/qyW1UX6xzmEOehMmJIMdVOJQZmMozJFl/S0f61QutUrMVqb0OX 7sAeN3SKVBRKBhHS8llhA==
X-UI-Out-Filterresults: notjunk:1;V03:K0:iLmk5g3XKwo=:PEjgm8aTaO6DDgSAikdxLA DwL49OEl0XBbGO1qC2BTezQ5UyH/u1yFXBZO52r4muFM4pb6Ef79SeHHXbhhQyRLCI3sCvR1w Fl1DmYqCFs/PL/YMxWHgh4K5rfzxjLAbhaiaDqrb28GcHqUJVUa5azuPYAabnhus76jo6Psxm 8qV/XLFKUMwU+eq/DYDbcMDZ8CMVgDGsd3tYJ93H6WkgBvLThAj4JeJ37IaYkjo/2k/gGTHwO LdMXJq0VxhrE9r0/vbJQrpSYqtoBKDrAU6xO7qAnG4dn7db21zp+ZoLL5jkRDMC5w6jLK8SiY Cl+NGh+oxgiYJK+Wk0kHnFN4tH1KOIUMMJ4QIRpHPR47UPlgILEGITxHvc113h58349AsR9aL wbOroLP6SlUgnOgaNxOT3Cxw8px8wiHrT+JsH1NcE8PsBHF9RDylyuObFfl7gRWdGarOEaJb2 DLWB1qM6FrMYDlfN1615Yt6gD8PowCHhGnw2awlkaG3Ls17TNM/x5PmTJvDrcePM8T07rLkdp z3qIn3vde3G+AF4IX8t0YY14rwcOtvaWlemtELr4yRktuYWso7UkDfrnZp1vd27aPFSYwJBKP fUBYPKSzWpNdBvJeahk0xddG4pGLNQk2TD0x583HbVZkfbveEpVdis2jVTTAjWbRzF0er9IGp +Mqea49G9TfeAyruKvj/rlaDOX+Add58vscO71VKz6CHzUEfmxn7zL9KPt4JFLw7MTLX4a/gO hENeWNs7lpWD3EmDq4spq4719J3USb5QtknFK/WLmvNpjkXNjRiLhXkw4GYb9q7y5kRWvEmpM aOEjU60/EpPVQL3BO6TjRahDqRjkajerrgXHEmH781WlsCMkuiShJOqt4NEaVRoTr7MKK9xve rFcQ0V0n5/h1HsOyqDSg==
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/RMstBlCmgLWS20OEwOIwprSRWZA>
Subject: Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Dec 2020 21:37:13 -0000

Bob,

see [SM2] for comments below.


> On Dec 3, 2020, at 17:17, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> Sebastian,
> 
> On 03/12/2020 13:40, Sebastian Moeller wrote:
>> Bob,
>> 
>> more beow in-line, prefixed [SM].
>> 
>>> On Dec 3, 2020, at 13:19, Bob Briscoe <in@bobbriscoe.net> wrote:
>>> 
>>> Sebastian, Jonathan,
>>> 
>>> Encrypted tunnels not revealing the component flows in the payload is a correct property of layering, which has been a principle that guides protocol design since long before the Internet protocols adopted it. I would ask you to imagine standing up (virtually) in an IETF plenary (or any networking forum) and blame layering (and/or layered encryption) for FQ not being able to isolate component flows. You would be flamed to a crisp.
>> 	[SM] Sure, but by that same token passing of ECN codepoints between inner and outer is also a layering violation, if we did not do this, we would have no issue. But still "moving" ECN codepoints around at e- and de-tunneling is "endorsed" by the IETF. The same IETF now should make sure that these recommendations do not lead to negative outcomes, like some flows taking over most of the tunnel, by say consciously redefining the response to those ECN signals by transport protocols.
> 
> [BB] I don't know which school you learned networking at,

	[SM2] @chairs, @Bob, really? There is nothing technical in that question and nothing advancing our discussion or civility. 

> but passing information into a layer at one end or out of it at the other is how layers work. ECN was designed to be fully in line with layering

	[SM2] Is it? In an encrypted tunnel it can be easily argued, that passing information between inner and outer layers is a layering violation. And as far as I can tell that argument has been made and recognized. Sure there can be value in passing ECN codepoints between inner and outer layers, but it certainly is not strict maintenance of layers. I am sure a similar discussion can be had about flow labels of IPv6 in IPv6 tunnels, copying them from inner to outer on encapsulation can have value, but needs to be weighted against the risk of establishing an information exfiltration channel. But that is really a whole different discussion than the question why L4S deals badly with rfc3169 signalling. 


> 
> The IETF has no obligation to make functions work that violate layering (like FQ). Layer violations have to be considered as an optimization that works when the higher layer is accessible and doesn't when it isn't. Same with other similar mechanisms like firewalls. Remember FQ-CoDel is not standards track.

	[SM2] FQ is a red herring here, the issue is L4S's failure to equitably share over single queue rfc3168 bottlenecks. I have stated that explicity twice in the e-mail you relied to. As an a-side, last time I looked L4S' internet drafts also aim for experimental track, like e.g. FQ_CoDel...


> 
> 
>> 
>>> Certainly, a flow (such as L4S) that needs to be isolated from certain other flows could be seen as the cause of the need for FQ. But it is not. It is the cause of the need for isolation, not specifically /flow/ isolation. It is FQ that chooses to rely on identifiers that violate layering in order to provide that isolation.
>> 	[SM] Well, as long as all flows behave TCP-friendly (as they arguably should) that isolation-failure is pretty benign, no?
> 
> [BB] FQ's failure to isolate flows that end up in a tunnel together overrides any flows that are trying to be TCP-friendly and forces them not to be TCP-friendly.

	[SM2] How that? Inside each of stochastic FQs bins, flows can and will share the same congestion signal probability, just as they would do in a lower queue count AQM (say 2 queues). Whether a flow is TCP-friendly or not, is not a function of the bottleneck AQM, but of the flow's own congestion response. 

	The issue here is, to repeat for your education, that L4S's designed CC response is not coping fairly with rfc3168 CE marks in a TCP friendly fashion, nothing more and nothing less; FQ or no-FQ on the bottleneck. Except that for non-tunneled flows FQ can actually recover some level of equitable sharing in spite of L4S in-appropriate CC response to rfc3168 CEs. But if a tunnel hides the information that FQ needs to isolate flows that behavior regresses to the normal behavior for non-FQ rfc3168 AQMs, sure.
	If I look at it, FQ is not a complete solution for all network issues (and nobody claimed it was), but it helps the normal situation quite a lot. Especially if the FQ-AQM is close to the ISP to end-user transition, where tunneled traffic is going to be rare (or can be treated specially be the end-user).


> You don't seem to have grocked that there are always two sides to this argument. I put the other side and recognized your side.
> Then I gave the principle (layering) that sits as the judge between the two - determining the side that the IETF and the networking industry as a whole will always take.

	[SM2] That seems true, and pretty much orthogonal to the discussion we are having. 

> 
> You can't keep playing the game of dressing up this argument from only one viewpoint, as if that somehow proves the other viewpoint isn't valid.

	[SM2] Which view-point again? As far as I a concerned, you are still trying to edrail the discussion of L4S inherent lack of safety into a discussion about anything else. 



> 
> 
>> The issue here is that L4S proposes a new way of ECN-signalling that has exciting new TCP-unfriendly failure modes, like a shared tunnel with rfc3168 flows over a rfc3168 bottleneck. IMHO, if somebody uses a tunnel, they deserve to be treated like a single flow, what is not okay, is to have some mis-designed protocols take over the tunnel for themselves.
>> 	IMHO, looking at essential IP and TCP/UDP header fields for flow definition is the only sane way of doing it (we can haggle which to include), any field that can be gamed, will be gamed, so flows need to be defined as little game-able as possible, and if that means looking into protocol headers so be it.
>> 
>>> The root cause of the problem can be objectively determined. FQ doesn't correctly schedule flows within encrypted tunnels, whether L4S is present or not. Therefore L4S cannot be the root cause of this problem.
>> 	[SM] No arguable FQ is doing what it is supposed, if something presents itself to the network as a single flow it gets treated as such, if the source wanted different treatment, it knows what to do....
> 
> [BB] An application endpoint is not in control of whether its flow is aggregated with others in a tunnel.

	[SM2] "Something" is the entity that encapsulated flows into a tunnel, obviously. But this is a bit theoretical, as none of the choke points that see large scale tunneled traffic will see any deployment of AQMs anytime soon anyway. But humor me and show how juniper, cisco and co. are close to releasing DualQ implementations for their big iron routers... the reason, why tunneling came up again is simply that arguably the rfc3168 AQM with widest deployment is fq_codel, and team L4S (not team FQ whoever that is) keeps claiming that FQ will make L4S' re-definition of the CE response "safe".; but in the light of tunneling that claim simply is not true. Also not really a reason to go off on an unprovoked rant about FQ and layer violations.

> 
> The whole of networking works by encapsulation - typically multiple layers of it. In networking, if you want network equipment to do something with a flow ID, you put another transport header inside the outer (typically UDP encapsulation). For instance, see how GUE ensures equal cost multipath routing works.

	[SM2] Again that seems to be orthogonal to the L4S issue. I still have a hard time believing, that ab ISP savvy enough for that level of traffic engineering at the same time would set up an AQM with known deficits to such encapsulated traffic an i would expect these tunnels to end at transitions between ASs, and not start at the ISP to customer edge. I will not enter a theoretical discussion about the pro and cons of tunneling in general, but restrict myself to the part where L4S demonstrates un-safe behavior. Existing tunnels, do exactly that, they exist already, but L4S is still in the pre-deploymet phase and can (and should) be fixed before roll-out is attempted.



> 
> 
>> But again it is L4S that turns this into a problem, because in the condition we currently discuss, it fails to follow the principle of TCP-fairness... So you basically claim, if you unfairly tackle someone while the referee isn't watching that constitutes no foul. Not sure I agree with that position.
>> 
>> 	IMHO, the root cause of this issue is still L4S hare-brained design of thinking that re-defining the meaning of CE is a safe proposition (see how I picked up your Leporidae-theme from below?). The tunnel with the rfc3168 is only the example the demonstrates that just saying anybody who cares should use an FQ-AQM is not a solution, but honestly, that is a position team L4S introduced to the discussion as a means to keep declaring the fall-out from re-defining CE "not a big deal".
>> Again to be clear, the problem is that L4S is designed to be actively hostile towards/over rea rfc3168 bottlenecks, and the ideal fix would be to change L4S here. there have been several proposals how to achieve that, e.g. by Jake Holland, that all have been essentially ignored by team L4S (sure such a change would requre some re-engineering, so the argument that engineering would be required is IMHO not a real blocker for Jake's proposal).
> 
> [BB] I worked with Jake to write-up all the pros and cons, which we put in the ecn-l4s-id draft after Jake approved it. You will see one pro and many, many cons.

	[SM2] You might have realized, that I do not agree with all/most of your judgement calls. These lists of yours tend to tell more about your "political" stance than about hard technical limits...


> 
>> 
>>> It is instructive to see how the world looks to people who are stuck so far down the FQ rabbit-hole
>>> that the walls of the FQ burrow and the other rabbits down there give the comforting feeling that FQ is the whole world. Except for that small chink of light in the sky when looking up out of the rabbit-hole, which is all that is visible of the rest of the universe of networking.
>> 	[SM] Does this actually contain actionable content you want an answer to? If so please clarify, otherwise I will ignore that as an unwelcome attempt at derailing the discussion.
> 
> [BB] When you produce any actionable content, please let us know.

	[SM2] Classy, I would say, classy. But also pretty much expected, by now I have seen your debating style often enough, to realize that this is a sign that you realize that your argument do not hold much water. The next step will likely be that you fall silent... and abandon this sub-thread, fine with me.


> 
> 
>> 	I understand, that you dislike FQ (you have not been shy or subtle about that position), but the issue at hand is really an issue about L4S's proposed schemes inherent lack of safety.
> 
> [BB] You will have seen that I have now put FQ on equal footing with DualQ in the latest L4S drafts, which I also explained in my status update talk in tsvwg at IETF-109.  Although FQ is contrary to the e2e argument and violates layering, I always see networking as a tussle between bell heads and net heads, and try to accommodate both where I can.

	[SM2] Oh, FQ seems to be your personal nemesis, but leave me out of this, please. That is not my battle, out of experience I can tell you that FQ-AQMs at the end-customer ISP edge work pretty well; but I also already tried to express the issue that neither FQ-AQMs not L4S AQMs actually offer the targeted unfairness that end-users seem to desire once more approximate (as already claimed by dualpi2) or stricter (as in stochastic fair queueing) flow fairness does not solve the issues. In that case typically the important traffic that merits un-equal treatment does not have simple characteristics like shortest RTT but really depends on the user's judgement. 

	No, I am concerned the L4S as designed and implemented does not meet its promises and frankly does not work well enough to consider deployment for anything but the over-tested short RTT, low-hop count fast-track for which it was designed. I am also concerned that team L4S tries to paper over catastrophic failure in the declared reference AQM by adding an increasing list of requirements onto L4S compatible protocols, with actually enforcing the required behavior at run time. And as TCP Prague demonstrates the best effort of coming up with an L4S compatible transport protocol falls noticeably short of what can be considered a transport protocol for general purpose internet transport. This last point has been driven home prominently by Koen's proposal to have TCP Prague fall back to a CUBIC type response for RTTs > 80 ms... after fudging TCP Prague's CC response dynamics to make up for DualPI2's equitable sharing break-down at short RTTs. 

	How about you go and demonstrate safe, reliable and robust functionality of L4S AQM and transport protocol over the existing internet, before wasting more time on telling me about how networking should have been thought at my school, okay? Humor me and demonstrate that my concerns and objections are unfounded, and L4S is truly the best invention in networking since sliced bread.

Regards
	Sebastian


> 
> 
> 
> Bob
> 
>> 
>> Best Regards
>> 	Sebastian
>> 
>> 
>> 
>>> 
>>> Bob
>>> 
>>> On 20/11/2020 07:56, Sebastian Moeller wrote:
>>>> Ingemar,
>>>> 
>>>> encrypted tunnels not revealing the individual component flows in their payload is a feature of encryption and not a failure of flow isolation... Arguably an encrypted tunnel that disguises as a single flow should not allow propagation of ECN codepoints between the inner and outer layer at all, but then that is in the hand of the tunnel operator, not the AQM node.
>>>> 	It is quite interesting though, how tunneling is brought as an argument against the SCE proposal (only CE is guaranteed to be passed between layers*) and the very moment L4S shows issues with tunneling this is interpreted as someone else's problem. This constant application of double standards alone should be reason to reject the L4S drafts....
>>>> 
>>>> 
>>>> Best Regards
>>>> 	Sebastian
>>>> 
>>>> 
>>>> *) one of the original sins in regards to ECN and tunnels seems to have been not simply requiring complete unconditionally copying of inner ECN bits to outer ECN bits on en-capsulation and outer to inner on de-capsulation and letting the end-points deal with any accidental fall-out (to be complete, a tunnel should either not do any ECN propagation in any direction, or the one described). For rfc3168, I can fully understand why that route was not chosen, but years later for rfc6040 that decision is much harder to rationalize.
>>>> 
>>>> 
>>>> On 20 November 2020 07:04:56 CET, Ingemar Johansson S <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org> wrote:
>>>> Hi David, Pete
>>>>  I try to make it clear to me what this scenario show is about and somehow I see it more as a flow isolation problem that makes FQ non-functional rather than an L4S problem?.
>>>>  There is of course a possibility that VPNs do not implement RFC6040 properly. I guess for software VPNs it is only an update cycle away, more hard/firmware VPNs can of course be a different story but I guess that, similar to the discussion on home gateways and ECN a few months ago, they can be upgradeable too  ?
>>>>  /Ingemar
>>>>    From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Black, David
>>>> Sent: den 19 november 2020 22:20
>>>> To: Pete Heist <pete@heistp.net>
>>>> Cc: tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)
>>>>  [posting as an individual]
>>>>  
>>>>> I'll leave it to the WG to come up with examples of what types of tunnels and traffic scenarios could lead to this,
>>>>> but one example is a user who has a privacy VPN on their PC, and fq_codel on their home gateway.
>>>>> Let's say one flow connects to an L4S capable server, and another flow to a non-L4S, conventional server.
>>>>> The L4S flow will dominate the non-L4S one (whether it's ECN capable or not), probably causing some level
>>>>> of poor service, perhaps for a video stream, download, or whatever.
>>>>  It’s more than home gateways – there will be increasing use of VPNs with public or shared WiFi to block snooping by other WiFi devices and/or the access point infrastructure.  In that case, the WiFi access point and nodes between the access point and the VPN gateway can only look at the outer IP header applied by the VPN.  If the VPN preserves packet boundaries and complies with RFC 6040, then ECT(1) in the inner header will show up in the outer header, but not all VPNs do both of those.
>>>>  Thanks, --David
>>>>  From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Pete Heist
>>>> Sent: Thursday, November 19, 2020 3:03 PM
>>>> To: Gorry Fairhurst
>>>> Cc: tsvwg IETF list
>>>> Subject: Re: [tsvwg] Reasons for WGLC/RFC asap
>>>>  [EXTERNAL EMAIL]
>>>> 
>>>> On Thu, 2020-11-19 at 16:34 +0000, Gorry Fairhurst wrote:
>>>> On 19/11/2020 16:22, Pete Heist wrote:
>>>>  Hi Koen,
>>>>  Rather than thinking of this as advantages and disadvantages to waiting, I see it as an engineering process. It was decided earlier this year that the L4S proposal has enough support to continue, so we're on that path now. Part of that decision, as I understood it, also recognized that there are valid safety concerns around compatibility with existing AQMs, and some solution needs to be devised.
>>>>  RFC3168 bottleneck detection was added to TCP Prague, which appears to be difficult to do reliably when there is jitter or cross-flow traffic, and it has since been disabled in the reference implementation. The l4s-ops draft was started, but isn't complete yet and may need WG adoption as part of a LC. We can then decide how effective the proposed mitigations are against the risks and prevalence.
>>>>  To start a WGLC now would circumvent that earlier recognition that a safety case needs to be made. Meanwhile, since testing showed that tunnels through RFC3168 FQ AQMs are a straightforward path to unsafe flow interaction, along with other issues relative to the goals, it doesn't seem like the engineering process is done just yet.
>>>>  By the way, I liked your data - and it helped me a lot to look at this, thanks very much for doing this.
>>>> 
>>>> I'm glad, as I think we're at our best when we're doing engineering and producing data. I wish it were easier to do!
>>>> It would help me if you clarify what you mean by  "unsafe" - to me "safety" relates to traffic unresponsive to drop, as in CBR traffic, etc. I've not understood how CE-marked traffic can erode safety, but maybe I missed something?
>>>> 
>>>> Sure, so the existing RFC3168 CE signal in use on the Internet today indicates an MD (multiplicative decrease), whereas the redefined CE signal in L4S indicates an AD (additive decrease). Two congestion controls responding to CE in a different way, or one that responds to CE with an AD and one that responds only to drop (i.e. all standard congestion controls that advertise Not-ECT), will not interact safely in the same RFC3168 signaling queue. We're probably on the same page here already, but I'll refer to section 5 of RFC8257.
>>>>  That is one of the reasons why ECT(1) is used in L4S to place L4S flows in the L queue- to keep them separate from conventional flows in the C queue. As long as flows have advertised their capability correctly, that works.
>>>>  However, existing RFC3168 queues do not have knowledge of L4S, therefore will not know that ECT(1) means that traffic needs to be segregated and signaled in a different way. They will signal a Prague flow, which sets ECT(1), with CE, expecting the flow to respond with an MD, rather than AD. Meanwhile they'll signal an RFC3168 or non-ECN flow with either CE or drop, and in either case the flow will respond with an MD, causing conventional flows to yield to Prague flows to varying degrees depending on the AQM in use.
>>>>  Here's an example of CUBIC and Prague when they end up in the same fq_codel queue:
>>>> http://sce.dnsmgr.net/results/l4s-2020-11-11T120000-final/l4s-s6-rfc3168-1q/l4s-s6-rfc3168-1q/l4s-s6-rfc3168-1q-ns-prague-vs-cubic-fq_codel-50Mbit-20ms_tcp_delivery_with_rtt.svg
>>>>  Here's a more extreme example of Reno and Prague sharing a single PIE queue with ECN enabled (less common):
>>>> http://sce.dnsmgr.net/results/l4s-2020-11-11T120000-final/l4s-s6-rfc3168-1q/l4s-s6-rfc3168-1q/l4s-s6-rfc3168-1q-ns-prague-vs-reno-pie-50Mbit-20ms_tcp_delivery_with_rtt.svg
>>>>  In the example with PIE, Reno appears to be driven at or close to minimum cwnd. In the fq_codel example, the steady state throughput of Prague:CUBIC is around 19:1. We've seen a range in the Codel case from around 12:1 to 20:1. In my opinion, we could use the word "unsafe" here in both cases.
>>>> I'm not sure why "tunnels have crept in here. There have always been side-effects with classification (and hence scheduling), but I don't see new issues relating to "tunnels" with ECN.
>>>> 
>>>> Tunnels are relevant because they provide an easy practical path to the unsafe flow interaction described above. The widely used fq_codel qdisc has ECN enabled by default. Fortunately, because it has flow-fair queueing, Prague flows and conventional flows are usually placed in a separate queue (hash collisions aside), causing Prague to only affect itself with additional delay (TCP RTT). However, a tunnel's encapsulated packets all share the same fq_codel queue because they all have the same 5-tuple, so there is unsafe interaction between the tunnel's flows. Here we use Wireguard through fq_codel:
>>>>  http://sce.dnsmgr.net/results/l4s-2020-11-11T120000-final/l4s-s5-tunnel/l4s-s5-tunnel-phys-wireguard-prague-vs-cubic-fq_codel-50Mbit-20ms_tcp_delivery_with_rtt.svg
>>>>  I'll leave it to the WG to come up with examples of what types of tunnels and traffic scenarios could lead to this, but one example is a user who has a privacy VPN on their PC, and fq_codel on their home gateway. Let's say one flow connects to an L4S capable server, and another flow to a non-L4S, conventional server. The L4S flow will dominate the non-L4S one (whether it's ECN capable or not), probably causing some level of poor service, perhaps for a video stream, download, or whatever.
>>>> I'm not commenting on when the Chairs think a WGLC will provide useful information, we'll say that in due course.
>>>> 
>>>> Ok, I trust that we'll engage enough disinterested people into congestion control who will add their input.
>>>>  Thanks Gorry for looking this over. :)
>>>> Best wishes,
>>>> 
>>>> Gorry
>>>> 
>>>>  Regards,
>>>> Pete
>>>>  On Wed, 2020-11-18 at 10:31 +0000, De Schepper, Koen (Nokia - BE/Antwerp) wrote:
>>>>  Hi all,
>>>>  To continue on the discussions in the meeting, a recap and some extra thoughts. Did I miss some arguments?
>>>>  Benefits to go for WGLC/RFC asap:
>>>> 	• There is NOW a big need for solutions that can support Low Latency for new Interactive applications
>>>> 	• The big L4S benefits were a good reason to justify the extra network effort to finally implement ECN in general and AQMs in network equipment
>>>> 	• Timing is optimal now: implementations in NW equipment are coming and deployment can start now
>>>> 	• Deployment of L4S support will include deployment of Classic ECN too! So even for the skeptics among us, that consider that the experiment can fail due to CCs not performing to expectations, we will fall back to having Classic ECN support
>>>> 	• Current drafts are about the network part, and are ready and stable for a very long time now.
>>>> 	• Only dependency to CCs in the drafts are the mandatory Prague requirements (only required input/review from future CC developers: are they feasible for you)
>>>> 	• We have a good baseline for a CC (upstreaming to Linux is blocked by the non-RFC status)
>>>> 	• Larger scale (outside the lab) experiments are blocked by non-RFCs status
>>>> 	• It will create the required traction within the CC community to come up with improvements (if needed at all for the applications that would benefit from it; applications that don’t benefit from it yet, can/will not use it)
>>>> 	• NW operators have benefits now (classic ECN and good AQMs) and in the future can offer their customers better Low Latency experience for the popular interactive applications
>>>> 	• When more L4S CCs are developed, the real independent evaluation of those can start
>>>>  Disadvantages to wait for WGLC/RFC:
>>>> 	• We’ll gets stuck in an analysis paralysis (aren’t we already?)
>>>> 	• Trust in L4S will vanish
>>>> 	• No signs that we can expect more traction in CC development; trust and expectations of continuous delays will not attract people working on it, as there will be plenty of time before deployments are materializing
>>>> 	• Product development of L4S will stall and die due to uncertainty on if L4S will finally materialize
>>>> 	• Product development of Classic ECN will stall and die due to uncertainty on how L4S will finally materialize
>>>>  What are the advantages to wait? Do they overcome these disadvantages?
>>>>  Regards,
>>>> Koen.
>>>>      
>>>> -->
>>> -- 
>>> ________________________________________________________________
>>> Bob Briscoe
>>> http://bobbriscoe.net/
>>> 
>>>                PRIVILEGED AND CONFIDENTIAL
>>> 
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/