[tsvwg] On coupled CC

Sebastian Moeller <moeller0@gmx.de> Sat, 19 August 2023 16:18 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 72E98C151081 for <tsvwg@ietfa.amsl.com>; Sat, 19 Aug 2023 09:18:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.853
X-Spam-Level:
X-Spam-Status: No, score=-6.853 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sv7S-x0nTj_z for <tsvwg@ietfa.amsl.com>; Sat, 19 Aug 2023 09:18:46 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2DD0BC151073 for <tsvwg@ietf.org>; Sat, 19 Aug 2023 09:18:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1692461867; x=1693066667; i=moeller0@gmx.de; bh=Swlcd3rQz8qy+1QYxcvY2B7ni2jF+dYEs7zRkFgVMNc=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=YBXCkHaCDhRv7M3CvI25aU7GvNJpz6eg0mbOkIeRqC2+J4BdFEWm3raRbPGcoJtBfgMNvgu HNEWNHrSW0S+lh0oG0fLvUVp72knYaw1ReWMJRdxgH4f1JtAg6R0mC8en8d+w0bbul/AsTve7 ZRrIUp/Cm06vP6z9JgyYGOAgnC1sPHloSIV+4R8QQKSq5L1ZeBhfoMEeANUPlj76ITJYE0NH9 hZT2aD8bZI7iBe6s8J9RiptOKbNc3lrlpvRTT5KlN1Y7nemBQXJOIei5bmVkMZGGkn3+334k3 jtLJ9hZWm5vXBE0nEj54BMTr32s+jG4gRVyC2XEEtaYfa8zPfhTg==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([77.10.50.12]) by mail.gmx.net (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MA7Ka-1qRIlF0SVu-00BZEs; Sat, 19 Aug 2023 18:17:47 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <9DD1F7A9-8087-4898-9618-802FDBDA4607@ifi.uio.no>
Date: Sat, 19 Aug 2023 18:17:45 +0200
Cc: tsvwg <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <C5F60631-7CA6-4BD6-80F7-615D42B8659C@gmx.de>
References: <169179236696.36797.6075120394432124931@ietfa.amsl.com> <CALx6S36-4d=48UMKusabbRnQiZ7B=0uTvd-Oksrnwj9bxN7xmg@mail.gmail.com> <579B1F7E-CE8C-47C5-94A8-39BE643C5796@ifi.uio.no> <CALx6S34aQ+cX--1OAs_TzjUxL2GwN2-5iigYegxWAzwv+_rR4A@mail.gmail.com> <D627976D-82FC-4C51-A983-FB724EFADC5B@ifi.uio.no> <CALx6S37ucLeXZUT-wKBHgpPqSiicDu197ai7QXphhxQDua45=Q@mail.gmail.com> <FC6D5711-1FF7-4429-84EC-76782017ED8F@ifi.uio.no> <CALx6S35eX9Ew5Z5RcGDpDX6N+b2RFkLkGUXf=56=ZRMRb7siiw@mail.gmail.com> <A1B6D204-3077-4CE9-88DD-ECF8D9752424@ifi.uio.no> <CALx6S37GcEXBu_UOv6TvgH6wAmPd0V_R7_t36yYmpCXLCD7Z4A@mail.gmail.com> <5FFA9884-F52C-421B-ACB1-D8C0517A87D6@ifi.uio.no> <C062D64D-1166-41FA-BCE0-20C53D0C9DAB@gmx.de> <DA8D8C24-E7A8-43B3-AD8B-3D12E891708E@ifi.uio.no> <CCB644C8-87C6-4A36-BA09-DC0D8D494665@gmx.de> <B5039087-48C8-4E26-9811-15443F689ABA@ifi.uio.no> <223D67DE-A10D-4076-93BD-34A000FE284C@gmx.de> <9DD1F7A9-8087-4898-9618-802FDBDA4607@ifi.uio.no>
To: Michael Welzl <michawe@ifi.uio.no>
X-Mailer: Apple Mail (2.3696.120.41.1.4)
X-Provags-ID: V03:K1:fjXvSdRouaGIH2h3CTZc8S4cFtW1+chJeiofCWky65vu3RZpQ5L vzq+m/jV+5YRMu1Hq0Je1j2fqe+SCGd8ggu4a4BuuTcwe5jHt6DkSjubIr1KCCnFkEmTMf7 Mvy7dwhLWstPqnMB6rcdU7lF/rHJOefAvrL7AE0Z+/4cRg+TXY8BrxPwJLX+2lfbJM7mPGT pP/fmJ/V/UZEdMs1qB2Xg==
UI-OutboundReport: notjunk:1;M01:P0:KvqMTwwc0a0=;Rmy4h3YtPtky2OmYH6QfkUs5Uqn KmpBsxwollJa83cwuEt+bVdTFtaSoA4eVz8R+ZuH9dT/i5uHx9CIRMpxuFtfuCPaVnz5NFDg8 7d61Kou7suyZh/Mde83zqnDS/YD0h7SULrqfvsyZqc/xnl2OOpUyMOBLGffuqB/cxhLwHT2qr /wQ5qYkzEajGNVPX3NCDqXCqMx/1TZd0224UgMBp3sqInXFNdVwZYoWaHAvIhUZqsK16P+3TF 7u4i6RdSFGEkjZxoUUl96outJU9WWIerY6RdSjxu4DW2Ojqwl8fV+XzQkEca3wJuWVbrN6ZBU vKPMxZxAI3ioRLooTqkeXzE9PmmQ6UKKzbBdueWYcNb5mM3MjPoE4rdrqnolMNYSxR5FNT/zh uZAqEFN0loN2EddcwNs+kLMgP/B/F3FibzPiyKdS4PKs1LFYotg2SE73kEGxictbgUWnpwyHY kAVOZysyV2aqPOIyusfsT0bCsdwg3ZxwYtZl/aU4NdH3OZaiTh27xWnxUe6AZ17IFxFfh9c44 +6eAZd3kC/0EbluzauqES7OOXo8Gvmvy3LuHLAeN1053iRCT9P8xzJbQf1Doc2LkN0lbIHX2c XCb9CLLsYYzlihUXVcd89Cn/Elz2Vhba7AReIwFXoUTJU/IiWlWuguhBfIDmHAWVzem8XM+4+ bp9jFbEr9zGlzw8cNFnG1RN+Vuixi2J7sGPWADGfk3duKeRPWWv4nc7NMtZQlLXc0yrwDGB3b RlthkhfIcsnt2EalF8FLJN2F8/V5ufEIvUCDrso7MAE2utPdnTpO7dPCZ/oXX8+yxZ0hXGtyK bZdg/2Y1w6bNm+YuvyQlH3mqZ0JFlPx0iKQqGr3zszBfS8bm+eOKbPWaPNKN1FrxK34us97tN qA2dSOkYpxUEQXwjtCMlUyKohexErjEt37jRulSyfhszOjJw852mtCWciTOHE32uK7+vn2sRF TFF6Km5hIEiaSuh23/8dktFrHpo=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/a8hU0ett4AdIXzNW6jjq89Wm29g>
Subject: [tsvwg] On coupled CC
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Aug 2023 16:18:50 -0000

Hi Michael,

thanks for time, see [SM2] below.

> On Aug 19, 2023, at 17:00, Michael Welzl <michawe@ifi.uio.no> wrote:
> 
> Hi !
> 
> I’ll point out that we have converged on what the draft should say, and cut + paste this up here:
> 
>>> It lets endpoints make a conscious decision between load balancing and cc coupling, and, when used, it increases the chance for NATs to keep their state intact. What’s not to like?
>> 
>> 	[SM] Oh, I think we might agree more than it looks, I think the draft could simply recommend to use a "fixed" source port if a coupled CC is used, otherwise making the src port reflect the underlying TCP flow identity should be the recommended action. 
> 
> I’m ok with that - so we could stop here.
> I will draw an ornamental line to make that clear.
> 
> =======================================================================================
> §§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§
> * * * * * * * * * * * * * * * * *                               * * * * * * * * * * * * * * * * * * * * *                               * * * * * * * * * * * * * *
> §§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§
> =======================================================================================
> 
> 
> However, just cutting off the conversation like this seems impolite to me - so I’ll still try to summarize the points below, maybe then we can get a more constructive outcome of the discussion altogether.
> Since we both now have agreed on wording, your arguments against using the same UDP 5-tuple for multiple connections must be related to my original, somewhat stronger proposal, of the same 5-tuple being the default rather than the exception. Well, I don’t insist on this, but let’s take it from there anyway. I asked for reasons *against* using the same 5-tuple, and you wrote:
> 
> >	[SM] I thought that was clear, for a flow queuing scheduler to work best it needs to see individual flows… 
> and:
> >	[SM] Well, if this travels over a fq scheduler the whole tunneled traffic will appear as a single flow and under congestion (and that is what this all about) it will only get a single flow's share of bottleneck capacity... that is a disadvantage for coupled CC traffic, and it also counteracts the actual flow isolation at the scheduler (which, assuming your coupled CC scheduler is decent might not matter that much).
> 
> ….to which I say: are you telling me that, if I open 10 connections and you open 1, I *should* get 10 times more capacity than you?

	[SM] Yes in essence I am. Not because this is the best absolute strategy, but simply because it is the one of the few an intermediate node can take without needing additional information that results in minimal starvation of any connection. However, as has been argued before by folks way brighter than me, depending on where in the network we are looking 5-tuple might not be the ideal granularity. Say, if you are after per user fairness maybe doing 2- or 3-tuple might be better. (IMHO it typically does not matter all that much, the big advantage of 5-tuple is that this allows to tack one AQM instance per flow, which in practice works reasonably well*).

*) As I stated before in home networks pure 5-tuple fairness often works quite well, but just as often works less well, especially when applications are in play that use many parallel flows (download managers, torrent clients/servers, ...). What we opted for in cake (or to be precise what Jonathan Morton came up with) is to first share capacity by IP address (typically internal address, so 1-tuple?) and then by flow for each IP address. That basically restrict the fall-out from flow-sharding to those IPs running software that does so... And if this is more to your liking, cake also offer to only do src- or dst-host IP fairness or host-pair isolation, so basically something for everybody... man tc-cake (assuming you "do linux", the BSDs do not have cake as far as I know).


> 
> What a flow queuing scheduler can do, for multiple separate flows originating from the same host, is to protect them from each other.

	[SM2] Yes, and once they are isolated they can get individual AQM instances, which admittedly is more helpful for divergent connections than parallel ones to between the same end-points.


> However, this is a way of making a network element do the Operating Systems’s job - the host is in a much better position to get this right, and it has more information available.  

	[SM2] The network element still need to arbitrate over hosts though, as game theory tells us that hosts are likely not to play nice with each other if left to their own devices. Also note how part of the problem, flow sharding by bit torrent precedes fq-schedulers and is a recognized problem even with dumb FIFOs (pet peeve: talking down FQ-schedulers because they do not fix all possible issues, even issues that exist for FIFOs, but I digress).


> “Fixing” things in-between a host’s own flows in the network is really the wrong place to do it,

	[SM2] Yes, if OSs would be up to that task I fully agree; but I am not sure that current OS are there yet (at least those that I use)...

> as it causes pathologies like the capacity share being a function of the number of open connections.

	[SM2] This is a pathology that already exists with pure FIFO... as FIFOs tend to give rough per flow fairness (after all that is a consequence of the "don't starve flows" command inherent in several IETF RFCs).


> That’s what the congestion manager proposal tried to fix so many years ago (http://www.nms.lcs.mit.edu/cm/, and 3124). 

	[SM2] Yes... "tried to fix" translates to me to "did not actually took hold in deployed OSs", am I correct?


>  “Individual flows” at the network layer should ideally be one per host, not one per application.  

	[SM2] Here I respectfully disagree. And I even read your proposal to be "one per host pair"...


> So, if anything, lumping more connections together under the same 5-tuple makes flow queuing schedulers work *better*, not worse!

	[SM2] Not really, if we combine say X responsive and even a single non-responsive (but high enough rate to matter) flows in an aggregate an isolating scheduler worth its salt needs to bring the whole aggregate down... Sure you can say that is the failure of the connection manager and hence that hosts users can go pound sand, but I think this is rather harsh ;). And that BTW is something we already run into if we get too many flows and hence hash-bin/AQM sharing. The only time aggregating more flows will work better is if the involved hosts go out of their way to manage these as aggregates, like in your coupled CC papers.


>  (also, perhaps a minor point: fewer “flows” (tuples) = fewer hash collisions).

	[SM2] True, but that is really not a value in itself, it just makes an implementation less costly... However there is a limit to the number of reasonable parallel flows and hence hash bins, and that is the round-robin delay of servicing all active flows should stay acceptable (I know that is ill-defined).


> 
> Now, just for completeness, we can discuss the research that you think would be advisable - also because I do appreciate the request for data to prove a point, in general.
> 
> 
> 1) My misunderstanding:
> ===================
> I thought you meant that it would be advisable to investigate single-path coupled cc in the face of multiple network bottlenecks. Investigating that is what I called “nonsense”, but I now understand that this is not what you meant. Sorry!

	[SM2] Let me join into the apology, I should have made this much clearer. I agree that in that situation coupling the CCs will not work terribly well and if it does than only by pure chance.


> 
> 
> 2) What you really meant:
> ====================
> You have made it clear that you’re not convinced that traversing different paths necessarily also means traversing different bottlenecks. I agree with that!  I’ll quote your suggestion:
> 
> "start by pretending fate is shared, and this will work more or less well for short flows as well, assuming that this speculative initial fate-sharing was correct or incorrect and whether coupled CC is tolerant to some participating flows not really sharing the same fate. That is why I ask for how important is that fate-sharing for coupled CC to work. Given the above, I am not convinced that load balancers actually are that much of a problem (unless the bottlenecks happen only after pathes split after the load balancers)."
> 
> Right; it’s not a bad idea!  Is this research worthwhile to do **in support of the "TCP-in-UDP uses the same 5-tuple idea”**, however?  I say no, because:
> (note, the asterisks stress the focus on this design idea alone - please bear with me: I do think it’s relevant research in a more general sense, see item 3 below).
> 
> a) such an approach will never be 100% reliable, whereas using the same 5-tuple is (100% here meaning: “yielding the same behavior as seen by a single cc instance today”), at no perceivable disadvantage (see above for why I think you’re wrong about flow queuing schedulers).

	[SM2] Here is my objection in a nutshell, if all TCP-in-UDP implementations default to synthesizing the UDP src port based on SRC/DST IP address and "server" port alone then fq schedulers will have less information to work on, even for hosts not operating a coupled CC (which effectively today are likely all/most hosts). If we only argue about enabling this for host with coupled CC I happily agree (except I am not sure whether not both hosts would need to have coupled CC).

> 
> b) if the point is to convince people, with data, that it would work, then I already have the experience from the RFC 9040 discussions that such data wouldn’t convince e.g. Google (and probably, similarly, it wouldn’t convince other big companies).

	[SM2] Good point. I do not work for nor am a big company, but am a mere end-user, and hence come to this with a different perspective. I would argue that most of the internet is directly or indirectly financed by end-users, so do not feel the end-user perspective is any less important than that of big companies (note I am not saying companies have no stake in this discussion either, just that there are multiple parties involved; and yes big companies surely have more leverage in standardization organizations like the the IETF than end-users).


> We didn’t have that data back then, but it just became clear that, with or without data, they wouldn’t want such a more complex machinery that *might* sometimes fail in their servers (remember, this is a sender-side operation).

	[SM2] I clearly do not want to second guess what big tech would or would not consider convincing, but I would guess they use some KPI method to assess their performance and accept that occasionally everything goes pear-shaped, so I had hoped being able to show that there is a statistical significant and robust improvement of doing that they would just use it. It is not that e.g. BBR was without its teething problems and failures, and yet google seems pretty determined... 

> 
> c) such an approach will necessarily have to be more conservative than a design where one can rely on the same 5-tuple. Using my example again, an existing flow could have a cwnd of e.g. 100 packets and when a new flow joins, it could even be assigned e.g. 90 of these 50 in one go,

	[SM2] 90 of these 100, perhaps?


> depending on how priorities are set. That’s a massive leap of the congestion window, which is surely too risky when one cannot be really certain about sharing the same bottleneck.

	[SM2] That is the nature of speculative methods occasionally one needs to wind back and do some clean-ups... but honestly look at the RTT and if that is close enough just assume shared bottleneck, at least for traffic directed at end-users ;) 


> 
> 
> 3) Is research on “single-path coupled cc on traffic that **may** not actually traverse the same bottleneck worthwhile, in general?
> =================================================================================================
> Simply yes. Mainly, I see interesting possibilities for the outgoing traffic of a household: even when it goes to different destinations, it might all share the same bottleneck, and perhaps even congestion controllers on different hosts could be coupled (since the latency within the household should be very low).  

	[SM2] I am way to cautious for that, plus I see no way my household's park of internet attached devices (macosX of different vintages, appleTVOS, raspian, ubuntu, android, whatever OS runs on a nintendo switch) would play ball... Having looked at your papers I am convinced that coupled CC is a great idea, so my hang-up here is more like get this going for individual OS/hosts first before attempting to do this across hosts ;).


> That could yield quite large gains.  But now we’re talking about highly experimental research ideas,

	[SM2] Nah, I am just arguing about for hosts in typical end-user networks to allow speculative aggressive coupling (everything TCP with roughly the same RTT), I am not talking about cross-host coupling ;)


> quite far from the engineering that TCP-in-UDP is, and this is in fact one of my project proposals that never got funded… so…  

	[SM2] Next time let's try to get me on the review board first, I fully endorse the idea, including the pie-in-the-sky cross-host-coupling as a research project ;)

> this is not happening, at least not for me. If someone else wants to do it and is interested in collaborating, get in touch  :-)
> 
> Altogether, many thanks for your interest and the inspiring points you shared; I hope that I managed to clear things up a little.

	[SM2] Indeed, while we still are not fully on the same page, I understand your position (and I hope I made it clear enough that i consider the concept of coupled CC pretty great)


> 
> 
> Cheers,
> Michael
> 
> 
> 
> 
> 
>> On Aug 18, 2023, at 3:43 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>> 
>> 
>>> On Aug 18, 2023, at 11:07, Michael Welzl <michawe@ifi.uio.no> wrote:
>>> 
>>> 
>>> 
>>>> On 18 Aug 2023, at 10:15, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> 
>>>> Hi Michael,
>>>> 
>>>> 
>>>>> On Aug 18, 2023, at 09:59, Michael Welzl <michawe@ifi.uio.no> wrote:
>>>>> 
>>>>> 
>>>>>> On 18 Aug 2023, at 08:24, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Aug 17, 2023, at 21:18, Michael Welzl <michawe@ifi.uio.no> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Aug 17, 2023, at 9:15 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Aug 17, 2023, 12:09 PM Michael Welzl <michawe@ifi.uio.no> wrote:
>>>>>>>> Hi !
>>>>>>>> 
>>>>>>>> About the flow label:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Within the network, the flow label serves the same function as how devices are using the ports in UDP encapsulation- in both cases they are use to mark packet as belonging to the same flow.
>>>>>>>>> 
>>>>>>>>> A "flow" in this context is purposely ill defined, it does not have to correspond one to one to a transport flow. So you're idea of combining TCP flows into a mega flow for purposes of network visibility is a valid use case;
>>>>>>>> 
>>>>>>>> … but it doesn’t work. Some routers do hash over transport ports + flow label + IP addresses (and who knows what else), and so we saw that, between the same host pair, packets using different ports but the same flow label can take different paths.
>>>>>>>> 
>>>>>>>> That's up to the router. Some routers do you flow labels, some packets don't even have port numbers or they're too deep in the packet.
>>>>>>>> 
>>>>>>>> We only need to define how things like flow label and port numbers are set, not how they must be used by intermediate nodes.
>>>>>>> 
>>>>>>> Well yes, but because of that, one just cannot rely on the flow label alone as a way to “pin down” the route. Equal UDP ports for different encapsulated TCP connections *are* needed for this to work. Combined congestion control is about traversing the same bottleneck.
>>>>>> 
>>>>>> 	[SM] Why? The endpoint running the connection manager surely can aggregate different TCP connections into one shared cwin aggregate, no? After all the flows need to start and terminate at the same IP addresses so will be identifiable... as far as I can see same outer tunnel flow ID can be a helpful shortcut, but seems not to be a strict requirement for coupled CC?
>>>>> 
>>>>> The reason is that a single path is only guaranteed (as much as it’s “guaranteed”, and hence assumed by, all single-path congestion control - of course paths can still change, etc.) when packets have a common 5-tuple. Indeed we put multiple connections together into a shared cwnd aggregate, but this only makes sense if they traverse the same bottleneck.
>>>> 
>>>> 	[SM] Well, perfect being the enemy of good (enough),
>>> 
>>> That’s not what this is:
>>> 
>>> 
>>>> so this looks like a field were more research is advisable.
>>> 
>>> No, because it’s just totally wrong. Look, out of 3 packets, one can traverse bottleneck 1, one can traverse bottleneck 2, one can traverse bottleneck 3. A single congestion control instance just doesn’t make any sense for that, and research on nonsense is not advisable.
>> 
>> 	[SM] Yes, such divergent paths seems theoretically possible, my question is how likely is this scenario, given that a considerable number of internet users are mostly limited by their own internet access (so the bottleneck will be already predicted by NATed IPv4 and IPv6 prefix)... I might be wrong, but I think what mainly determines a flow's cwin and cwin's dynamics over a congested/limited path is the bottleneck capacity share of that flow and the RTT, the actual endpoint should not really matter all that much. So your quest of avoiding load balancing really just serves as a proxy for these flows share a common bottleneck, correct? 
>> A load balancer that happens on either side of the bottleneck should not really matter (unless it affects the RTT, but that should be trivial to check, after all TCPs need to maintain individual RTT estimates, no?).
>> 
>> I respectfully maintain, that more research seems desirable about how coupled CCs operate under "normal" existing-internet conditions.
>> 
>> 
>> 
>>> 
>>> 
>>>> So how does coupled CC work when the assumption "single-path" is not fully correct. Which as you state is never fully guaranteed anyway.
>>> 
>>> And, load balancing is happening plenty when ports are different - surely not hard to dig up measurement papers that show this.
>> 
>> 	[SM] How prevalent load-balancing is not my question, my question is how much does a realistic level of load balancing compromise the utility of coupled CC. This is IMHO a relevant research question that proponents of coupled CC might want to consider. The answer might well be that this is catastrophic and hence fully deterministic shared outer flow id is required. I expect however that this will take more for coupled CC to loose its usefulness (given my limited understanding on what should affect cwin dynamics).
>> 
>> 
>> 
>>> 
>>> Here’s a different angle to this: RFC 9040 is about coupling information across connections too, but not at the same level as coupled cc (instead, only to initialize). We (authors) tried to lobby for more coupling because this is beneficial when it works, and colleagues from Google were strongly opposed to this because of load balancing, and the reality that “connections with different ports take different paths”.
>>> 
>>> So, quite simply, without the same ports, we really can’t do this, period.
>> 
>> 	[SM] Which I, again with all respect, am not convinced of. Unless you already tried and it failed, in which case I will follow the data.
>> 
>>> 
>>> 
>>>>> An alternative to using the same 5-tuple is to measure whether there is a common bottleneck - we have also done work on this. Our latest and most thorough paper on this topic is:
>>>>> David Hayes, Michael Welzl, Simone Ferlin, David Ros, Safiqul Islam: "Online Identification of Groups of Flows Sharing a Network Bottleneck", IEEE/ACM Transactions on Networking 28(5), pp. 2229-2242, Print ISSN: 1063-6692, Online ISSN: 1558-2566 October 2020. DOI 10.1109/TNET.2020.3007346.
>>>>> https://ieeexplore.ieee.org/document/9161279?source=authoralert
>>>>> Preprint: https://folk.universitetetioslo.no/michawe/research/publications/sbd_ton.pdf
>>>>> 
>>>>> … and there’s also RFC 8382.  However: this is not fully reliable, and it requires connections to be relatively long - which is perhaps appropriate for WebRTC (which RFC 8382 was written for), but is not at all the case for most other Internet traffic. With (and only with) a common 5-tuple, single-path coupled cc. can be instantly applied.
>>>> 
>>>> 	Maybe... I think a common 2-tuple (src/dst address) will already be quite deterministic, the question is, is this not already good enough for coupled CC to deliver on its promises? Say, start by assuming fate sharing by 2-tuple and run the "Online Identification of Groups" to confirm whether that initial decision was good enough or not... if not, de-share the congestion control again?
>>> 
>>> That’s exactly the argument that didn’t fly for RFC 9040. See my next statement for a reason:
>>> 
>>> 
>>>> That said, for a fully coupled CC world an FQ scheduler would essentially operate on 3-tuples, something that has been argued as a suitable "flow-granularity" for deeper network nodes... but it really puts the burden on the coupledCC implementation to not screw things up regarding flow mixing and inter-flow scheduling.
>>> 
>>> Coupled cc won't get enough information to ever be able to do the right thing like this for short flows, when these flows (as is the case for the large majority of Internet connections) terminate in slow start, without experiencing congestion. Yet, without coupled cc., they may easily waste more round-trips than would have been needed. It’s really nothing that more research can fix.
>> 
>> 	[SM] As I said start by pretending fate is shared, and this will work more or less well for short flows as well, assuming that this speculative initial fate-sharing was correct or incorrect and whether coupled CC is tolerant to some participating flows not really sharing the same fate. That is why I ask for how important is that fate-sharing for coupled CC to work. Given the above, I am not convinced that load balancers actually are that much of a problem (unless the bottlenecks happen only after pathes split after the load balancers).
>> 
>> 
>>> 
>>> On the other hand, why are you even opposed to using the same 5-tuple?
>> 
>> 	[SM] I thought that was clear, for a flow queuing scheduler to work best it needs to see individual flows... 
>> 
>>> It’s reliable, easy with TCP-in-UDP, and I can’t see any disadvantage with it anyway.
>> 
>> 	[SM] Well, if this travels over a fq scheduler the whole tunneled traffic will appear as a single flow and under congestion (and that is what this all about) it will only get a single flow's share of bottleneck capacity... that is a disadvantage for coupled CC traffic, and it also counteracts the actual flow isolation at the scheduler (which, assuming your coupled CC scheduler is decent might not matter that much).
>> 
>> 
>>> 
>> 
>> 
>>> 
>>> 
>>>> Tangent: for home networks one of cake's recommended isolation-modes is one where first capacity is shared equitably between active internal IP addresses and only then (within each IP's capacity share) based on 5-tuple flows. That mode would give coupled CC meta-flows a more "equitable capacity share" than a pure 5-tuple flow isolation. That however is so far unique to cake and fq_codel does not implement that at all.
>>> 
>>> What a handful of devices do is irrelevant. Even if many devices would do it, it would be irrelevant:
>> 
>> 	[SM] You are missing my point, I think. This tangent shows how coupled CC does not need to suffer unduly even on a fq-scheduler assuming that scheduler does not do strict capacity sharing based on 5-tuple information.
>> 
>> 
>> 
>>> as long as there is a non-negligible number of routers out there that carry out load balancing using the 5-tuple, one cannot use single-path cc. coupling with multiple ports.
>> 
>> 	[SM] Assuming that coupled CC can not tolerate the expected level of load balancing (where the load balancing needs to happen before bottlenecks). I wonder for an on-path bottleneck, does path diversion after the bottleneck really matter? 
>> 
>> 
>> Regards
>> 	Sebastian
>> 
>> 
>>> 
>>> Cheers,
>>> Michael
>