Re: [iccrg] [Stackevo-discuss] New Version Notification for draft-welzl-irtf-iccrg-tcp-in-udp-00.txt

Michael Welzl <michawe@ifi.uio.no> Fri, 25 March 2016 08:54 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 30C8812DBEB for <iccrg@ietfa.amsl.com>; Fri, 25 Mar 2016 01:54:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.21
X-Spam-Level:
X-Spam-Status: No, score=-4.21 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P3qIeLyJi92i for <iccrg@ietfa.amsl.com>; Fri, 25 Mar 2016 01:54:02 -0700 (PDT)
Received: from mail-out4.uio.no (mail-out4.uio.no [IPv6:2001:700:100:10::15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CE50812D7B8 for <iccrg@irtf.org>; Fri, 25 Mar 2016 01:54:01 -0700 (PDT)
Received: from mail-mx1.uio.no ([129.240.10.29]) by mail-out4.uio.no with esmtp (Exim 4.80.1) (envelope-from <michawe@ifi.uio.no>) id 1ajNVI-0001RX-5D; Fri, 25 Mar 2016 09:54:00 +0100
Received: from 3.134.189.109.customer.cdi.no ([109.189.134.3] helo=[192.168.0.107]) by mail-mx1.uio.no with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) user michawe (Exim 4.80) (envelope-from <michawe@ifi.uio.no>) id 1ajNVH-0006m0-FZ; Fri, 25 Mar 2016 09:54:00 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\))
From: Michael Welzl <michawe@ifi.uio.no>
In-Reply-To: <56F427D9.9030208@isi.edu>
Date: Fri, 25 Mar 2016 09:53:57 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <9826517E-9237-4EBB-AC0D-54388EBD4E5B@ifi.uio.no>
References: <A741874C-0E2C-4905-9FD3-D29B4B94A9C0@ifi.uio.no> <56F3212B.5020408@isi.edu> <20F3E6FF-DED4-46BD-BFD5-C76F8A6A8D40@ifi.uio.no> <56F32C47.6080707@isi.edu> <271375F3-2B9D-4C61-9C6E-468E6423A1A4@ifi.uio.no> <56F427D9.9030208@isi.edu>
To: Joe Touch <touch@isi.edu>
X-Mailer: Apple Mail (2.3112)
X-UiO-SPF-Received:
X-UiO-Ratelimit-Test: rcpts/h 11 msgs/h 6 sum rcpts/h 14 sum msgs/h 8 total rcpts 39663 max rcpts/h 54 ratelimit 0
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, TVD_RCVD_IP=0.001, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 9458DD3F1E10A6569C343362E653DF631A047C0E
X-UiO-SPAM-Test: remote_host: 109.189.134.3 spam_score: -49 maxlevel 80 minaction 2 bait 0 mail/h: 6 total 575 max/h 14 blacklist 0 greylist 0 ratelimit 0
Archived-At: <http://mailarchive.ietf.org/arch/msg/iccrg/CaCVDumEw_DvH-R-55N9QRAo8fU>
Cc: iccrg@irtf.org
Subject: Re: [iccrg] [Stackevo-discuss] New Version Notification for draft-welzl-irtf-iccrg-tcp-in-udp-00.txt
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Mar 2016 08:54:04 -0000

So now I’m answering the congestion control bits here. I’ll reorder things a little to more effectively answer points that belong together, I hope you don’t mind.

Side note: while this is about TCP using multiple paths, so far we have left MPTCP out of the discussion, and that probably makes sense. It does pretty clearly reflect "one congestion controller per path” thinking, I would say, and that makes it come from a different angle than what we’re discussing here: assuming a network of multiple paths, not a single path, for a single end-to-end congestion controller.


> On 24. mar. 2016, at 18.46, Joe Touch <touch@isi.edu> wrote:
> 
> 
> 
> On 3/24/2016 1:20 AM, Michael Welzl wrote:
>> 
>>> On 24. mar. 2016, at 00.52, Joe Touch <touch@isi.edu> wrote:
> ...
> 
>>>>> I can't imagine what combined congestion control of TCP would mean
>>>>> inside the network - we had a hard enough time defining it for either
>>>>> TCB sharing or the Congestion Manager just at the endpoints.
>>>> 
>>>> I don’t get this either; it sounds like a misunderstanding?
>>>> 
>>>> So the encapsulation here is just meant to force packets to take the 
>>>> same path - else the Congestion Manager may be doing something
>>>> pretty wrong.
>>> 
>>> The CM is supposed to react to E2E congestion, which is the sum of all
>>> effects of the different paths that packets can take.
>> 
>> I’ll have to disagree. The CM is one congestion control instance, 
>> and E2E TCP-like congestion control is based on the assumption of a
>> single path. Things go wrong if this assumption is broken, e.g. when
>> packets from a single connection are sent in different directions
>> inside the network. This is not only due to possible reordering, but
>> also because e.g. a congestion event on path 1 would make the
>> controller reduce its rate, even if path 2 still has 3 times more
>> capacity available.
> 
> TCP has always handled reordering.

In one way or another, yes…  I (and I’m sure, you) know of various attempts to deal with it; first off, stretching the threshold to allow more than 3 DupACKs, which is a trade-off between allowing more reordering and making the response slower. Second, trying to survive even when you got it wrong - i.e., seeing spurious loss events, and then having to undo the response; yes, good TCP implementations can cope with these things to some degree, but repeatedly seeing and undoing spurious loss events isn’t exactly what I’d call TCP’s normal operation condition. It’s also implementation dependent - this Ph.D. thesis:
http://folk.uio.no/paalh/students/DominikKaspar-phd.pdf
finds that Linux TCP can handle reordering better than a fully standard-conforming implementation would.

Bottom line, I agree TCP handles it, but I wouldn’t say these are its “normal” (preferred) operation conditions. Anyway, I mainly mentioned it because it’s the more prominent problem that one gets when using multiple paths. The other one that I mentioned above seems more severe to me. See below:


> TCP congestion control treats the network as a system - if the net as a
> whole drops packets, it backs off. If the net as a whole does not, it
> increases. That's consistent with TCP CC. You don't need to assume a
> single path.
(..)
> TCP CC absolutely allows us to assume different paths - it reacts to the
> *network* as a transfer mechanism, i.e., where the "path" is through the
> blob we call a network. The only real problem is when that blob's
> properties vary, but that's always a problem for every feedback system.

so then how do you solve the 2-path problem I explained above? Let me make this clearer, with a simple, very static example problem (the only dynamics here come from a single TCP connection):
Consider an empty network. Path 1 has capacity X. Path 2 has capacity 3*X. Some device in the network decides to round-robin schedule packets from a single TCP connection across paths 1 and 2.
Congestion will first appear on path 1. TCP will react, halving its cwnd. Most of the capacity of path 2 will always remain unused that way.

How do you solve this with e2e congestion control?


>> (*) Researchy side note here: I’m all for fixing this, but this
>> would require putting congestion controls inside the network,
> 
> No, it really doesn't. It just means we assume that the net is the
> "path", not that we know or have control over a specific set of links.

The only solution I can see to my problem above is not to schedule traffic as I described it. Obviously the scheduler should send more packets on path 2 than on path 1. Doing this correctly requires knowing the capacities of these links. Considering a more dynamic network situation, it requires knowledge about where the bottleneck currently is, and how much capacity is available per link. This is a control loop. If you put such a control loop in the network, you may want to give it a different name than me, but this is what I meant when I said “this would require putting congestion controls inside the network”.


> NOTE: threads along these lines have been popping up on V6OPS recently
> too; it'd be useful to check there. I'm not the only one who thinks we
> should stop trying to act like we can pin paths merely by picking IP
> addresses, flow labels, or doing DPI - all we ought to assume is that
> everything within a flow is "treated the same way", which does not mean
> that every packet is treated IDENTICALLY.

As per my example above, this would mean that you’d need something quite similar to congestion control inside the network, or the available capacity for packets on all these paths must be equal, all the way to the receiver - else you’ll end up underutilizing your network.

Cheers,
Michael