Re: [multipathtcp] Multipath deployment and fate sharing

Iljitsch van Beijnum <iljitsch@muada.com> Mon, 14 December 2009 12:06 UTC

Return-Path: <iljitsch@muada.com>
X-Original-To: multipathtcp@core3.amsl.com
Delivered-To: multipathtcp@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D80243A681C for <multipathtcp@core3.amsl.com>; Mon, 14 Dec 2009 04:06:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DxDrQQecN7ex for <multipathtcp@core3.amsl.com>; Mon, 14 Dec 2009 04:06:12 -0800 (PST)
Received: from sequoia.muada.com (unknown [IPv6:2001:1af8:2:5::2]) by core3.amsl.com (Postfix) with ESMTP id F38B63A68AB for <multipathtcp@ietf.org>; Mon, 14 Dec 2009 04:05:44 -0800 (PST)
Received: from claw.it.uc3m.es (claw.it.uc3m.es [163.117.139.224]) (authenticated bits=0) by sequoia.muada.com (8.13.3/8.13.3) with ESMTP id nBEC4qfs037731 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 14 Dec 2009 13:04:52 +0100 (CET) (envelope-from iljitsch@muada.com)
Mime-Version: 1.0 (Apple Message framework v1077)
Content-Type: text/plain; charset="us-ascii"
From: Iljitsch van Beijnum <iljitsch@muada.com>
In-Reply-To: <4B0D52D5.9090403@isi.edu>
Date: Mon, 14 Dec 2009 13:05:16 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <58B2A4B2-F276-4E96-87FB-F8273FD78ED0@muada.com>
References: <E9EE0C1A-C9D3-4EBC-97FD-E1B1628CD2E7@iki.fi> <3c3e3fca0911090542h54e45784qbdbf1f338a4c3e90@mail.gmail.com> <E03D46E1-51EA-4273-A8A7-4F37F88B2E92@iki.fi> <20091123.092214.140617438.nishida@sfc.wide.ad.jp> <48DA092B-F3BC-432E-A199-B265DDED39DA@iki.fi> <3c3e3fca0911240434p4d95ec7an34615ae218faa4f@mail.gmail.com> <C622F375-EE67-46AE-AC28-6617CFEF6D12@lurchi.franken.de> <BF345F63074F8040B58C00A186FCA57F1C65FB29B9@NALASEXMB04.na.qualcomm.com> <4B0C607A.6060503@isi.edu> <BF345F63074F8040B58C00A186FCA57F1C65FB29C0@NALASEXMB04.na.qualcomm.com> <4B0C6590.9010000@isi.edu> <BF345F63074F8040B58C00A186FCA57F1C65FB29C6@NALASEXMB04.na.qualcomm.com> <4B0C6C13.2060103@isi.edu> <BF345F63074F8040B58C00A186FCA57F1C65FB29CB@NALASEXMB04.na.qualcomm.com> <4B0C6FBE.40003@isi.edu> <alpine.DEB.2.00.0911250211520.23603@ayourtch-lnx.stdio.be> <alpine.DEB.2.00.0911250428590.23603@ayourtch-lnx.stdio.be> <BE063B0C-BB34-4C75-A57E-1BAB0BE6780A@cs.ucl.ac.uk> <4B0D52D5.9090403@isi.edu>
To: Joe Touch <touch@ISI.EDU>
X-Mailer: Apple Mail (2.1077)
Cc: "multipathtcp@ietf.org List" <multipathtcp@ietf.org>
Subject: Re: [multipathtcp] Multipath deployment and fate sharing
X-BeenThere: multipathtcp@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Multi-path extensions for TCP <multipathtcp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/multipathtcp>, <mailto:multipathtcp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/multipathtcp>
List-Post: <mailto:multipathtcp@ietf.org>
List-Help: <mailto:multipathtcp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/multipathtcp>, <mailto:multipathtcp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Dec 2009 12:06:14 -0000

[I haven't been keeping up with this list, apologies if I make points that were discussed previously.]

On 25 nov 2009, at 16:52, Joe Touch wrote:

>> Coming back to fate sharing of multipath connection and initial
>> subflow: I would assume fate sharing will be a sysctl flag on the host,

That can only set the default, we need a socket option to do it per application.

>> a) connection doesn't go down with first subflow
>> b) connection dies when local interface dies
>> c) connection dies when subflow times out (this implies b)

> These are interesting configuration parameters from an implementation
> view, but from a protocol view, if we're talking about TCP, there are
> only two cases:

> 	a) connection goes down when the whole set of subflows goes down

Yes, you still need to have a _session_ timeout.

Somewhat related issue: do we explicitly close a session, or implicitly by closing all subflows? The latter leads to long delays when a subflow is down.

> 	b) connection goes down when the first subflow goes down

> What makes a subflow go down shouldn't be defined separately;

We need to avoid sending data over broken subflows, though.

> However, anything that would cause the original subflow to fail even
> when tunneled over other interfaces still, IMO, needs to cause the whole
> set to go down. The key example is loss of the IP address associated
> with the original flow - either when the interface goes down, or when
> it's lost via failed DHCP renewals, or when it's otherwise explicitly
> reconfigured.


There are several issues here. But before we delve in: why didn't we have this discussion in shim6, where the exact same problem is present? And what about mobility?

I agree that if a host loses an address, it should no longer send packets that use that address in some way. So in the multiaddress MPTCP case that means a subflow that uses that address will have to be terminated when the address is removed. Note though, that some systems remove addresses when the interface goes down. That in itself is no reason to release the address. If a DHCP server or RA indicates that an address may be used for X amount of time, you get to keep it X amount of time, even if connectivity goes away, temporarily or difinitively, at some point.

But then the question is: what happens to other subflows?

My answer: "if a tree falls in the forest, and all the people in the forest are wearing headphones at maximum volume, did the tree fall?"

The problem with a socket option to indicate care is that it's going to take a long time for all apps to use this socket option, and we still need a default behavior. I don't think it makes sense to make the default behavior "care" because then you have to go out of your way to not care, which by definition people don't do. Especially since it requires thinking about how to set a socket option that will be unknown on some systems for portable applications.

However, we can deduce that applications care if they bind to a specific address, or look up the address tied to the remote end of an incoming session. If neither happens subflows can continue without trouble.

By the way: we need to communicate the address loss event to the other side so the other side can tear down the session if we lose our address but don't care but the other side cares. Note also that although binding to a listening address happens before the three-way handshake, looking up the remote address may happen afterwards.

Another class of applications is those that care about the addresses, but are not bothered by change. For instance, I may log that I'm talking to 192.0.2.1 but then when that address goes down and communication continues using 10.0.0.1, I simply log the address addition/change and all is well.

So I think what we need:

- mechanism to indicate that a session is bound to an address
- mechanism to indicate that an address has been removed
- mechanism to indicate "no address agility" (default = agility)
- mechanism to indicate "no MPTCP" (default = MPTCP)

Note that "no address agility" and "MPTCP" means that you can have additional subflows using additional addresses, but only as long as the primary addresses have reachability. I think this is similar to MIPv6, where you need to return to the HoA at certain intervals.

I don't think it's appropriate to explicitly track interface status in TCP. If the packets get there you're up, if they don't you time out eventually and then presumably you're down.