Re: [multipathtcp] Fwd: New Version Notification for draft-paasch-mptcp-loadbalancer-00.txt

Olivier Bonaventure <Olivier.Bonaventure@uclouvain.be> Wed, 09 September 2015 20:39 UTC

References: <20150908035557.GB73228@Chimay.local>
To: Christoph Paasch <cpaasch@apple.com>, MultiPath WG <multipathtcp@ietf.org>
From: Olivier Bonaventure <Olivier.Bonaventure@uclouvain.be>
Message-ID: <55F098DB.8020201@uclouvain.be>
Date: Wed, 09 Sep 2015 22:38:51 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <20150908035557.GB73228@Chimay.local>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/multipathtcp/pqP36sdCoUoCVKkTAmvnqOnlYVc>
Subject: Re: [multipathtcp] Fwd: New Version Notification for draft-paasch-mptcp-loadbalancer-00.txt
Precedence: list
Reply-To: Olivier.Bonaventure@uclouvain.be

Christoph,
>
> we have submitted a draft discussing the issues when using MPTCP behind
> layer-4 loadbalancers.

Thanks for bringing this to the list. This is indeed an important 
problem. As I might be late during tomorrow's telco, let me try to 
answer by email.

A first point that we might want to clarify in the document is how ECMP 
works with Multipath TCP and regular TCP.

With regular TCP, some load balancers behave in a purely stateless 
manner and compute a hash over the five-tuple in each packet to 
distribute the load among a set of servers.

ECMP --- server1
   +----- server2


With single homed and single stack clients, this ECMP solution ensures 
that all the connections sent by one host (i.e. one IP address) will be 
received by the same server if the hashing is independant of the source 
port. Otherwise, connections from the same host might hit different servers.

If the clients are either multihomed (e.g. smartphones) or dual-stack, 
then using a hash-based load balancing that does not include the source 
port in the computation does not ensure that all the connections from 
one client will hit the same server.

With Multipath TCP, a single connection may use different subflows and 
each subflow has its own five-tuple. When ECMP is used, then subflows 
from the same Multipath TCP connection might hit different servers.

Let us now consider stateful load balancers (LB). For simplicity, we 
assume that the load balancer maintains a table that maps each TCP 
connection, identified by a five-tuple on a given server. The table 
might also be used to perform NAT. The LB can use any technique to build 
this table (random, RR, load-based solutions, ...). As an example, let 
us consider a load balancer that resieds in front of N servers as shown 
below :


LB ----- server1
  +------- server2


When the LB receives a SYN with the MP_CAPABLE option from a client, it 
must insert an entry for the corresponding five-tuple inside its table. 
For regular TCP, this would be sufficient. For Multipath TCP, this is 
not enough since it needs to ensure that all the subflows that belong to 
this connection will be forwarded to the same server. In your draft, you 
propose to change the way tokens are computed and discuss different ways 
to perform the handshake. Let me suggest a different approach that might 
be worth to compare.

To support Multipath TCP efficiently in this setup, we need to ensure 
the following properties for the tokens that are used (on the server side) :
  - tokens are unique (at time t, server1 and server2 cannot use the 
same token for different connections)
  - tokens are known by both the LB and the server that is handling for 
the connection

Instead of computing the keys (and thus indirectly the token) on the 
servers, why not selecting the keys for the MP_CAPABLE option in the 
load balancer ?

Let us consider the following scenario :

      client                       LB              server

  SYN + MP_CAPABLE (K_A)
----------------------->
(client sends normal key)


                        SYN + MP_CAPABLE (K_A,K_B)
                        ----------------------->
                      (LB computes key for server)


                                    SYN/ACK + MP_CAPABLE (K_B)
                                  <---------------------------
                                   (normal MPTCP segment)


ACK + MP_CAPABLE_ACK (K_A, K_B)
-------------------------------->
(no change to this ack)


Since the LB computes the key that will be used by the server, it can 
ensure that the (token extracted from the key) is unique and also 
remember it for future subflows to pin them to the good server. There 
are probably different techniques that the LB could use to generate 
these keys on behalf of the server, but I don't think that they need to 
be specified in an IETF document. It's up to each implementor to find an 
efficient way to do that.

One issue that needs to be discussed is how the LB can convey the key to 
the server in the SYN packet. There are different techniques that are 
possible and we should probably select one to ensure the 
interoperability between a LB and servers :
- add K_B to the MP_CAPABLE option, but this is likely to be beyond the 
maximum length of the TCP options
- add K_B in the payload of the SYN segment (but we'll need to check the 
impact on TFO)
- encapsulate the SYN segment inside something, like a UDP segment with 
a TLV format that has enough space for a SYN segment and the additional key
- invent a hack to encode the 64 bits key inside some "unused" fields of 
the IP/TCP headers (say IP id, TCP ack number, IPv6 flowlabel, TCP 
timestamp echo, ...)
- ...

At this stage, I'd opt for an encapsulation which seems to be cleaner to 
me. It also forces the servers to be aware of the presence of the LB, 
which seems a useful feature to me.

Comments are, of course, welcome


Olivier

[multipathtcp] Fwd: New Version Notification for … Christoph Paasch
Re: [multipathtcp] Fwd: New Version Notification … Olivier Bonaventure
Re: [multipathtcp] Fwd: New Version Notification … Christoph Paasch
Re: [multipathtcp] Fwd: New Version Notification … Olivier Bonaventure
Re: [multipathtcp] Fwd: New Version Notification … Christoph Paasch
Re: [multipathtcp] New Version Notification for d… Costin Raiciu