Re: [multipathtcp] A question related to MPTCP control overhead

Sayee Kompalli Chakravartula <> Wed, 12 April 2017 13:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 37921129AF4 for <>; Wed, 12 Apr 2017 06:03:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.222
X-Spam-Status: No, score=-4.222 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id K_ekuJhTYRO9 for <>; Wed, 12 Apr 2017 06:03:38 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E0DB4129B14 for <>; Wed, 12 Apr 2017 06:03:33 -0700 (PDT)
Received: from (EHLO ([]) by (MOS 4.3.7-GA FastPath queued) with ESMTP id DKT88128; Wed, 12 Apr 2017 13:03:31 +0000 (GMT)
Received: from ( by ( with Microsoft SMTP Server (TLS) id 14.3.301.0; Wed, 12 Apr 2017 14:03:30 +0100
Received: from ([]) by ([]) with mapi id 14.03.0301.000; Wed, 12 Apr 2017 18:33:25 +0530
From: Sayee Kompalli Chakravartula <>
To: "Sargent, Matthew T. (GRC-LCA0)[Peerless Technologies]" <>
CC: "" <>, "" <>
Thread-Topic: [multipathtcp] A question related to MPTCP control overhead
Thread-Index: AdKux4amd3XFz/fSRoiWehKw99KRVgAruciAAHFvuLAAGcM3gABJPHoQ//+sEQD//iJhoA==
Date: Wed, 12 Apr 2017 13:03:24 +0000
Message-ID: <5C068B455EB58047BBD492DA2B0829FA3763D85B@blreml501-mbs>
References: <5C068B455EB58047BBD492DA2B0829FA3763ADD5@blreml501-mbs> <> <5C068B455EB58047BBD492DA2B0829FA3763CB68@blreml501-mbs> <> <5C068B455EB58047BBD492DA2B0829FA3763D266@blreml501-mbs> <>
In-Reply-To: <>
Accept-Language: zh-CN, en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A0B0202.58EE25A4.0001, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 36dc4be7c02fd16e8af772bc154acd9d
Archived-At: <>
Subject: Re: [multipathtcp] A question related to MPTCP control overhead
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Multi-path extensions for TCP <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 12 Apr 2017 13:03:41 -0000

Hi Matt,
I can think of two ways in which a TCP connection can go down: gracefully or ungracefully. Assume that there are two subflows, SF1 and SF2, and that SF2 is to be brought down. If SF2 goes down ungracefully, we have no cleaner way of informing the receiver that the sender is disabling DSS, and so this leaves us with two choices: (i) continue to use DSS on SF1, or (ii) define a new Option Subtype or a reserved bit of the DSS Option through which the sender can explicitly inform the receiver that it is disabling DSS. I believe that the later choice is the cleaner way of disabling the DSS option. When SF2 goes down ungracefully, we of course need to retransmit the outstanding bytes on SF1, similar to what the MPTCP specification requires.

The not-so-cleaner way of doing things is fragile and convoluted, so I would not want to discuss that here.

If the connection is closed using the RST bit: after sending RST, the sender immediately closes the socket, so the MPTCP sender has no way of knowing whether the receiver indeed received the RST bit. In fact, the RST bit may be lost in the transit. I would put this case in the category of ungraceful shutdown.

If the connection is closed using the FIN bit: Let us represent a SN as x_{i,j} where the subscript i denotes the type of sequence space (SSN or DSN) and the subscript j gives the subflow ID. Let x_{DSN, SF2} be the DSN of the last byte transmitted before FIN is sent on SF2 and that x_{DSN, SF1} be the last DSN mapped to SF1 before the DSS Option is disabled by the sender. In this context we need to consider two possibilities: x_{DSN, SF1} < x_{DSN, SF2} and x_{DSN, SF1} >= x_{DSN, SF2}. Now consider the case x_{DSN, SF1} < x_{DSN, SF2}. When a data byte is received on SF1 that is not covered by DSN we know where to place that data byte with respect to the data byte whose DSN is x_{DSN, SF1} using subflow sequence number of the data byte. Because, eventually, data bytes need to be ordered based on DSN before releasing to the application layer we still need to decide where to place this data byte at the connection-level with respect to x_{DSN, SF2}, and this is where the protocol will run into ambiguity: whether to place the data byte before or after x_{DSN, SF2}. This will lead to protocol dead-lock. Next, we consider the possibility x_{DSN, SF1} >= x_{DSN, SF2}. Here we do not have the ambiguity present in the previous context. The received data byte on SF1 will have to be placed to the right of both x_{DSN, SF1} as well as x_{DSN, SF2} and the actual placement on SF1 will be determined by the SSN of the data byte.

The above analysis helps us to decide how and when DSS Option can be disabled. After sending the FIN on SF2, continue sending the DSS option on SF1 until the largest DSN used on SF1 is at least as large as x_{DSN, SF2} and the state variable NUM_SUBFLOW has transitioned to 1, and then disable the DSS Option. To describe the receiver behaviour, the receiver expects to continue to receive data bytes covered with DS mapping on SF1 until the largest DSN mapped to SF1 is at least as large as x_{DSN, SF2}. After then, starting with the first received byte that is not covered with DSN, the receiver will understand that the sender has disabled DS mapping on SF1.

At this point, I like to believe that defining a new Option Subtype or utilizing an unused bit in the DSS option will be fruitful.

Except in some specific scenarios, like a data center, I don't think a device can know in advance if second interface becomes available during the connection. So if the users goes with TCP then he is at disadvantage because he cannot utilize additional interfaces when they become available. If he opens MPTCP connection hoping that new interfaces may become available in the future, until then the connection has to incur control overhead due to redundant DSS Option. 

As a first-cut solution I think we should allow a MPTCP connection not to use the DSS option until it opens the second subflow. To keep things simple, we may say that once DSS Option is enabled it will be enforced until the MPTCP connection is closed.


-----Original Message-----
From: Sargent, Matthew T. (GRC-LCA0)[Peerless Technologies] [] 
Sent: Tuesday, April 11, 2017 10:03 PM
To: Sayee Kompalli Chakravartula
Subject: Re: [multipathtcp] A question related to MPTCP control overhead

Hi Sayee,

I am not quite sure I understand how you are keeping sequence numbers straight for the case where you drop from 2 subflows to 1 subflow and want to stop using the DSS. Could you provide some details? What do you do when there is outstanding data on the subflow that goes away?


> On Apr 11, 2017, at 9:33 AM, Sayee Kompalli Chakravartula <> wrote:
> Dear Olivier,
> I read through the paper "Are TCP Extensions Middlebox-prrof?", and especially focused on the Section 3.3 on Multipath TCP and midleboxes. My comments are as follows:
> 1. Regarding middleboxes removing options from non-SYN segments:
> Currently, MPTCP handles this issue by attaching MPTCP option to every segment in the first window worth of data. An appropriate behaviour is defined for the sender and receiver to fallback to TCP in case middlebox removes the MPTCP option.
> In my proposal I say that, instead of appending MPTCP option to segments in the first window worth of data, we defer appending the MPTCP option to segments to a later time just before we establish the second flow and as described below.
> We will define an additional state variable DSS_RCV at both the sender and the receiver. At the beginning, this state variable will be initialized to false at both the ends. Now assume that the state variable NUM_SUBFLOW = 1 and DSS_RCV = false at the sender side and that the sender has send the DSS option for the first time. When DSS option is received with its state variable NUM_SUBFLOW = 1 and DSS_RCVD = false, the receiver will understand that the sender has decided to utilize the DSS option, and so it will update its state variable DSS_RCV = true and will ACK the segment that carried the DSS option with its own DSS option. When the sender receives ACK containing DSS option it will update its state variable DSS_RCV = true. Assuming that middlebox removes the DSS option included by the sender, the receiver will acknowledge the segment without DSS option. Because the ACK segment does carry DSS option the sender will fall back.
> 2. To cope with sequence number randomizers:
> The same approach works with my proposal too, but with one little observation. Whenever the state variable NUM_SUBFLOW transitions from two to one, all the DSS related information is discarded, i.e., the space of DSNs and the mapping are removed from the protocol control block. But, when the state variable NUM_SUBFLOW again transitions from one to two, the space of DSNs will be created to establish mapping between SSS and DSS with one little difference: unlike when the MPTCP connection is established, for the subflow 1 we will not map the first DSN to subflow sequence zero but to the running subflow SN at that time. 
> 3. Regarding middlebox performing segment splitting or coalescing:
> This behaviour of middlebox poses as much risk to my way of doing things as it does to the existing MPTCP specification. Because the existing mitigation does not depend on whether we continue to support DSNs on MPTCP connection consisting of just one subflow, the same solution will continue to work whether or not we support DSS for MPTCP connection consisting of just one subflow.
> 4. ALG modifying the data stream by adding or removing bytes from the payload:
> This issue is relevant when a MPTCP option is included in a segment, and the same protocol behaviour will work in my case too.
> I haven't prepared a detailed write-up to share with IETF folks, which I will do once I am fairly confident that I have worked out all corner cases.
> Sayee