Re: [multipathtcp] RFC6824bis edits based on implementation feedback

Yoshifumi Nishida <> Tue, 11 February 2020 09:05 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3FC561201DE for <>; Tue, 11 Feb 2020 01:05:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id z4o7xGsEM-oH for <>; Tue, 11 Feb 2020 01:05:50 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::e2a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 2489312008D for <>; Tue, 11 Feb 2020 01:05:50 -0800 (PST)
Received: by with SMTP id x123so5821718vsc.2 for <>; Tue, 11 Feb 2020 01:05:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bYwneWG/fR5f29mqIbMLKNai/vR809smK8H1US2l9dw=; b=Bz7ntItTS9iyMdGrim6QnBZt4GRu5v22xSkWPV4vxk8zRSGb1n07SI4RH+49lmfZil pmuSy09icyzleheMRN6IQ2HNf48fHYef5r5n0LcZZ2Tu2fEqtDMArFYX0rTnPOWd6+rP JVhq8P3OELGy8abKfDGAhdGKMqLC1f7SZvMwpfGjz/y0aHFmfsio4zMgdW6v3O6sIl7o wk1m+b+aJZh+Y64/wr0Pr7mt3MYRc3/U7m7t+JzV/PR1219gPqhxD5Ro6ZjLTyT5yaXO 0x7koTbwVWc3fbtdx5j/YNEfWQ/C3gnt9UMr47p1JMikqItj94a+oi/NrN68m+pMDO39 zXuw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bYwneWG/fR5f29mqIbMLKNai/vR809smK8H1US2l9dw=; b=gffAMJLgPp/RM+8ldk7uKsmMnOs3h8kJtduaZyWESdPblzWCHooWuxCWFwJpgJiJe/ lzzBQNOuVa2ecmw2V+k5Hx4WsiQ9os0WBe9syACxcv10GRFHUl6bjysYy1M+3kcmGxgn 9iSLVesXhwP/HOvlaJTpAQlLEv3x1q27iDn9A2tUoBFXRehO31KXxhBTUHj+jZzo7rPJ gXk3QZRCPf4gLtlXWb+qu4kKME6/qxp9wOInCdtfLAxPt2KV8Bs3EWq2EjXSGsNjtXsF /WpCQUcvTq53gHj8TkEtNmEHZGUaqv7H0nvvs1Fx9QT2bFz/aePwaaFTsy7olrNpAsk7 jjXw==
X-Gm-Message-State: APjAAAWsQ+xf87+JM8ifROM/II+jQzx2ClzrmpgVatJalWpKc9DabSF8 LvigKdB6v1irD8RaIof0hK7hFEDkhYuaq8aCtNg=
X-Google-Smtp-Source: APXvYqw5lXhgTqIr20v7hiI23Rg1S1fEoGgdbsZ/xcSNypcDAomKKQ6zd/++5D3hvxeS1/ZJEdUrJLd3UrmNhUhLBSY=
X-Received: by 2002:a67:f641:: with SMTP id u1mr8651429vso.86.1581411949037; Tue, 11 Feb 2020 01:05:49 -0800 (PST)
MIME-Version: 1.0
References: <> <> <> <> <> <> <> <>
In-Reply-To: <>
From: Yoshifumi Nishida <>
Date: Tue, 11 Feb 2020 01:05:37 -0800
Message-ID: <>
To: V Anil Kumar <>
Cc: Alan Ford <>, multipathtcp <>
Content-Type: multipart/alternative; boundary="000000000000774b0b059e49290e"
Archived-At: <>
Subject: Re: [multipathtcp] RFC6824bis edits based on implementation feedback
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-path extensions for TCP <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 11 Feb 2020 09:05:55 -0000

Hi folks,
I guess we might want to know the behavior of existing implementations such
as linux.
When an mptcp stack tries to send a data packet and find that there's no
enough option space for data mapping, what it will do?
Split the packet to create more option space?

On Thu, Feb 6, 2020 at 10:10 AM V Anil Kumar <> wrote:

> Hi Alan,
> Thank you for your reply. I have two points to clarify. Please see them in
> line.
> ------------------------------
> *From: *"Alan Ford" <>
> *To: *"V Anil Kumar" <>jk
> *Cc: *"Yoshifumi Nishida" <>om>, "multipathtcp" <
> *Sent: *Thursday, February 6, 2020 2:49:29 AM
> *Subject: *Re: [multipathtcp] RFC6824bis edits based on implementation
> feedback
> Hi Anil,
> This would not be forbidden if the mapping was carried on a pure ACK with
> no data.
> As far as I understand, delivery of DSM through pure ACK is not in the
> scope of the current draft. If we intend to do this, we may need an
> approach similar to the one proposed for delivery of ADD_ADDR option in
> pure ACK with reliability feature.  So, packing DSM on pure ACK does not
> seem to be an option at this stage.
> I do see the point here: if a packet of 1000 bytes contains 500 bytes of
> one mapping and 500 bytes of another mapping, then only one DSM would
> appear on one packet, leaving the mapping for the second 500 bytes to be
> carried somewhere else - the only option being a pure ACK. But this kind of
> scenario would be extremely rare
> Yes, I do agree that the scenario you mentioned above would be extremely
> rare. In fact, I wonder whether such a situation (i.e., need to cover the
> bytes in a single packet with two different maps) would ever arise.
> Probably there might be some corner cases, which I don't get it rightly
> now.
> More importantly, the scenario that Yoshi and me are referring to is
> totally different: the one I had given as part of my comments in response
> to the proposed change 2. In this scenario, which you could see in the
> trailing mail, data segment-1 is transmitted without including its map,
> something which is permitted in the  mptcp framework (6824 bis). I make
> this inference from the below text in 6824 bis:
> "...even if a mapping does not exist from the subflow space to the data-
> level space, the data SHOULD still be ACKed at the subflow (if it is
> in-window). This data cannot, however, be acknowledged at the data level
> (Section 3.3.2) because its data sequence numbers are unknown.
> Implementations MAY hold onto such unmapped data for a short while in the
> expectation that a mapping will arrive shortly."
> One option the data sender has at this stage is to include the map for
> data in segment-1, in segment-3 (please see the two-subflow example I gave
> in response to the change 2 you had proposed). But the proposed change does
> not permit this. What would happen to  subflow-1 in this case ?
> With regards,
> Anil
> and I would imagine any implementation would just split into two 500 byte
> segments each with the PSH flag set. I don’t think we need to spell this
> out in the spec however.
> Regards,
> Alan
> On 5 Feb 2020, at 15:09, V Anil Kumar <> wrote:
> Hi Yoshi,
> Please see in line.
> ------------------------------
> *From: *"Yoshifumi Nishida" <>
> *To: *"V Anil Kumar" <>
> *Cc: *"alan ford" <>om>, "multipathtcp" <
> *Sent: *Tuesday, February 4, 2020 1:08:45 PM
> *Subject: *Re: [multipathtcp] RFC6824bis edits based on implementation
> feedback
> Hi Anil,
> Thanks for pointing it out. I overlooked this one.
> This looks an interesting point.
> It seems to me that whether RST is happen or not depends on the size of
> receive window according to the text.
> If the receive window size is big enough to accommodate segment 1 and
> segment 3, the text "Implementations MAY hold onto such unmapped data for
> a short while in the expectation that a mapping will arrive shortly. " can
> be applied to the segment 1. As a result, the segment 1 won't be discarded.
> *Yes.  So, segment 1 may be kept in the data receiver's buffer in
> expectation that its mapping will arrive shortly. And in the example that
> we are referring to, the data sender will not be able to include the map
> for the data in segment 1 in segment 3 or any higher segment.*
> *Regards,*
> *Anil*
> However, this might be contradict with the new texts Alan proposed? Or, am
> I missing something?
> Thanks,
> --
> Yoshi
> On Sun, Feb 2, 2020 at 8:42 AM V Anil Kumar <> wrote:
>> Hi Yoshi,
>> Thanks for this point. In fact, I had initially not thought of a
>> scenario, where the map is being delivered through a retransmitted data
>> packet while its first transmission did not include the map. Now I am just
>> seeing the document (RFC 6824-bis) in this context.
>> My understanding is that in scenarios like what I described in my
>> previous mail, RST is likely to happen whether we explicitly state so or
>> not. Please see the paragraph containing the below text in RFC 6824-bis.
>> "If a mapping for that subflow-level sequence space does not arrive
>> within a receive window of data, that subflow SHOULD be treated as broken,
>> closed with a RST, and any unmapped data silently discarded."
>> if we assume that the map is included while retransmitting the data (even
>> though the first transmission did not contain the map for some reasons),
>>  we could argue that RST could be avoided provided that the retransmission
>> is triggered within a receive window of data. But the question here would
>> be how and when will the retransmission take place. In this case, the
>> subflow may not initiate the retransmission of data by its own (i.e., no
>> retransmission due to three duplicate ACKs or RTO expiry at subflow level)
>> as there is no segment loss at subflow level sequence space. So there could
>> be a high possibility of RST happening even before the map delivery through
>> retransmission.
>> With regards,
>> Anil
>> ------------------------------
>> *From: *"Yoshifumi Nishida" <>
>> *To: *"V Anil Kumar" <>
>> *Cc: *"alan ford" <>om>, "multipathtcp" <
>> *Sent: *Saturday, February 1, 2020 3:39:51 AM
>> *Subject: *Re: [multipathtcp] RFC6824bis edits based on implementation
>> feedback
>> Hi Anil,
>> I have a question about your proposed text.
>> I am actually wondering if we really want to terminate connection here.
>> The packets without proper mappings will be treated as invalid and will
>> be discarded.
>> If an implementation failed to attach proper mapping for some reasons
>> (e.g. option space), it might be able to attach the proper one when it
>> retransmits the packets. This also looks ok to me.
>> I don't have strong preference for this. But, do we have a reason to
>> terminate connection?
>> Thanks,
>> --
>> Yoshi
>> On Mon, Jan 13, 2020 at 10:28 AM V Anil Kumar <> wrote:
>>> Hi,
>>> I have some points related to the  modifications (Change 2) being
>>> proposed on data sequence map. Please see them inline. Though I am
>>> putting forward the below points, if the consensus is in favour of the
>>> proposed change for reducing implementation complexity, I am also OK with
>>> that as well.
>>> ------------------------------
>>> *From: *"alan ford" <>
>>> *To: *
>>> *Sent: *Thursday, January 2, 2020 4:21:32 AM
>>> *Subject: *[multipathtcp] RFC6824bis edits based on implementation
>>> feedback
>>> Hi all,
>>> We’d love to get this to a state of completion as soon as possible, and
>>> to this end I am starting a new thread on this topic. In discussion with
>>> the chairs, it *is *possible to make the desired changes in AUTH48 as
>>> long as there is WG consensus. The discussion so far has been fairly
>>> limited in terms of participation.
>>> I would ask the chairs please if it was possible to specify a time bound
>>> for this discussion and a default conclusion.
>>> Regarding the changes, in summary, there are two areas where changes
>>> have been requested by the implementation community. As we are the IETF we
>>> obviously have strong focus on “running code” and so ease of implementing
>>> standards-compliant code is strongly desirable. However, we do not wish to
>>> reduce functionality agreed by the IETF community if it is considered a
>>> required feature by the community.
>>> *Change 1*
>>> Change the sentence reading:
>>> *   If B has data to send first, then the reliable delivery of the ACK +
>>> MP_CAPABLE can be inferred by the receipt of this data with an MPTCP Data
>>> Sequence Signal (DSS) option (Section 3.3). *
>>> To:
>>> *   If B has data to send first, then the reliable delivery of the ACK +
>>> MP_CAPABLE is ensured by the receipt of this data with an MPTCP Data
>>> Sequence Signal (DSS) option (Section 3.3) containing a DATA_ACK for the
>>> MP_CAPABLE (which is the first octet of the data sequence space).*
>>> What this means:
>>> The current text is concerned only with ensuring a path is MPTCP
>>> capable, and so only cares that DSS option occurs on a data packet.
>>> However, the MP_CAPABLE option is defined to occupy the first octet of data
>>> sequence space and thus, if analogous to TCP, must be acknowledged. >From
>>> an implementation point of view it would make sense not to have this
>>> hanging around forever and instead define it is acknowledged at the
>>> connection level as soon as received. This change ensures the first data
>>> packet also DATA_ACKs this MP_CAPABLE octet.
>>> *Change 2*
>>> Change the sentence reading:
>>> *   A Data Sequence Mapping does not need to be included in every MPTCP
>>> packet, as long as the subflow sequence space in that packet is covered by
>>> a mapping known at the receiver.*
>>> To:
>>> *   The mapping provided by a Data Sequence Mapping MUST apply to some
>>> or all of the subflow sequence space in the TCP segment which carries the
>>> option. It does not need to be included in every MPTCP packet, as long as
>>> the subflow sequence space in that packet is covered by a mapping known at
>>> the receiver.*
>>> What this means:
>>> The current text does not place any restrictions on where a mapping
>>> could appear. In theory a sender could define a thousand different mappings
>>> up front, send them all, and expect a receiver to store this and reassemble
>>> data according to these mappings as it arrives. Indeed, this was never
>>> explicitly disallowed since it “might have been useful”. The implementation
>>> community, however, has expressed concerns over the difficulty of
>>> implementing this open-endedly. How many mappings is it reasonable to
>>> store? Is there a DoS risk here? Instead, it has been requested that thee
>>> specification restricts the placement of the DSS option to being within the
>>> subflow sequence space to which it applies.
>>> Below are my comments on this. I had shared some of these points in a
>>> previous thread that you had initiated in the same context.
>>> Transmitting large number of non-contiguous data sequence maps could be
>>> a misbehaviour (map-flooding), though it is not clear whether this can go
>>> to the extent of causing a potential DoS to the data receiver. So some sort
>>> of restriction on this could be useful.  One approach could be to insist
>>> that the data sender should ensure that the map being transmitted is for
>>> in-window data, as per the receiver advertised window. A receiver should
>>> anyhow be willing to store the maps for in-window data to deal with packet
>>> loss. For example, when a window of data segments (say 1 to 64) is
>>> transmitted, each carrying its corresponding map, and segment-1 is lost,
>>> the maps for the remaining 63 need to be stored till the lost segment is
>>> retransmitted. Of course, in this case the maps will be stored at the
>>> receiver side along with their corresponding data. But the need to store
>>> multiple maps for in-window data would still be there.
>>> The problem with the proposed change (restriction) is that a data sender
>>> may find it difficult, in case a need arise to slightly delay the map
>>> delivery by few segments, i.e., sending some data first without map, and
>>> then send the corresponding map in a later segment, as shown below:
>>> subflow-1:      segment-1                   segment-3
>>> segment-4                       segment-7
>>>                       bytes:1-100                 bytes:201-300
>>>      bytes:301-400                 bytes:601-700
>>>                       no map                        map for 1-100
>>>        map for 201-400             map for 601-700
>>> subflow-2:       segment-2                  segment-5
>>>   segment-6                       segment-8
>>>                        bytes: 101-200           bytes:401-500
>>>         bytes: 501-600                bytes:701-800
>>>                        map for 101-200       map for 401-600
>>>  no map                            map for 701-800
>>> In the above case, segment-1 goes without map and its map is included
>>> later in segment-3, the next data segment in the same subflow. Further,  in
>>> the above scheduling pattern, the map in segment-3 cannot cover the  data
>>> in segment-1 and segment-3, as some  data in between (segment-2) is
>>> transmitted through another subflow.  With the proposed change, the map in
>>> segment-3 will become invalid and this will eventually break subflow-1,
>>> though this could be a corner case.
>>> The question at this stage is why would segment-1 be transmitted without
>>> its map. In the case of bidirectional data transfer, there could be a need
>>> to pack both timestamp and SACK  options in a data segment, i.e.,
>>> piggybacking of  SACK with data. If we consider that timestamp takes 12
>>> bytes and SACK, even with single block,  takes another 10 bytes, the
>>> remaining 18 bytes option space is not adequate to carry data sequence
>>> signal with map, especially when DSN is 64 bit long. So the delivery of
>>> either of the two (SACK or map) would be delayed.
>>> As far as I understand, RFC 2018 (TCP Selective Acknowledgement Options)
>>> implies that SACK should not be delayed. It states "If sent at all, SACK
>>> options SHOULD be included in all ACKs which do not ACK the highest
>>> sequence number in the data receiver's queue". It also says "If data
>>> receiver generates SACK options under any circumstance, it SHOULD generate
>>> them under all permitted circumstances".   So, as part of meeting the RFC
>>> 2018 requirements, if the combination of SACK and timestamp is given
>>> preference over DSS, data segments could be transmitted without their map.
>>> Another case of delaying map could arise if the data sender prefers to
>>> send ADD_ADDR option, instead of map, in a data segment. It is nice that
>>> ADD_ADDR option can be delivered reliably in a pure ACK, but I think this
>>> is not the case with DSS at present.
>>> If we adopt the proposed change, I think it might also be helpful to
>>> spell out how the receiver is supposed to behave, if it gets maps not
>>> meeting the MUST condition in the proposed change.  For example termination
>>> of the subflow with MP_TCPRST option (section 3.6 in RFC 6824-bis) with
>>> appropriate reason code and T flag value to intimate the data sender the
>>> cause for subflow termination.
>>> With regards,
>>> Anil
>>> Please can members of the WG express whether they are happy with these
>>> changes, or concerned.
>>> Best regards,
>>> Alan
>>> _______________________________________________
>>> multipathtcp mailing list
>>> _______________________________________________
>>> multipathtcp mailing list