Re: [multipathtcp] MPTCP implementation feedback for RFC6824bis

Alan Ford <alan.ford@gmail.com> Wed, 04 December 2019 21:51 UTC

Return-Path: <alan.ford@gmail.com>
X-Original-To: multipathtcp@ietfa.amsl.com
Delivered-To: multipathtcp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 10553120835 for <multipathtcp@ietfa.amsl.com>; Wed, 4 Dec 2019 13:51:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zdFCFclbB3mr for <multipathtcp@ietfa.amsl.com>; Wed, 4 Dec 2019 13:51:03 -0800 (PST)
Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9EAAB12004C for <multipathtcp@ietf.org>; Wed, 4 Dec 2019 13:51:02 -0800 (PST)
Received: by mail-wm1-x341.google.com with SMTP id p17so1443822wmi.3 for <multipathtcp@ietf.org>; Wed, 04 Dec 2019 13:51:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=APP22OuAxw+FDyxmar76o9tY5cjKuljxOmx/qN0fLcw=; b=XKS5nnXAJ8jjyGFR5bSQDSifQF5vNO1GP0coDxN2tLE4H86xV8gEB7qYMOjthKD/aw uwWjLehlNBfHkRT9nhPBFKXGhsSkZ+y+KmALezBU7LgbVlC4TAnhiIw7Zuk90xtAbLXH 70xC01b2ES/ZpsdD2bhgb/qt0QpLkhBysJYK8Me5TzX0Dn5g9hxo5SvqKNR4IHnA75rV sM+rmLmAHq+0aPP05uSFnefuTP1MYygK+NBhQJRb6et+erWjZ4McTyunsQHwNec/Wrib ZCqaDI25UOgzInPbpLa6z9w48BUdiduUhUDND2HFopdUxQsutZY+J+tHHS15ewtZWOe5 Tp+A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=APP22OuAxw+FDyxmar76o9tY5cjKuljxOmx/qN0fLcw=; b=rrHFeEbxgGLntyekWuV+3h8UYd2BX566IWbYweDKGTN9Vii3Vn0q1SNF/sBjn8MGt8 fEnbtzR3smlFo01JicynM45POBzu0nJ/JFLKLQA6ePp+vLrPoIt/u/O2aGWOE7LbSl+1 L08dzGCdSd4V20crFHV2RWrdqFzCnjuLvdej09FrBVnuWOh7eHlRiLQ1HEX0X+yPG9rk idK35RqtCbhRZDyc72fKmYI6wDQOCwuB9E2rx9zki7LnBSFVxJHS6xwHS0BxOG/i2MLj 36w6zTWkAIN186qdhInhLDUCCep+EjbuDaQS1Q70wfcaPXQBgGtVvzsVlGJy5Dj03tsN 6xHg==
X-Gm-Message-State: APjAAAXXZAK4JdkTeSEFwgJm+zsdeQ4ZB9Lsly6sqPQoXhLALpq2VxCj XvUzoR7lGJ4v97cCPAQffQCTmP2EtbE=
X-Google-Smtp-Source: APXvYqwpx7s7jEdMCnmw289yW4AwILlpyolPX6/HD/FowVNVFs4VaSIMODQeKKpCJjRSSkKkXkvnYw==
X-Received: by 2002:a05:600c:2318:: with SMTP id 24mr4860wmo.48.1575496261113; Wed, 04 Dec 2019 13:51:01 -0800 (PST)
Received: from [172.20.10.2] (92.40.248.230.threembb.co.uk. [92.40.248.230]) by smtp.gmail.com with ESMTPSA id u14sm9507024wrm.51.2019.12.04.13.50.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Dec 2019 13:51:00 -0800 (PST)
From: Alan Ford <alan.ford@gmail.com>
Message-Id: <CF3EBAFD-E24E-4233-8FCE-775396E747A2@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_DA312B7D-5BB1-45E9-857B-BAA034883499"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Date: Wed, 04 Dec 2019 21:50:52 +0000
In-Reply-To: <20191202172757.GA84163@MacBook-Pro-64.local>
Cc: MultiPath TCP - IETF WG <multipathtcp@ietf.org>, Yoshifumi Nishida <nsd.ietf@gmail.com>, Philip Eardley <philip.eardley@bt.com>, Mirja Kuehlewind <ietf@kuehlewind.net>, mptcp Upstreaming <mptcp@lists.01.org>
To: Christoph Paasch <cpaasch@apple.com>
References: <17233788-D98B-4484-B785-2F58D43EA7CA@apple.com> <D070F2D5-6E8C-4551-86DD-E50B4ADF11B7@gmail.com> <3F1F1135-D2C0-48E2-9B6E-A83DDC11DF4F@apple.com> <83BFBFD6-255E-4022-96D4-BE183B709CB2@gmail.com> <20191202172757.GA84163@MacBook-Pro-64.local>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/multipathtcp/_59IH96eWMEaZMzkDa2DacNHJQU>
Subject: Re: [multipathtcp] MPTCP implementation feedback for RFC6824bis
X-BeenThere: multipathtcp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-path extensions for TCP <multipathtcp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/multipathtcp>, <mailto:multipathtcp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/multipathtcp/>
List-Post: <mailto:multipathtcp@ietf.org>
List-Help: <mailto:multipathtcp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/multipathtcp>, <mailto:multipathtcp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Dec 2019 21:51:06 -0000

Hi Christoph,

Thank you for the clarifications. I was revisiting the text to see ways to make these clarifications, however I find myself unsure of the need; please see comments below:


Section 3.1 clarification

Towards the end of Section 3.1 we actually say the following:

   The SYN with MP_CAPABLE occupies the first octet of data sequence space, although this does not need to be acknowledged at the connection level until the first data is sent (see Section 3.3).

Which would seem to cover exactly this concern. However if you still feel further clarification is required, then we could add this also to the sentence you suggest, i.e.:

   If B has data to send first, then the reliable delivery of the ACK + MP_CAPABLE can be inferred by the receipt of this data with an MPTCP Data Sequence Signal (DSS) option (Section 3.3). 

Can change to:

   If B has data to send first, then the reliable delivery of the ACK + MP_CAPABLE can be inferred by the receipt of this data with an MPTCP Data Sequence Signal (DSS) option (Section 3.3) containing a DATA_ACK for the MP_CAPABLE (which is the first octet of the data sequence space). Furthermore, when A receives a DATA_ACK from B it is a signal of the reliable delivery of A's MP_CAPABLE.

Please confirm if you still feel this is necessary, given the quote I provide.


Early Mapping

   A Data Sequence Mapping does not need to be included in every MPTCP
   packet, as long as the subflow sequence space in that packet is
   covered by a mapping known at the receiver.  This can be used to
   reduce overhead in cases where the mapping is known in advance.  One
   such case is when there is a single subflow between the hosts, and
   another is when segments of data are scheduled in larger-than-packet-
   sized chunks.

I would suggest simply adding a sentence at the beginning saying “A Data Sequence Mapping MUST appear on a TCP segment which is covered by the mapping”.

Regarding late mapping, we say:

   Implementations MAY hold onto such unmapped data for a
   short while, in the expectation that a mapping will arrive shortly.
   Such unmapped data cannot be counted as being within the connection-
   level receive window because this is relative to the data sequence
   numbers, so if the receiver runs out of memory to hold this data, it
   will have to be discarded.  If a mapping for that subflow-level
   sequence space does not arrive within a receive window of data, that
   subflow SHOULD be treated as broken, closed with a RST, and any
   unmapped data silently discarded.

Note the last sentence. We already bound this suggestion to just a subflow receive window of data, and provide a mechanism to reject (RST and silently discard). If we add the above text re early mapping that would also apply in this case and provide your requirement that the mapping lands on a segment covered by the mapping.

Any thoughts?

Best regards,
Alan

> On 2 Dec 2019, at 17:27, Christoph Paasch <cpaasch@apple.com> wrote:
> 
> Hello Alan,
> 
> On 29/11/19 - 21:13:38, Alan Ford wrote:
>>> On 28 Nov 2019, at 19:49, Christoph Paasch <cpaasch@apple.com> wrote:
>>>> On Nov 28, 2019, at 8:16 AM, Alan Ford <alan.ford@gmail.com <mailto:alan.ford@gmail.com>> wrote:
>>>>> On 27 Nov 2019, at 19:29, Christoph Paasch <cpaasch@apple.com <mailto:cpaasch@apple.com>> wrote:
>>>>> Section 3.3.1, page 32 & 33, "A data sequence mapping does not need..."
>>>>> 
>>>>> This paragraph states that it is permissive to send a mapping in advance. Late-mapping is specified a bit higher around the sentence 
>>>>> "Implementations MAY hold onto such unmapped data for a short while in the expectation that a mapping will arrive shortly"
>>>>> 
>>>>> This kind of early/late mapping announcements are difficult to handle in an implementation. The Linux Kernel implementation of multipath-tcp.org <http://multipath-tcp.org/> has always disallowed such kind of mappings. Meaning, whenever a DSS-option is received such that the range specified by the relative subflow-sequence number in the DSS-option and the data-length does not (partially) cover the TCP sequence number of the packet itself, the subflow will be killed with a TCP-RST. The problem around handling such early/late mappings is that it is unclear for how long the stack needs to remember these mappings (in the early-mapping case), or for how long he needs to hold on to the data (in the late-mapping case).
>>>>> 
>>>>> We thus suggest to change this to the following:
>>>>> Whenever a DSS-option is received on a packet such that the mapping of the subflow-sequence space does not partially cover the TCP-sequence number of  the packet itself, the host MUST discard this mapping and MAY destroy the subflow with a TCP-RST. It should be noted that a DATA_FIN that does not come with data has a relative subflow-sequence number of 0 and thus should be handled separately.
>>>> 
>>>> This one I do have an issue with:
>>>> 
>>>> - It is a technical change
>>>> - Wording to this effect has been in the document since pretty much the beginning
>>>> - It is a MAY which might as well say “there is no guarantee this would work”
>>> 
>>> The problem with the MAY is that the sender can't really know if the receiver accepts it (more regarding this below)
>>> 
>>>> Most importantly, the replacement text seems not to address this issue at all. If I read it correctly, it says that the data sequence mapping option MUST partially cover the subflow sequence space of the packet itself. But that has nothing to do with late or early mapping, both could partially cover the subflow sequence space and preceding data.
>>>> 
>>>> Can you clarify exactly what you want to permit and prevent, here?
>>> 
>>> Let me try to clarify what exactly we mean with early/late mapping so that we are all on the same terms here:
>>> 
>>> Early mapping:
>>> 
>>> A TCP-segment with sequence-number 1 holds a DSS-option with subflow-sequence number 1001 and data-length 100. This means we need to allocate space to store this DSS-option so that when the TCP-segment with seqno 1001 arrives we can know the mapping. There may be coming more of these DSS-options which all need to be stored in allocated memory. It is unclear what the limit to this is and there is no way to communicate this limit to the sender.
>> 
>> I don’t think we have ever intended to support a mapping like this. If the text is not clear here then yes, we might have an issue.
> 
> Yes, I do think that the text is not very clear on that  - we should clarify
> that.
> 
>> We intend only to support: A TCP segment with sequence-number 1 holds a DSS option for SSN 1 and length 10000 (so multiple segments in the future).
> 
> Sounds good!
> 
>> Or, slightly more convoluted, a TCP segment with SN 100, length 100, which holds a DSS option for 151-250, and bytes 50-150 were already covered by a previous mapping.
> 
> Also good.
> 
>> I do not believe we have ever intended mappings on data segments where the segments do not include any of the mapped data. We did, however, intend to support mappings on pure ACKs in order to avoid any option space limits.
> 
> Mappings on a pure ACK are an unlikely use-case. The problem is that in the
> end the mapping needs to reliably make it to the receiver as otherwise he
> needs to throw the data away. Combining it with data implicitly makes it
> reliable.
> 
> In case of option-space limits, it is better to send the options that do
> consume a lot of space (ADD_ADDR,...) on the pure ACK.
> 
>> 
>>> Late mapping:
>>> 
>>> The receiver receives data without DSS-options with TCP-sequence 1 to 1001. The corresponding DSS-option however arrives with the TCP-segment with seqno 2001. Here, the receiver needs to hold on to this data, waiting for the TCP-segment with the DSS-option. At one point the receiver needs to drop the data due to memory limits. Again, the sender has no way for knowing what this limit is.
>> 
>> So this is slightly more problematic, and this is what the MAY in the text is designed to discuss.
>> 
>> I believe we had intended to support a situation where you could have a segment 1-1000 without a DSS option, and 1001-2000 also without, and then 2001-3000 with the option then providing a mapping for 1 to 2001 or higher.
> 
> I think this scenario is fine to some extend with a slight change that the
> provided mapping should be for 1 to 2002 or higher (thus, including the byte
> of the 2001-3000 segment).
> 
>> This would allow a sender to start pushing out data before it knew what a mapping might look like.
>> It does, however, seem an unlikely situation but you as a receiver could of course reduce the subflow window size in order to limit buffering. The text does recognise the memory issue and point out that this won’t be DATA ACKed and as such a sender should soon realise and retransmit on a separate subflow and then this subflow may eventually be closed.
> 
> Yes, it seems like an unlikerly situation and I'm not sure about the
> use-case for doing this.
> 
>> Another situation would involve two mappings on the same TCP segment. If the first 50 bytes of a segment are covered by the mapping provided in that segment, but there are then also another 50 bytes, then the mapping can’t be provided until the next segment.
> 
> Yes, two mappings on a single packet are problematic simply because of the
> lack of option-space.
> 
>> I guess my initial thought here is that this was intended to cover a number of corner cases but if it does not work for you then there are several compliant ways of dealing with it.
>> 
>>> When the DSS-option comes together with the corresponding TCP-sequence it is straight-forward to store it together with the data. There are no issues with memory-allocation,... as all of this is accounted together with the announced window (yes, the memory is not counted against the window, but the receiver can foresee the DSS-option overhead when computing what window he should announce).
>>> 
>>> When a receiver gets data without a DSS-option, he can store it for up to 64KB of data as that is the maximum data-level length and the last segment of the 64KB-train could be holding the DSS-option. After that he has to drop the data.
>>> 
>>> When the mapping partially covers the segment it also isn't a problem as the unmapped part can safely be dropped and the mapped part can be passed on to the MPTCP-layer.
>>> 
>>> All of this does not imply that every segment of a mapping needs to hold the DSS-option. Just one of them needs to have it.
>> 
>> That last statement aligns with what was intended in the text. But are you really saying that, or are you saying that the first one needs to have it? Because that would be a change.
> 
> No, the first one does not need to have it. Just one of them.
> 
> 
> Cheers,
> Christoph
>