Re: DISCUSS's on rfc1981bis - MK Comments

"Mirja Kuehlewind (IETF)" <ietf@kuehlewind.net> Tue, 23 May 2017 11:20 UTC

Return-Path: <ietf@kuehlewind.net>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94BB2129ACD for <ipv6@ietfa.amsl.com>; Tue, 23 May 2017 04:20:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.003
X-Spam-Level:
X-Spam-Status: No, score=-2.003 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); domainkeys=pass (1024-bit key) header.from=ietf@kuehlewind.net header.d=kuehlewind.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DhyfFnGaX4JM for <ipv6@ietfa.amsl.com>; Tue, 23 May 2017 04:20:41 -0700 (PDT)
Received: from kuehlewind.net (kuehlewind.net [83.169.45.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8B29E129AB7 for <ipv6@ietfa.amsl.com>; Tue, 23 May 2017 04:20:40 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=kuehlewind.net; b=pRVbodmaYhz9O6KKnX/p2lA07Qn7jPJPjmcOhK1N+08sWNC0QFAgockssawTAdRsJhXn2psGW/L65dHqF/2xpNEnpul9SsnPu2a/U7ul037VhEvgcmkto+iVcCJPyJCcIWBqEsR1lvb23M9+IXm8RV3qxqKkNDmtY6xzKeUZUE0=; h=Received:Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer:X-PPP-Message-ID:X-PPP-Vhost;
Received: (qmail 13368 invoked from network); 23 May 2017 13:20:38 +0200
Received: from p5dec2002.dip0.t-ipconnect.de (HELO ?192.168.178.33?) (93.236.32.2) by kuehlewind.net with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 23 May 2017 13:20:38 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: DISCUSS's on rfc1981bis - MK Comments
From: "Mirja Kuehlewind (IETF)" <ietf@kuehlewind.net>
In-Reply-To: <5923E9E2.8010602@erg.abdn.ac.uk>
Date: Tue, 23 May 2017 13:20:35 +0200
Cc: ipv6@ietfa.amsl.com
Content-Transfer-Encoding: quoted-printable
Message-Id: <848EF662-2CAF-445D-9709-5EFC88251ABD@kuehlewind.net>
References: <FF803F19-4253-4422-AFA5-B99A8894BD83@gmail.com> <5911B0EE.6080802@erg.abdn.ac.uk> <EC313F56-0193-4510-9CCB-4AEF85F3E590@kuehlewind.net> <59205C94.30200@erg.abdn.ac.uk> <52A24ADC-8D1D-47DA-ADC8-21D30C09D49E@kuehlewind.net> <589a1f5a-8ce8-d484-9083-5eaedee3f5ff@erg.abdn.ac.uk> <97E22E2E-33E1-45A1-94E3-82FE35D1BDDB@kuehlewind.net> <64a06a7f-3d3e-dde6-7373-e104dfc04272@erg.abdn.ac.uk> <508AA18F-0790-4167-A54F-9F3E8B051229@kuehlewind.net> <5923E9E2.8010602@erg.abdn.ac.uk>
To: gorry@erg.abdn.ac.uk
X-Mailer: Apple Mail (2.3273)
X-PPP-Message-ID: <20170523112038.13363.98590@lvps83-169-45-111.dedicated.hosteurope.de>
X-PPP-Vhost: kuehlewind.net
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/pfcXY48PPPUN7vlrA0XQ2QipKqE>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 May 2017 11:20:43 -0000

Hi Gorry,

my thinking was that the IP layer should actually not indicate a new PMTU until the probing is completed and as such only probing packets should have a larger size than indicated by the current PMTU. However, I actually don’t know how this is implemented in reality.

I think the text below from you is better but now that I understand what it wants to say, I can understand it but I'm still not sure if it’s otherwise clear. Let me give a try:

"A packetization layer that determines a probe packet is lost, needs to adapt the segment size of the retransmission. Using the reported size in the last Packet Too Big message, however, can lead to further losses as there might be smaller PMTU limits at the routers further along the path. This would lead to loss of all retransmitted segments and therefore cause unnecessary congestion and amplification each time a new router announces a smaller MTU. The packetization layer therefore should not send any packets with a new PMTU that do not belong to the probing itself until the probing is completed and at least one packet with the new PMTU was successfully received at the other end. During the probing phase the packetization layer should either continue to use the previous PMTU or the minimum PMTU for all packet that are not probe packets, including retransmissions.“

I guess we could even use SHOULD and SHOULD NOT…?

What do you think?

Mirja

> Am 23.05.2017 um 09:50 schrieb Gorry Fairhurst <gorry@erg.abdn.ac.uk>:
> 
> Here is my proposed replacement text, it focusses on retransmission, because other clauses deal with otehr aspects of handling PTB:
> 
> "Note: A packetization layer that determines a probe packet is lost, needs to avoid retranmission using a packet size that may induce multiple retransmission. This could occur if it retransmitted the data segmented into packets of the size reported in the last Packet Too Big message and there were several successively smaller PMTU limits at the routers along the path. The resulting sequence of probe packets would require multiple retransmissions (with retransmitted probes also dropped by a router later along the path), with unecessary delay of the data and transmission of superfluous packets that contribute to the network load under congestion. Any packetization layer that uses retransmission is therefore also responsible for congestion contol of its retransmissions [RFC8085]."
> 
> 
> Gorry
> 
> On 22/05/2017, 16:02, Mirja Kuehlewind (IETF) wrote:
>> Yes, please propose new text.
>> 
>>> Am 22.05.2017 um 17:01 schrieb Gorry Fairhurst<gorry@erg.abdn.ac.uk>:
>>> 
>>> On 22/05/2017 15:31, Mirja Kuehlewind (IETF) wrote:
>>>> Hi Gorry,
>>>> 
>>>> thanks for the example below. Now I understand. My thinking was that you could also just retransmit the first segment in response to the packet too big message with the new indicated packet size and hold the rest of the to-be-retransmitted packet for later when the MTU probing is terminated. However, the point here is actually not about retransmitting. I guess what you actually need to say is that you should only send one MTU probe at a time. So if you want to retransmit all segments at once you need to use the confirmed PMTU that was used before the probing started, right?
>>>> 
>>>> Mirja
>>>> 
>>> OK - I also don't particularly mind if we change the text.
>>> 
>>> I'd suggest not even retransmitting the probe packet as another probe (if this data needs to be sent reliably). If there is more data, that later data can be used for the probe, I think there is really no need to hurry this procedure be vulnerable to multiple losses and re-transmits.
>>> 
>>> So, without trying to sketch a definition of a method - which I think is outside of the spirit of this update, can we simply identify what is the key thing to highlight?
>>> 
>>> Gorry
>>> 
>>>>> Am 22.05.2017 um 13:17 schrieb Gorry Fairhurst<gorry@erg.abdn.ac.uk>:
>>>>> 
>>>>> On 22/05/2017 10:49, Mirja Kuehlewind (IETF) wrote:
>>>>>> See below.
>>>>>> 
>>>>>> 
>>>>>>> Am 20.05.2017 um 17:11 schrieb Gorry Fairhurst<gorry@erg.abdn.ac.uk>:
>>>>>>> 
>>>>>>> On 19/05/2017, 15:51, Mirja Kuehlewind (IETF) wrote:
>>>>>>>> Hi Gorry, hi all,
>>>>>>>> 
>>>>>>>> thanks for the work and the update. All comments from Gorry seemed fine to me and I will try to review the changes in the updated doc soon. One more minor comment on this:
>>>>>>>> 
>>>>>>>>> Am 09.05.2017 um 08:07 schrieb Gorry Fairhurst<gorry@erg.abdn.ac.uk>:
>>>>>>>>> 
>>>>>>>>>> I don't understand the following paragraph. Can this be removed?
>>>>>>>>>> "Note: A packetization layer must not retransmit in response to
>>>>>>>>>> every Packet Too Big message, since a burst of several oversized
>>>>>>>>>> segments will give rise to several such messages and hence several
>>>>>>>>>> retransmissions of the same data. If the new estimated PMTU is
>>>>>>>>>> still wrong, the process repeats, and there is an exponential
>>>>>>>>>> growth in the number of superfluous segments sent."
>>>>>>>>>> 
>>>>>>>>> GF: I think the example is important. I'm not sure why that would be removed.
>>>>>>>> The problem is I don’t understand this example. Is there an assumption that all mtu probe packets have been ‚generated‘ on the same packet? If so that makes sense but must the spelled out somewhere!
>>>>>>>> 
>>>>>>>> Mirja
>>>>>>>> 
>>>>>>>> 
>>>>>>> I think the wording is maybe jumbled, but the point is OK. Perhaps something like this would be clearer?
>>>>>>> 
>>>>>>> "Note: A packetization layer ought to avoid retransmitting the data in a probe packet
>>>>>>> using the size reported in the last Packet Too Big message.
>>>>>>> This is to avoid an exponential growth in the number of superfluous segments
>>>>>>> that would be sent when a path encounters several successive smaller PMTU limits.
>>>>>>> (Each new estimated PMTU would result in retransmission of the data in a smaller
>>>>>>> packet that may itself fail if it encounters a still smaller PMTU in a device further
>>>>>>> along the same path)."
>>>>>>> 
>>>>>>> Gorry
>>>>>>> 
>>>>>> I still don’t get it. Each time a packet is dropped, you have to
>>>>>> retransmit it (if your transport is reliable)
>>>>>> because otherwise the other end will not have it.
>>>>>> 
>>>>> Sure. this says nothing about how you find out what segments need to
>>>>> be retransmit, only about the *SIZE* of the packets you use to perform that transmission.
>>>>>> 
>>>>>> I also still don’t see how there can be an exponential grows.
>>>>>> 
>>>>>> 
>>>>>> Sorry if I miss something but maybe you can explain again this case using different words…?
>>>>>> 
>>>>>> Mirja
>>>>>> 
>>>>> 
>>>>> Before I try different words, we should make sure the topic is understood.
>>>>> 
>>>>> Suppose we wish to send a probe for 8KB, and suppose the path supports
>>>>> 3 router MTU sizes: a hop with 4KB MTU and then a later hop with 2KB MTU, and finally one of just 1400B.
>>>>> 
>>>>> Largest probe size = 8KB
>>>>> 
>>>>> The first probe sends 8KB as the Segment Size
>>>>> ->------------------------------X
>>>>> Dropped by first router along path.
>>>>> In example case PTB indicates next hop =  4KB.
>>>>> Data not reliably sent.
>>>>> 
>>>>> Largest probe size = 4KB
>>>>> 
>>>>> Retransmit 2 packet for same segment of data, each size 4KB
>>>>> ------------->------------------------------X
>>>>> ------------->------------------------------X
>>>>> Dropped by router later along path.
>>>>> In example case PTB indicates next hop =  2KB.
>>>>> Data still not reliably sent.
>>>>> 
>>>>> Largest probe size = 2KB
>>>>> 
>>>>> Retransmit 4 packets for same segment of data.
>>>>> In example case PTB indicates next hop =  1400B.
>>>>> Data still not reliably sent.
>>>>> ------------->------------------------------>---X
>>>>> ------------->------------------------------>---X
>>>>> ------------->------------------------------>---X
>>>>> ------------->------------------------------>---X
>>>>> 
>>>>> Largest probe size = 1400B
>>>>> 
>>>>> Data finally transmitted in 6 packets all less than PMTU.
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> 
>>>>> All data delivered. PMTU=1400B is confirmed.
>>>>> 
>>>>> Total sent packets = 13 packets
>>>>> No of re-transmissions = 7 (6 failed re-transmissions in 3 re-transmission attempts)
>>>>> 
>>>>> Now let's try a method that heeds the warning in the text and re-transmits the probe data using a "safe: MTU:
>>>>> 
>>>>> The first probe sends 8KB as the Segment Size
>>>>> ->------------------------------X
>>>>> 
>>>>> Dropped by first router along path.
>>>>> In example case PTB indicates next hop =  4KB.
>>>>> Data not reliably sent.
>>>>> 
>>>>> Largest probe size = 4KB
>>>>> 
>>>>> Sender resends the 8KB segment using a known workable PMTU (in the above 1500B, if that were cached, or even 1280B.
>>>>> 
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> ------------->------------------------------>-------------->
>>>>> 
>>>>> Total sent packets = 7 packets
>>>>> No of re-transmissions = 6 (all successful, 1 re-transmission)
>>>>> 
>>>>> And afterwards separately resend a single probe packet to detect a the largest PMTU using the 4KB probe.
>>>>> 
>>>>> Obviously, there are many combinations of paths, and not all cases lead
>>>>> to exponential growth, but all cases lead to multiple probes, and
>>>>> multiple re-transmissions.
>>>>> 
>>>>> Gorry
>