Re: Update to draft QUIC DPLPMTUD text i draft-ietf-tsvwg-datagram-plpmtud

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Fri, 15 May 2020 18:27 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2C4473A005D; Fri, 15 May 2020 11:27:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Sb_WtvSm87nJ; Fri, 15 May 2020 11:27:41 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 519243A0363; Fri, 15 May 2020 11:27:28 -0700 (PDT)
Received: from [192.168.1.74] (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 530841B001FC; Fri, 15 May 2020 19:27:14 +0100 (BST)
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Mime-Version: 1.0 (1.0)
Subject: Re: Update to draft QUIC DPLPMTUD text i draft-ietf-tsvwg-datagram-plpmtud
Date: Fri, 15 May 2020 19:27:13 +0100
Message-Id: <F34E7194-424A-4A61-A1CC-052D6DB9C2DD@erg.abdn.ac.uk>
References: <5aaaaaa7-92c1-a4bf-921c-08d476b46fdc@huitema.net>
Cc: Magnus Westerlund <magnus.westerlund@ericsson.com>, "martin.h.duke@gmail.com" <martin.h.duke@gmail.com>, "tuexen@fh-muenster.de" <tuexen@fh-muenster.de>, "quic@ietf.org" <quic@ietf.org>, "draft-ietf-tsvwg-datagram-plpmtud.all@ietf.org" <draft-ietf-tsvwg-datagram-plpmtud.all@ietf.org>
In-Reply-To: <5aaaaaa7-92c1-a4bf-921c-08d476b46fdc@huitema.net>
To: Christian Huitema <huitema@huitema.net>
X-Mailer: iPad Mail (17D50)
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/WCCAfQEjHqHqerBodVK6LtW9AkM>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 May 2020 18:27:44 -0000

Thanks Christian, 

I think the search algorithm is interesting, I also think there are probably multiple ways to organise the search - and I like your analysis of the cost before probing higher, there is also a need to decide what to choose to probe when black holes are detected and there is inconsistent results, and i don’t know if  the performance of all the algorithms might be highly dependent on path properties ... we tried intentionally to avoid going beyond what we were sure about, expecting that if the framework worked and could be deployed in simple ways for robust black hole detection, we can determine good algorithms.

Actually, I think your thought is really cool, and I know a couple of people who would be really keen to work on the selection of algorithms (including how to deal with really awkward cases). Thinking and measuring could stimulate some useful research. I have huge hopes that we come around to some great algorithm, or set of algorithms, and we might then be able to publish an RFC on that part too.

Sent from my iPad

> On 15 May 2020, at 18:40, Christian Huitema <huitema@huitema.net> wrote:
> 
> 
>> On 5/15/2020 8:56 AM, Magnus Westerlund wrote:
>> Hi,
>> 
>> So what Gorry says appear correct. And it is a constant but its value could in
>> worst case be not only interface but actually path dependent. The reason being
>> that if the lower layers adds headers in specific cases. 
>> 
>> Although use of IPv6 extension headers are not problem free there might be cases
>> where such a header is added for some paths but not others. Thus, specifying the
>> MIN_PLPMTU correctly to be stable acrosss interfaces and paths could be
>> problematic. But for QUIC as Gorry says the stake is already in the ground and
>> it is in fact reasonable maybe to be more precise to say that a default
>> MIN_PLPMTU to use can be 1200 bytes as we don't expect QUIC to work over paths
>> that don't support this. 
> 
> Quic servers will only accept an Initial packet if it is at least 1200
> bytes long. This pretty much enforces a min MTU requirement of 1200
> bytes. Quic will not work on a network path does not support that.
> 
>> We might get some interesting questions on the padded handshake packets when
>> some want to run QUIC as IOT transport over radios such as what LPWAN WG
>> discusses where it is very costly to send 1200 bytes packets. The radio blocks
>> are more in the range of 32-64 bytes if I remember correctly. 
> 
> It is pretty hard to design an end-to-end transport that works well with
> a very small end-to-end MTU. There are just too many trade-offs. For
> example, the 1200 minimum MTU in Quic was chosen to mitigate DDOS
> attacks. That's the kind of trade-off mandated by the requirement to
> work end-to-end over the Internet. I assume that IOT devices with
> limited networking capability will also have limited connectivity, for
> example to a local relay.
> 
>> 
>> So I think this discussion is pointing to that we should actually move Section
>> 6.3 from draft-ietf-tsvwg-datagram-plpmtud into draft-ietf-quic-transport so
>> that the details can still be polished. 
> 
> I am comparing the specification in the paper with my early
> implementation of PMTU discovery in Quic (Picoquic). They are pretty
> much aligned, except for one important point regarding the sending of
> probes. As the draft says, the Picoquic implementation will send probes
> to discover the Path MTU, "by selecting step sizes from a table of
> common PMTU sizes". But there is an additional step in the Picoquic
> algorithm, a decision to probe or not as a trade-off between potential
> gains in efficiency and potential costs in probing.
> 
> End-to-end probes consume resource. Sending a 1500 bytes probe consumes
> about the same resource as sending 1500 bytes of data using the already
> discovered packet size. Nodes send such probes because they expect a
> gain in efficiency. If they discover that MTU size is 1500 bytes instead
> of 1200 bytes, they can send a 1.5 MB file in 1000 packets instead of
> 1250. There will be 25% fewer packet headers, 25% fewer packet
> processing, 25% fewer packet acknowledgements. That seems "worth the
> try". But these gains depends on how many bytes the nodes expect to send
> during the course of the connection.
> 
> Suppose now that instead of 1.5MB of data the node only anticipates
> sending 15KB. Successful probe means sending 10 packets of 1500 bytes
> instead of 13 packets of 1200 bytes. The typical overhead per Quic
> packet would be about 60 bytes, between link header, IPv6 header, UDP
> header and Quic header. Sending larger packets will save 180 bytes in
> packet overhead, which is less than the 1500 bytes of sending a probe.
> The node will achieve better throughput by not bothering with PMTU probes.
> 
> Of course, the numbers in the examples above are arbitrary. But there is
> a principle there: at a given time, nodes send probes because they
> anticipate a gain in efficiency over the reminder of the connection, if
> the probe is successful. This gives us a simple test: only send probes
> if the anticipated gain is larger than the cost of sending the probe:
> 
>     if (anticipated_gain * probability_of_success > cost-of_probe)
> 
>         then: send probe
> 
>         else: keep current MTU
> 
> That inequality informs the decision to send a probe or not. For
> example, suppose that discovery so far has assessed that the PMTU is at
> least 1400 bytes but lower than 1500. The table of step sizes says that
> the most likely value between 1400 and 1500 is 1430. Is it a good idea
> to try 1430? Maybe, but the efficiency gain will be just over 2%. In
> case of success, the cost of probing will only be amortized after
> sending 50 packets. If the probability of success is about 50%, the node
> should only send the probe if it anticipates sending at least 100 packets.
> 
> The inequality also informs the decision of which probe to send. For
> example, assume that the discovery so far has assessed that PMTU is at
> least 1200 bytes. Will the node probe for 1500 bytes or 1400 bytes? To
> make the decision, the node can assess the potential gains and the
> probability of success. 1400 bytes might be a safer bet if the node
> anticipates just sending a small amount of data, 1500 would provide a
> higher gain if the node anticipates sending a large amount. Trying 8K
> might be even more attractive if both peers support that size, as the
> lower probability of success is compensated by a much higher gain.
> 
> The inequality also informs the decision on whether to repeat a probe.
> Suppose that a node tried a 1500 bytes probe and failed. There are two
> explanations: either the path MTU is lower than 1500 bytes, or the probe
> was lost for other reasons. Applying Bayesian logic, the node can
> reevaluate the probability of success of a 1500 bytes probe. It may well
> be that sending a 1400 bytes probe will now be more interesting than
> trying 1500 bytes again.
> 
> It may be that this discussion is too much detail for the plpmtud draft.
> I could of course submit a draft discussing the probabilistic approach
> to PMTUD. But it still might be a good idea to add a general
> consideration in the description of the algorithm, asking implementation
> to take into account probability of success and expected gains before
> sending a probe.
> 
> -- Christian Huitema
> 
>