Re: Update to draft QUIC DPLPMTUD text i draft-ietf-tsvwg-datagram-plpmtud

Timo Völker <timo.voelker@fh-muenster.de> Tue, 26 May 2020 12:32 UTC

Return-Path: <timo.voelker@fh-muenster.de>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 683D13A0EE9; Tue, 26 May 2020 05:32:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cMbKONQLf-5R; Tue, 26 May 2020 05:32:46 -0700 (PDT)
Received: from mail.fh-muenster.de (mail.fh-muenster.de [212.201.120.190]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4796D3A0EEB; Tue, 26 May 2020 05:32:46 -0700 (PDT)
Received: from fhad-ex04.fhad.fh-muenster.de (unknown [10.40.11.27]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.fh-muenster.de (Postfix) with ESMTPS id 099BA285238; Tue, 26 May 2020 14:32:44 +0200 (CEST)
Received: from fhad-ex04.fhad.fh-muenster.de (10.40.11.27) by fhad-ex04.fhad.fh-muenster.de (10.40.11.27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Tue, 26 May 2020 14:32:43 +0200
Received: from fhad-ex04.fhad.fh-muenster.de ([fe80::c97a:37b6:5abe:2799]) by fhad-ex04.fhad.fh-muenster.de ([fe80::c97a:37b6:5abe:2799%2]) with mapi id 15.01.1979.003; Tue, 26 May 2020 14:32:43 +0200
From: =?iso-8859-1?Q?Timo_V=F6lker?= <timo.voelker@fh-muenster.de>
To: Christian Huitema <huitema@huitema.net>
CC: Magnus Westerlund <magnus.westerlund@ericsson.com>, "gorry@erg.abdn.ac.uk" <gorry@erg.abdn.ac.uk>, "martin.h.duke@gmail.com" <martin.h.duke@gmail.com>, Michael Tuexen <tuexen@fh-muenster.de>, "quic@ietf.org" <quic@ietf.org>, "draft-ietf-tsvwg-datagram-plpmtud.all@ietf.org" <draft-ietf-tsvwg-datagram-plpmtud.all@ietf.org>
Subject: Re: Update to draft QUIC DPLPMTUD text i draft-ietf-tsvwg-datagram-plpmtud
Thread-Topic: Update to draft QUIC DPLPMTUD text i draft-ietf-tsvwg-datagram-plpmtud
Thread-Index: AdYoOUazdCbDBFvtSKewbBJw2UYCnQByePIAAAMqLYAAAQF+gAArMr+AAANO+4ACHscYAA==
Date: Tue, 26 May 2020 12:32:43 +0000
Message-ID: <DA4E9B23-9A52-44F7-841B-13F5EFA4DC46@fh-muenster.de>
References: <HE1PR0702MB3772606AA3C6D808992C5DF595BE0@HE1PR0702MB3772.eurprd07.prod.outlook.com> <CAM4esxRZBehUOHpEg6E_p7bvY8CpK4oya7rfv4JVTkmAKfQ7jw@mail.gmail.com> <2643A40F-F0DC-45FF-A780-975D5568BE5B@fh-muenster.de> <03b92d83-b3f9-2fb3-c21c-5bf3fda767cb@erg.abdn.ac.uk> <425504a704c5b5ee6ea71fd23e9490cf8d79251a.camel@ericsson.com> <5aaaaaa7-92c1-a4bf-921c-08d476b46fdc@huitema.net>
In-Reply-To: <5aaaaaa7-92c1-a4bf-921c-08d476b46fdc@huitema.net>
Accept-Language: en-US, de-DE
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-originating-ip: [10.40.10.31]
Content-Type: multipart/signed; boundary="Apple-Mail=_ECC07F0A-D4D2-4F5F-AE94-E1EBD35B3F47"; protocol="application/pkcs7-signature"; micalg=sha-256
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/Qv2VvnF7SWyVomHEUSoJMLQb6CA>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 May 2020 12:32:51 -0000

Hi Christian,

thanks for sharing your experiences. Especially for the in-equation.

I wonder how to determine the values.

anticipated_gain
It is easy if we know how much data we are going to send over a connection, but how can QUIC figures that out in advance?
Also, a discovered PMTU could be reused in the next connection to the same destination. How to anticipate whether and how often we will reconnect in a short term?

probability_of_success
I think, measurements can tell (or have told?) us.

cost-of_probe
What do you mean with cost? The extra load for the network or the consequences for the endpoints?
It seems to me, the costs for a sender is quite high in QUIC. If we follow the spec, a sent probe packet consumes cwnd and, if it gets lost (for whatever reason), it reduces the (New Reno) cwnd, right? Did you include exceptions for probe packets?

Timo

> On 15. May 2020, at 19:31, Christian Huitema <huitema@huitema.net> wrote:
> 
> 
> On 5/15/2020 8:56 AM, Magnus Westerlund wrote:
>> Hi,
>> 
>> So what Gorry says appear correct. And it is a constant but its value could in
>> worst case be not only interface but actually path dependent. The reason being
>> that if the lower layers adds headers in specific cases. 
>> 
>> Although use of IPv6 extension headers are not problem free there might be cases
>> where such a header is added for some paths but not others. Thus, specifying the
>> MIN_PLPMTU correctly to be stable acrosss interfaces and paths could be
>> problematic. But for QUIC as Gorry says the stake is already in the ground and
>> it is in fact reasonable maybe to be more precise to say that a default
>> MIN_PLPMTU to use can be 1200 bytes as we don't expect QUIC to work over paths
>> that don't support this. 
> 
> Quic servers will only accept an Initial packet if it is at least 1200
> bytes long. This pretty much enforces a min MTU requirement of 1200
> bytes. Quic will not work on a network path does not support that.
> 
>> We might get some interesting questions on the padded handshake packets when
>> some want to run QUIC as IOT transport over radios such as what LPWAN WG
>> discusses where it is very costly to send 1200 bytes packets. The radio blocks
>> are more in the range of 32-64 bytes if I remember correctly. 
> 
> It is pretty hard to design an end-to-end transport that works well with
> a very small end-to-end MTU. There are just too many trade-offs. For
> example, the 1200 minimum MTU in Quic was chosen to mitigate DDOS
> attacks. That's the kind of trade-off mandated by the requirement to
> work end-to-end over the Internet. I assume that IOT devices with
> limited networking capability will also have limited connectivity, for
> example to a local relay.
> 
>> 
>> So I think this discussion is pointing to that we should actually move Section
>> 6.3 from draft-ietf-tsvwg-datagram-plpmtud into draft-ietf-quic-transport so
>> that the details can still be polished. 
> 
> I am comparing the specification in the paper with my early
> implementation of PMTU discovery in Quic (Picoquic). They are pretty
> much aligned, except for one important point regarding the sending of
> probes. As the draft says, the Picoquic implementation will send probes
> to discover the Path MTU, "by selecting step sizes from a table of
> common PMTU sizes". But there is an additional step in the Picoquic
> algorithm, a decision to probe or not as a trade-off between potential
> gains in efficiency and potential costs in probing.
> 
> End-to-end probes consume resource. Sending a 1500 bytes probe consumes
> about the same resource as sending 1500 bytes of data using the already
> discovered packet size. Nodes send such probes because they expect a
> gain in efficiency. If they discover that MTU size is 1500 bytes instead
> of 1200 bytes, they can send a 1.5 MB file in 1000 packets instead of
> 1250. There will be 25% fewer packet headers, 25% fewer packet
> processing, 25% fewer packet acknowledgements. That seems "worth the
> try". But these gains depends on how many bytes the nodes expect to send
> during the course of the connection.
> 
> Suppose now that instead of 1.5MB of data the node only anticipates
> sending 15KB. Successful probe means sending 10 packets of 1500 bytes
> instead of 13 packets of 1200 bytes. The typical overhead per Quic
> packet would be about 60 bytes, between link header, IPv6 header, UDP
> header and Quic header. Sending larger packets will save 180 bytes in
> packet overhead, which is less than the 1500 bytes of sending a probe.
> The node will achieve better throughput by not bothering with PMTU probes.
> 
> Of course, the numbers in the examples above are arbitrary. But there is
> a principle there: at a given time, nodes send probes because they
> anticipate a gain in efficiency over the reminder of the connection, if
> the probe is successful. This gives us a simple test: only send probes
> if the anticipated gain is larger than the cost of sending the probe:
> 
>     if (anticipated_gain * probability_of_success > cost-of_probe)
> 
>         then: send probe
> 
>         else: keep current MTU
> 
> That inequality informs the decision to send a probe or not. For
> example, suppose that discovery so far has assessed that the PMTU is at
> least 1400 bytes but lower than 1500. The table of step sizes says that
> the most likely value between 1400 and 1500 is 1430. Is it a good idea
> to try 1430? Maybe, but the efficiency gain will be just over 2%. In
> case of success, the cost of probing will only be amortized after
> sending 50 packets. If the probability of success is about 50%, the node
> should only send the probe if it anticipates sending at least 100 packets.
> 
> The inequality also informs the decision of which probe to send. For
> example, assume that the discovery so far has assessed that PMTU is at
> least 1200 bytes. Will the node probe for 1500 bytes or 1400 bytes? To
> make the decision, the node can assess the potential gains and the
> probability of success. 1400 bytes might be a safer bet if the node
> anticipates just sending a small amount of data, 1500 would provide a
> higher gain if the node anticipates sending a large amount. Trying 8K
> might be even more attractive if both peers support that size, as the
> lower probability of success is compensated by a much higher gain.
> 
> The inequality also informs the decision on whether to repeat a probe.
> Suppose that a node tried a 1500 bytes probe and failed. There are two
> explanations: either the path MTU is lower than 1500 bytes, or the probe
> was lost for other reasons. Applying Bayesian logic, the node can
> reevaluate the probability of success of a 1500 bytes probe. It may well
> be that sending a 1400 bytes probe will now be more interesting than
> trying 1500 bytes again.
> 
> It may be that this discussion is too much detail for the plpmtud draft.
> I could of course submit a draft discussing the probabilistic approach
> to PMTUD. But it still might be a good idea to add a general
> consideration in the description of the algorithm, asking implementation
> to take into account probability of success and expected gains before
> sending a probe.
> 
> -- Christian Huitema
> 
> 
>