Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt

Marc Petit-Huguenin <> Sun, 29 August 2021 15:40 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 03B213A19B8; Sun, 29 Aug 2021 08:40:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 0Sdv4ypVjKSs; Sun, 29 Aug 2021 08:40:48 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id BFDBB3A19B6; Sun, 29 Aug 2021 08:40:45 -0700 (PDT)
Received: from [IPv6:2601:204:e600:411:d250:99ff:fedf:93cd] (unknown [IPv6:2601:204:e600:411:d250:99ff:fedf:93cd]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "Marc Petit-Huguenin", Issuer "" (verified OK)) by (Postfix) with ESMTPS id 65E9DAE269; Sun, 29 Aug 2021 17:40:35 +0200 (CEST)
To: Magnus Westerlund <>, "" <>, "" <>
References: <> <> <> <> <>
From: Marc Petit-Huguenin <>
Message-ID: <>
Date: Sun, 29 Aug 2021 08:40:33 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/mixed; boundary="------------9210817E50B905E63E9F713E"
Content-Language: en-US
Archived-At: <>
Subject: Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussing the creation of a Turn Revised And Modernized \(TRAM\) WG, which goal is to consolidate the various initiatives to update TURN and STUN." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 29 Aug 2021 15:40:55 -0000

Hi Magnus,

As I said in my previous email, I am working on an analysis of RFC 8899.  You can see attached a picture of a complete model of RFC 8899 as a Timed Coloured Petri Net, here after 10 (virtual) hours of simulation,  I learned a few things preparing that model, which can be summarized as RFC 8899 being a bad description of an OK algorithm.

It is a bad description because there is a lot of missing information in it, information that I had to painstakingly infer from clues spread all over the document.  In some cases I had to make stuff up, that's particularly visible around the Complete and Error states.

This is an OK algorithm, not a great one, and there is space for improvements.  I even took some notes on a parallel version of that algorithm that would make it acceptable as a basis for draft-ietf-tram-stun-pmtud.  Remember that draft-ietf-tram-stun-pmtud started 13 years ago and, in spite of all the efforts of the transport people to make it as irrelevant as possible, was always meant to implement whatever algorithm the implementer wanted under the constraints of RFC 4821.  I always felt that RFC 8899 was a step back from that ideal, now with that model not only I know that it is, but I know precisely why and what can be done to fix it.

Note that the work is not done as I still have to complete the validation of the model and go through every single line of RFC 8899 to find the deviations between it and the reality of a provably working model.  But the question now is what to do with that knowledge.

One possibility is to submit an I-D that summarizes all the issues I found, propose some fix, and then continue to ignore RFC 8899 until someone publish a RFC8899bis that does that.

Another is to simply transcribe the better state machine I have in mind directly in draft-ietf-tram-stun-pmtud -- which will be a far easier task when doing it from a model like the one I built.

Finally the simplest is to remove altogether any reference to RFC 8899 in draft-ietf-tram-stun-pmtud and continue the publication as is.

Building that model took probably 6 days (half of it trying to make sense of the RFC 8899 text) over a month, which is not a small thing especially when not paid to do that, so my personal preference is the second option.

Anyway, thank you for your reviews and efforts to progress that draft.


On 8/26/21 5:20 AM, Magnus Westerlund wrote:
> Hi Marc and TRAM WG,
> My intention was definitely not to bully you. And we know that RFC 8899 have
> some issues. These issues are due to the problem of how to write a straight
> forward algorithm that handles all the corner cases the Internet will throw at
> the algorithm. RFC 8899 is weak in those areas where the STUN PMTUD ID is even
> weaker. Namely the description of how one use and arrive at a result for what
> the supported Path MTU is that is really reliable and can handle dynamic changes
> quickly and precise.
> So STUN PMTUD includes a novel algorithm for running multi packet sampling which
> clearly can improve the consistency of the results. However, there is no
> algorithm specifying both your probe construction for this, how you convert
> these measurements into a result and deals with the results that sampling
> provides.
> So if you want to make this the better solution, then please can you actually
> write up how to use the protocol mechanism in way that provide better results or
> at least similar results for these applications where STUN PMTUD style probing
> works better and have less impact on the application. I definetly see how RFC
> 8899 works better for fully ACKed protocols like TCP, SCTP and QUIC in
> comparision to RTP/UDP for example.
> My push for making this into a probe specification for RFC 8899 has been
> motivated with getting this work done during my term as AD. However, it clearl
> wasn't and the progress was very slow. Making the document into an RFC 8899
> usage appear to me to simplify the document as mechanism that would work in the
> context of RTP and other STUN using protocols, however some more advanced
> features would clearly be lost. However, it would mean a streamlining of the
> document and avoid several areas where the current document is lacking in
> specification, namley the algorithm to arrive at a result and how to actually
> send probes in a way that is network safe and not a significiant risk to be used
> a denial of service tool.
> I don't any longer have any stakes in this as I am no longer the AD. I just
> wanted to make clear what my position was at the end of the AD term. Going
> forward I would review this work soley from the perspective, is it providing
> something complementary to RFC 8899 and defined usages of that algorithm for
> other protocols, is it functioning, safe to deploy, and can it be implemented
> from this specification with less implementor tuning or configuration than RFC
> 8899.
> I hope that clarifies my view and what I see as the considerations here,
> enabling you and the WG to decided in what direction you want to take this work.
> Cheers
> Magnus
> On Sun, 2021-08-08 at 13:40 -0700, Marc Petit-Huguenin wrote:
>> Hi Magnus,
>> Sorry for the delay -- unexpected issue restricted my time in front of a
>> computer until few days ago.
>> I think that your most important point is that you (and others before) want
>> this draft to follow RFC 8899.  I am starting to feel bullied into accepting
>> that RFC 8899 is the only way to do DPLPMTUD which, as explained multiple
>> times, is it not.
>> These explanations have been conveniently ignored, so I decided to prove what
>> I think are the weaknesses of RFC 8899 by doing a formal analysis of that
>> protocol.  If I can prove that then I will publish my findings and continue
>> using a better algorithm than RFC 8899.  If I can't then I'll do the
>> modifications requested.
>> Now a formal analysis is not a small task so it will take some time to do that
>> especially as my plate is already quite full until the end of the year.  But I
>> should be able to spare half a day each week for that, so expect periodic
>> updates here.
>> Thanks.
>> On 7/18/21 7:16 AM, Marc Petit-Huguenin wrote:
>>> Hi Magnus,
>>> Thank you very much for that review.
>>> I'll work on a response and an update to the draft in the next two weeks,
>>> probably to be uploaded after the IETF meeting.
>>> On 7/14/21 8:08 AM, Magnus Westerlund wrote:
>>>> Hi,
>>>> I have reviewed -20 and have the following feedback.
>>>> Some of the issues have been resolved. However, my individual conclusion
>>>> is
>>>> that this document would be shorter, have fewer issues if only the simple
>>>> probing was retained and defined as a RFC 8899 PL solution for UDP based
>>>> application protocols that wants DPLPMTUD and not use protocol internal
>>>> methods. This could shorten section 4 significantly. I think the
>>>> alternative
>>>> is to simply declare the document dead.
>>>> However, I would take a look at the examples of RFC 8899 PL definitions
>>>> that
>>>> exists in RFC 8899, RFC9000 (QUIC) and for UDP Options
>>>> (
>>>> /) and rewrite Section 4 based on that and only keep the simple method.
>>>> Then
>>>> rewrite intro to adjust it as RFC 8899 based, and also clarify the STUN
>>>> server on the port for the whole session aspect and the demultiplexing
>>>> need.
>>>> Then section 7 also can be deleted. It would be good to have a clearer
>>>> rate
>>>> limiting specification for the probe packets in section 4, as the STUN
>>>> retransmission timer gives exponential back-off, and the ICE is not really
>>>> applicable here. The probe sending implementation will have a RTT estimate
>>>> when some response has been received. Based on that one can limit the
>>>> probes
>>>> to be sent only every n*RTT, possible with a MAX(n*RTT, Minimal_Interval).
>>>> The main reason I write this is that I think RFC 8899 have resolved some
>>>> corner cases that could cause issues in a naïve implementation that think
>>>> that just getting the probe through means that one should immediately
>>>> update
>>>> the MTU. And as RFC 8899 is improved this usage could also get that
>>>> improvement without a need for update.
>>>> In addition as can be seen there are some unclarities that I think makes
>>>> implementation challenging from the current spec.
>>>> Section 1:
>>>>      The Packetization Layer Path MTU Discovery (PMTUD) specification
>>>>      [RFC4821] describes a method to discover the Path MTU, but does not
>>>>      describe a practical protocol to do so with UDP.  Many application
>>>>      layer protocols based on the transport layer protocol UDP do not
>>>>      implement the Path MTU discovery mechanism described in [RFC4821].
>>>> Wouldn't it be better do rewrite this in relation to RFC 8899? Which
>>>> doesn't
>>>> have a PL definition for UDP, and the only "competing" proposal is based
>>>> on
>>>> UDP Options which have no real deployment yet and requires OS level
>>>> changes.
>>>> Section 1:
>>>>      These application layer protocols can make use of the probing
>>>>      mechanisms described in this document instead of designing their own
>>>>      adhoc extension.  These probing mechanisms are implemented with
>>>>      Session Traversal Utilities for NAT (STUN), but their usage is not
>>>>      limited to STUN-based protocols.
>>>> Yes, UDP based protocols that previously haven't been using STUN can use
>>>> this mechanism, but they need to be compatible and accept the multiplexing
>>>> solution that is implied. I think that could be clarified. They also needs
>>>> the STUN Server to be deployed in the peer endpoint which could be made
>>>> clearer. It is a given for any ICE supporting application, but I find no
>>>> reference to this. I would also note that ICE (RFC 8445) once it conclude
>>>> allows the peers to stop responding to STUN request, thus this method
>>>> needs
>>>> to be clear that the STUN Server needs to maintained during the whole
>>>> session lifetime to enable DPLPMTUD.
>>>> Section 1:
>>>>      Complementary techniques can be used to discover additional network
>>>>      characteristics, such as the network path (using the STUN Traceroute
>>>>      mechanism described in [I-D.martinsen-tram-stuntrace]) and bandwidth
>>>>      availability (using the mechanism described in
>>>>      [I-D.martinsen-tram-turnbandwidthprobe]).
>>>> Is this text really relevant as written. Neither of these individual
>>>> proposal have been updated since 2015.
>>>> Section 2:
>>>> When a
>>>>      probe succeeds with a larger size than the current PMTU, the PMTU is
>>>>      increased.
>>>> There is a point to verification. If one looks at the search algorithm
>>>> behavior it updates its probe sizes, but it doesn't conclude it searching
>>>> nor update the upper layer's MTU until the search has concluded. There are
>>>> good reasons for doing it this way and thus I would suggest reformulation.
>>>> Section 4.1:
>>>>     The Simple Probing Mechanism uses only STUN Requests/Responses, which
>>>>      are subject to the congestion control mechanism in [RFC8489] section
>>>>      6.2.1.  The default Rc and Rm values may be defined differently for a
>>>>      combination of the Simple Probing Mechanism and the protocol running
>>>>      on the same port.
>>>> I would not call it a congestion control mechanism, rather a
>>>> retransmission
>>>> timer mechanism. Thus a rate limiting mechanism makes more sense.
>>>> Section 4.1.1:
>>>>      The client adds a PADDING attribute with a length that, when added to
>>>>      the IP and UDP headers and the other STUN components, is equal to the
>>>>      Selected Probe Size, as defined in [RFC4821] Section 7.3.
>>>> Why referencing RFC 4821, when RFC 8899 has simpler and clearer interface
>>>> suitable for Datagram and where the simple probe is an excellent match to
>>>> just define as PL for probing.
>>>> Section 4.1.3:
>>>>      A client receiving a Probe Response MUST process it as specified in
>>>>      section 6.3.3 of [RFC8489] and MUST ignore the PADDING attribute.  If
>>>>      a response is received this is interpreted as a Probe Success, as
>>>>      defined in [RFC4821] Section 7.6.1.
>>>> More reliance on RFC 4821 rather than RFC 8899.
>>>> RFC 8899 is not perfect but it handles a number of corner cases that can
>>>> occur and should produce more stable, and fewer updates to the MTU than
>>>> what
>>>> RFC 4821 does.
>>>> Section 4.2:
>>>>         The Simple Probing Mechanism uses STUN indications, which are not
>>>>           subject to the congestion control mechanism in [RFC8489] section
>>>>           6.2.1.  As it will have to be intricately related to the protocol
>>>>           that runs on the same port, each implementation of the Complete
>>>>         Probing Mechanism in association MUST define the congestion
>>>> control
>>>>           that will be applied to the STUN Indications.  The default Rc and
>>>> Rm
>>>>           values for the STUN Requests/Responses may be defined differently
>>>> for
>>>>           a combination of the Simple Probing Mechanism and the protocol
>>>>           running on the same port.
>>>> Once more a full blown congestion control is not really needed here. The
>>>> point is that the PMTUD probe traffic will be a small fraction of the
>>>> application traffic, alternatively such a low rate application that it is
>>>> extremely unlikely that the probe will overload any network. I would note
>>>> that RFC 8899 do specify some normative statement about this in Section 3,
>>>> Bullet 7.
>>>> In addition this does not really give an implementor a clear answer to
>>>> what
>>>> they should implement. Some basic rate limiting would be more simple to
>>>> implement.
>>>> My high level comment is that I don't see what the benefits of the
>>>> complete
>>>> method compared to running simple probes as the PL probes in the algorithm
>>>> of RFC 8899. The only potentially benefit is that one sometime will get an
>>>> indication of a burst loss across a probe when a prior or following
>>>> application protocol packet as well as the probe is lost, indicating
>>>> potentially having loss for other reasons. I think RFC 4899 probing a
>>>> multiple times gives significant high probability that congestion or
>>>> random
>>>> loss of probes will rarely affect the DPLPMTUD results.
>>>> Section 4.2.3:
>>>>      The server creates a Report Response and adds an IDENTIFIERS
>>>>      attribute that contains the chronologically ordered list of all
>>>>      identifiers received so far.  The server MUST add the FINGERPRINT
>>>>      attribute.  The server then sends the response to the client.
>>>> This doesn't discuss what a server should do if the IDENTIFIERS attribute
>>>> does not fit in the packet.
>>>> Section 5.1:
>>>> Based on that the peer need to keep a STUN server following this spec
>>>> running on the ports being used. Isn't the need for explicit signaling
>>>> more
>>>> clear. Or is the inclusion of any STUN PMTUD attribute a sufficient
>>>> indication that the peer will not remove its STUN Server when ICE
>>>> concludes?
>>>> Cheers
>>>> Magnus Westerlund

Marc Petit-Huguenin