Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt

Marc Petit-Huguenin <> Wed, 20 October 2021 12:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 0878E3A040A; Wed, 20 Oct 2021 05:13:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id V70YPpsT5sST; Wed, 20 Oct 2021 05:13:36 -0700 (PDT)
Received: from ( [IPv6:2001:4b98:dc0:45:216:3eff:fe7f:7abd]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 20D433A044E; Wed, 20 Oct 2021 05:13:34 -0700 (PDT)
Received: from [IPv6:2601:204:e600:411:d250:99ff:fedf:93cd] (unknown [IPv6:2601:204:e600:411:d250:99ff:fedf:93cd]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "Marc Petit-Huguenin", Issuer "" (verified OK)) by (Postfix) with ESMTPS id D38A2AE269; Wed, 20 Oct 2021 14:13:28 +0200 (CEST)
To: Gonzalo Camarillo <>, Magnus Westerlund <>, "" <>, "" <>, Magnus Westerlund <>
Cc: Martin Duke <>
References: <> <> <> <> <> <> <> <> <>
From: Marc Petit-Huguenin <>
Message-ID: <>
Date: Wed, 20 Oct 2021 05:13:26 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <>
Subject: Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussing the creation of a Turn Revised And Modernized \(TRAM\) WG, which goal is to consolidate the various initiatives to update TURN and STUN." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 20 Oct 2021 12:13:41 -0000

Hi Gonzalo,

I am working on a plan that would permit to remove the blocking point in the draft, but require another draft (draft-petithuguenin-formal-fsm is a precursor of that future draft).  Unfortunately we are again victim of the availability curse: I have a window of availability until November to finish this but my co-author does not.  And starting in December I'll be the one without any available time for the IETF, and that for probably 6 months.  I am not optimistic on the possibility of solving that conundrum.

On 10/20/21 12:51 AM, Gonzalo Camarillo wrote:
> Marc, Magnus,
> as the responsible AD for TRAM, Martin is going to close down the WG so that we finish this draft as an AD sponsor document with me as its shepherd. Please, let us know what are the next steps at this point and your timeline. Thanks.
> Cheers,
> Gonzalo
>> -----Original Message-----
>> From: Marc Petit-Huguenin <>
>> Sent: Tuesday, October 12, 2021 16:24
>> To: Magnus Westerlund
>> <>rg>; draft-ietf-tram-stun-
>> Subject: Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt
>> I needed a state machine description language to be able to concisely
>> describe the state machine in RFC 8899 (and others), so I designed one:
>> Questions, comments, and suggestions can be sent in the
>> mailing-list or directly to my email address.  I am also available on
>> Thanks.
>> On 8/30/21 2:19 AM, Magnus Westerlund wrote:
>>> Hi Marc,
>>> I find it great that you have been working on validating the RFC 8999 model.
>> I did wished that this had happened earlier when your feedback could have
>> more easily have been addressed.
>>> Independently the current ID for stun-pmtud does not contain either per
>> reference or explicitly defined methods for how one arrive at the result of a
>> current estimate of MTU. If the WG finally want to define such methods and
>> algorithms or use RFC 8999 or an replacement is a necessary from my
>> perspective. However, that choice of path should also be considered
>> carefully. I think improving RFC 8999 would be great. I also think it can be a
>> shorter path to completion. However, if the WG thinks one really should be
>> able to take care of the multi-packet reporting functionality of this probing a
>> new algorithm or at least adaptations may be necessary.
>>> >From my perspective I think we should strive for collaborations that make
>> things better. The people interested in this problem space sufficiently to
>> want to standardizing it are not many.
>>> I however, are still of the opinion that approving just a specification for
>> probing format without some limitations for how it is applied is quite
>> negative. First of all because if people thinks this is a trivial problem and
>> implement algorithm that is to simple and works poorly it make DPLPMTUD a
>> disfavor. Secondly, it is hard to verify what the negative consequence or
>> potential for security threats from this format without some bounds on
>> expected usage behaviors.
>>> Cheers
>>> Magnus Westerlund
>>>> -----Original Message-----
>>>> From: Marc Petit-Huguenin <>
>>>> Sent: den 29 augusti 2021 17:41
>>>> To: Magnus Westerlund <>om>; draft-ietf-
>>>> Subject: Re: I-D Action: draft-ietf-tram-stun-pmtud-20.txt
>>>> Hi Magnus,
>>>> As I said in my previous email, I am working on an analysis of RFC
>>>> 8899.  You can see attached a picture of a complete model of RFC 8899
>>>> as a Timed Coloured Petri Net, here after 10 (virtual) hours of
>>>> simulation,  I learned a few things preparing that model, which can
>>>> be summarized as RFC 8899 being a bad description of an OK algorithm.
>>>> It is a bad description because there is a lot of missing information
>>>> in it, information that I had to painstakingly infer from clues
>>>> spread all over the document.  In some cases I had to make stuff up,
>>>> that's particularly visible around the Complete and Error states.
>>>> This is an OK algorithm, not a great one, and there is space for
>>>> improvements.  I even took some notes on a parallel version of that
>>>> algorithm that would make it acceptable as a basis for
>>>> draft-ietf-tram-stun- pmtud.  Remember that
>>>> draft-ietf-tram-stun-pmtud started 13 years ago and, in spite of all
>>>> the efforts of the transport people to make it as irrelevant as
>>>> possible, was always meant to implement whatever algorithm the
>>>> implementer wanted under the constraints of RFC 4821.  I always felt
>>>> that RFC 8899 was a step back from that ideal, now with that model not
>> only I know that it is, but I know precisely why and what can be done to fix it.
>>>> Note that the work is not done as I still have to complete the
>>>> validation of the model and go through every single line of RFC 8899
>>>> to find the deviations between it and the reality of a provably
>>>> working model.  But the question now is what to do with that knowledge.
>>>> One possibility is to submit an I-D that summarizes all the issues I
>>>> found, propose some fix, and then continue to ignore RFC 8899 until
>>>> someone publish a RFC8899bis that does that.
>>>> Another is to simply transcribe the better state machine I have in
>>>> mind directly in draft-ietf-tram-stun-pmtud -- which will be a far
>>>> easier task when doing it from a model like the one I built.
>>>> Finally the simplest is to remove altogether any reference to RFC
>>>> 8899 in draft-ietf-tram-stun-pmtud and continue the publication as is.
>>>> Building that model took probably 6 days (half of it trying to make
>>>> sense of the RFC 8899 text) over a month, which is not a small thing
>>>> especially when not paid to do that, so my personal preference is the
>> second option.
>>>> Anyway, thank you for your reviews and efforts to progress that draft.
>>>> Thanks.
>>>> On 8/26/21 5:20 AM, Magnus Westerlund wrote:
>>>>> Hi Marc and TRAM WG,
>>>>> My intention was definitely not to bully you. And we know that RFC
>>>>> 8899 have some issues. These issues are due to the problem of how to
>>>>> write a straight forward algorithm that handles all the corner cases
>>>>> the Internet will throw at the algorithm. RFC 8899 is weak in those
>>>>> areas where the STUN PMTUD ID is even weaker. Namely the description
>>>>> of how one use and arrive at a result for what the supported Path
>>>>> MTU is that is really reliable and can handle dynamic changes quickly and
>> precise.
>>>>> So STUN PMTUD includes a novel algorithm for running multi packet
>>>>> sampling which clearly can improve the consistency of the results.
>>>>> However, there is no algorithm specifying both your probe
>>>>> construction for this, how you convert these measurements into a
>>>>> result and deals with the results that sampling provides.
>>>>> So if you want to make this the better solution, then please can you
>>>>> actually write up how to use the protocol mechanism in way that
>>>>> provide better results or at least similar results for these
>>>>> applications where STUN PMTUD style probing works better and have
>>>>> less impact on the application. I definetly see how RFC
>>>>> 8899 works better for fully ACKed protocols like TCP, SCTP and QUIC
>>>>> in comparision to RTP/UDP for example.
>>>>> My push for making this into a probe specification for RFC 8899 has
>>>>> been motivated with getting this work done during my term as AD.
>>>>> However, it clearl wasn't and the progress was very slow. Making the
>>>>> document into an RFC 8899 usage appear to me to simplify the
>>>>> document as mechanism that would work in the context of RTP and
>>>>> other STUN using protocols, however some more advanced features
>>>>> would clearly be lost. However, it would mean a streamlining of the
>>>>> document and avoid several areas where the current document is
>>>>> lacking in specification, namley the algorithm to arrive at a result
>>>>> and how to actually send probes in a way that is network safe and
>>>>> not a significiant risk to be used a
>>>> denial of service tool.
>>>>> I don't any longer have any stakes in this as I am no longer the AD.
>>>>> I just wanted to make clear what my position was at the end of the
>>>>> AD term. Going forward I would review this work soley from the
>>>>> perspective, is it providing something complementary to RFC 8899 and
>>>>> defined usages of that algorithm for other protocols, is it
>>>>> functioning, safe to deploy, and can it be implemented from this
>>>>> specification with less implementor tuning or configuration than RFC
>> 8899.
>>>>> I hope that clarifies my view and what I see as the considerations
>>>>> here, enabling you and the WG to decided in what direction you want
>>>>> to
>>>> take this work.
>>>>> Cheers
>>>>> Magnus
>>>>> On Sun, 2021-08-08 at 13:40 -0700, Marc Petit-Huguenin wrote:
>>>>>> Hi Magnus,
>>>>>> Sorry for the delay -- unexpected issue restricted my time in front
>>>>>> of a computer until few days ago.
>>>>>> I think that your most important point is that you (and others
>>>>>> before) want this draft to follow RFC 8899.  I am starting to feel
>>>>>> bullied into accepting that RFC 8899 is the only way to do DPLPMTUD
>>>>>> which, as explained multiple times, is it not.
>>>>>> These explanations have been conveniently ignored, so I decided to
>>>>>> prove what I think are the weaknesses of RFC 8899 by doing a formal
>>>>>> analysis of that protocol.  If I can prove that then I will publish
>>>>>> my findings and continue using a better algorithm than RFC 8899.
>>>>>> If I can't then I'll do the modifications requested.
>>>>>> Now a formal analysis is not a small task so it will take some time
>>>>>> to do that especially as my plate is already quite full until the
>>>>>> end of the year.  But I should be able to spare half a day each
>>>>>> week for that, so expect periodic updates here.
>>>>>> Thanks.
>>>>>> On 7/18/21 7:16 AM, Marc Petit-Huguenin wrote:
>>>>>>> Hi Magnus,
>>>>>>> Thank you very much for that review.
>>>>>>> I'll work on a response and an update to the draft in the next two
>>>>>>> weeks, probably to be uploaded after the IETF meeting.
>>>>>>> On 7/14/21 8:08 AM, Magnus Westerlund wrote:
>>>>>>>> Hi,
>>>>>>>> I have reviewed -20 and have the following feedback.
>>>>>>>> Some of the issues have been resolved. However, my individual
>>>>>>>> conclusion is that this document would be shorter, have fewer
>>>>>>>> issues if only the simple probing was retained and defined as a
>>>>>>>> RFC
>>>>>>>> 8899 PL solution for UDP based application protocols that wants
>>>>>>>> DPLPMTUD and not use protocol internal methods. This could
>>>>>>>> shorten section 4 significantly. I think the alternative is to
>>>>>>>> simply declare the document dead.
>>>>>>>> However, I would take a look at the examples of RFC 8899 PL
>>>>>>>> definitions that exists in RFC 8899, RFC9000 (QUIC) and for UDP
>>>>>>>> Options (
>>>>>>>> s-
>>>>>>>> dplpmtud
>>>>>>>> /) and rewrite Section 4 based on that and only keep the simple
>>>> method.
>>>>>>>> Then
>>>>>>>> rewrite intro to adjust it as RFC 8899 based, and also clarify
>>>>>>>> the STUN server on the port for the whole session aspect and the
>>>>>>>> demultiplexing need.
>>>>>>>> Then section 7 also can be deleted. It would be good to have a
>>>>>>>> clearer rate limiting specification for the probe packets in
>>>>>>>> section 4, as the STUN retransmission timer gives exponential
>>>>>>>> back-off, and the ICE is not really applicable here. The probe
>>>>>>>> sending implementation will have a RTT estimate when some
>>>>>>>> response has been received. Based on that one can limit the
>>>>>>>> probes to be sent only every n*RTT, possible with a MAX(n*RTT,
>> Minimal_Interval).
>>>>>>>> The main reason I write this is that I think RFC 8899 have
>>>>>>>> resolved some corner cases that could cause issues in a naïve
>>>>>>>> implementation that think that just getting the probe through
>>>>>>>> means that one should immediately update the MTU. And as RFC
>> 8899
>>>>>>>> is improved this usage could also get that improvement without a
>> need for update.
>>>>>>>> In addition as can be seen there are some unclarities that I
>>>>>>>> think makes implementation challenging from the current spec.
>>>>>>>> Section 1:
>>>>>>>>        The Packetization Layer Path MTU Discovery (PMTUD)
>> specification
>>>>>>>>        [RFC4821] describes a method to discover the Path MTU, but
>>>>>>>> does
>>>> not
>>>>>>>>        describe a practical protocol to do so with UDP.  Many application
>>>>>>>>        layer protocols based on the transport layer protocol UDP do not
>>>>>>>>        implement the Path MTU discovery mechanism described in
>>>> [RFC4821].
>>>>>>>> Wouldn't it be better do rewrite this in relation to RFC 8899?
>>>>>>>> Which doesn't have a PL definition for UDP, and the only
>>>>>>>> "competing" proposal is based on UDP Options which have no real
>>>>>>>> deployment yet and requires OS level changes.
>>>>>>>> Section 1:
>>>>>>>>        These application layer protocols can make use of the probing
>>>>>>>>        mechanisms described in this document instead of designing
>>>>>>>> their
>>>> own
>>>>>>>>        adhoc extension.  These probing mechanisms are implemented
>> with
>>>>>>>>        Session Traversal Utilities for NAT (STUN), but their usage is not
>>>>>>>>        limited to STUN-based protocols.
>>>>>>>> Yes, UDP based protocols that previously haven't been using STUN
>>>>>>>> can use this mechanism, but they need to be compatible and accept
>>>>>>>> the multiplexing solution that is implied. I think that could be
>>>>>>>> clarified. They also needs the STUN Server to be deployed in the
>>>>>>>> peer endpoint which could be made clearer. It is a given for any
>>>>>>>> ICE supporting application, but I find no reference to this. I
>>>>>>>> would also note that ICE (RFC 8445) once it conclude allows the
>>>>>>>> peers to stop responding to STUN request, thus this method needs
>>>>>>>> to be clear that the STUN Server needs to maintained during the
>>>>>>>> whole session lifetime to enable DPLPMTUD.
>>>>>>>> Section 1:
>>>>>>>>        Complementary techniques can be used to discover additional
>>>> network
>>>>>>>>        characteristics, such as the network path (using the STUN
>> Traceroute
>>>>>>>>        mechanism described in [I-D.martinsen-tram-stuntrace]) and
>>>> bandwidth
>>>>>>>>        availability (using the mechanism described in
>>>>>>>>        [I-D.martinsen-tram-turnbandwidthprobe]).
>>>>>>>> Is this text really relevant as written. Neither of these
>>>>>>>> individual proposal have been updated since 2015.
>>>>>>>> Section 2:
>>>>>>>> When a
>>>>>>>>        probe succeeds with a larger size than the current PMTU,
>>>>>>>> the PMTU
>>>> is
>>>>>>>>        increased.
>>>>>>>> There is a point to verification. If one looks at the search
>>>>>>>> algorithm behavior it updates its probe sizes, but it doesn't
>>>>>>>> conclude it searching nor update the upper layer's MTU until the
>>>>>>>> search has concluded. There are good reasons for doing it this
>>>>>>>> way and
>>>> thus I would suggest reformulation.
>>>>>>>> Section 4.1:
>>>>>>>>       The Simple Probing Mechanism uses only STUN
>>>> Requests/Responses, which
>>>>>>>>        are subject to the congestion control mechanism in
>>>>>>>> [RFC8489]
>>>> section
>>>>>>>>        6.2.1.  The default Rc and Rm values may be defined
>>>>>>>> differently for a
>>>>>>>>        combination of the Simple Probing Mechanism and the
>>>>>>>> protocol
>>>> running
>>>>>>>>        on the same port.
>>>>>>>> I would not call it a congestion control mechanism, rather a
>>>>>>>> retransmission timer mechanism. Thus a rate limiting mechanism
>>>>>>>> makes more sense.
>>>>>>>> Section 4.1.1:
>>>>>>>>        The client adds a PADDING attribute with a length that,
>>>>>>>> when added
>>>> to
>>>>>>>>        the IP and UDP headers and the other STUN components, is
>>>>>>>> equal to
>>>> the
>>>>>>>>        Selected Probe Size, as defined in [RFC4821] Section 7.3.
>>>>>>>> Why referencing RFC 4821, when RFC 8899 has simpler and clearer
>>>>>>>> interface suitable for Datagram and where the simple probe is an
>>>>>>>> excellent match to just define as PL for probing.
>>>>>>>> Section 4.1.3:
>>>>>>>>        A client receiving a Probe Response MUST process it as specified
>> in
>>>>>>>>        section 6.3.3 of [RFC8489] and MUST ignore the PADDING
>> attribute.
>>>> If
>>>>>>>>        a response is received this is interpreted as a Probe Success, as
>>>>>>>>        defined in [RFC4821] Section 7.6.1.
>>>>>>>> More reliance on RFC 4821 rather than RFC 8899.
>>>>>>>> RFC 8899 is not perfect but it handles a number of corner cases
>>>>>>>> that can occur and should produce more stable, and fewer updates
>>>>>>>> to the MTU than what RFC 4821 does.
>>>>>>>> Section 4.2:
>>>>>>>>           The Simple Probing Mechanism uses STUN indications,
>>>>>>>> which are not
>>>>>>>>             subject to the congestion control mechanism in
>>>>>>>> [RFC8489] section
>>>>>>>>             6.2.1.  As it will have to be intricately related to
>>>>>>>> the protocol
>>>>>>>>             that runs on the same port, each implementation of the
>>>>>>>> Complete
>>>>>>>>           Probing Mechanism in association MUST define the
>>>>>>>> congestion control
>>>>>>>>             that will be applied to the STUN Indications.  The
>>>>>>>> default Rc and Rm
>>>>>>>>             values for the STUN Requests/Responses may be defined
>>>>>>>> differently for
>>>>>>>>             a combination of the Simple Probing Mechanism and the
>>>>>>>> protocol
>>>>>>>>             running on the same port.
>>>>>>>> Once more a full blown congestion control is not really needed
>>>>>>>> here. The point is that the PMTUD probe traffic will be a small
>>>>>>>> fraction of the application traffic, alternatively such a low
>>>>>>>> rate application that it is extremely unlikely that the probe
>>>>>>>> will overload any network. I would note that RFC 8899 do specify
>>>>>>>> some normative statement about this in Section 3, Bullet 7.
>>>>>>>> In addition this does not really give an implementor a clear
>>>>>>>> answer to what they should implement. Some basic rate limiting
>>>>>>>> would be more simple to implement.
>>>>>>>> My high level comment is that I don't see what the benefits of
>>>>>>>> the complete method compared to running simple probes as the PL
>>>>>>>> probes in the algorithm of RFC 8899. The only potentially benefit
>>>>>>>> is that one sometime will get an indication of a burst loss
>>>>>>>> across a probe when a prior or following application protocol
>>>>>>>> packet as well as the probe is lost, indicating potentially
>>>>>>>> having loss for other reasons. I think RFC 4899 probing a
>>>>>>>> multiple times gives significant high probability that congestion
>>>>>>>> or random loss of probes will rarely affect the DPLPMTUD results.
>>>>>>>> Section 4.2.3:
>>>>>>>>        The server creates a Report Response and adds an IDENTIFIERS
>>>>>>>>        attribute that contains the chronologically ordered list of all
>>>>>>>>        identifiers received so far.  The server MUST add the FINGERPRINT
>>>>>>>>        attribute.  The server then sends the response to the client.
>>>>>>>> This doesn't discuss what a server should do if the IDENTIFIERS
>>>>>>>> attribute does not fit in the packet.
>>>>>>>> Section 5.1:
>>>>>>>> Based on that the peer need to keep a STUN server following this
>>>>>>>> spec running on the ports being used. Isn't the need for explicit
>>>>>>>> signaling more clear. Or is the inclusion of any STUN PMTUD
>>>>>>>> attribute a sufficient indication that the peer will not remove
>>>>>>>> its STUN Server when ICE concludes?

Marc Petit-Huguenin