Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt

Marc Petit-Huguenin <marc@petit-huguenin.org> Tue, 12 October 2021 14:24 UTC

Return-Path: <marc@petit-huguenin.org>
X-Original-To: tram@ietfa.amsl.com
Delivered-To: tram@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4CB053A14BF; Tue, 12 Oct 2021 07:24:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n-BRBhEpK5hp; Tue, 12 Oct 2021 07:24:02 -0700 (PDT)
Received: from implementers.org (implementers.org [92.243.22.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 069643A14B9; Tue, 12 Oct 2021 07:23:58 -0700 (PDT)
Received: from [IPv6:2601:204:e600:411:d250:99ff:fedf:93cd] (unknown [IPv6:2601:204:e600:411:d250:99ff:fedf:93cd]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "Marc Petit-Huguenin", Issuer "implementers.org" (verified OK)) by implementers.org (Postfix) with ESMTPS id 07EF0AE269; Tue, 12 Oct 2021 16:23:54 +0200 (CEST)
To: Magnus Westerlund <magnus.westerlund=40ericsson.com@dmarc.ietf.org>, "draft-ietf-tram-stun-pmtud@ietf.org" <draft-ietf-tram-stun-pmtud@ietf.org>, "tram@ietf.org" <tram@ietf.org>
References: <161697408224.3594.210086487138582847@ietfa.amsl.com> <HE1PR0702MB3772BED6E2CE35F50D015D4195139@HE1PR0702MB3772.eurprd07.prod.outlook.com> <a8dc8afe-3e8f-2d6f-7879-e7945d83f949@petit-huguenin.org> <f61b0dd4-a800-420c-dc75-620ee3aaf086@petit-huguenin.org> <7bf6070cfe31ec3f3cb7aa4b62fb7026a05611b3.camel@ericsson.com> <8226ff9c-722b-8b7f-d058-304b71275514@petit-huguenin.org> <HE1PR0702MB377267A46FAE8C68A8BF13E995CB9@HE1PR0702MB3772.eurprd07.prod.outlook.com>
From: Marc Petit-Huguenin <marc@petit-huguenin.org>
Message-ID: <5710b3b5-aaa8-6b8b-eacb-9fe412b9d735@petit-huguenin.org>
Date: Tue, 12 Oct 2021 07:23:52 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <HE1PR0702MB377267A46FAE8C68A8BF13E995CB9@HE1PR0702MB3772.eurprd07.prod.outlook.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tram/T6rqGNugaJgL2Y8xHzYV1j-gX14>
Subject: Re: [tram] I-D Action: draft-ietf-tram-stun-pmtud-20.txt
X-BeenThere: tram@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussing the creation of a Turn Revised And Modernized \(TRAM\) WG, which goal is to consolidate the various initiatives to update TURN and STUN." <tram.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tram>, <mailto:tram-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tram/>
List-Post: <mailto:tram@ietf.org>
List-Help: <mailto:tram-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tram>, <mailto:tram-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Oct 2021 14:24:08 -0000

I needed a state machine description language to be able to concisely describe the state machine in RFC 8899 (and others), so I designed one:

https://datatracker.ietf.org/doc/draft-petithuguenin-formal-fsm/

Questions, comments, and suggestions can be sent in the fdt@ietf.org mailing-list or directly to my email address.  I am also available on hallway@jabber.ietf.org/MPH.

Thanks.

On 8/30/21 2:19 AM, Magnus Westerlund wrote:
> Hi Marc,
> 
> I find it great that you have been working on validating the RFC 8999 model. I did wished that this had happened earlier when your feedback could have more easily have been addressed.
> 
> Independently the current ID for stun-pmtud does not contain either per reference or explicitly defined methods for how one arrive at the result of a current estimate of MTU. If the WG finally want to define such methods and algorithms or use RFC 8999 or an replacement is a necessary from my perspective. However, that choice of path should also be considered carefully. I think improving RFC 8999 would be great. I also think it can be a shorter path to completion. However, if the WG thinks one really should be able to take care of the multi-packet reporting functionality of this probing a new algorithm or at least adaptations may be necessary.
> 
>>From my perspective I think we should strive for collaborations that make things better. The people interested in this problem space sufficiently to want to standardizing it are not many.
> 
> I however, are still of the opinion that approving just a specification for probing format without some limitations for how it is applied is quite negative. First of all because if people thinks this is a trivial problem and implement algorithm that is to simple and works poorly it make DPLPMTUD a disfavor. Secondly, it is hard to verify what the negative consequence or potential for security threats from this format without some bounds on expected usage behaviors.
> 
> Cheers
> 
> Magnus Westerlund
> 
>> -----Original Message-----
>> From: Marc Petit-Huguenin <marc@petit-huguenin.org>
>> Sent: den 29 augusti 2021 17:41
>> To: Magnus Westerlund <magnus.westerlund@ericsson.com>om>; draft-ietf-
>> tram-stun-pmtud@ietf.org; tram@ietf.org
>> Subject: Re: I-D Action: draft-ietf-tram-stun-pmtud-20.txt
>>
>> Hi Magnus,
>>
>> As I said in my previous email, I am working on an analysis of RFC 8899.  You
>> can see attached a picture of a complete model of RFC 8899 as a Timed
>> Coloured Petri Net, here after 10 (virtual) hours of simulation,  I learned a
>> few things preparing that model, which can be summarized as RFC 8899 being
>> a bad description of an OK algorithm.
>>
>> It is a bad description because there is a lot of missing information in it,
>> information that I had to painstakingly infer from clues spread all over the
>> document.  In some cases I had to make stuff up, that's particularly visible
>> around the Complete and Error states.
>>
>> This is an OK algorithm, not a great one, and there is space for
>> improvements.  I even took some notes on a parallel version of that
>> algorithm that would make it acceptable as a basis for draft-ietf-tram-stun-
>> pmtud.  Remember that draft-ietf-tram-stun-pmtud started 13 years ago
>> and, in spite of all the efforts of the transport people to make it as irrelevant
>> as possible, was always meant to implement whatever algorithm the
>> implementer wanted under the constraints of RFC 4821.  I always felt that
>> RFC 8899 was a step back from that ideal, now with that model not only I
>> know that it is, but I know precisely why and what can be done to fix it.
>>
>> Note that the work is not done as I still have to complete the validation of the
>> model and go through every single line of RFC 8899 to find the deviations
>> between it and the reality of a provably working model.  But the question
>> now is what to do with that knowledge.
>>
>> One possibility is to submit an I-D that summarizes all the issues I found,
>> propose some fix, and then continue to ignore RFC 8899 until someone
>> publish a RFC8899bis that does that.
>>
>> Another is to simply transcribe the better state machine I have in mind
>> directly in draft-ietf-tram-stun-pmtud -- which will be a far easier task when
>> doing it from a model like the one I built.
>>
>> Finally the simplest is to remove altogether any reference to RFC 8899 in
>> draft-ietf-tram-stun-pmtud and continue the publication as is.
>>
>> Building that model took probably 6 days (half of it trying to make sense of
>> the RFC 8899 text) over a month, which is not a small thing especially when
>> not paid to do that, so my personal preference is the second option.
>>
>> Anyway, thank you for your reviews and efforts to progress that draft.
>>
>> Thanks.
>>
>> On 8/26/21 5:20 AM, Magnus Westerlund wrote:
>>> Hi Marc and TRAM WG,
>>>
>>> My intention was definitely not to bully you. And we know that RFC
>>> 8899 have some issues. These issues are due to the problem of how to
>>> write a straight forward algorithm that handles all the corner cases
>>> the Internet will throw at the algorithm. RFC 8899 is weak in those
>>> areas where the STUN PMTUD ID is even weaker. Namely the description
>>> of how one use and arrive at a result for what the supported Path MTU
>>> is that is really reliable and can handle dynamic changes quickly and precise.
>>>
>>> So STUN PMTUD includes a novel algorithm for running multi packet
>>> sampling which clearly can improve the consistency of the results.
>>> However, there is no algorithm specifying both your probe construction
>>> for this, how you convert these measurements into a result and deals
>>> with the results that sampling provides.
>>>
>>> So if you want to make this the better solution, then please can you
>>> actually write up how to use the protocol mechanism in way that
>>> provide better results or at least similar results for these
>>> applications where STUN PMTUD style probing works better and have less
>>> impact on the application. I definetly see how RFC
>>> 8899 works better for fully ACKed protocols like TCP, SCTP and QUIC in
>>> comparision to RTP/UDP for example.
>>>
>>> My push for making this into a probe specification for RFC 8899 has
>>> been motivated with getting this work done during my term as AD.
>>> However, it clearl wasn't and the progress was very slow. Making the
>>> document into an RFC 8899 usage appear to me to simplify the document
>>> as mechanism that would work in the context of RTP and other STUN
>>> using protocols, however some more advanced features would clearly be
>>> lost. However, it would mean a streamlining of the document and avoid
>>> several areas where the current document is lacking in specification,
>>> namley the algorithm to arrive at a result and how to actually send
>>> probes in a way that is network safe and not a significiant risk to be used a
>> denial of service tool.
>>>
>>> I don't any longer have any stakes in this as I am no longer the AD. I
>>> just wanted to make clear what my position was at the end of the AD
>>> term. Going forward I would review this work soley from the
>>> perspective, is it providing something complementary to RFC 8899 and
>>> defined usages of that algorithm for other protocols, is it
>>> functioning, safe to deploy, and can it be implemented from this
>>> specification with less implementor tuning or configuration than RFC 8899.
>>>
>>> I hope that clarifies my view and what I see as the considerations
>>> here, enabling you and the WG to decided in what direction you want to
>> take this work.
>>>
>>> Cheers
>>>
>>> Magnus
>>>
>>>
>>> On Sun, 2021-08-08 at 13:40 -0700, Marc Petit-Huguenin wrote:
>>>> Hi Magnus,
>>>>
>>>> Sorry for the delay -- unexpected issue restricted my time in front
>>>> of a computer until few days ago.
>>>>
>>>> I think that your most important point is that you (and others
>>>> before) want this draft to follow RFC 8899.  I am starting to feel
>>>> bullied into accepting that RFC 8899 is the only way to do DPLPMTUD
>>>> which, as explained multiple times, is it not.
>>>>
>>>> These explanations have been conveniently ignored, so I decided to
>>>> prove what I think are the weaknesses of RFC 8899 by doing a formal
>>>> analysis of that protocol.  If I can prove that then I will publish
>>>> my findings and continue using a better algorithm than RFC 8899.  If
>>>> I can't then I'll do the modifications requested.
>>>>
>>>> Now a formal analysis is not a small task so it will take some time
>>>> to do that especially as my plate is already quite full until the end
>>>> of the year.  But I should be able to spare half a day each week for
>>>> that, so expect periodic updates here.
>>>>
>>>> Thanks.
>>>>
>>>> On 7/18/21 7:16 AM, Marc Petit-Huguenin wrote:
>>>>> Hi Magnus,
>>>>>
>>>>> Thank you very much for that review.
>>>>>
>>>>> I'll work on a response and an update to the draft in the next two
>>>>> weeks, probably to be uploaded after the IETF meeting.
>>>>>
>>>>> On 7/14/21 8:08 AM, Magnus Westerlund wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have reviewed -20 and have the following feedback.
>>>>>>
>>>>>> Some of the issues have been resolved. However, my individual
>>>>>> conclusion is that this document would be shorter, have fewer
>>>>>> issues if only the simple probing was retained and defined as a RFC
>>>>>> 8899 PL solution for UDP based application protocols that wants
>>>>>> DPLPMTUD and not use protocol internal methods. This could shorten
>>>>>> section 4 significantly. I think the alternative is to simply
>>>>>> declare the document dead.
>>>>>>
>>>>>> However, I would take a look at the examples of RFC 8899 PL
>>>>>> definitions that exists in RFC 8899, RFC9000 (QUIC) and for UDP
>>>>>> Options (
>>>>>> https://datatracker.ietf.org/doc/draft-fairhurst-tsvwg-udp-options-
>>>>>> dplpmtud
>>>>>> /) and rewrite Section 4 based on that and only keep the simple
>> method.
>>>>>> Then
>>>>>> rewrite intro to adjust it as RFC 8899 based, and also clarify the
>>>>>> STUN server on the port for the whole session aspect and the
>>>>>> demultiplexing need.
>>>>>> Then section 7 also can be deleted. It would be good to have a
>>>>>> clearer rate limiting specification for the probe packets in
>>>>>> section 4, as the STUN retransmission timer gives exponential
>>>>>> back-off, and the ICE is not really applicable here. The probe
>>>>>> sending implementation will have a RTT estimate when some response
>>>>>> has been received. Based on that one can limit the probes to be
>>>>>> sent only every n*RTT, possible with a MAX(n*RTT, Minimal_Interval).
>>>>>>
>>>>>> The main reason I write this is that I think RFC 8899 have resolved
>>>>>> some corner cases that could cause issues in a naïve implementation
>>>>>> that think that just getting the probe through means that one
>>>>>> should immediately update the MTU. And as RFC 8899 is improved this
>>>>>> usage could also get that improvement without a need for update.
>>>>>>
>>>>>> In addition as can be seen there are some unclarities that I think
>>>>>> makes implementation challenging from the current spec.
>>>>>>
>>>>>>
>>>>>> Section 1:
>>>>>>
>>>>>>       The Packetization Layer Path MTU Discovery (PMTUD) specification
>>>>>>       [RFC4821] describes a method to discover the Path MTU, but does
>> not
>>>>>>       describe a practical protocol to do so with UDP.  Many application
>>>>>>       layer protocols based on the transport layer protocol UDP do not
>>>>>>       implement the Path MTU discovery mechanism described in
>> [RFC4821].
>>>>>>
>>>>>> Wouldn't it be better do rewrite this in relation to RFC 8899?
>>>>>> Which doesn't have a PL definition for UDP, and the only
>>>>>> "competing" proposal is based on UDP Options which have no real
>>>>>> deployment yet and requires OS level changes.
>>>>>>
>>>>>>
>>>>>> Section 1:
>>>>>>
>>>>>>       These application layer protocols can make use of the probing
>>>>>>       mechanisms described in this document instead of designing their
>> own
>>>>>>       adhoc extension.  These probing mechanisms are implemented with
>>>>>>       Session Traversal Utilities for NAT (STUN), but their usage is not
>>>>>>       limited to STUN-based protocols.
>>>>>>
>>>>>> Yes, UDP based protocols that previously haven't been using STUN
>>>>>> can use this mechanism, but they need to be compatible and accept
>>>>>> the multiplexing solution that is implied. I think that could be
>>>>>> clarified. They also needs the STUN Server to be deployed in the
>>>>>> peer endpoint which could be made clearer. It is a given for any
>>>>>> ICE supporting application, but I find no reference to this. I
>>>>>> would also note that ICE (RFC 8445) once it conclude allows the
>>>>>> peers to stop responding to STUN request, thus this method needs to
>>>>>> be clear that the STUN Server needs to maintained during the whole
>>>>>> session lifetime to enable DPLPMTUD.
>>>>>>
>>>>>>
>>>>>> Section 1:
>>>>>>
>>>>>>       Complementary techniques can be used to discover additional
>> network
>>>>>>       characteristics, such as the network path (using the STUN Traceroute
>>>>>>       mechanism described in [I-D.martinsen-tram-stuntrace]) and
>> bandwidth
>>>>>>       availability (using the mechanism described in
>>>>>>       [I-D.martinsen-tram-turnbandwidthprobe]).
>>>>>>
>>>>>> Is this text really relevant as written. Neither of these
>>>>>> individual proposal have been updated since 2015.
>>>>>>
>>>>>>
>>>>>> Section 2:
>>>>>>
>>>>>> When a
>>>>>>       probe succeeds with a larger size than the current PMTU, the PMTU
>> is
>>>>>>       increased.
>>>>>>
>>>>>> There is a point to verification. If one looks at the search
>>>>>> algorithm behavior it updates its probe sizes, but it doesn't
>>>>>> conclude it searching nor update the upper layer's MTU until the
>>>>>> search has concluded. There are good reasons for doing it this way and
>> thus I would suggest reformulation.
>>>>>>
>>>>>> Section 4.1:
>>>>>>
>>>>>>      The Simple Probing Mechanism uses only STUN
>> Requests/Responses, which
>>>>>>       are subject to the congestion control mechanism in [RFC8489]
>> section
>>>>>>       6.2.1.  The default Rc and Rm values may be defined
>>>>>> differently for a
>>>>>>
>>>>>>       combination of the Simple Probing Mechanism and the protocol
>> running
>>>>>>       on the same port.
>>>>>>
>>>>>> I would not call it a congestion control mechanism, rather a
>>>>>> retransmission timer mechanism. Thus a rate limiting mechanism
>>>>>> makes more sense.
>>>>>>
>>>>>>
>>>>>> Section 4.1.1:
>>>>>>
>>>>>>       The client adds a PADDING attribute with a length that, when added
>> to
>>>>>>       the IP and UDP headers and the other STUN components, is equal to
>> the
>>>>>>       Selected Probe Size, as defined in [RFC4821] Section 7.3.
>>>>>>
>>>>>> Why referencing RFC 4821, when RFC 8899 has simpler and clearer
>>>>>> interface suitable for Datagram and where the simple probe is an
>>>>>> excellent match to just define as PL for probing.
>>>>>>
>>>>>>
>>>>>> Section 4.1.3:
>>>>>>
>>>>>>       A client receiving a Probe Response MUST process it as specified in
>>>>>>       section 6.3.3 of [RFC8489] and MUST ignore the PADDING attribute.
>> If
>>>>>>       a response is received this is interpreted as a Probe Success, as
>>>>>>       defined in [RFC4821] Section 7.6.1.
>>>>>>
>>>>>> More reliance on RFC 4821 rather than RFC 8899.
>>>>>>
>>>>>> RFC 8899 is not perfect but it handles a number of corner cases
>>>>>> that can occur and should produce more stable, and fewer updates to
>>>>>> the MTU than what RFC 4821 does.
>>>>>>
>>>>>> Section 4.2:
>>>>>>
>>>>>>          The Simple Probing Mechanism uses STUN indications, which
>>>>>> are not
>>>>>>
>>>>>>            subject to the congestion control mechanism in [RFC8489]
>>>>>> section
>>>>>>
>>>>>>            6.2.1.  As it will have to be intricately related to the
>>>>>> protocol
>>>>>>
>>>>>>            that runs on the same port, each implementation of the
>>>>>> Complete
>>>>>>
>>>>>>          Probing Mechanism in association MUST define the congestion
>>>>>> control
>>>>>>            that will be applied to the STUN Indications.  The
>>>>>> default Rc and Rm
>>>>>>            values for the STUN Requests/Responses may be defined
>>>>>> differently for
>>>>>>            a combination of the Simple Probing Mechanism and the
>>>>>> protocol
>>>>>>
>>>>>>            running on the same port.
>>>>>>
>>>>>> Once more a full blown congestion control is not really needed
>>>>>> here. The point is that the PMTUD probe traffic will be a small
>>>>>> fraction of the application traffic, alternatively such a low rate
>>>>>> application that it is extremely unlikely that the probe will
>>>>>> overload any network. I would note that RFC 8899 do specify some
>>>>>> normative statement about this in Section 3, Bullet 7.
>>>>>>
>>>>>> In addition this does not really give an implementor a clear answer
>>>>>> to what they should implement. Some basic rate limiting would be
>>>>>> more simple to implement.
>>>>>>
>>>>>> My high level comment is that I don't see what the benefits of the
>>>>>> complete method compared to running simple probes as the PL probes
>>>>>> in the algorithm of RFC 8899. The only potentially benefit is that
>>>>>> one sometime will get an indication of a burst loss across a probe
>>>>>> when a prior or following application protocol packet as well as
>>>>>> the probe is lost, indicating potentially having loss for other
>>>>>> reasons. I think RFC 4899 probing a multiple times gives
>>>>>> significant high probability that congestion or random loss of
>>>>>> probes will rarely affect the DPLPMTUD results.
>>>>>>
>>>>>> Section 4.2.3:
>>>>>>
>>>>>>       The server creates a Report Response and adds an IDENTIFIERS
>>>>>>       attribute that contains the chronologically ordered list of all
>>>>>>       identifiers received so far.  The server MUST add the FINGERPRINT
>>>>>>       attribute.  The server then sends the response to the client.
>>>>>>
>>>>>> This doesn't discuss what a server should do if the IDENTIFIERS
>>>>>> attribute does not fit in the packet.
>>>>>>
>>>>>>
>>>>>> Section 5.1:
>>>>>>
>>>>>> Based on that the peer need to keep a STUN server following this
>>>>>> spec running on the ports being used. Isn't the need for explicit
>>>>>> signaling more clear. Or is the inclusion of any STUN PMTUD
>>>>>> attribute a sufficient indication that the peer will not remove its
>>>>>> STUN Server when ICE concludes?
>>>>>>
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> Magnus Westerlund
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>


-- 
Marc Petit-Huguenin
Email: marc@petit-huguenin.org
Blog: https://marc.petit-huguenin.org
Profile: https://www.linkedin.com/in/petithug