Re: [tsvwg] [Technical Errata Reported] RFC7141 (7237)

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Fri, 04 November 2022 13:04 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 50B4BC14CE31 for <tsvwg@ietfa.amsl.com>; Fri, 4 Nov 2022 06:04:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=5, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0HUewjDEVBLU for <tsvwg@ietfa.amsl.com>; Fri, 4 Nov 2022 06:03:59 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 506E2C14CE28 for <tsvwg@ietf.org>; Fri, 4 Nov 2022 06:03:59 -0700 (PDT)
Received: from [192.168.1.64] (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id D28501B00079; Fri, 4 Nov 2022 13:03:45 +0000 (GMT)
Message-ID: <f02cfbb6-9a14-0c70-4986-358b9226033f@erg.abdn.ac.uk>
Date: Fri, 04 Nov 2022 13:03:45 +0000
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:102.0) Gecko/20100101 Thunderbird/102.4.1
To: Sebastian Moeller <moeller0@gmx.de>
Cc: RFC Errata System <rfc-editor@rfc-editor.org>, bob.briscoe@bt.com, jukka.manner@aalto.fi, martin.h.duke@gmail.com, Zaheduzzaman.Sarker@ericsson.com, david.black@dell.com, martenseemann@gmail.com, tsvwg@ietf.org
References: <20221104094005.747A455F68@rfcpa.amsl.com> <4aef3037-fae5-68c9-661f-4ce89b1ce7e7@erg.abdn.ac.uk> <273A82C1-E675-4950-A7E0-E8C564B09834@gmx.de> <6672b32e-19b6-b295-1460-904481de2c83@erg.abdn.ac.uk> <1351054E-7647-40CA-B2FA-7A566DE09E24@gmx.de>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
In-Reply-To: <1351054E-7647-40CA-B2FA-7A566DE09E24@gmx.de>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/dimYJfpJG-0ohEcDbhzyzmyWaEY>
Subject: Re: [tsvwg] [Technical Errata Reported] RFC7141 (7237)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Nov 2022 13:04:03 -0000

On 04/11/2022 12:42, Sebastian Moeller wrote:
> Hi Gorry,
>
>
>> On Nov 4, 2022, at 11:56, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
>>
>> On 04/11/2022 10:43, Sebastian Moeller wrote:
>>> Hi Gorry,
>>>
>>> See [SM] below.
>>>
>>> On 4 November 2022 11:20:56 CET, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
>>>> Commenting as an individual on the Errata filing:
>>>>
>>>> On 04/11/2022 09:40, RFC Errata System wrote:
>>>>> The following errata report has been submitted for RFC7141,
>>>>> "Byte and Packet Congestion Notification".
>>>>>
>>>>> --------------------------------------
>>>>> You may review the report below and at:
>>>>> https://www.rfc-editor.org/errata/eid7237
>>>>>
>>>>> --------------------------------------
>>>>> Type: Technical
>>>>> Reported by: Sebastian Moeller <moeller0@gmx.de>
>>>>>
>>>>> Section: 2
>>>>>
>>>>> Original Text
>>>>> -------------
>>>>> 2.2.  Recommendation on Encoding Congestion Notification
>>>>>
>>>>>      When encoding congestion notification (e.g., by drop, ECN, or PCN),
>>>>>      the probability that network equipment drops or marks a particular
>>>>>      packet to notify congestion SHOULD NOT depend on the size of the
>>>>>      packet in question.
>>>>> [...]
>>>>> 2.3.  Recommendation on Responding to Congestion
>>>>>
>>>>>      When a transport detects that a packet has been lost or congestion
>>>>>      marked, it SHOULD consider the strength of the congestion indication
>>>>>      as proportionate to the size in octets (bytes) of the missing or
>>>>>      marked packet.
>>>>>
>>>>>      In other words, when a packet indicates congestion (by being lost or
>>>>>      marked), it can be considered conceptually as if there is a
>>>>>      congestion indication on every octet of the packet, not just one
>>>>>      indication per packet.
>>>>>
>>>>>      To be clear, the above recommendation solely describes how a
>>>>>      transport should interpret the meaning of a congestion indication, as
>>>>>      a long term goal.  It makes no recommendation on whether a transport
>>>>>      should act differently based on this interpretation.  It merely aids
>>>>>      interoperability between transports, if they choose to make their
>>>>>      actions depend on the strength of congestion indications.
>>>>>
>>>>> Corrected Text
>>>>> --------------
>>>>> I am not sure the text is actually salvageable, as it appears ti be a logic disconnect at the core of the recommendations.
>>>>>
>>>>> Notes
>>>>> -----
>>>>> The recommendations seem not self consistent:
>>>>> A) Section 2.2.  recommends that CE marking should be made independent of packet size, so a CE-mark carries no information about packet size.
>>>> I did not understood that it needed to. This RFC I think was intended to be independent of the transport.  I see the transport sender as responsible for determining the packetisation of the transport segments, and the (S)ACKs can often identify segments, hence the sender can determine the segments that have been acknoweldged or times when ECN marking was seen.
>>> [SM] This assumes that relevant segment size does not change along the path. Which generally is not true. Just think fragmentation, if the sender sends a packet that gets fragmented along the path and only a single fragment gets CE marked the sender will see this as the whole packet being marked. Or from the other side of the issue, if say a Linux router uses GRO/GSO and queues a larger meta packet and CE marks that, receiver and sender at best see a sequence of CE marked packets. So the recommendation would need to be changed to calculate the consecutive sequence of CE marked octets and take these as correlate for congestion strength. So no, the sender really has no reliable knowledge about the size of the data unit the marking node marked.
>>>
>> I suggest IETF transports treat all IP fragments as one unit of retransmission/congestion at the transport layer.
> 	[SM2] But what if the re-segmentation does not happen at the receiver, but say a fragmenting and CE-marking path tries to act transparently. According to the rules both in RFC3168 and RFC7141 a re-segmented packet containing even a single CE-marked fragment is to be CE-marked (or dropped). So the AQM might have marked a 576 octet segment but all the endpoint sees is a marked ~1460 octet segment.
> 	This also illustrates how section 2.4 of RFC7141 proposes a method that does not achieve its aim, of giving veridical "number of market octets" information. It simply is impossible to do so generally (often it will work, but the endpoints can not even know when it was correct and when not).
> Section 2.4 has more issues BTW, it tries to give recommendation how to deal with splitting and merging but fails to achieve its goals of giving a veridical account of the marked octets:
>
> Let's see what happens when applying the proposed counter method in regards to number of marked octets under the conditions this section addresses
> Here let's look at a toy problem with 20 byte headers and a total payload of 1200 octets that is split in or merged out of 3 fragments/segments with 400 octets payload each
>
> Merging multiple segments pre-marking:
> (20+400) + (20+400)-20 + (20+400)-20 -> 1220 total 1200 payload + CE
> -> AQM marks 1220 or 1200 octets
> (12+1200)+CE
> receiver sees 1200 octets with CE and ACKs these with ECE
> sender can assume 1200 octets where marked
> CORRECT
>
> Merging multiple segments post-marking ():
> -> AQM marks segment 2 of 420 or 400 octets
> (20+400) + (20+400+CE) + (20+400)
> (20+400)+(20+400+CE)-20+(20+400)-20 = 1220 total 1200 payload + CE
> receiver sees 1200 octets with CE and ACKs these with ECE
> sender must assume 1200 octets where marked
> FALSE
>
> Fragmenting a segment pre-marking
> 1220 -> (20+400) + (20+400) + (20+400)
> -> AQM marks segment 2 of 420 or 400 octets
> (20+400) + (20+400+CE) + (20+400)
> Resegmentation happens before protocol sees marking
> (20+400) + (20+400+CE)-20 + (20+400)-20 -> 1220 total 1200 payload + CE
> receiver sees 1200 octets with CE and ACKs these with ECE
> sender must assume 1200 octets where marked
> FALSE
>
> Fragmenting a segment post-marking
> (20+1200)
> -> AQM marks 1220 or 1200 octets
> (12+1200)+CE
> fragmentation happens:
> (20+400+CE) + (20+400+CE) + (20+400+CE)
> Resegmentation happens before protocol sees marking
> (20+400=CE) + (20+400+CE)-20 + (20+400+C)-20 -> 1220 total 1200 payload + CE
> receiver sees 1200 octets with CE and ACKs these with ECE
> sender must assume 1200 octets where marked
> CORRECT
>
> So only in two out of four conditions does the proposed method actually achieves its goal.
>
> Now add the complication that the RFC fails to mention what it considers marked octets, just the payload or payload+headers.
> This is important as the sum of payload + headers of X fragments is larger than the sum payload + header of the single packet re-constituted out of these fragments. So the de-fragmenting process arguably needs to only look at payload size, but RFC7141 section 2.4 does not make that explicit.
> If an implementation actually uses the full size instead of the payload size now the last condition also gets it wrong:
>
> Fragmenting a segment post-marking
> (20+1200)
> -> AQM marks 1220 or 1200 octets
> (12+1200)+CE
> fragmentation happens:
> (20+400+CE) + (20+400+CE) + (20+400+CE)
> Resegmentation happens before protocol sees marking
> (20+400+CE) + (20+400+CE) + (20+400+C) -> 1220 total 1200 payload + CE
> but (20+400)*3 = 1260 marked octets
> receiver sees 1200 octets with CE and ACKs these with ECE
> sender must assume 1200 octets where marked
> CORRECT
> But now the left over 40 bytes in the marked-octet budget will result in CE feedback for the next (re-assembled) packet.
> FALSE
>
> For rfc3168 that will not matter much as ECE is sustained until CWR is received anyway, but L4S style signaling now acquired an erroneous CE mark.
Network fragmentation be it in tunnels, extension headers or IPv4 
fragments is indeed thwarted with all manner of issues. Nothing new - 
the IETF has long recommended the unit of loss/marking to be the same as 
the end to end PDU. PMTU is tricky, but does have benfits:-)
>
>> GSO/GRO and variants would/could change the fragmentation, that is true and need to be considered.
> 	[SM2] I am confused? How do GRO/GSO affect fragmentation, IMHO these two will cause larger aggregates that exist only locally (Linux will segment meta-packets in the sending process and will not sent out say a large 64K TCP packet in fragments, but will re-segment the meta-packet into a neat sequence of complete self-sustained TCP packets)? IMHO they affect primarily the unit size the AQM might CE-mark on, in a way that is in-transparent to the end points. My point is the unit size an AQM acts on is generally unknowable precisely be the end-points. At which point making the end-points pretend that congestion strength somehow correlates with size of marked packets really stops making sense.
>
The segment delivered can be a different size to the unit of 
transmission. This is an implementation optimisation - if this done 
without regard to the marking, then the results will be different and 
likely do not deliver what is expected - optimisations need to 
understand what they optimise.
>>>>> B) Section 2.3 then recommends to use the size of marked packets as direct indicators of congestion strength.
>>>>>
>>>>> C) Section 2.3 then later clarifies that transports should interpret the size of CE-marked packets as correlate for congestion strength but are in no way required to take this interpretation into account when acting based on the congestion signal.
>>>>>
>>>>>
>>>>> This has several problems:
>>>>> 1) A) and B) are in direct contradiction to each other. If we ask marking nodes to ignore packet size while marking, but end nodes to take it into account we basically create random congestion strength "information" by the pure chance of a specific packet of a specific size "catching" a CE mark. At which point we might as well simply draw a random number at the end-point to interpret congestion strength (except that packet sizes are not distributed randomly).
>>>>>
>>>>> 2) Asking endpoints to interpret CE_marks in this way but not act on it, is hardly actionable advice for potential implementers. If we can not recommend a specific way, we should refrain from offering recommendations at all to keep things as simple as reasonably possible.
>>>> This doesn't appear to be textual errata, it seems more like the request is for more clarification or motivating an alternative?
>>> [SM] What alternatives to changing incorrect text do exist? I do not think changing the status to historic is a realistic option in spite of the text recommending the impossible.
>>>
>> Put simply:
>>
>> An Erratum would normally specify either:
>>
>>      a direct change of text to fix a mistake in production, but a change of the spec from the original intended method;
>>
>>      or specify something to inform a future revision.
>>
>> An update in a new RFC is needed to change the method, or a process request to mark an RFC as historic.
> 	[SM2] Would it also be possible to request to re-classify as informative? This RFC with its impossible recommendations is causing issues with other RFCs and I think it would help if this could be ameliorated by moving away from BCP status.
>
If there is consensus an RFC shouldn't be associated with a BCP, we can 
examine what to do. The first thing is to write a (short) ID and see if 
you can gain sufficient attention from the WG to enable this to be 
discussed.

Gorry

>> Gorry
>>
>>>>> Instructions:
>>>>> -------------
>>>>> This erratum is currently posted as "Reported". If necessary, please
>>>>> use "Reply All" to discuss whether it should be verified or
>>>>> rejected. When a decision is reached, the verifying party
>>>>> can log in to change the status and edit the report, if necessary.
>>>>>
>>>>> --------------------------------------
>>>>> RFC7141 (draft-ietf-tsvwg-byte-pkt-congest-12)
>>>>> --------------------------------------
>>>>> Title               : Byte and Packet Congestion Notification
>>>>> Publication Date    : February 2014
>>>>> Author(s)           : B. Briscoe, J. Manner
>>>>> Category            : BEST CURRENT PRACTICE
>>>>> Source              : Transport Area Working Group
>>>>> Area                : Transport
>>>>> Stream              : IETF
>>>>> Verifying Party     : IESG