Re: [tsvwg] Progress with draft-ietf-tsvwg-ecn-encap-guidelines

Sebastian Moeller <moeller0@gmx.de> Fri, 22 September 2023 11:32 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6E8BCC15109F for <tsvwg@ietfa.amsl.com>; Fri, 22 Sep 2023 04:32:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.148
X-Spam-Level: ***
X-Spam-Status: No, score=3.148 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, GB_SUMOF=5, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 13McVHgAXZTR for <tsvwg@ietfa.amsl.com>; Fri, 22 Sep 2023 04:32:07 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 31CE6C151099 for <tsvwg@ietf.org>; Fri, 22 Sep 2023 04:32:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1695382278; x=1695987078; i=moeller0@gmx.de; bh=PSqfPXGXotBalXxq+hYZyUCWqsK+tZbHksyZeiGvvwU=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=GrVE4Bnl5T+bsqyr1WaofSNXVHK4t7YbZDYtgkEcgxlcryqv3c9vQd3HrEmuNPA08mkIRjcQifQ xsO8FlhB93bWU93sp2XrnS8Yj9vQgRb7FnWfVNkHdd0vTtaTzTDzhtqAb8KDmAqoEZYCQK9K5sVkf qz6c2phZpH4qOeTufPPaFXbreIPLTiZecNaWR0QX6TsgZ86P9Bqtx3lTDsPdbkJ8Kl09WA2yf8HB9 Woai2CNWY4IaH6YaHF2oD/Ja1TON1Lifj6MMAr5e+bUitlkShGbIuM2PBVTpl3PiyMLzksXWpG33v peD1KyP9b0zHlEnNadyGeuVaDhjUvaa3K0kA==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1Mn2WF-1rQvYX1cwB-00k8nK; Fri, 22 Sep 2023 13:31:18 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <e80e355b-098b-ede4-71cd-560f2b480538@bobbriscoe.net>
Date: Fri, 22 Sep 2023 13:31:17 +0200
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <3C618140-7B42-4E4D-A107-BD02D3A228AD@gmx.de>
References: <23c00fae-e6a6-072c-0513-1c0d5c637c17@bobbriscoe.net> <3d442824-722f-90af-8d04-916b29bafca4@erg.abdn.ac.uk> <D2E7D6FA-C39D-44B4-BC27-8897CE24145C@gmx.de> <bdc9685f-77b1-50f6-63b7-8b167d850148@bobbriscoe.net> <C4D0E327-32E9-42E8-850F-DFF579612CD0@gmx.de> <52b12bcc-1aa7-9069-2b21-aeb00e9e39db@bobbriscoe.net> <646319E7-78D3-4BBF-9EC7-F069CE7124BA@gmx.de> <e80e355b-098b-ede4-71cd-560f2b480538@bobbriscoe.net>
To: Bob Briscoe <ietf@bobbriscoe.net>
X-Mailer: Apple Mail (2.3696.120.41.1.4)
X-Provags-ID: V03:K1:fsqvePd20huQr+QmwszUJ15uHd/8hzAtsrJDb8duSnreHyE6dzN /Q+IgYLtDcJw5oAX5MRFOvf/mYZ8uKBgeLtmENMOw2ZWnU5H/PYd3J9b6On0795wPWtRWKL mRl/5rK35f3xwyu85mUULJeQHR2OKsMKgfpkNi5XafUv8Zvmx7hwYR9Ia7T9/qtgfwCA5AS 2lxA0o2j97wwD0PBR0t5g==
UI-OutboundReport: notjunk:1;M01:P0:QINgEnDgTXU=;nBHXAzN8WbLBeQPcUbIjZgUowou ScSEAWVlMoP5vz1BV0zHAR2t68DyRr3cDJ6sGWXUPAACljbDRi1V4JUkayx4S94UdoAwLoXFJ +eDSXNPRK3y3Jg1Rzwk4631EnxhEBF2zEmZc3BvOpL6uG/KCRWX7C4161YBbbsRHk9l3+qJhE mWepc3RMvRm7qiUqmbly56Hgg7nui3lsrHdZWGFbRGEoHlqDdOztvmCMju0ATka8T4ZF2rydR kXIfc6J/haEzxXsTLz+hijd7PZ7ZknlHoyTJuahtnfifsclRO9GgsYLoK3iAu8JFi5EQNmn90 Y5vnTGOP84xomfrph9kqHHHjOsWWGXWliPUAB1hHBjc1M8T0RfXikiKdmMXFz2pE0oH6pGDRd wX5Udq86ZGFWp78WhKklRUXYNY6Oe+3ufOPyLbYUDwzVkR/wTVeRFh+AZ21+f84NklNsQYImf 9bQhuCe3TOPWop8OYypLkK4ezkHspA07Z03HMgUeXOoWoEzj6LdarGJxBy7SRJw31u0cuR+xa ZV13R+woNyJUDk2VNw6o3UZ8k0h5Ar3yJ/bghw5KUCFGnN4h28ffqy+zc1a3YT36AXh2ZlphO wV/TDrYyZofinZUgpOTUY7G3fIjiOdA0hvEP99Uzg/xTePaOeJXDRQ4g99ruESX4lfa4sWE/U Ig5c4BP/HipgLl8RJmwOpWZYg5zsQ2CHuxsMSlKZENTpB1xJTX/Dxn3Fls9dXExTS6xnGJWTT qs+BrhSi0oDCuIvmbRMpdlHMK/1AJEvVkdSeeY3rR2lYLzLPKYTcwWjvl+DtyVQhSWi0CYIxW 16FwUvBwPf43jNH2Y3Sgvj30UjpiW09ioD9NJZQX0RePIh0z8NRI1N1acWnC0ZlCDtwcgLrYp MxMHD4O/4IuUq6raxega+iTTooh9Qe7oVCY8HCfCvMlddDKbD26SkR0tdnBh/xoxc+CzQch53 ml5IjEMjWrSxDPXu3vFMvX2JIFc=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/VbWCCTGY7FEp5ymgAq66CeuAZbE>
Subject: Re: [tsvwg] Progress with draft-ietf-tsvwg-ecn-encap-guidelines
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Sep 2023 11:32:11 -0000

Hi Bob,


> On Sep 21, 2023, at 18:45, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> Sebastian,
> 
> On 13/09/2023 11:47, Sebastian Moeller wrote:
>> Hi Bob,
>> 
>> 
>> 
>>> On Sep 12, 2023, at 17:19, Bob Briscoe <in@bobbriscoe.net>
>>>  wrote:
>>> 
>>> Sebastian,
>>> 
>>> The draft is now going forward.
>>> 
>> 	[SM] As expected (and announced by the chairs). Had I not mentioned that method 2 was incorrect/incomplete in the past we would have ratified this long ago, in spite of it being incorrect. This does not fill me with confidence about our process.
>> 
>> 
>> 
>>> But I will still respond...
>>> 
>>> On 08/09/2023 08:23, Sebastian Moeller wrote:
>>> 
>>>> Bob,
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Sep 7, 2023, at 16:50, Bob Briscoe <in@bobbriscoe.net>
>>>>> 
>>>>>  wrote:
>>>>> 
>>>>> Sebastian,
>>>>> 
>>>>> Due to the impasse between the views of two 'camps', some time ago the chairs asked me (as editor of the draft) to write both goals in the draft without stating a preference for either. And two descriptions of example ways they might be implemented.
>>>>> 
>>>>> 
>>>> 	[SM] Yes, based on the observation that the same method was already sketched out in an earlier BCP. However, I question the validity of doing so given that:
>>>> 
>>>> 
>>> [BB] I've addressed your point (c) first, because it is the root of the misunderstanding.
>>> 
>>> 
>>>> c) it flies in the face of how a flow should IMHO operate, it should try to generate the best possible estimate of the real state of congestion along the path and then react appropriately. And since IMHO no AQM marks based on packet-size (and hence does not effectively mark individual octets), it makes no sense to propagate ECN marks in an octet preserving fashion, as that does add noise to the data, making it harder to get a veridical estimate of what happened. 
>>>> 
>>> [BB] Just because the word 'octet' appears in the technique, doesn't make its marking size-dependent.
>>> 
>> 	[SM] Yes that is the point, rfc7141 recommends to mark size independent, but to interpret the marking size dependent, the whole rationale behind your method 2 hence must be to make this size dependent interpretation possible by conserving this in spite of the re-framing.
> 
> [BB2] The two ends are different (size-independent and size-dependent). So, there's no point trying to infer what "the whole rationale ... must be" by picking one end or the other.

	[SM] Well, unless you wan to be able to calculate something like fraction of marked octets instead of fraction of marked packets* this is rather irrelevant. But humor me, please demonstrate that octet account here results in a better performavnce/fairness**/complexity trade-off


*) Side note, most flows use as large packets as possible for bulk transfers, so these two ratios will essentially be pretty similar under most/many realistic scenarios, making this insistence on untested goal 2 really somewhat surprising, adding considerable complexity for little gain seems not what the IETF should recommend. UNLESS we could show that in reality goal 2 works much better we should apply something like Occam's razor and stick to the simplest methods that work.


**) I roughly summarize this rationale blow from rfc7141 as: to increase fairness between flows of different packet sizes:

3.3.  Transport-Independent Network

   TCP congestion control ensures that flows competing for the same
   resource each maintain the same number of segments in flight,
   irrespective of segment size.  So under similar conditions, flows
   with different segment sizes will get different bit rates.

   To counter this effect, it seems tempting not to follow our
   recommendation, and instead for the network to bias congestion
   notification by packet size in order to equalise the bit rates of
   flows with different packet sizes.  However, in order to do this, the
   queuing algorithm has to make assumptions about the transport, which
   become embedded in the network.  [...]


> 
>> I understand that the chairs of tsvwg have already asked you to write up a draft of your arguments against RFC7141, if you have any. This is the constructive way expected at the IETF. 
>> 
> 	[SM] I opted created an erratum for rfc 7141 instead and the response to that (and lack thereof) convinced me that writing a new draft is going to be an exercise in futility, but I digress.
> 
> [BB2] I've just responded to your erratum on the tsvwg list.
> Subject: [Technical Errata Reported] RFC7141 (7237)
> 
> So pls read that before continuing with this thread.

	[SM] I will respond to this (or not) separately.

>> 
>>> On the contrary, after decap, preserving marked octets preserves size-independence better than preserving the presence of marks. I'll try to explain that at the end of this response (in A3). 
>>> 
>> 	[SM] See the challenge the real challenge here is to post-hoc figure out how the marking entity had decided, had is seen the IP packets individually... as this seems not really achievable I see the best approach to come up with something that is simple and still covers the gist of the congestion signaling.
>> 
> 
> [BB2] Again, see thread about your erratum to RFC7141, and below.
> TL;DR: it is achievable.

	[SM] Well, the lack of an implementation that does so makes your claim appear to be rather hypothetical so far. As before, if you show real data convincingly demonstrating the validity of your claim I will retract my objection. Please bear with me though, that "it is achievable" is not the kind of data I would like to see. 


> 
>> 
>>> Before that, my first assertion might have raised a question in your mind:
>>>     Q1. In the draft, why do I say goal 1 is to preserve proportion, but the example preserves octets? Especially given that using octets seems to cause controversy.
>>> 
>> 	[SM] Why is it confused to assume that this method was motivated by rfc7141's recommendation to take the size of marked packets into account when "interpreting" congestion marks? 
> 
> [BB2] RFC7141 isn't actually written like that. It is broken into sections, where each is about a different part of the process: Encoding, Responding, or Splitting / Merging packets. RFC7141 never says congestion notifications can or should be 'interpreted' universally across all these stages (without specifying what is doing the interpreting).

	[SM] This is rather odd, that you claim that rfc7141 consisted out of 3 sections that should be read in isolation, I had assumed the IETF would have published three different documents if that wpould have been the goal. Honestly, these things are interlinked and hence looking at the problem holistically seems to be the obvious thing to do. 
To be rather frank here, this is not a good faith argument you seem to be bringing here.



> 
> (Except for that sentence in §2.4 that I already said (further down the last round of this thread) that I would now disown;

	[SM] Well, write an erratum and see whether it gets accepted. Technically RFC/BCP are not supposed to be editorials of an individual's opinions but documenting the (rough) consensus in the IETF, so "disowning" (just like adopting of ratifying) would probably require buy-in from the WG/IETF community, no?


> where it wrongly says that octet preservation for splitting/merging is based on the principle of responding to congestion as if every octet of a marked packet is marked.)
> 
>> Rfc7141 manly talks about proportionality to packet size, not proportionality to proportion of marked bytes. (With the possible exception of Appendic B1: 
>>   "Packet-mode drop actually gives flows sufficient information to
>>    measure their loss rate in bits per second, if they choose, not just
>>    packets per second.  Each flow can count the size of a lost or marked
>>    packet and scale its rate response in proportion (as TFRC-SP does).)"
>> 
>> If the justification to add method 2 is truely rfc7141's precedence then I would expect that method two actually is a consequence of rfc7141, which it strictly does not seem to be the case.
>> 
>> Side-note: looking at TFRC-SP I get:
>> In TFRC-SP, the loss event rate is calculated by counting at most one
>>    loss event in loss intervals longer than two round-trip times, and by
>>    counting each packet lost or marked in shorter loss intervals.
>> 
>> This implies that TFRC-SP does indeed not look at the size of marked/lost packets when registering marks... so not sure the TFRC-SP reference here is useful.


	[SM] Ping? I would like your rationale why I must be mis-understanding this issue, please?


>> 
>>> That might lead to a second question, which I'll also answer below:
>>>     Q2. Why preserve marking proportion anyway? Is it "the best possible estimate of the real state of congestion along the path"?
>>> 
>>> A1. Why preserving octets preserves proportion
>>> 
>>> 
>>> Imagine a stream of packets with their headers all run back-to-back as a stream of octets then cut up into frame payloads at L2. 
>>>     For instance, consider scenario a) at: 
>>> https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg
>>>  (which excludes frame headers)
>>> 
>>> Now, take any window of data, e.g. the first 3 frame payloads (top left). 1/3 are marked pink.
>>> Now count the packets encapsulated inside those 3 frames. There are about 15. So proportion preserving wants 1/3 of 15 = 5 to be marked. The marked octet preserving technique in the Goal2 row marks 6 packets, which is near enough, given it deliberately rounds up.
>>> 
>> 	[SM]  Looking at your example, I immediately note that you did NOT depict the variant to achieving goal 1 by marking all packets somehow "touched" by the marked L2-frame (aka method 1a)... this will result in something in between goal 1b and goal 2 ...
> 
> [BB2] I didn't depict method 1a, because I do not believe it implements Goal1


	[SM] Ah, fair enough, however it would have helped if you had simply pro-actively mentioned that... 



> and I'm arguing against Goal1 here (which I believe only method 1b implements). Nonetheless, as you know, the chairs asked me (as editor) to include both methods 1a & 1b verbatim in the draft, in order to document the lack of consensus. 

	[SM] My understanding here is that removing any of these section was and probably still is a viable option. In my reading (which might be incorrect) the chairs did not insist upon including both goal 1 and goal 2 but tried to split the difference of opinions by offering the compromise of including both. But again, I might be wrong. 



> However, as you brought it up, here's the problem with method 1a:
> Consider scenario b) at: https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg
> If there were a Method 1a row, an episode with infrequent frames being marked would translate into a run of 100% packet marking (or near-100%); similar to the right-hand episode in the Goal1 row of scenario b). Thus limiting the dynamic range available for frame marking (why that's such a problem is explained further below wrt method 1b).

	[SM] Yes, that can indeed happen, that is also the one condition where the results of method 1a and 2 would differ noticeably, however the question arise how likely this scenario is going to be and what the consequence of this slight over marking would be on the CC loops of the affected flows. My gut feeling is, not often and not much, but that should be settled via real data not gut feeling.


> So, having already shown (below) that Method 1b is problematic, both methods that purport to implement Goal 1 are problematic. That's because, as I've argued on this list, the rationale of Goal 1 is incorrect.

	[SM] Which did not even result in rough consensus if I recall correctly?


> 
>> [SM] ...with considerably less complexity than goal two (with its requirement for timeout and to account for dropped frames/packets). My point is goal 2 seems pretty complex with very little to show for it in regards to data demonstrating that is superior than methods 1a and 1b that both are considerably simpler in scope and implementation.
> 
> [BB2] So, you say method 2 has "got very little to show for it",
> ... other than you seem to be tending towards agreeing that it's the only one that works robustly - you just don't like the implementation!?

	[SM] Thanks for rephrasing your understanding of my argument. No, I do not agree that method 2 is robust or reliable. My gripes with its "implementation" is a complete lack of implementation, resulting in a clearly incorrect/incomplete description of how to reach goal 2 in earlier versions of the draft. But humor me, show me a real implementation and data demonstrating clearly superior "performance" over methods 1a and 1b... equal "performance" is IMHO not enough, given the considerably higher complexity.


> 
> But you have made this complexity pronouncement without having seen code for either method and you don't know the specifics of the protocols involved, or even whether it's for hardware or software. 

	[SM] Indeed, I will make a prediction that a method that relies on immediately propagating a mark from a frame to one/all related IP packets compared to keep several counters (that can roll over) and timers/timeout will be easier to implement all else being equal. And yes, I will make that pronouncement without ever having seen the code. But humor me, and show me sane implementatins for both methods, where method two does not come out considerably more complex than method 1a or 1b.


> 
> I'm not going to descend into "my pseudocode is less complex than yours."  

	[SM] Appreciated, I would like to see real code compared not pseudo code. But IMHO the complexity considerations make it moot to even look at pseudocode, method 2 has more moving parts and hence more potential to get it wrong than both method 1a and 1b.


> But I've given pseudocode for method 2, so that others can judge complexity, depending on their particular scenario and protocols involved:

	[SM] Which i am not going to dive into, following your "my pseudocode is less complex than yours." proposal... 


>     https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing_goal2_pseudo.c
> 
> Re your specific points on complexity:
>   * Accounting for dropping due to incompatible ECN fields:
>       o Method 2 doesn't need to because each ECN type is already classified into a separate stream of frames. 

	[SM] Which in itself comes with its own increase in complexity, that needs to be accounted as part of method 2....

>       o Methods 1a & 1b:
>          - will either need code branches to handle all the possible combinations of ECN types,
>          - or they will also have to rely on classification into a stream per type.

	[SM] This is irrelevant for 1a, as the goal is to pass the congestion signal (mark or drop) on to any related IP packet, so whether we mark or drop we do the right thing. For 1b, we would see indeed a potentially inflated signal propagation, if we do not account for dropped packets... IMHO I would be inclined to simply ignore this, as I do not read 1b as requiring strict adherence to 1 mark in one mark out, but different opinions are possible.


>   * Timeout: 
>       o I'll leave others to judge whether a single timeout is complex.

	[SM] With 4 ECN code-point queues/counters I would expect that you need a similar amount of timeouts as well. But yes requiring a timer/time-out at all increases the complexity considerably.

> 
> To preserve proportion (Goal2) I proposed method 2 in preference to other candidates because, during normal operation, there is no branching, making it ideal for pipelining, and I've shown how to avoid a lock by not sharing writing of the balance between in and out. Also, in method 2, there are only two state variables per ECN class, whereas both method 1a & 1b require ECN state per-packet.
> 
>> 
>>> Proportion is preserved because, when marking is approximately independent of frame and packet size:
>>>     marked frames / total frames ~= marked octets before decap / total octets
>>>     marked packets / total packets ~= marked octets after decap / total octets
>>> In both cases frame headers are excluded, but packet headers are included, which makes total octets the same.
>>> 
>> 	[SM] This is fine, the question is still, is this complexity actually warranted.
> 
> [BB2] Complexity... according to your assessment.

	[SM] Indeed... I would be interested on where the consensus on "complexity" falls down in the WG.

> 
>> 
>>> By ensuring marked octets are preserved before and after decap, both the ratios on the right will be identical. This makes both the marking ratios on the left (frame and packet) approximately the same.
>>> 
>>> A2. Why preserve proportion?
>>> 
>>> For the same traffic and link scenario, irrespective of whether the decap preserves presence (Goal1) or proportion (Goal2), congestion control algorithms (CCAs) and the AQM will adjust so that the sum of the flows still fits into the link. 
>>> 
>>> The only reason to preserve marking proportion is to avoid the proportion of marks being shifted too far outside its normal operating range. I.e. not so high that it more often saturates at 100% and not so low that the AQM has to emit marks so far apart that control becomes very slack or jumpy.
>>> 
>> 	[SM] Then please show that methods 1a and/or 1b actually affect the proportion of marked bytes sufficiently strongly to make this more than a theoretical musing. 
> 
> [BB2] Please don't trivialise of other people's arguments with swipes like this. 

	[SM] Well, "theoretical musing" was indeed a quite direct way of expressing my surprise that a decade after proposing this first there is still no empirical test of that idea behind method 2. I note that you did not even try to answer my request, but redirected intio the direction of appropriate manners.


>> 
>>> For example, let's consider scenario a) in https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg
>>>  again. With the Goal1 decap (2nd row assuming the second implementation bullet from the draft), even though there are about 3 packets in each marked frame, only one packet gets marked each time. 
>>> 
>> 	[SM] Yes, that is a direct consequence of the propagate the number of congestion marks policy that is at the heart of method 1b.
>> 
>> 
>> 
>>> Assuming the IP packets in a frame will often belong to separate flows, this means fewer flows see each L2 mark. So the system will adjust to increase the L2 marks, until enough flows respond. But the AQM cannot mark more than 100% of the frames, so it cannot mark more than about 1 in 3 of the packets.
>>> 
>> 	[SM] Yes, and that is why we have method 1a, which certainly will hit all flows in a marked frame...
> 
> [BB2] By resorting to method 1a, I think you're admitting that method 1b is problematic. 

	[SM] No, I accept that these methods (actually all three methods) are subtly or not so subtly different, but that IMHO is not a problem. The question is IMHO how well does each method do with real congestion signaling compared to its complexity/cost. As I think I mentioned before the real crux here is that we are trying to second guess the marking entity, ideally we would propagate the congestion marking as if the marking entity had seen the actual IP packets not just the non-aligned frames. And here IMHO it is far from clear which of the three currently discussed alternatives actually does best. But without knowing that comparing differences between the methods in isolation does not allow us to figure out which actually is "better".


> I've now shown (above) that method 1a is also problematic. And here's why method 1b is not just slightly problematic (addressing your earlier swipe)...

	[SM] Method 1b might not be perfect (see above why it can not) but it is quite simple to implement and might well be "good enough" already and hence seems like a decent pragmatic choice.

> 
> You can imagine a longer period of 100% frame marking (say 100 frames) that would still translate to about 1 in 3 packet marking. 

	[SM] For that combination of packet and frame sizes yes... but for an rfc3168 flow a single mark per flow per RTT is sufficient to elicit a rather large rate reduction. Now the per packet marking probability is 33% but we do not need more than a single mark per flow... Now, dctcp/L4S might be less reactive under such conditions, but they will also come around sooner or later...



> So, if congestion has got bad enough to cause about 33% packet marking, if more flows joined, congestion at the AQM would get worse,

	[SM] That is true... more flows especially in slow-start will make congestion harder, but that is independent of the mark propagation method.



> but it wouldn't be able to mark more than 33% of packets, because it can't mark more than 100% of frames. Now consider even further that the frames are 9,216 B and the packets are still 1500B, therefore the ratio is about 6:1, not 3:1. Then transports would be limited to a range of just 0-17% marking.

	[SM] Well, I very much think that single bit congestion signal is clearly sub optimal and giving more information per congestion signal is the way forward, that solves your problem as well, if some proxy of "magnitude" is encoded in the congestion signal. But that is orthogonal to the issue at hand (whether we should recommend purely theoretical concepts in official IETF RFCs/BCPs, IMHO we should not, and if we do we at least should clearly and explicitly mark such considerations as untested).

But sure under your conditions, one might tests with real traffic and then potentially switch to method 1a, assuming that method 1b truly is insufficient.
(This is a bit of a unicorn discussion, until we know hoe prevalent the situation we are discussing here actually is, an L2 transport or non-IP L3 tunnel that has "frames" not aligned with the encapsulated IP packets but that still employs some form of congestion signaling that can be meaningfully translated to ECN bits).


> 
> Consider further that the 300 packets within those 100 frames might map to say 100 flows (assumed all equal rate for illustration), i.e. about 3 packets per flow. So on average about 1 packet per flow would be marked. In practice some flows get more marks and some less. So a number of the flows wouldn't even see a mark, even though the AQM is marking 100% of the frames. 

	[SM] Yes, that likely would work out well for rfc3168 ECN where even a low marking rate causes a significant effect. For L4S it really just highlights the conceptual challenge when using CE marks to transmit congestion severity via mark frequency in a non per-flow fashion... But let's face it if the rationale for method 2 is that it seems required to maintain appropriate marking rate for L4S then maybe explicitly state so in the draft instead of justifying this via rfc7141 (though I still would object to method 2).

> 
> So you can see why I didn't want method 1b in the draft - it can severely limit the usable range of marking proportion.

	[SM] Well, here is the kicker, signaling congestion severity via rate modulation of the ECN bitfield over the temporal sequence of packets is simply not a good method (as in robust and reliable).


> 
> The point here is that the rationale for preserving presence/timing of congestion events doesn't carry over from IP fragmentation to L2 encapsulation, where the packets inside a large frame will generally belong to different flows. Then, the timing of a congestion event should be propagated to all the flows within the frame. 

	[SM] AS I said the problem is that we try to second guess what the marking entity had done had it seen the IP packets instead of the frames...


> 
> Method 1a, does that, but it also spreads to all partial packets covered by the frame, which can saturate packet marking, as explained earlier.

	[SM] Yepp, but again that just shows that marking proportion is simply not a robust and reliably way of cengestion magnitude signaling over the internet. It can/will/does work very well in controlled environments, but generally, rather less well in my opinion.


> 
> 
>> 
>>> Conversely, in scenario b) with the Goal1 decap, if the AQM marks more than about 1 in 3 of the smaller frames, 100% of the packets will be marked. So, taking the system as a whole, the AQM will mark fewer frames before the CCAs slow down enough. This gives the AQM a smaller operating range and it is likely to make the system more jumpy, and less controllable.
>>> 
>>> 	[SM] As I said before here we are trying to second guess the AQM, we do NOT know how that AQM would have marked had it seen our actual IP packets, so it seems futile to figure out which of our approximations is "best", the goal should be "good enough" and as simple as possible.
>>> 
> 
> [BB2] The question for the ecn-encap draft was: which principle or goal to recommend. You seem to be back-tracking away from Goal1 and veering towards method 1a being an approximation that could be used for Goal2. Pls confirm whether you no longer subscribe to Goal 1?

	[SM] Nice choice of words there... "back tracking"... this is not a useful summary of my statement above, is it?
But, no, I will not confirm that I consider goal 1 worse than goal 2*, I am still convinced that method 2 does not offer compelling advantages over the considerably simpler methods 1a and 1b... as these seem to win in the "good enough" category, if you can show method 2 actually approximating what the AQM would have done had it seen the IP packets, please do so.

*) If we want to signal congestion severity we should absolutely do so via a per-packet unambiguous encoding, and do not transmit it if via 1-bit frequency modulation of the CE code-point and spraying this signal over packets independent of flow identity.


> Method 2 illustrates Goal2 precisely, and I /believe/ it can be implemented very simply, possibly with less complexity that the Goal1 methods.
> Nonetheless, for the purposes of a design guidelines draft, it is expected that a protocol designer (and implementers) will optimize for their particular case, which could involve methods that only approximate the goal.

	[SM] Indeed the ONLY consequence of not following an IETF document is not being allowed to claim compliance with that document, so implementers clearly a free to do waht they want. But that is still no convincing argument for recommending untested methods.

> 
>>> A3. Preserving size-independent marking
>>> 
>>> It might help to visualize the following using 
>>> https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg
>>> 
>>> 
>>> If L2 frame boundaries are completely independent of L3 packet boundaries:
>>> 	• If the packet that includes the start of each marked frame is marked (as in each Goal1 row), packet marking will become size-dependent.
>>> 		• Reason: the start of each frame is equivalent to a point picked at random in the stream of packets.
>>> 		• any point picked at random within the stream is more likely to fall within a larger packet
>>> 	• With the Goal2 approach, the start of a congestion episode will tend to fall within a larger packet by the same reason as for Goal1 above
>>> 		• further packets in the congestion episode are marked depending on how much octet marking is left over from previous marking
>>> 			• so subsequent packets are marked whatever their size
>>> 		• however, the last packet to be marked in an episode will also tend to be a larger packet, 
>>> 			• because the last octet in a frame-marking episode is also a random point in the packet stream, like the start.
>>> 
>> 	[SM] As I said, this assumes a specific way the AQM would mark the IP packets in question 
> 
> [BB2] The only presumption (adopted from you) is that that frame marking starts out size-independent before decap. 
> BTW, the AQM is marking L2 frames, not IP packets.
> 
>> (say if variable size L2 frames would be used each only containing an individual IP packet).
> 
> [BB2] Er... that's the simple case with aligned frame and packet boundaries, which is outside the scope of this whole discussion.

	[SM] That was implied, yes.


>> 
>> 
>> 
>>> If the traffic stream contains idle periods, of course, the first L2 frame and L3 packet boundaries after each idle will coincide.
>>> 
>>> 
>>>> a) in the years sine that BCP was ratified no known implementatinn of that method has come to see the light of day (and hence no real data exists about it working as intended)
>>>> 
>>> [BB] Before responding to that, with hindsight, I would make the following picky disagreement with one of my own sentences [in §2.4 of RFC7141]:
>>>    "This [octet preserving when splitting or merging packets] is based on the principle used above;
>>>    that an indication of congestion on a packet can be considered as an
>>>    indication of congestion on each octet of the packet."
>>> 
>>> 
>>> In the context of splitting or merging packets at decap, I would say today that octet preserving is an implementation technique not a principle. At decap, the appropriate principle is to preserve the proportion of marking (which happens to also preserve octets at any function where total octets are preserved - by the reasoning given earlier). 
>>> 
>> 	[SM] Yes, you would ned to do that, given that the justification for method 2 is not really obvious in RFC7141... which IMHO is a problem in itself as it is odd to claim precedence by an RFC if that RFC actually says something different (however rfc7141 still wants octet preservation and method 2 will also deliver that).
>> 
>> 
>> 
>>> [BB] Now to your point,... 
>>> I'm not sure there have been many, if any, implementations of splitting or merging packets while propagating ECN since RFC7141. And I'm not sure how you know there haven't been any that follow §2.4 of BCP7141. (I admit though that, if you were omniscient, I wouldn't know that you were, because I'm not.)
>>> 
>> 	[SM] Oh, I asked here on the list and got back no response, which I think conservatively needs to be interpreted as non-existence for the scope of this discussion, the onus to show differently is on the proponents of method 2 IMHO.
> 
> [BB2] So presumably you also believe there have been no implementations of Goal1 either?

	[SM] This is not about what I believe (or should not be) but what is observable fact, I personally have seen no implementation neither of methods 1a, 1b or method 2 and hence consider it to be understandable not to recommend any, as these all seem to suffer from not-implemented-anywhere syndrome. My rationale against recommending on pure theoretical considerations cut both ways and is independent of whether I predict a method to work more or less well. But that is why I asked, I do not know and do not want to claim otherwise.


> 
>> 
>> 
>> 
>>> In two cases that I am aware of (both protocol specs, rather than implementations), ECN has been disabled over an encapsulating tunnel in order to avoid having to propagate ECN at decap:
>>> 
>>> 
>>> https://datatracker.ietf.org/doc/html/rfc9347#section-3.1
>>> https://datatracker.ietf.org/doc/html/draft-ietf-masque-connect-ip-13#section-10.2
>> 	[SM] Which can be counted against all methods, and to be honest if there is no data showing methods 1a and 1b being used, I would propose to rip out that whole section as well, not only method 2.
> 
> [BB2] On the contrary, if we had sorted out which goal to recommend, they would have been able to follow the recommendation.

	[SM] But to do so we preferably would have looked at real data from implementations of both goals showing how well they perform over the existing internet... 


> 
>> 
>>>> b) you yourself got involved in specifying a protocol (TCP Prague) that also does not see to follow that method
>>>> 
>>> [BB] When we were first using Linux DCTCP for L4S, it used acked_bytes to maintain the fraction of ce-marked bytes, which did follow the principle of treating all the octets in a marked packet as marked. See:
>>> 
>>> https://elixir.bootlin.com/linux/v3.18.9/source/net/ipv4/tcp_dctcp.c#L186
>>> 
>>> 
>>> But by the time Prague was forked off from DCTCP, someone had DCTCP to counting marked packets, probably for efficiency because the relevant variables (delivered_ce and delivered) were already maintained by the kernel. But none of us noticed until a while later. We should now make the change back to bytes, but haven't got around to arguing it through with everyone yet.
>>> 
>> 	[SM] Well, come back after you made that change? It seems rather odd that you are willing to stall a draft for almost a year to fight for a method that you did not bother to actually use (consistently) in your own protocol.
> 
> [BB2] ecn-encap draft stalled on a recommendation about non-aligned frame/packet boundaries, which is for lower-layer network infrastructure and therefore has much wider impact than one CCA (Prague), which can be changed very easily if responding to small packets becomes problematic.

	[SM] It stalled because you insist upon including method 2 at all costs... 


> Also, I think you're getting your RFCs and drafts confused. The recommendation about end systems is in RFC7141, not ecn-encap. And BTW the recommendation about end-system in RFC7141 is quite liberally worded (because it is primarily about avoiding self-harm, not inter-flow interaction).
> 
>> 
>> 
>> 
>>> As you saw in my response to the review from Neal, it doesn't much matter when only a few ECN-capable packets are smaller than the SMSS, because the proportions of marked packets and marked octets aren't often that different. But once ACKs are ECN-capable, it becomes important to get this right, particularly in connections with significant 2-way data flow.
>>> 
>> 	[SM] I would respectfully argue that in connections with significant 2-way data flow the ACK will often be piggy backed onto data packets, hence will not be all that small.
>> And then I ask: if we talk about pure reverse ACK flows what differential response to a marked ACK do consider appropriate (keeping in mind that the AQM was supposed to mark packet based, hence marking probability should e decoupled from packet size)?
>> 
> 
> [BB2] The relevant scenario is where the direction of flow alternates, so you get a round-trip of pure ACKs at the end of each volley.

	[SM] Yes? Unless you are willing/prepared to reduce the ACK rate as response to congestion signaling in the current ACK direction there is little you can do here... this is something already mentioned in rfc3168, so is e.g. QUIC prepared to reduce its ACK frequency and how is this going to affect high fidelity congestion signal in the other direction (as that requires ACKs)? So CE marking pure ACKs might be worth researching, but it does not yet seem like a solved problem... This is interesting, but seems an orthogonal issue.



> 
>> 
>>>> d) I could poke severe holes into the described method (that seem fixed now) and it seems a clear indicating that this method was truly never more than a sketch.
>>>> 
>>> [BB] Indeed, it /was/ meant to be a sketch to give implementers an idea of how they might implement the design goal.
>>> 
>> 	[SM] And that is exactly what I think should e avoided for RFCs/BCPs unless said sketch is based on real data.
> 
> [BB2] But I had worked out the details and considered alternatives before sketching it at high level.

	[SM] As you should and I applaud you for that.


> Because I (also) subscribe to the view that it has to be known to be possible to implement a policy, before that policy can be recommended. 


	[SM] Sure that seems like a decent necessary condition to require, but not a sufficient one, for that it needs to be quite certain that the ploicy actually ends up performing better than alternate simpler solutions.


> 
>>> At the time you pointed this out, I thought it was obvious that an implementer would not mix non-ECN and ECN codepoints together in the same frames, but I was happy to say that explicitly when you asked.
>>> 
>> 	[SM] "I thought it was obvious" is not the most robust and reliable approach to write recommendations. 
>> 
>> 
>> 
>>> Both the examples for how to achieve Goal1 suffer from the same problem.
>>> 
>> 	[SM] How so? The issue is that method 2 needs special accounting for dropped packets as otherwise the counters go out of sync
>> Let's look at method
>> 1a) Mark all IP packets related to a marked L2 frame:
>> This will not see dropped IP packets, but it does not matter as a drop in itself is a (slightly ambiguous) congestion signal, also all packets that can carry a mark will be marked, so this method is IMHO robust against that issue
>> 
> 
> [BB2] A Not-ECT packet covered by a marked frame would need to be dropped. That's admittedly not a counting problem, but there's a problem with large amounts of unnecessary drop

	[SM] I respectfully disagree, the amount of drops would be the same*, what might differ is where along the path they are dropped, at the AQM node or at the de-framer, if the AQM-node is severely under water doing the drops there would be better than delaying them to the de-framer, but both will have the same effect on the affected flows response...

*) assuming otherwise identical conditions when the respective pure or mixed Not-ECT hit the AQM node which wants to signal "slow-down".


> unless packets are pre-classified into ECN types, which again I assumed they would be for this approach.

	[SM] Well, I guess operators of such re-framing links would need to tell us what level of complexity they are willing to carry (at both ends of the frame link) to help out end-to-end congestion signaling. Many backbones, as far as I heard, side step this issue completely be simply making sure utilization stays below X% (often rumored to be ~80%) of capacity. So the relevant links would need to be either internal (leaf to back-bone) or external (back-bone to peerings/transits), how likely is it that such links employ a link technology that is affected by the issue we are discussing here?
	That said, if the en-framer makes sure frames are always single type then this method does not have that issue, if we assume such a en-framer for method 2 it seems fair to make the same assumption for method 1a as well. At which point we have no problem.



> 
>> 
>> 1b) propagate a single mark only to one IP packet out of the set related to a L2 frame:
>> So what happens here is that we end up potentially sending more congestion signals per frame and hence slightly violate the method. If a single IP packet was dropped one could argue the precise way forward would be to only propagate a mark to an eligible packet if no packet was dropped; but given that a single marked frame could contain multiple Not-ECT IP packets (that need to be dropped) this is unfixable. It also will not matter all that much, since no end2end protocol will depend on the mark propagation following this method strictly.
>> 
> 
> [BB2] I think we can conclude that your trying to wriggle out of your previous assertions.

	[SM] That is certainly a way of interpreting this; a pretty impolite way that assumes to know quite a lot of my internal state, but a way nevertheless. What i am trying here is to go through the scenarios in an objective* fashion, and that implies that if a method has issues I mention these and then give my interpretation of the severity of those issues.
Side-note if we assume your ECN-sorting en-framer, this method will also have zero problems... 



*) Not saying I achieve this, but that i try this.


> 
>> 
>> 
>> But since you say both goal 1 examples have the same issue, please elaborate how this affects 1a, and how it affects 1b in a relevant way. 
>> 
> 
> [BB2] Surely you just did.

	[SM] But did I? Method 1a above has no correctness issue with mixed ECN-frames as you seem to admit and method 1b is strictly unfixable with mixed ECN-frames (with separated ECN-frames 1b has no issues) But again, I do not believe that method 1b would fail to work well enough even when occasionally signaling more than a single packet per frame (but have no data to back this up).

> 
>> As far as I can see it is only the elanorate dual counter method that requires special care in this regard.
>> 
>> 
>> 
>> 
>> 
>> 
>>> But I didn't touch them 'cos I had been asked to include that wording verbatim (and like I said, I think it's obvious that one doesn't mix ECN codepoints within the same frame).
>>> 
>> 	[SM] "it's obvious that one doesn't mix ECN codepoints within the same frame" this as a policy will either introduce re-ordering at the en-framer, or will require variable sized frames (at which point one could to 1:1 frame to IP packet framing making things moot again), or partially empty frames (wasting utilisation); it will also make the en-framer (and de-framer) considerably more complex e.g. by requiring multiple queues...
> 
> [BB2] Classification by ECN codepoint shouldn't lead to data reordering.

	[SM] Yeah, but that should is hard to enforce in reality unless "partially empty frames" are used. Heck, consider the framing "by ECN codepoint" which implies that CE marks get their own special queue, but these are rare, so what do we do? The "obvious" solution is to not treat them differently from ECT(0) and ECT(1), but that only works if we treat ECT(0) and ECT(1)in one aggregate, but for L4S we especially do not do that (CE should go with ECT(1))... ceterum censeo that redefining what a CE means is a "present" thst keeps on giving.


> Whatever, these sorts of issues all depend on the specific circumstances - degree of aggregation, frame sizing constraints, whether flows consist of packets all of the same ECN type, etc. which is why only high level examples are given in ecn-encap. 
> 
>> 
>>>> I understand that BCPs, inspite of what their name implies, are intended to inject some policy, but if such an injectin attempt proves to be a complete dud, as here, I argue it is time to drop it.
>>>> 
>>>> 
>>> [BB] As above, the principle of preserving proportion is held by most people.
>>> 
>> 	[SM] I accept that this is your assumption. Given the amount of people that participated in this discussion I am more inclined to believe most folks do not really have an opinion on that.  
> 
> [BB2] You are seeing little discussion now, probably because you are raising stuff:
> * 17 years after all the discussion on RFC7141 started and 9 years after it was published
> * 12 years after all the discussion started on ecn-encap, and 4 years after the 2nd WGLC closed

	[SM] These are rather procedural arguments that do not seem to care about the actual content of the drafts... I care little when an RFC/draft was written/ratified, if it is incorrect or simply nat match very well any more to how the internet developed it should be open t correction/change.


> 
>> 
>> 
>> 
>>> Having digested my arguments above, I hope you might join them.
>>> 
>> 	[SM] Not really, as I said, let's not second guess the AQM and come up with something simple enough to describe in 1-2 sentences so an implementer will do the right thing. 
>> 
>> 
>> 
>>> And octet preserving is one of the few ways to preserve proportion without delaying the signal. Other ways that measure proportion (e.g. by counting the packets between marks, or the way Koen suggested) all delay changes to the signal.
>>> 
>>> 
>>>>> This draft is not a protocol spec. It's design guidelines for adding congestion notification to a L2 protocol in the future. The question of whether there is an implementation can only be relevant at the time a spec is written for a specific protocol.
>>>>> 
>>>>> 
>>>> 	[SM] I disagree, any RFC should (IMHO, apparemtly this is not accepted commonly) ONLY recommend methods that are known to work, and the easiest way to demonstrate that is to implement it and show data. Again, the TSV wg does not require this, but I consider that to be the wrong approach.
>>>> 
>>> [BB] No-one can write an implementation of how an abstract unknown L2 protocol propagates ECN.
>>> 
>> 	[SM] This is why I would be satisfied to see an implementation for a known L2 protocol/framer... 
>> 
>> 
>> 
>>> The two approaches I'm aware of for propagating ECN between the layers (TRILL and for MPLS) are very different, because they are tailored to their protocols. When a requirement to propagate ECN with disjoint packet boundaries needs to be implemented, it will again be tailored to the protocol involved. 
>>> 
>> 	[SM] At which point we might do best not to mention any method at all...
>> 
>> 
>>>>> The question of which goal to aim for and which implementation to use will be decided when a specific protocol is designed. It's possible that the disagreements have been due to a difference in assumptions between the two 'camps'. Which assumptions are appropriate might become clearer when a specific protocol is on the table.
>>>>> 
>>>>> 
>>>> 	[SM] No, recommending an untested approach is not what an IETF document should do, at the very least it should clearly mark untested ideas as such, but that is not what the current draft does.
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Hence, I believe the chairs are asking me to post the proposed wording because it's not relevant whether you or I or anyone else wants one or the other of the examples in the draft. You don't agree with goal 2. I don't agree with goal 1.
>>>>> 
>>>>> 
>>>> 	[SM] And there we have it, this whole thing is based on your personal dislike... is an RFC/BCP really the best place to voice personal opinions?
>>>> 
>>> [BB] Please don't twist my words.
>>> 
>> 	[SM] Twisting of words was not intended, this was a rephrasing of "I don't agree with goal 1".
>> 
>> 
>> 
>>>> Please show data that method 2 works better than method 1 (heck, show data that method 2 works at all) and we are talking. As far as I can tell, correct me if I am wrong, method one is actually implemented? If however nether method is implemented right now, the draft should clearly say that both are speculative and untested.
>>>> 
>>> [BB] As above, design goals aren't implementable without a specific protocol. So criticism of lack of implementation is just pointless negativity and applies to all methods anyway.
>>> 
>> 	[SM] Again, I am asking for one example implementation that shows the practical feasibility of goal 2, having to actually implement something tends to highlight areas of underspecification pretty quickly and after having one (tested and) working implementation the issues encountered during the implementation can help improve the recommendation.
>> 
>> 
>> 
>>> 
>>>>> But I have written both into the draft anyway, so the options are recorded, but the decision is essentially deferred until a specific protocol is written.
>>>>> 
>>>>> 
>>>> 	[SM] Well, how about ripping out both then? That clearly leaves even more freedom to implementers, without giving them wrong ideas.
> 
> [BB2] You may have noticed I prefer to be constructive. Because protocol designers need guidance on this point, as illustrated by the two recent RFCs that bypassed the issue.

	[SM] I consider saying nothing if we have no clear recommendation to make to be a better approach... 

> 
> If you now agree that Goal1 is incorrect, perhaps you could help persuade its proponents that it is incorrect and have it removed.

	[SM] Bob, I await the data with which you conclusively demonstrate that method 2 truly achieves better performance than the alternatives*. Once I scrutinized that data, I am open to change my opinion, but until then I consider recommending method 2 as counter-productive. This "remove" all was intended in the spirit of a compromise, but I see you did not take it that way.


Regards
	Sebastian


*) Me observing that 1a and 1b are not flawless is not the same as me changing my opinion that method 2 is the bee's knee, and I am pretty convinced you know that.


> 
> 
> 
> 
> Bob
> 
> 
>>>> 
>>> [BB] No more changes. The draft is moving forward. 
>>> This email was just to pick up on points where I could see misconceptions.
>>> 
>> 	[SM] As I said, this is pretty much the outcome I expected.
>> 
>> Regards
>> 	Sebastian
>> 
>> 
>> 
>>> 
>>> Bob
>>> 
>>> 
>>>>> Bob
>>>>> 
>>>>> On 23/08/2023 13:39, Sebastian Moeller wrote:
>>>>> 
>>>>> 
>>>>>> Dear List,
>>>>>> 
>>>>>> this is not going as it should. We are still promoting a method that was essentially proposed in 2014* and has since apparently never been implemented/properly tested. If anybody has evidence of an actual implementation and data showing this implementation actually working, please come forward.
>>>>>> 
>>>>>> 
>>>>>> *) 2014 RFC7141 was ratified, the actual idea/methos is probably older given that what resulted in rfc7141 was started in 2007.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Aug 23, 2023, at 10:21, Gorry Fairhurst <gorry@erg.abdn.ac.uk>
>>>>>>> 
>>>>>>>  wrote:
>>>>>>> 
>>>>>>> As promised at the meeting in San Francisco, we will be progressing with draft-ietf-tsvwg-ecn-encap-guidelines.
>>>>>>> 
>>>>>>> This (and it's related dependency ID) have been on the Chair's action list since completion of WGLC quite some time ago.  At the last IETF meeting, the Chairs worked with the document editor to revise the text around the two possible design goals. The changes that we expect are summarised below and we are now expecting a new revision of this draft. This will allow us to complete a Shepherd  writeup for these drafts.
>>>>>>> 
>>>>>>> Gorry
>>>>>>> (TSVWG Co-Chair)
>>>>>>> 
>>>>>>> ----
>>>>>>> BEFORE
>>>>>>> 
>>>>>>> Two possible design goals for propagating congestion indications, described in section 5.3 of [RFC3168] and section 2.4 of [RFC7141], are:
>>>>>>> 	• approximate preservation of the presence of congestion marks on the L2 frames used to construct an IP packet;
>>>>>>> 	• approximate preservation of the proportion of congestion marks arriving and departing.
>>>>>>> 
>>>>>>> In either case, an implementation SHOULD ensure that any new incoming congestion indication is propagated immediately, not held awaiting the possibility of further congestion indications to be sufficient to indicate congestion on an outgoing PDU [RFC7141]. Nonetheless, to facilitate pipelined implementation, it would be acceptable for congestion marks to propagate to a slightly later IP packet.
>>>>>>> 
>>>>>>> Concrete example implementations of goal #1 include (but are not limited to):
>>>>>>> 	• Every IP PDU that is constructed, in whole or in part, from an L2 frame that is marked with a congestion signal, has that signal propagated to it;
>>>>>>> 	• Every L2 frame that is marked with a congestion signal, propagates that signal to one IP PDU which is constructed, in whole or in part, from it. If multiple IP PDUs meet this description, the choice can be made arbitrarily but ought to be consistent.
>>>>>>> 
>>>>>>> Concrete example implementations of goal #2 include (but are not limited to):
>>>>>>> 	• A counter ('in') tracks octets arriving within the payload of marked L2 frames and another ('out') tracks octets departing in marked IP packets. While 'in' exceeds 'out', forwarded IP packets are ECN-marked. If 'out' exceeds 'in' for longer than a timeout, both counters are zeroed, to ensure that the start of the next congestion episode propagates immediately;
>>>>>>> 
>>>>>>> AFTER
>>>>>>> 
>>>>>>> Two possible design goals for propagating congestion indications, described in section 5.3 of [RFC3168] and section 2.4 of [RFC7141], are:
>>>>>>> 	• approximate preservation of the presence (and therefore timing) of congestion marks on the L2 frames used to construct an IP packet;
>>>>>>> 	• a) at high frequency of congestion marking, approximate preservation of the proportion of congestion marks arriving and departing;
>>>>>>> b) at low frequency of congestion marking, approximate preservation of the timing of congestion marks arriving and departing;
>>>>>>> 
>>>>>>> In either case, an implementation SHOULD ensure that any new incoming congestion indication is propagated immediately, not held awaiting the possibility of further congestion indications to be sufficient to indicate congestion on an outgoing PDU [RFC7141]. Nonetheless, to facilitate pipelined implementation, it would be acceptable for congestion marks to propagate to a slightly later IP packet.
>>>>>>> 
>>>>>>> 
>>>>>> 	[SM] 1 and 2.b contain already mention conservation of timing, which is essentially a direct consequence of the next paragraph "In either case, an implementation SHOULD ensure that any new incoming congestion indication is propagated immediately". Either this does not hold for 2.a) (which should be noted somewhere) or the addition of timing in 1.) and 2.b) seems redundant.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> At decapsulation in either case:
>>>>>>> 	• ECN marking propagation logically occurs before application of rule 1 in Section 4.4.  For instance, if ECN marking propagation would cause an ECN congestion indication to be applied to an IP packet that is a Not-ECN-PDU, then that IP packet is dropped in accordance with rule 1.
>>>>>>> 	• if a mix of frames with different types of ECN capability arrives to construct the same IP packet, that packet MUST be discarded. This requirement uses the generalization 'types of ECN capability', because the L2 ECN protocol might not map exactly to the three types in IP, which are Not-ECN-capable, ECT(0) and ECT(1) [RFC8311].
>>>>>>> 
>>>>>>> The following gives one way that goal #1 might be achieved, but it is not intended to be the only way:
>>>>>>> 	• Every IP PDU that is constructed, in whole or in part, from an L2 frame that is marked with a congestion signal, has that signal propagated to it;
>>>>>>> 	• Every L2 frame that is marked with a congestion signal, propagates that signal to one IP PDU which is constructed, in whole or in part, from it. If multiple IP PDUs meet this description, the choice can be made arbitrarily but ought to be consistent.
>>>>>>> 
>>>>>>> 
>>>>>> 	[SM] I am confused:
>>>>>> The first clause says: every IP PDU "inherits" the mark from a L2 frame. Which to me means all IP PDUs (even only partially) constructed from a marked L2 frame will inherit the mark.
>>>>>> The second clause says that mark is propagated to only one (consistently seected) IP PDU.
>>>>>> These appear to describe two mutually incompatible ways to achieve goal 1, not well described as "The following gives one way", no? So what am I missing here?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> The following gives one way that goal #2 might be achieved, but it is not intended to be the only way:
>>>>>>> 	• For each of the streams of frames encapsulating IP packets
>>>>>>> 
>>>>>>> 
>>>>>> 	[SN] "for each of the streams of the frames" so this now allows for multiple different frame types to arrive in the IP-decapsulator?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> of each IP-ECN codepoint,
>>>>>>> 
>>>>>>> 
>>>>>> 	[SM] There are arguably 4 IP-ECN codepoints, is this supposed to result in 4 counters? I had thought that we really only care about propagating L2 congestion events to ECN-CE marks here?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> a counter ('in') tracks octets arriving within the payload of marked L2 frames and another ('out') tracks octets departing in marked IP packets.
>>>>>>> While 'in' exceeds 'out', forwarded IP packets are ECN-marked. If 'out' exceeds 'in' for longer than a timeout,
>>>>>>> 
>>>>>>> 
>>>>>> 	[SM] In this condition we dequeued more "CE-bits" than we enqueued, if the next marked L2 frame results in less bits than out-in we will not immediately mark this and essentially "swallow" that mark. This now means that this "a timeout" will need to be pretty short to still obey the "SHOULD ensure that any new incoming congestion indication is propagated immediately" rule. So this timeout needs to be equivalent to not more than "slightly later"?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> both counters are zeroed, to ensure that the start of the next congestion episode propagates immediately. The 'out' counter includes octets in reconstructed IP packets that would have been marked, but had to be dropped because they were Not-ECN-PDUs (by rule 1 in Section 4.4).
>>>>>>> 
>>>>>>> 
>>>>>> 	[SM] What about packets that would be marked where dropped for other reasons (e.g. queue full)? Such dropped packet will also send a "slow" down signal to the end-points so why still follow this up with more marking?
>>>>>> 
>>>>>> 
>>>>>> I really would like to see a working implementation of that method before putting it in a RFC/BCP*... yes this AFTER version is better than the BEFORE version, but I still think it would be prudent to drop this still speculative discussion of "method 2". Yes, this was essentially propsed 2014 in rfc7141, but the apparent lack of implementations indicates lack of interest in the field.
>>>>>> 
>>>>>> Regards
>>>>>> 	Sebastian
>>>>>> 
>>>>>> *) Which is not a requirement in tsvwg, I just think it should be.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> -- 
>>>>> ________________________________________________________________
>>>>> Bob Briscoe                               
>>>>> 
>>>>> http://bobbriscoe.net/
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>> -- 
>>> ________________________________________________________________
>>> Bob Briscoe                               
>>> 
>>> http://bobbriscoe.net/
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               
> http://bobbriscoe.net/