Re: [tsvwg] Progress with draft-ietf-tsvwg-ecn-encap-guidelines

Sebastian Moeller <moeller0@gmx.de> Wed, 13 September 2023 10:48 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 55F4DC151069 for <tsvwg@ietfa.amsl.com>; Wed, 13 Sep 2023 03:48:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.148
X-Spam-Level: ***
X-Spam-Status: No, score=3.148 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, GB_SUMOF=5, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id m0Yl8x2X0Rvy for <tsvwg@ietfa.amsl.com>; Wed, 13 Sep 2023 03:48:35 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CEBC4C151076 for <tsvwg@ietf.org>; Wed, 13 Sep 2023 03:48:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1694602068; x=1695206868; i=moeller0@gmx.de; bh=spPIITUuHD7ui29peshjb76X7yV5NBfQYBTewQydDRE=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=K9lB9vR1wE8iCwTeNmGcMKY8ICHufSQaj1xX8godg841X7pI7QnJCSPo+eRLsUpFlgIWFkyRVwd wzNt0vzxutlKYpw7P5W/W9okUpdetEFYew9en8tZIbuOEwX+4XPkeh9gWYAq9N6l74hI3QaPaBtn1 dQfIUqXZzK+jr+PD2jo5a2mTODdtxNSoycZlQNwKh3QCWUJiDexllfJzWROP8u3vEK0MTSOSpuaqY HOulzHiPBYBqpl0YlDpTKm5cTw/OKl05cz+Ufre0IxxzTIBn87bZsag/dsc78HTdNmlU/92mpInAk YCBe8cRByd09rLRk7C4rLlWGzG8WabqBiSkA==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1M7b2d-1qnnOT2qUP-0080eQ; Wed, 13 Sep 2023 12:47:48 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <52b12bcc-1aa7-9069-2b21-aeb00e9e39db@bobbriscoe.net>
Date: Wed, 13 Sep 2023 12:47:47 +0200
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <646319E7-78D3-4BBF-9EC7-F069CE7124BA@gmx.de>
References: <23c00fae-e6a6-072c-0513-1c0d5c637c17@bobbriscoe.net> <3d442824-722f-90af-8d04-916b29bafca4@erg.abdn.ac.uk> <D2E7D6FA-C39D-44B4-BC27-8897CE24145C@gmx.de> <bdc9685f-77b1-50f6-63b7-8b167d850148@bobbriscoe.net> <C4D0E327-32E9-42E8-850F-DFF579612CD0@gmx.de> <52b12bcc-1aa7-9069-2b21-aeb00e9e39db@bobbriscoe.net>
To: Bob Briscoe <in@bobbriscoe.net>
X-Mailer: Apple Mail (2.3696.120.41.1.4)
X-Provags-ID: V03:K1:auin/2T/Ys1pPmTerWMOvNywQHKgJO0BUgLTveIGib4kXRuViVB 2Zot4aJ/l8PBTWlh++3Dncw9KLObnzdBwQ5MDEZdVS0lUu7Af+m0JF5Scp9TCMjwe22TYjq 1+JlhyVDUA2WBpjmWCFWx++3gLc8T3JwYytp+o8LCLBfE2/6rdrdjv1WGJRmmfLYRJU/VvL QKc53I2kK7TIz4xSWc31g==
UI-OutboundReport: notjunk:1;M01:P0:iCKpv5992O4=;W9LAAzSnPjSjgDJKytzFhj7CXjM t2JaN8I0CVupbbI3rSr+UCklvepTVLSmxo5+b+pl7gCkbU6WxFPFxDfIZz+RenfzReOGGi1GL pY9bSVHlhP2KIu1gem2yGmW6kF12KhaLuAGg/bSbgPsN9OSuxL30O/H5E0GO4JSegEg0skG9E saQhLjcHVauobmSt1UN/SOew/HG/XffKd8u2Xh0s8N5zkgLjcUOCeoR7iVU34SuWpxQanONte QxQfhGHWo0ThwUXbF+E2hdr1QGLAgYq/KywLohE4WRvOm1RNBPfm5xaHeT86JErJgvbmFB+Tb eqlG1YVJfgoi5g/MEHEfN6lub7T6PXoMoEFVM2EwlrP3oc3lVqL5IsPBF8HLtclS8VBc2vsXz 65LuUNnAq9o4JSJqSRVk90VoGKGiBmKqqL0nug1tRgpfYj70JEH3JdHToRoSc4SUWbEN3FKZU Z+oZSy1SwF4/fE75B6oMwDZhmIVIG8Cqn+c1SfU/zrAdEXbPqtcahEbHp9RtDpJm1vC9QVNYT LnsfSMbpBJpDQFl5TxCQv9ktS7hE8qYAksih3yeJev9WVmg/nwvjRQveRLD83stAzIouzBumb FPrx1SQsP4lIjDps03KQKx3dXrFNicMcd5QSCwe94Ym8cYjeXpDx9KFk5P2Wk7lEjfqKifeyZ Wg5qkU6w7gaOdN83By3xwwg2z9K50ixmKRt2Ij0fxUYHMGTRkbKM6mt1ixrzw3ka686VIoqCc oRHR9Yd0EJfNHioiCu8i+IPdzPHfXwJLT2ICXb/QMDkcYbaTLNBrydYB+5dePxyVvTD7h4b8H 8pXxRRxEYX7KDi47wSIPT9+VyvmTKckNyvmDHgEOei14UMosvpeDm+usJDaANnffiBFw+FdJ0 qhezKNSMrrc8CmLo9cyt5uzYOoQzD/MCYVwL8qdUMbiaXwKJyi1VXl4qwhmXEF3GkewT3p1d6 j5Wop3q6lrzc1GHFr+PTLo3mitM=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/A0D7qjFGKRNfrargKvfPAGpT0ac>
Subject: Re: [tsvwg] Progress with draft-ietf-tsvwg-ecn-encap-guidelines
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Sep 2023 10:48:40 -0000

Hi Bob,


> On Sep 12, 2023, at 17:19, Bob Briscoe <in@bobbriscoe.net> wrote:
> 
> Sebastian,
> 
> The draft is now going forward.

	[SM] As expected (and announced by the chairs). Had I not mentioned that method 2 was incorrect/incomplete in the past we would have ratified this long ago, in spite of it being incorrect. This does not fill me with confidence about our process.


> But I will still respond...
> 
> On 08/09/2023 08:23, Sebastian Moeller wrote:
>> Bob,
>> 
>> 
>> 
>>> On Sep 7, 2023, at 16:50, Bob Briscoe <in@bobbriscoe.net>
>>>  wrote:
>>> 
>>> Sebastian,
>>> 
>>> Due to the impasse between the views of two 'camps', some time ago the chairs asked me (as editor of the draft) to write both goals in the draft without stating a preference for either. And two descriptions of example ways they might be implemented.
>>> 
>> 	[SM] Yes, based on the observation that the same method was already sketched out in an earlier BCP. However, I question the validity of doing so given that:
>> 
> 
> [BB] I've addressed your point (c) first, because it is the root of the misunderstanding.
> 
>> c) it flies in the face of how a flow should IMHO operate, it should try to generate the best possible estimate of the real state of congestion along the path and then react appropriately. And since IMHO no AQM marks based on packet-size (and hence does not effectively mark individual octets), it makes no sense to propagate ECN marks in an octet preserving fashion, as that does add noise to the data, making it harder to get a veridical estimate of what happened. 
> 
> [BB] Just because the word 'octet' appears in the technique, doesn't make its marking size-dependent.

	[SM] Yes that is the point, rfc7141 recommends to mark size independent, but to interpret the marking size dependent, the whole rationale behind your method 2 hence must be to make this size dependent interpretation possible by conserving this in spite of the re-framing.


> On the contrary, after decap, preserving marked octets preserves size-independence better than preserving the presence of marks. I'll try to explain that at the end of this response (in A3). 

	[SM] See the challenge the real challenge here is to post-hoc figure out how the marking entity had decided, had is seen the IP packets individually... as this seems not really achievable I see the best approach to come up with something that is simple and still covers the gist of the congestion signaling.


> 
> Before that, my first assertion might have raised a question in your mind:
>     Q1. In the draft, why do I say goal 1 is to preserve proportion, but the example preserves octets? Especially given that using octets seems to cause controversy.

	[SM] Why is it confused to assume that this method was motivated by rfc7141's recommendation to take the size of marked packets into account when "interpreting" congestion marks? Rfc7141 manly talks about proportionality to packet size, not proportionality to proportion of marked bytes. (With the possible exception of Appendic B1: 
  "Packet-mode drop actually gives flows sufficient information to
   measure their loss rate in bits per second, if they choose, not just
   packets per second.  Each flow can count the size of a lost or marked
   packet and scale its rate response in proportion (as TFRC-SP does).)"

If the justification to add method 2 is truely rfc7141's precedence then I would expect that method two actually is a consequence of rfc7141, which it strictly does not seem to be the case.

Side-note: looking at TFRC-SP I get:
In TFRC-SP, the loss event rate is calculated by counting at most one
   loss event in loss intervals longer than two round-trip times, and by
   counting each packet lost or marked in shorter loss intervals.

This implies that TFRC-SP does indeed not look at the size of marked/lost packets when registering marks... so not sure the TFRC-SP reference here is useful.




> 
> That might lead to a second question, which I'll also answer below:
>     Q2. Why preserve marking proportion anyway? Is it "the best possible estimate of the real state of congestion along the path"?
> 
> A1. Why preserving octets preserves proportion
> 
> 
> Imagine a stream of packets with their headers all run back-to-back as a stream of octets then cut up into frame payloads at L2. 
>     For instance, consider scenario a) at: https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg (which excludes frame headers)
> 
> Now, take any window of data, e.g. the first 3 frame payloads (top left). 1/3 are marked pink.
> Now count the packets encapsulated inside those 3 frames. There are about 15. So proportion preserving wants 1/3 of 15 = 5 to be marked. The marked octet preserving technique in the Goal2 row marks 6 packets, which is near enough, given it deliberately rounds up.

	[SM]  Looking at your example, I immediately note that you did NOT depict the variant to achieving goal 1 by marking all packets somehow "touched" by the marked L2-frame (aka method 1a)... this will result in something in between goal 1b and goal 2 with considerably less complexity than goal two (with its requirement for timeout and to account for dropped frames/packets). My point is goal 2 seems pretty complex with very little to show for it in regards to data demonstrating that is superior than methods 1a and 1b that both are considerably simpler in scope and implementation.


> 
> Proportion is preserved because, when marking is approximately independent of frame and packet size:
>     marked frames / total frames ~= marked octets before decap / total octets
>     marked packets / total packets ~= marked octets after decap / total octets
> In both cases frame headers are excluded, but packet headers are included, which makes total octets the same.

	[SM] This is fine, the question is still, is this complexity actually warranted.


> By ensuring marked octets are preserved before and after decap, both the ratios on the right will be identical. This makes both the marking ratios on the left (frame and packet) approximately the same.
> 
> A2. Why preserve proportion?
> 
> For the same traffic and link scenario, irrespective of whether the decap preserves presence (Goal1) or proportion (Goal2), congestion control algorithms (CCAs) and the AQM will adjust so that the sum of the flows still fits into the link. 
> 
> The only reason to preserve marking proportion is to avoid the proportion of marks being shifted too far outside its normal operating range. I.e. not so high that it more often saturates at 100% and not so low that the AQM has to emit marks so far apart that control becomes very slack or jumpy.

	[SM] Then please show that methods 1a and/or 1b actually affect the proportion of marked bytes sufficiently strongly to make this more than a theoretical musing. 


> For example, let's consider scenario a) in https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg again. With the Goal1 decap (2nd row assuming the second implementation bullet from the draft), even though there are about 3 packets in each marked frame, only one packet gets marked each time. 

	[SM] Yes, that is a direct consequence of the propagate the number of congestion marks policy that is at the heart of method 1b.


> Assuming the IP packets in a frame will often belong to separate flows, this means fewer flows see each L2 mark. So the system will adjust to increase the L2 marks, until enough flows respond. But the AQM cannot mark more than 100% of the frames, so it cannot mark more than about 1 in 3 of the packets.

	[SM] Yes, and that is why we have method 1a, which certainly will hit all flows in a marked frame...


> Conversely, in scenario b) with the Goal1 decap, if the AQM marks more than about 1 in 3 of the smaller frames, 100% of the packets will be marked. So, taking the system as a whole, the AQM will mark fewer frames before the CCAs slow down enough. This gives the AQM a smaller operating range and it is likely to make the system more jumpy, and less controllable.

	[SM] As I said before here we are trying to second guess the AQM, we do NOT know how that AQM would have marked had it seen our actual IP packets, so it seems futile to figure out which of our approximations is "best", the goal should be "good enough" and as simple as possible.


> 
> A3. Preserving size-independent marking
> 
> It might help to visualize the following using https://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-encap-reframing.svg
> 
> If L2 frame boundaries are completely independent of L3 packet boundaries:
> 	• If the packet that includes the start of each marked frame is marked (as in each Goal1 row), packet marking will become size-dependent.
> 		• Reason: the start of each frame is equivalent to a point picked at random in the stream of packets.
> 		• any point picked at random within the stream is more likely to fall within a larger packet
> 	• With the Goal2 approach, the start of a congestion episode will tend to fall within a larger packet by the same reason as for Goal1 above
> 		• further packets in the congestion episode are marked depending on how much octet marking is left over from previous marking
> 			• so subsequent packets are marked whatever their size
> 		• however, the last packet to be marked in an episode will also tend to be a larger packet, 
> 			• because the last octet in a frame-marking episode is also a random point in the packet stream, like the start.

	[SM] As I said, this assumes a specific way the AQM would mark the IP packets in question (say if variable size L2 frames would be used each only containing an individual IP packet).


> If the traffic stream contains idle periods, of course, the first L2 frame and L3 packet boundaries after each idle will coincide.
> 
>> a) in the years sine that BCP was ratified no known implementatinn of that method has come to see the light of day (and hence no real data exists about it working as intended)
> 
> [BB] Before responding to that, with hindsight, I would make the following picky disagreement with one of my own sentences [in §2.4 of RFC7141]:
>    "This [octet preserving when splitting or merging packets] is based on the principle used above;
>    that an indication of congestion on a packet can be considered as an
>    indication of congestion on each octet of the packet."
> 
> 
> In the context of splitting or merging packets at decap, I would say today that octet preserving is an implementation technique not a principle. At decap, the appropriate principle is to preserve the proportion of marking (which happens to also preserve octets at any function where total octets are preserved - by the reasoning given earlier). 

	[SM] Yes, you would ned to do that, given that the justification for method 2 is not really obvious in RFC7141... which IMHO is a problem in itself as it is odd to claim precedence by an RFC if that RFC actually says something different (however rfc7141 still wants octet preservation and method 2 will also deliver that).


> [BB] Now to your point,... 
> I'm not sure there have been many, if any, implementations of splitting or merging packets while propagating ECN since RFC7141. And I'm not sure how you know there haven't been any that follow §2.4 of BCP7141. (I admit though that, if you were omniscient, I wouldn't know that you were, because I'm not.)

	[SM] Oh, I asked here on the list and got back no response, which I think conservatively needs to be interpreted as non-existence for the scope of this discussion, the onus to show differently is on the proponents of method 2 IMHO.


> 
> In two cases that I am aware of (both protocol specs, rather than implementations), ECN has been disabled over an encapsulating tunnel in order to avoid having to propagate ECN at decap:
> 
> https://datatracker.ietf.org/doc/html/rfc9347#section-3.1
> https://datatracker.ietf.org/doc/html/draft-ietf-masque-connect-ip-13#section-10.2

	[SM] Which can be counted against all methods, and to be honest if there is no data showing methods 1a and 1b being used, I would propose to rip out that whole section as well, not only method 2.

> 
> 
>> b) you yourself got involved in specifying a protocol (TCP Prague) that also does not see to follow that method
> 
> [BB] When we were first using Linux DCTCP for L4S, it used acked_bytes to maintain the fraction of ce-marked bytes, which did follow the principle of treating all the octets in a marked packet as marked. See:
> https://elixir.bootlin.com/linux/v3.18.9/source/net/ipv4/tcp_dctcp.c#L186
> 
> But by the time Prague was forked off from DCTCP, someone had DCTCP to counting marked packets, probably for efficiency because the relevant variables (delivered_ce and delivered) were already maintained by the kernel. But none of us noticed until a while later. We should now make the change back to bytes, but haven't got around to arguing it through with everyone yet.

	[SM] Well, come back after you made that change? It seems rather odd that you are willing to stall a draft for almost a year to fight for a method that you did not bother to actually use (consistently) in your own protocol.


> 
> As you saw in my response to the review from Neal, it doesn't much matter when only a few ECN-capable packets are smaller than the SMSS, because the proportions of marked packets and marked octets aren't often that different. But once ACKs are ECN-capable, it becomes important to get this right, particularly in connections with significant 2-way data flow.

	[SM] I would respectfully argue that in connections with significant 2-way data flow the ACK will often be piggy backed onto data packets, hence will not be all that small.
And then I ask: if we talk about pure reverse ACK flows what differential response to a marked ACK do consider appropriate (keeping in mind that the AQM was supposed to mark packet based, hence marking probability should e decoupled from packet size)?


> 
>> d) I could poke severe holes into the described method (that seem fixed now) and it seems a clear indicating that this method was truly never more than a sketch.
> 
> [BB] Indeed, it /was/ meant to be a sketch to give implementers an idea of how they might implement the design goal.

	[SM] And that is exactly what I think should e avoided for RFCs/BCPs unless said sketch is based on real data.



> At the time you pointed this out, I thought it was obvious that an implementer would not mix non-ECN and ECN codepoints together in the same frames, but I was happy to say that explicitly when you asked.

	[SM] "I thought it was obvious" is not the most robust and reliable approach to write recommendations. 


> Both the examples for how to achieve Goal1 suffer from the same problem.

	[SM] How so? The issue is that method 2 needs special accounting for dropped packets as otherwise the counters go out of sync
Let's look at method
1a) Mark all IP packets related to a marked L2 frame:
This will not see dropped IP packets, but it does not matter as a drop in itself is a (slightly ambiguous) congestion signal, also all packets that can carry a mark will be marked, so this method is IMHO robust against that issue

1b) propagate a single mark only to one IP packet out of the set related to a L2 frame:
So what happens here is that we end up potentially sending more congestion signals per frame and hence slightly violate the method. If a single IP packet was dropped one could argue the precise way forward would be to only propagate a mark to an eligible packet if no packet was dropped; but given that a single marked frame could contain multiple Not-ECT IP packets (that need to be dropped) this is unfixable. It also will not matter all that much, since no end2end protocol will depend on the mark propagation following this method strictly.


But since you say both goal 1 examples have the same issue, please elaborate how this affects 1a, and how it affects 1b in a relevant way. As far as I can see it is only the elanorate dual counter method that requires special care in this regard.





> But I didn't touch them 'cos I had been asked to include that wording verbatim (and like I said, I think it's obvious that one doesn't mix ECN codepoints within the same frame).

	[SM] "it's obvious that one doesn't mix ECN codepoints within the same frame" this as a policy will either introduce re-ordering at the en-framer, or will require variable sized frames (at which point one could to 1:1 frame to IP packet framing making things moot again), or partially empty frames (wasting utilisation); it will also make the en-framer (and de-framer) considerably more complex e.g. by requiring multiple queues...


>> I understand that BCPs, inspite of what their name implies, are intended to inject some policy, but if such an injectin attempt proves to be a complete dud, as here, I argue it is time to drop it.
>> 
> 
> [BB] As above, the principle of preserving proportion is held by most people.

	[SM] I accept that this is your assumption. Given the amount of people that participated in this discussion I am more inclined to believe most folks do not really have an opinion on that.  


> Having digested my arguments above, I hope you might join them.

	[SM] Not really, as I said, let's not second guess the AQM and come up with something simple enough to describe in 1-2 sentences so an implementer will do the right thing. 


> And octet preserving is one of the few ways to preserve proportion without delaying the signal. Other ways that measure proportion (e.g. by counting the packets between marks, or the way Koen suggested) all delay changes to the signal.
> 
>>> This draft is not a protocol spec. It's design guidelines for adding congestion notification to a L2 protocol in the future. The question of whether there is an implementation can only be relevant at the time a spec is written for a specific protocol.
>>> 
>> 	[SM] I disagree, any RFC should (IMHO, apparemtly this is not accepted commonly) ONLY recommend methods that are known to work, and the easiest way to demonstrate that is to implement it and show data. Again, the TSV wg does not require this, but I consider that to be the wrong approach.
> 
> [BB] No-one can write an implementation of how an abstract unknown L2 protocol propagates ECN.

	[SM] This is why I would be satisfied to see an implementation for a known L2 protocol/framer... 


> The two approaches I'm aware of for propagating ECN between the layers (TRILL and for MPLS) are very different, because they are tailored to their protocols. When a requirement to propagate ECN with disjoint packet boundaries needs to be implemented, it will again be tailored to the protocol involved. 

	[SM] At which point we might do best not to mention any method at all...

> 
>> 
>>> The question of which goal to aim for and which implementation to use will be decided when a specific protocol is designed. It's possible that the disagreements have been due to a difference in assumptions between the two 'camps'. Which assumptions are appropriate might become clearer when a specific protocol is on the table.
>>> 
>> 	[SM] No, recommending an untested approach is not what an IETF document should do, at the very least it should clearly mark untested ideas as such, but that is not what the current draft does.
>> 
>> 
>> 
>>> Hence, I believe the chairs are asking me to post the proposed wording because it's not relevant whether you or I or anyone else wants one or the other of the examples in the draft. You don't agree with goal 2. I don't agree with goal 1.
>>> 
>> 	[SM] And there we have it, this whole thing is based on your personal dislike... is an RFC/BCP really the best place to voice personal opinions?
> 
> [BB] Please don't twist my words.

	[SM] Twisting of words was not intended, this was a rephrasing of "I don't agree with goal 1".


> 
>> Please show data that method 2 works better than method 1 (heck, show data that method 2 works at all) and we are talking. As far as I can tell, correct me if I am wrong, method one is actually implemented? If however nether method is implemented right now, the draft should clearly say that both are speculative and untested.
> 
> [BB] As above, design goals aren't implementable without a specific protocol. So criticism of lack of implementation is just pointless negativity and applies to all methods anyway.

	[SM] Again, I am asking for one example implementation that shows the practical feasibility of goal 2, having to actually implement something tends to highlight areas of underspecification pretty quickly and after having one (tested and) working implementation the issues encountered during the implementation can help improve the recommendation.


> 
> 
>>> But I have written both into the draft anyway, so the options are recorded, but the decision is essentially deferred until a specific protocol is written.
>>> 
>> 	[SM] Well, how about ripping out both then? That clearly leaves even more freedom to implementers, without giving them wrong ideas.
>> 
> 
> [BB] No more changes. The draft is moving forward. 
> This email was just to pick up on points where I could see misconceptions.

	[SM] As I said, this is pretty much the outcome I expected.

Regards
	Sebastian


> 
> 
> Bob
> 
>> 
>>> Bob
>>> 
>>> On 23/08/2023 13:39, Sebastian Moeller wrote:
>>> 
>>>> Dear List,
>>>> 
>>>> this is not going as it should. We are still promoting a method that was essentially proposed in 2014* and has since apparently never been implemented/properly tested. If anybody has evidence of an actual implementation and data showing this implementation actually working, please come forward.
>>>> 
>>>> 
>>>> *) 2014 RFC7141 was ratified, the actual idea/methos is probably older given that what resulted in rfc7141 was started in 2007.
>>>> 
>>>> 
>>>> 
>>>>> On Aug 23, 2023, at 10:21, Gorry Fairhurst <gorry@erg.abdn.ac.uk>
>>>>>  wrote:
>>>>> 
>>>>> As promised at the meeting in San Francisco, we will be progressing with draft-ietf-tsvwg-ecn-encap-guidelines.
>>>>> 
>>>>> This (and it's related dependency ID) have been on the Chair's action list since completion of WGLC quite some time ago.  At the last IETF meeting, the Chairs worked with the document editor to revise the text around the two possible design goals. The changes that we expect are summarised below and we are now expecting a new revision of this draft. This will allow us to complete a Shepherd  writeup for these drafts.
>>>>> 
>>>>> Gorry
>>>>> (TSVWG Co-Chair)
>>>>> 
>>>>> ----
>>>>> BEFORE
>>>>> 
>>>>> Two possible design goals for propagating congestion indications, described in section 5.3 of [RFC3168] and section 2.4 of [RFC7141], are:
>>>>> 	• approximate preservation of the presence of congestion marks on the L2 frames used to construct an IP packet;
>>>>> 	• approximate preservation of the proportion of congestion marks arriving and departing.
>>>>> 
>>>>> In either case, an implementation SHOULD ensure that any new incoming congestion indication is propagated immediately, not held awaiting the possibility of further congestion indications to be sufficient to indicate congestion on an outgoing PDU [RFC7141]. Nonetheless, to facilitate pipelined implementation, it would be acceptable for congestion marks to propagate to a slightly later IP packet.
>>>>> 
>>>>> Concrete example implementations of goal #1 include (but are not limited to):
>>>>> 	• Every IP PDU that is constructed, in whole or in part, from an L2 frame that is marked with a congestion signal, has that signal propagated to it;
>>>>> 	• Every L2 frame that is marked with a congestion signal, propagates that signal to one IP PDU which is constructed, in whole or in part, from it. If multiple IP PDUs meet this description, the choice can be made arbitrarily but ought to be consistent.
>>>>> 
>>>>> Concrete example implementations of goal #2 include (but are not limited to):
>>>>> 	• A counter ('in') tracks octets arriving within the payload of marked L2 frames and another ('out') tracks octets departing in marked IP packets. While 'in' exceeds 'out', forwarded IP packets are ECN-marked. If 'out' exceeds 'in' for longer than a timeout, both counters are zeroed, to ensure that the start of the next congestion episode propagates immediately;
>>>>> 
>>>>> AFTER
>>>>> 
>>>>> Two possible design goals for propagating congestion indications, described in section 5.3 of [RFC3168] and section 2.4 of [RFC7141], are:
>>>>> 	• approximate preservation of the presence (and therefore timing) of congestion marks on the L2 frames used to construct an IP packet;
>>>>> 	• a) at high frequency of congestion marking, approximate preservation of the proportion of congestion marks arriving and departing;
>>>>> b) at low frequency of congestion marking, approximate preservation of the timing of congestion marks arriving and departing;
>>>>> 
>>>>> In either case, an implementation SHOULD ensure that any new incoming congestion indication is propagated immediately, not held awaiting the possibility of further congestion indications to be sufficient to indicate congestion on an outgoing PDU [RFC7141]. Nonetheless, to facilitate pipelined implementation, it would be acceptable for congestion marks to propagate to a slightly later IP packet.
>>>>> 
>>>> 	[SM] 1 and 2.b contain already mention conservation of timing, which is essentially a direct consequence of the next paragraph "In either case, an implementation SHOULD ensure that any new incoming congestion indication is propagated immediately". Either this does not hold for 2.a) (which should be noted somewhere) or the addition of timing in 1.) and 2.b) seems redundant.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> At decapsulation in either case:
>>>>> 	• ECN marking propagation logically occurs before application of rule 1 in Section 4.4.  For instance, if ECN marking propagation would cause an ECN congestion indication to be applied to an IP packet that is a Not-ECN-PDU, then that IP packet is dropped in accordance with rule 1.
>>>>> 	• if a mix of frames with different types of ECN capability arrives to construct the same IP packet, that packet MUST be discarded. This requirement uses the generalization 'types of ECN capability', because the L2 ECN protocol might not map exactly to the three types in IP, which are Not-ECN-capable, ECT(0) and ECT(1) [RFC8311].
>>>>> 
>>>>> The following gives one way that goal #1 might be achieved, but it is not intended to be the only way:
>>>>> 	• Every IP PDU that is constructed, in whole or in part, from an L2 frame that is marked with a congestion signal, has that signal propagated to it;
>>>>> 	• Every L2 frame that is marked with a congestion signal, propagates that signal to one IP PDU which is constructed, in whole or in part, from it. If multiple IP PDUs meet this description, the choice can be made arbitrarily but ought to be consistent.
>>>>> 
>>>> 	[SM] I am confused:
>>>> The first clause says: every IP PDU "inherits" the mark from a L2 frame. Which to me means all IP PDUs (even only partially) constructed from a marked L2 frame will inherit the mark.
>>>> The second clause says that mark is propagated to only one (consistently seected) IP PDU.
>>>> These appear to describe two mutually incompatible ways to achieve goal 1, not well described as "The following gives one way", no? So what am I missing here?
>>>> 
>>>> 
>>>> 
>>>>> The following gives one way that goal #2 might be achieved, but it is not intended to be the only way:
>>>>> 	• For each of the streams of frames encapsulating IP packets
>>>>> 
>>>> 	[SN] "for each of the streams of the frames" so this now allows for multiple different frame types to arrive in the IP-decapsulator?
>>>> 
>>>> 
>>>> 
>>>>> of each IP-ECN codepoint,
>>>>> 
>>>> 	[SM] There are arguably 4 IP-ECN codepoints, is this supposed to result in 4 counters? I had thought that we really only care about propagating L2 congestion events to ECN-CE marks here?
>>>> 
>>>> 
>>>> 
>>>>> a counter ('in') tracks octets arriving within the payload of marked L2 frames and another ('out') tracks octets departing in marked IP packets.
>>>>> While 'in' exceeds 'out', forwarded IP packets are ECN-marked. If 'out' exceeds 'in' for longer than a timeout,
>>>>> 
>>>> 	[SM] In this condition we dequeued more "CE-bits" than we enqueued, if the next marked L2 frame results in less bits than out-in we will not immediately mark this and essentially "swallow" that mark. This now means that this "a timeout" will need to be pretty short to still obey the "SHOULD ensure that any new incoming congestion indication is propagated immediately" rule. So this timeout needs to be equivalent to not more than "slightly later"?
>>>> 
>>>> 
>>>>> both counters are zeroed, to ensure that the start of the next congestion episode propagates immediately. The 'out' counter includes octets in reconstructed IP packets that would have been marked, but had to be dropped because they were Not-ECN-PDUs (by rule 1 in Section 4.4).
>>>>> 
>>>> 	[SM] What about packets that would be marked where dropped for other reasons (e.g. queue full)? Such dropped packet will also send a "slow" down signal to the end-points so why still follow this up with more marking?
>>>> 
>>>> 
>>>> I really would like to see a working implementation of that method before putting it in a RFC/BCP*... yes this AFTER version is better than the BEFORE version, but I still think it would be prudent to drop this still speculative discussion of "method 2". Yes, this was essentially propsed 2014 in rfc7141, but the apparent lack of implementations indicates lack of interest in the field.
>>>> 
>>>> Regards
>>>> 	Sebastian
>>>> 
>>>> *) Which is not a requirement in tsvwg, I just think it should be.
>>>> 
>>>> 
>>>> 
>>>> 
>>> -- 
>>> ________________________________________________________________
>>> Bob Briscoe                               
>>> http://bobbriscoe.net/
>>> 
>>> 
>>> 
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               
> http://bobbriscoe.net/