Re: [tsvwg] ECN encapsulation draft - proposed resolution

Bob Briscoe <ietf@bobbriscoe.net> Sat, 05 June 2021 23:06 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CD12C3A32CC for <tsvwg@ietfa.amsl.com>; Sat, 5 Jun 2021 16:06:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.434
X-Spam-Level:
X-Spam-Status: No, score=-1.434 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YT-p9dBtK_mk for <tsvwg@ietfa.amsl.com>; Sat, 5 Jun 2021 16:06:55 -0700 (PDT)
Received: from mail-ssdrsserver2.hosting.co.uk (mail-ssdrsserver2.hosting.co.uk [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A5F6F3A33A8 for <tsvwg@ietf.org>; Sat, 5 Jun 2021 16:06:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=EScU/AgMeUJaGf+kHlXZVF2Ro1I53Jx+CWDXyS3vWFk=; b=p4WB/1qB//V/bv4SLMohhg/ZRY BW1913MIo9v9kcMJ4SjxHJ03d9qLQHKD8L9HBTegIrBfR1yt3jTDobWIxZQuBIEUBk+jCi3zhHYom S4L/bFV7Fu8p2ktAy44B3q58cqQ5Y4mn6gHO/+L0NRouCb08M7tn6eOzC+lbXRY/lA9F+DtU9GDvN Csx/G1eOkAd5ZvRWYed8acUd2aGn1L4YXw97elixImOpzsp09SKPaYwLO+vR0/G4XWXEdgXvlqVoJ mNm0hLeJGyCPskRJ7HTe7wYZPYbmhNK6XudveL0Q4iqYChH5Wah/pTLYwMx2AMUqAeXw/s1IkTo6D ZX1uJHbg==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:43144 helo=[192.168.1.11]) by ssdrsserver2.hosting.co.uk with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <ietf@bobbriscoe.net>) id 1lpfMx-00074Q-8R; Sun, 06 Jun 2021 00:06:15 +0100
To: Markku Kojo <kojo@cs.helsinki.fi>, David Black <David.Black@dell.com>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <MN2PR19MB40454BC50161943BC33AAAD783289@MN2PR19MB4045.namprd19.prod.outlook.com> <43e89761-d168-1eca-20ce-86aa574bd17a@bobbriscoe.net> <de8d355d-08b6-34fb-a6cc-56755c9a11ee@bobbriscoe.net> <MN2PR19MB4045DB9D2C45066AEB0762DB83259@MN2PR19MB4045.namprd19.prod.outlook.com> <alpine.DEB.2.21.2106021717300.4214@hp8x-60.cs.helsinki.fi>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <290e1624-fa1e-21d7-95fb-90e284c27dd8@bobbriscoe.net>
Date: Sun, 06 Jun 2021 00:06:14 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.21.2106021717300.4214@hp8x-60.cs.helsinki.fi>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hosting.co.uk
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hosting.co.uk: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hosting.co.uk: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/EBwzSVnZ1ammiaYB3WKiEsQ1vto>
Subject: Re: [tsvwg] ECN encapsulation draft - proposed resolution
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 05 Jun 2021 23:07:00 -0000

Markku, see inline [BB] (I'll reply to Jonathan's email yesterday 
separately)

On 03/06/2021 12:41, Markku Kojo wrote:
> Hi David, all,
>
> Catching up ...
>
> I'm afraid there is an open/ongoing discussion on this where we have 
> not reached concensus. Last message on the topic is here:
>
>  https://mailarchive.ietf.org/arch/msg/tsvwg/AtEI72QCFhOWOn9d6xNcrssTVzs/
>
> Note that while this discussion is much about splitting IP packets to 
> smaller fragments/frames/PDUs and reassembling/decoding these smaller 
> PDUs back to larger IP packets, it does, however, relate to the 
> problem in ECN encapsulation draft on what to do when framing 
> boundaries do not necessarily align with packet boundaries. This is 
> because the case when boundaries do not align often INVOLVES also the 
> case where IP packets are splitted into smaller L2 PDUs (or several 
> smaller IP packets are gathered into a larger L2 PDU). So, the 
> solution should be based on the same known-to-work method no matter 
> whether we are fragmenting/reassembling IP packets or 
> encoding/decoding IP packets to/from L2 PDUs. 

[BB] RFC3168 reassembly is not 'known to work', even for fragment 
reassembly that it is intended for. You acknowledge that it suffers from 
the fairness problem unless AQMs all use byte-mode marking (and none do 
- see later). RFC3168 reassembly does not even have any suggestion on 
what to do if framing boundaries overlap packet boundaries or where 
packets are smaller than frames - so it is not 'known-to-work' in any 
case, let alone in every case.

> As the discussion referred in the above message is longish, I'll try 
> summarize the problem space in the end of this message.
>
> And, then there is also another thread that Bob initiated and I 
> seemingly have not replied although promised (I was away from any IETF 
> work in April due to family emergency and only resuming now, my 
> apologies). That thread is about the (additonal but not independent) 
> problem where L3 packet and L2 PDU boundaries do not align. The last 
> message on the thread is here:
>
>  https://mailarchive.ietf.org/arch/msg/tsvwg/3la3kG5-JLU2OPx3zGxxmhWGYEo/
>
> I will send my notes on that very issue separately tomorrow.
>
> I believe that the best solution to allow ECN encap draft to move 
> forward is that the draft does not say anything on the topic except 
> points to a new draft (the one that has been envisioned to handle the 
> IP fragmentation problem and would include also the handling of not 
> aligned packet/frame boundaries) and we initiate such new draft before 
> ECN encap draft gets published.

[BB] Let us now start talking about the content of that.

Before we do, can we make sure we're all on the same page regarding some 
basics that I believe are /facts/ about preserving markings when PDU 
boundaries change. Do you agree with the following table that I asked 
about earlier:

                    | marked    marked
                    | PDUs      bytes
-------------------+------------------
preserving prop'n  |  ==        ==
preserving number  |  !=        ==


IOW, do you agree that the three that are tagged as '==' are equivalent 
ways of expressing the same thing, but different from the one tagged '!=' ?

>
> Below I try to summarize the problem with the two suggested paragraphs 
> with two SHOULDs.
>
> 1. The two paragraphs (SHOULDs) are contradictory: there is no 
> algorithm that has been shown to be able to correctly fulfill both 
> requirements (please see the first message referred above where I 
> explain why the algorithms that Bob has suggested do not work correctly).

[BB] Since draft-13 (May 2019) I haven't given any specific algorithms 
(because the chairs requested we don't discuss specifics).

Nonetheless, now that the drafts are progressing, I think we're allowed 
to discuss specifics. Although, let's not discuss implementation 
efficiency quite yet - let's keep to intent for now.

So here's one example of pseudocode for propagating ECN marking when 
frames are derived from an ordered byte stream of larger packets, but 
the frame and packet boundaries do not align:

----------------->+<---------------------------->+<------------------------------>+<----
Pkt1       |               Pkt2           | Pkt3                |
+-------------+-------------+-------------+-------------+-------------+-------------+---
|    Fr1      |     Fr2     |     Fr3     |    Fr4      | Fr5     |    
Fr6      |
+-------------+-------------+-------------+-------------+-------------+-------------+---


/* Algorithm A =======================
  */
#define TIMEOUT 1ms    // (say) as a default for the public Internet
bool marked, pending;

// On frame arrival
marked = ismarked(incoming_frame);
pending = pending || marked;

// On packet departure
if (marked|| (pending && time(now) >= expiry_time) ) {
     mark(outgoing_packet);
     expiry_time = time(now) + TIMEOUT;
}
pending = FALSE;
//\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

Explanation:
1) First consider the case where marking is frequent enough that 
outgoing packets are always marked more often than every 1ms. Then as 
each outgoing packet is made ready to depart, it will be marked if the 
last frame to arrive was marked. This is based on the idea Jonathan 
suggested for preserving the proportion of marked PDUs.
2) Now, what if marks arrive infrequently, so that the time since the 
last outgoing mark exceeds the TIMEOUT? Then the outgoing packet will be 
marked if /any/ frame that it consists of was marked, because the timer 
condition and 'pending' will both be true. Because pending toggles on if 
any frame is marked, and pending is cleared after every packet is sent.

The value of TIMEOUT can be argued about for ever, but the intent is to 
ensure it's significantly smaller than most RTTs over the Internet.

Here's an alternative algorithm:

/* Algorithm B =======================
  */
#define TIMEOUT 400μs    // (say) as a default for the public Internet

// On frame arrival
if (ismarked(incoming_frame)) {
     balance += size(incoming_frame);
}

// On packet departure
if (balance > 0) {
     mark(outgoing_packet);
     balance -= size(outgoing_packet);
     expiry_time = time(now) + TIMEOUT;
} elseif (time(now) >= expiry_time) {
         balance /= 2;
}
//\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

Explanation:
While marked packets are arriving often enough, this keeps the 
proportion of outgoing marked bytes the same as the proportion incoming 
(it therefore also preserves the proportion of marked PDUs).
But if marks arrive more infrequently, it preserves the /timing/ and 
therefore the /number/ of marks.

More specifically, the algo works is as follows:
* When a marked frame first arrives, the algo marks an outgoing packet 
immediately, even if there weren't enough incoming marked bytes for the 
outgoing packet size. But the algo holds the deficit in balance, so that 
if more marked bytes appear before the TIMEOUT, they will have to eat up 
the deficit before causing another outgoing mark. This is what ensures 
the proportion outgoing balances the proportion incoming.
* However, if the balance has been negative for more than the TIMEOUT, 
it is halved. And after another timeout the deficit is halved again, and 
so on, until eventually it will round down to zero. Then, with 
certainty, the next incoming marked frame, whatever its size, will 
trigger an outgoing mark immediately. In the intervening period, every 
time the deficit is halved, it becomes more likely that an incoming 
marked frame will cause an immediate outgoing mark, by taking the 
balance over zero.

I prefer the approach of gradually transitioning between preserving the 
proportion of marks and preserving the timing of marks, rather than a 
hard cut-off between the two. But the latter could be implemented simply 
by changing '/= 2' to '= 0' and using a longer TIMEOUT. Also note that 
the balance approach works whether some frames are larger than packets 
or some packets are larger than frames.

By definition these algorithms resolve a compromise (between preserving 
fairness and preserving timeliness). So if you're a cup-half-empty sort 
of person, you will always be able to argue that they're not truly 
correct. But if you're a cup-half-full-sort of person you will be able 
to argue that they're never too far off being correct.


>
> 2. There is no actual algorithm in the draft for a potential 
> implementor that would allow for a proper implementation (nor is there 
> experimental evidence showing any such algorithm would work correctly).
> With the relatively vague first SHOULD, it is very unlikely anyone 
> would get it right, I think.

[BB] Can you not remember what happened between draft-13 (May 2019) and 
draft-14 (Nov 2020)? Because we couldn't agree, the chairs asked me to 
remove the specific algorithm that was in the draft, and solely state 
requirements (which eventually resulted in the two SHOULDs). The chairs 
wanted to defer this debate, and not to constrain Layer-2 solutions to a 
particular algorithm in the mean time.

If you don't agree with that approach, don't argue with me. The chair's 
intent was that the question of how exactly to mark PDUs while reframing 
was controversial, so it should be solved in a separate draft, in order 
that this ecn-encap draft could go through quickly (ironic raised eyebrow).

>
> 3. RFC 7141 has been used as the guideline for the new proposed text
>    which makes resolving the issue even more problematic because the
>    recommendations in RFC 7141 are the origin of the problem on how
>    to handle fragmentation/reassembly as well as encapsulation/
>    decapsulation of IP packets/L2 frames when there is an AQM
>    marking/dropping fragmented/splitted PDUs. As said, this is not
>    independent of the problem with not aligned packet/frame boundaries.

[BB] RFC7141 is not "the origin of the problem". It described current 
AQM practice, which was not what researchers thought (and apparently 
still isn't). It is a BCP and many many people were involved in reaching 
consensus on it during the 6 years it progressed through the IETF. It 
explained why AQM practice at the time was not implementing byte-mode 
drop, and proposed a way to move forward with packet mode drop instead.

It included a survey which found that /none/ of the respondents had 
implemented byte-mode dropping (admittedly only 19% response rate, out 
of 84 vendors contacted. Plus we checked Linux RED, which also did not 
implement byte-mode drop).
See https://datatracker.ietf.org/doc/html/rfc7141#appendix-A

RFC7567 is also a BCP, and in section 4.4 it also recommends against 
byte-mode marking and byte-mode drop.

Since RFC7141 and RFC7567 were published, a new round of AQMs appeared 
(CoDel and PIE). Contrary to what you say in the first linked email 
above, CoDel does not do byte-mode drop or byte-mode marking{Note 1}. 
And PIE doesn't either.

DOCSIS PIE does do a limited form of byte-mode drop{Note 2}, but it does 
not support ECN and it's not under 'IETF-change-control' anyway (the 
authors offered to document it as an informational RFC). The only other 
algorithm I know of that has implemented byte-mode drop is the RED used 
in the ns2 simulator.

Later you say byte-mode drop is common industry practice and that it's 
the way forward. It's not industry practice at all, so it's hardly going 
to be the way forward.


{Note 1}: With CoDel:
* The likelihood of marking a specific packet does not depend on that 
specific packet's size, only on the average size of all packets. That is 
packet-mode marking, not byte-mode marking.
* The control law reduces the time between each mark/drop on a 
preordained schedule, so the likelihood of any one packet being 
dropped/marked is higher and increases faster if the average packet rate 
is lower (which can be because packets are larger on average).
* However, just because CoDel happens to mark packets on a preordained 
time schedule is not why it's important for a decap to propagate a mark 
in a timely fashion. The motivation is not specific to CoDel - it's to 
minimize the delay around the control loop whatever the AQM.

{Note 2}: DOCSIS PIE implements limited dependence on packet size, in 
order to reduce the chance of dropping small control packets (not 
because it considers that drop should depend on packet size). In the 
spec, it apologizes for not following RFC7567, but it includes 
safeguards against the consequential DDoS vulnerability that it opens up.


> The problems that we have at hand are as follows:
>
>  a) The original paper on RED suggested byte-mode operation where
>     byte-mode dropping/marking would adjust drop/mark probability
>     of smaller fragments (to have lower probability) such that with
>     the same level of bit-congestion the RED AQM would mark/drop
>     approximately at the same point a packet for a flow being
>     fragmented and for a flow with full-sized packets, i.e., it
>     would treat fragmented and non-fragmented traffic fairly.

[BB] Can you give a more precise description of the AQM scenario you 
have in your mind here? The words "at the same point" make me suspicious 
that you might be thinking of something different to byte-mode drop. 
It's possible you're talking about byte-mode queue measurement (see 
RFC7141 for definitions). So please describe exactly how you think this 
AQM is working. I.e. whether there are per-flow queues or a shared 
queue; in what units it measures the queue, whether it marks packets of 
different sizes within the same flow with different likelihoods, etc.


> That byte-mode drop together with RFC 3168 reassembly logic
>     results in fair and correct behavior for Standards track
>     congestion control, loss-based CC included.

[BB] Assuming for now that you are indeed talking about byte mode drop,...

The perfect complementarity you claim between byte-mode drop and RFC3168 
reassembly never existed, because no-one implemented byte-mode drop. 
RFC7141 did not /cause/ your perfect world to collapse. It investigated 
and discovered that byte-mode drop didn't exist, worked out why, 
articulated why, and proposed a way forward that would give the desired 
complementarity between network and hosts, without relying on byte-mode 
drop, which was clearly problematic to everyone.

You write the above as if byte-mode was deliberately introduced in RED 
so that it worked well with the fragmentation. I'm afraid that's a 
rewrite of history in reverse. The interaction between ECN and 
fragmentation hadn't even been thought about when byte mode and packet 
mode were included in the original 1993 paper on RED. Byte and packet 
mode were two options, which Sally Floyd left configurable because it 
wasn't completely clear at that time which one was most suitable (see 
the [pktByteEmail] reference in RFC7141 to Sally's 1997 email about this).

6 years later, when ECN first became an experimental RFC, there was 
still no mention or thought of fragmentation in RFC 2481. I became 
involved during the late stages of the update from RFC 2481 to what 
became RFC 3168 - 8 years after the original RED paper. Jon Crowcroft 
and I pointed out that there was no mention of fragmentation, as part of 
our attempt to make a more principled division between the IP and TCP 
parts of RFC3168 - see our review at 
https://bobbriscoe.net/pubs.html#ECN-IP (we were coming at it from a 
real-time media perspective. not just TCP). The fragmentation approach 
that was subsequently added to what became RFC3168 was still highly 
TCP-specific, even though it was at the IP layer. However, I didn't feel 
I should hold up RFC3168 to argue any further about it (unlike the 
present situation in tsvwg, I took the view that I had arrived late at 
the party, so allowing others to make progress was more important than 
me continually making the same point over and over, even tho I thought I 
was right).

Then we reach the 2008-2014 time-frame over which RFC7141 was developed 
- when it was discovered that no-one was implementing byte-mode drop 
anyway (see back where I started the story above).


>
>  b) Recommendation in RFC 7141 to not do byte-mode drop but
>     instead use packet-mode drop (with equal drop/mark probability
>     regardless of PDU size) will treat fragmented (splitted)
>     traffic unfairly, yielding fragmented traffic suboptimal
>     performance (as Bob has indicated several times). Therefore,
>     packet-mode drop would need an algorithm at reassembly/decapsulation
>     that SHOULD approximately preserve the proportion of PDUs(/bytes)
>     with congestion indications arriving and leaving.
>
>     However, there is no known algorithm that could do this correctly
>     at reassembly/decapsulation as stated in item 1 above.
>     More importantly, even if such algorithm existed it cannot
>     work with non-ECT AQMs, i.e., with loss-based congestion control
>     for which no adjustment of congestion indications can be done
>     at the reassembly/decapsulator. But, one can achieve the correct
>     outcome in a single place: in the AQM algorithm itself by employing
>     byte-mode dropping/marking; it works correctly also for the majority
>     of the traffic today, that is, for traffic employing loss-based
>     congestion control.

[BB] This is nostalgia for a past that never was. When no-one has done 
what you think they should have done, it is important to try to 
understand why.

>
>     Moreover, doing adjustment at the reassembly/decapsulator is
>     architecturally not very good solution because we would
>     need such an algorithm at several places (at receiving
>     enpoints, at tunnel egress, at L2 decapsulator) which
>     introduces quite unnecessary complexity in several places.

[BB] Quite the opposite. The whole approach is designed to minimize 
in-network processing, and move it to the end-systems.
See https://datatracker.ietf.org/doc/html/rfc7141#section-3.5

At most decaps (including all tunnels), the framing doesn't change. 
Fragmentation is deprecated. And few networks apply ECN marking where 
there is no IP-awareness.


>
>  c) Recommendation in RFC 7141 is not applicable with AQMs that do
>     not use probabilistic dropping/marking. E.g., it would result
>     in incorrect behavior with CoDel AQM that employs deterministic
>     dropping (please see more detailed explanation in the first
>     message referred above) and with any potential new AQM that
>     does not employ probabilistic dropping/marking.

[BB] Incorrect. See explanation earlier in this email.

>
>  d) Although RFC 7141 is BCP, the recommendation in it is not based
>     on any deployed mechanism (or at least I am not aware of any such
>     best practice) nor on any published/evaluated algorithm that has
>     shown to work. On the other hand, there is quite a bit of
>     experimental evaluation on RFC 3168 reassembly & byte-mode
>     drop/mark (although more high-quality evaluation would be useful).

[BB] This is completely the reverse of reality. As explained above, 
RFC7141 was based on a survey of AQM implementations at the time, and it 
recorded industry practice at the time. That practice has continued 
since then in CoDel, PIE and now PI2, DCTCP, etc.

In contrast, AFAICT, the /only/ use of byte-mode marking has been in the 
research community.

Regards


Bob

>
>
> Thanks,
>
> /Markku
>
> On Tue, 25 May 2021, David Black wrote:
>
>>
>> As draft shepherd and a WG chair, I believe that these drafts resolve 
>> the last of the open issues from WG
>> Last Call.
>>
>>
>>
>> In the next week or two, I will prepare the shepherd writeups and 
>> submit these drafts to our AD for further
>> review towards IETF Last Call and publication as RFCs.
>>
>>
>>
>> Thanks, --David
>>
>>
>>
>> From: Bob Briscoe <ietf@bobbriscoe.net>
>> Sent: Tuesday, May 25, 2021 10:35 AM
>> To: Black, David; tsvwg@ietf.org
>> Cc: Donald Eastlake; John Kaippallimalil
>> Subject: Re: [tsvwg] ECN encapsulation draft - proposed resolution
>>
>>
>>
>> [EXTERNAL EMAIL]
>>
>> David, tsvwg list,
>>
>> As promised yesterday, we just posted a new rev of 
>> ecn-encap-guidelines-16, with text based on your
>> (David's) suggestions below. To add the references and to avoid some 
>> repetition, I twiddled the order round,
>> but otherwise kept the text intact.
>>
>> I also took the opportunity to post a new rev of rfc6040update-shim, 
>> 'cos I noticed Geneve has been
>> published as an RFC. There were also a couple of words edited in my 
>> local copy as agreed on the list a few
>> months ago.
>>
>> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-encap-guidelines 
>> [datatracker.ietf.org]
>> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-rfc6040update-shim 
>> [datatracker.ietf.org]
>>
>> Cheers
>>
>>
>>
>> Bob
>>
>> On 24/05/2021 13:50, Bob Briscoe wrote:
>>
>>       David, Thx for bringing this one up. See [BB] inline,
>>
>>       On 22/05/2021 01:02, Black, David wrote:
>>
>>       On another topic, I believe that I have good news to pass along 
>> on the ECN encapsulation
>>       drafts.
>>
>>
>>
>>       The current situation is that the 6040update-shim draft is 
>> ready for RFC publication to be
>>       requested, but there's an open issue in the ecn-encap draft on 
>> the contents of this
>>       paragraph in Section 4.6 (Reframing and Congestion Markings),
>> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-encap-guidelines-15#section-4.6
>>       [datatracker.ietf.org]:
>>
>>
>>
>>          Congestion indications SHOULD be propagated on the basis 
>> that an
>>
>>          encapsulator or decapsulator SHOULD approximately preserve the
>>
>>          proportion of PDUs with congestion indications arriving and 
>> leaving.
>>
>>
>>
>>       Digging further, this area appears to be dealt with in greater 
>> length and detail by RFC
>>       7141 (Byte and Packet Congestion Notification) Section 2.4 
>> (Recommendation on Handling
>>       Congestion Indications When Splitting or Merging Packets),
>>       https://datatracker.ietf.org/doc/html/rfc7141#section-2.4 
>> [datatracker.ietf.org].  The
>>       short summary is that the quoted sentence is generally correct 
>> with RFC 7141 containing a
>>       more comprehensive discussion including an exception.  As RFC 
>> 7141 is a BCP, I suggest
>>       treating it as authoritative on this matter for now, leaving 
>> redesign in this area to a
>>       possible future draft (as we did in the 6040update-shim draft 
>> wrt RFC 3168 fragment
>>       reassembly requirements).
>>
>>
>>
>>       To carry this out, here's an initial ecn-encap draft text 
>> change suggestion (begins with
>>       last two sentences in second paragraph of Section 4.6):
>>
>>
>>
>>       OLD
>>
>>             Where framing boundaries do not necessarily align
>>
>>          with packet boundaries, the following guidance will be 
>> needed.  It
>>
>>          explains how to propagate ECN markings from layer-2 frame 
>> headers
>>
>>          when they are stripped off and IP PDUs with different 
>> boundaries are
>>
>>          reassembled for forwarding.
>>
>>
>>
>>          Congestion indications SHOULD be propagated on the basis 
>> that an
>>
>>          encapsulator or decapsulator SHOULD approximately preserve the
>>
>>          proportion of PDUs with congestion indications arriving and 
>> leaving.
>>
>>
>>
>>          The mechanism for propagating congestion indications SHOULD 
>> ensure
>>
>>          that any incoming congestion indication is propagated 
>> immediately,
>>
>>          not held awaiting the possibility of further congestion 
>> indications
>>
>>          to be sufficient to indicate congestion on an outgoing PDU.
>>
>>
>>
>>       NEW
>>
>>             Where framing boundaries do not necessarily align
>>
>>          with packet boundaries, the provisions of Section 2.4 of RFC 
>> 7141
>>
>>          apply to propagation of ECN markings from layer-2 frame headers
>>
>>          when they are stripped off and IP PDUs with different 
>> boundaries are
>>
>>    reassembled for forwarding. Those provisions include: "The general
>>
>>    rule to follow is that the number of octets in packets with
>>
>>    congestion indications SHOULD be equivalent before and after merging
>>
>>    or splitting." See RFC 7141 for the complete provisions and related
>>
>>    discussion, including an exception to that general rule.
>>
>>
>>
>>          In addition to adhering to the provisions of RFC 7141 
>> Section 2.4,
>>
>>          the mechanism for congestion indication propagation SHOULD 
>> ensure
>>
>>          that any incoming congestion indication is propagated 
>> immediately,
>>
>>          and not held awaiting possible arrival of further congestion
>>
>>          indications sufficient to indicate congestion for all of the 
>> octets
>>
>>          of an outgoing IP PDU.
>>
>>
>>
>>       END
>>
>>
>> [BB] OK, this is indeed progress.
>>
>>
>>
>>
>>       RFC 7141 (a BCP) would be added as a normative reference.
>>
>>
>> [BB] I'll write that up now. And post a revised draft.
>>
>>
>> Bob
>>
>>
>>
>>
>>
>>       Comments?
>>
>>
>>
>>       Thanks, --David (as draft shepherd)
>>
>>
>>
>>       David L. Black, Sr. Distinguished Engineer, Technology & Standards
>>
>>       Infrastructure Solutions Group, Dell Technologies
>>
>>       mobile +1 978-394-7754 David.Black@dell.com
>>
>>
>>
>>
>>
>> -- 
>>
>> ________________________________________________________________
>>
>> Bob Briscoe                               http://bobbriscoe.net/ 
>> [bobbriscoe.net]
>>
>>
>>
>> -- 
>>
>> ________________________________________________________________
>>
>> Bob Briscoe                               http://bobbriscoe.net/ 
>> [bobbriscoe.net]
>>
>>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/