Re: [tsvwg] Fragmentation & ECN encapsulation drafts, one more try

Bob Briscoe <ietf@bobbriscoe.net> Thu, 19 March 2020 19:03 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA5BF3A0DB7 for <tsvwg@ietfa.amsl.com>; Thu, 19 Mar 2020 12:03:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F11ro3rzVFlH for <tsvwg@ietfa.amsl.com>; Thu, 19 Mar 2020 12:02:12 -0700 (PDT)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A094A3A0D9B for <tsvwg@ietf.org>; Thu, 19 Mar 2020 12:02:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=vtN3iRthI3uzXqaS6h+lnm8sINf2S3NqCkFY6lMxXOQ=; b=wBJDAbfY7YPDIOavlm3woo/4F N215y1OlrDXepY+LX4PUvDtS6VRt7ot9ciqvAqKj6h/q+2n47LaTt+aww7iRs1fYKArDoxa88jHHF pvN8VlTt62VQNanxAIEYNshW/WgDYAp0RL0hTzTZIUuUx6zBeitFwEmn0QWvdKW7xPHcb84nVWg6+ ekT7fZpoA17iCDbgfug7EqtO5YvhaVwBjkRKGIgcwZhOGjyyZb9Tgef8ikTjuDzLPPpP7xdspIOJR yqNqdeQ+Lln8l+aR5rrDqB8Tyu0H14+7FajsJ0WLA3kNo7bvxaZyFltCiBdzf3G1vtTm0ZWpXPIL0 CarIrFK5Q==;
Received: from [31.185.135.141] (port=53208 helo=[192.168.0.4]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1jF0Qh-00FowQ-8T; Thu, 19 Mar 2020 19:02:08 +0000
To: "Rodney W. Grimes" <ietf@gndrsh.dnsmgr.net>, "Black, David" <David.Black@dell.com>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <202003181545.02IFjfQv002816@gndrsh.dnsmgr.net>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <f328c7c5-8824-f5d3-5f6a-9acf5a78f8dd@bobbriscoe.net>
Date: Thu, 19 Mar 2020 19:02:06 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <202003181545.02IFjfQv002816@gndrsh.dnsmgr.net>
Content-Type: multipart/alternative; boundary="------------2B078482A8476FC44374113D"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/eJgXdsPLWIWu1VaaklgguvFzV1A>
Subject: Re: [tsvwg] Fragmentation & ECN encapsulation drafts, one more try
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Mar 2020 19:03:08 -0000

David, thx for this new email thread. I generally agree with the 
approach you have suggested.

Rod, I just ought to pick up on a couple of things you've said...

On 18/03/2020 15:45, Rodney W. Grimes wrote:
> David,
>
>> Going back to Bob's message on the 4 options (A-D) and looking for a path forward, I believe the WG is converging on something close to option C):
>>
>>> C) Say nothing about fragmentation and reassembly in rfc6040update-shim or ecn-encap-guidelines.
>>> Then use a later RFC to update them both (stds track and BCP) with a considered 'correct' approach.
>>> ecn-encap-guidelines would still say include what it has always said about re-framing (which is a similar but different subject).
>> Given how we got to this point, I believe that the rough consensus of the TSVWG WG is not to use either of these drafts to update RFC 3168, and instead to use a new draft to propose changes to RFC 3168 for fairness across fragmented and non-fragmented flows.
>>
>> Another important reason for a new draft is that this WG and other impacted WGs need to discuss the implementation impacts of the proposed changes, as implementations can be expected to have complied with the "MUST" in section 5.3 of RFC 3168 that results in fragmented IPv4 flows receiving more congestion indications that similar-sized flows that are not fragmented.  That discussion and determination of what to do is going to take a while, including a broad WG Last Call that includes all the affected areas of the IETF - that's much better done via a new draft focused on that topic rather than the existing rfc6040update-shim draft.
>>
>> However, the path forward cannot just delete Section 5 of the rfc6040update-shim draft, thereby ignoring the concern, as we do have a WGLC comment to resolve that requires stating something about fragmentation.  I suggest that the overall goal should be to state as little as possible.  There appear to be 3 things that need to be dealt with:
>>
>>    1.  When a tunnel ingress fragments a packet, the ECN field in every fragment has the same value as the original packet.
>>       *   This was implicit in RFC 3168, at least as I read Section 5.3 of RFC 3168.
>>       *   It makes sense to state this explicitly with a MUST and have that statement update RFC 6040.  Such a statement would not be changed by the new draft to propose updates to RFC 3168.
>>    2.  When a packet is reassembled by a tunnel egress, and one of the fragments is marked with CE, RFC 3168 specifies what to do.
>>       *   Given the desire to change RFC 3168, that is all that should be stated here.
>>       *   That avoids repeating or reinforcing text from RFC 3168 for which changes may be proposed in the not too distant future.
>>    3.  When a packet is reassembled by a tunnel egress, and none of the fragments are marked with CE, RFC 3168 does not specify what to do.
>>       *   In fact, RFC 3168 even allows use of an ECN value that was in none of the fragments (see the last paragraph in Section 5.3 of RFC 3168), but I'd be surprised to find "running code" that behaves that way.
>>       *   I think I've seen some list discussion to the effect that RFC 3168 applies a bitwise "logical OR" across the ECN field values in the fragments - RFC 3168 does not do that.  The use of "logical OR" in the text below (from Bob) refers to case 2 above - if any fragment is marked with CE and the reassembled packet is forwarded, then RFC 3168 requires the reassembled packet to be marked with CE.
>   [RWG] I do not believe anyone was using "bitwise" as that would mangle an (ECT(1) | ECT(0)) into a CE mark.

[BB] Correct. AFAIK, no-one has ever intended "logical OR" to mean 
"bitwise OR". I certainly haven't. I hadn't even thought of that 
possible interpretation. Now I see the possible ambiguity, I'm going to 
stop using the phrase.

When I described RFC3168 CE reassembly behaviour as "logical OR", I was 
just describing the way RFC3168 says that the reassembled packet MUST be 
CE if any of the fragments are CE.


>
>>       *   Going back to earlier list discussion (October of last year), the clear outcome was to use the ECN value from "any" fragment in the reassembled packet, with the implementation choosing the fragment from which that value is obtained.
>   [RWG] I do believe it is possible to do slightly better than "any".  Certain combinations may even indicate a failure, ie how would it be possible to get a 00 fragment with any other fragments?
> Propose 3a.  When a packet is reassembled by a tunnel egress and all fragments contain the same ECN marking that ECN marking is applied.
> Propose 3b.  When a packet is reassembled by a tunnel egress and if any fragment contains ECN 00 the reassembled packet should be marked 00, (or dropped?).
1) Not-ECT with CE

RFC3168 could be interpreted as saying that if CE and Not-ECT fragments 
need to be reassembled into the same packet, the packet MUST be dropped. 
But it's not particularly clear.

 From the 2nd sentence onwards, I think 3168 was trying to say that if 
any fragment is CE, the congestion indication MUST be propagated, either 
as CE or as drop. But if there are both Not-ECT and CE, there is no 
marking that would safely propagate the congestion indication (because 
the Not-ECT might imply the transport will not understand CE). So the 
only alternative left is drop.

I would agree with that.

But, the point of saying as little extra as possible about RFC3168 is to 
unstick these drafts. So if there is any controversy now over whether 
(CE with Not-ECT => drop), I would say nothing about it in 
rfc6040update-shim, and continue the list discussion in parallel.

2) Not-ECT with ECT

3168 explicitly says it places no requirements on implementations in 
this case. I'm only semi-sure about Rod's suggestion of propagating 
Not-ECT. So again, if there's going to be continued discussion, let's 
continue in parallel, and say nothing about this case in rfc6040update-shim.

My concern is that, although passing on Not-ECT seems safe, Not-ECT and 
ECT fragments together could imply something on the path is mangling the 
ECN field. In the absence of the mangling, the correct ECN fields on 
each fragment might not have been either value.

I would rather we specified an action that would unambiguously highlight 
the problem to the endpoint or any downstream monitoring point (I'm not 
saying the endpoint has to always check for problems. I'm just saying 
that, if it is checking, I'd rather we revealed the problem for it to 
find). The two candidates are Not-ECT or drop, I believe, because it's 
not safe to pass on anything else anyway.

But, could passing on Not-ECT conceal this problem from the endpoint 
and/or from the downstream path? On balance, I think not. Not-ECT is 
indeed the codepoint most likely to be interpreted as highlighting a 
problem. Reasoning:

  * Not-ECT would only fail to highlight a problem if all the fragments
    were 00 before the mangling. But it's unlikely one 00 fragment but
    not another would be mangled to ECT. And I doubt non-ECN endpoints
    would be checking for ECN problems anyway.
  * If all the fragments were non-zero before the mangling, then Not-ECT
    should highlight problems to any transport watching for it.

However, there is a possibility that ECT1 is meant to mean a higher 
severity marking than ECT0 (PCN or a future scheme like SCE). So 
reverting to Not-ECT could be unsafe. We mustn't only think of SCE 
(where it would not be unsafe). We have to cater for other possible 
3-severity schemes in future. Whatever the scheme, CE plus Not-ECT will 
always be dropped. So ECT1 plus Not-ECT will always have a backstop if 
congestion gets worse. Hence I think forwarding Not-ECT is safe.

This is similar logic to that in RFC6040 (altho that was about decap not 
reassembly). RFC6040 says that an ECT1 outer on a Not-ECT inner ought 
not to be possible, but it MUST be forwarded from decap as Not-ECT. 
Because if congestion escalated to CE and was still mixed with Not-ECT, 
it would be dropped.

Summary: On balance, I think Not-ECT is safe enough.

Sorry for that long conversation with myself, but I like to think these 
things through.

>
>   [RWG] This only leaves how to deal with packet fragments that contain a mixture of 01 and 10.  And I believe the present Internet would not produce this combination, so we should have a fairly easy time at sorting this case out.

[BB] Well, it depends whether you consider PCN signalling [RFC6660] as 
"the present Internet". It's standards track, implemented but not 
deployed AFACT. When RFC6040 was written (which was when PCN marking 
[RFC5670] had just been implemented by Cisco, Huawei, Nortel, etc), the 
thinking was that it would do no harm for tunnel decaps to treat the 
severity of ECT(1) as:
     ECT(0) <= ECT(1) < CE
as long as that was not incompatible with the status quo where ECT(1) 
and ECT(0) having equal severity.

It does no harm to continue that line of thinking. Hence 
RFC6040update-shim and ecn-encap-guidelines both recommend continuing to 
treat ECT(1) as RFC6040 did (as in the above inequality).

> [RWG] It appears that making ECT(1) a priority override of ECT(0) would work well for either case of using ECT(1) as input or output and hence for both L4S and SCE and any other possible scheme.

[BB] Yes, a standards track RFC does not need to support SCE (or L4S). 
But we ought to support PCN, which might happen to be the same as 
supporting SCE.

>
> [RWG] Propose 4a.  When a packet is reassmbled by a tunnel egress and any mixture of ECN 01 and 10 are present the reassmbled packet should be marked ECT(1).

[BB] No. That's the same mistake as RFC3168 made with CE.

But I notice further down that you're prepared to accept a 
probability-preserving requirement in this case. IOW, for a mix of 01 
and 10, a good enough approach would be to pick the marking of one of 
the fragments for onward propagation.


>
>>       *   Trying to walk a fine line, I'd suggest crafting text to states that using the ECN value from one of the fragment is a suggested implementation that meets the requirements of RFC 3168, and leave it there without applying a "SHOULD" keyword to that behavior, lest that keyword have to be updated in the not too distant future by proposed changes to RFC 3168.
>>
>> I believe that the approach to the first two items is clear from WG discussion to date.  Item 3 is new.  Please comment.
>   [RWG] Agree on items 1 and 2 being clear.  Comments on 3 inline above.  My propose(ed) adds may need better ordering arrangement and the values 3a, 3b and 4a can be ignored.
> Keep reading as I have made some comments inline to Bob's email.

[BB] more...

>
>
>> Thanks, --David
>>
>> From: Bob Briscoe <ietf@bobbriscoe.net>
>> Sent: Tuesday, March 10, 2020 2:47 PM
>> To: Black, David; tsvwg@ietf.org
>> Subject: Re: [tsvwg] Status of ECN encapsulation drafts (i.e., stuck)
>>
>>
>> [EXTERNAL EMAIL]
>> David,
>>
>> I admit to curling up into a little ball and trying to ignore this controversy when it arose.
>> Let me try to sort this out now, for both ecn-encap-guidelines and rfc6040update-shim.
>>
>> Back in Sep '19 (quoted at the end) you asked me not to use rfc64040update-shim to update RFC3168's fragmentation behaviour, even if it's the "right thing" to do, given I was saying that there were problems with the RFC3168 approach.
>>
>> Background: Neither RFC3168 nor RFC6040 covered fragmentation & reassembly during encap and decap. So Joe Touch suggested rfc6040update-shim should fix that omission. Seems reasonable enough. However, it doesn't seem right to fix an omission by the stop-gap of:
>> 1. requiring the approach in RFC3168 that we know is potentially problematic.
>> 2. then planning to correct what we write, by updating it in a later RFC.
>>
>> Let's call that approach (A). I don't like that at all. What if step #2 never happens?
>> Fortunately, that's not the only way out of this. I can think of three other ways:
>> B) The compromise text I've drafted below, which states the high level intent of a good mechanism as a SHOULD, and gives an example of how to do it. Then also allows the RFC3168 mechanism as a "MAY".
>> C) Say nothing about fragmentation and reassembly in rfc64040update-shim or ecn-encap-guidelines. Then use a later RFC to update them both (stds track and BCP) with a considered 'correct' approach. ecn-encap-guidelines would still say include what it has always said about re-framing (which is a similar but different subject).
>> D) Convince ourselves that fragmentation and reassembly during encap and decap is allowed to be different from fragmentation and reassembly without encapsulation.
>>
>> Last night, I took approach (B), but with too little time left to discuss it on the list. I scrubbed the offending paras from rfc6040update-shim and replaced them with those below (also at https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-10#section-5 ).
>>
>> Thinking about it further since last night, I'm now inclining towards approach (C).
>> 5.  ECN Propagation and Fragmentation/Reassembly
>>
>>
>>
>>     The following requirements update RFC6040<https://tools.ietf.org/html/rfc6040>, which omitted handling of
>>
>>     the ECN field during fragmentation or reassembly.  These changes
>>
>>     might alter how many ECN-marked packets are propagated by a tunnel
>>
>>     that fragments packets, but this would not raise any backward
>>
>>     compatibility issues:
>>
>>
>>
>>     If a tunnel ingress fragments a packet, it MUST set the outer ECN
>>
>>     field of all the fragments to the same value as it would have set if
>>
>>     it had not fragmented the packet.
>
>>
>>
>>     During reassembly of outer fragments [I-D.ietf-intarea-tunnels<https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-10#ref-I-D.ietf-intarea-tunnels>], if
>>
>>     the ECN fields of the outer headers being reassembled into a single
>>
>>     packet consist of a mixture of Not-ECT and other ECN codepoints, the
>>
>>     packet MUST be discarded.
>   [RWG] I draw the same conclusion as Bob here, though I make the MUST a possibility.  I am actually fine with either.

[BB] We've both now switched our positions in trying to agree with each 
other, so we're now disagreeing again!

If there's any CE with any Not-ECT, I still think drop.
But if Not-ECT with either ECT0 or ECT1 (and no CE), I now think Not-ECT 
is best (agreeing with what you said earlier in this email).

Agree?

>
>>
>>
>>     As a tunnel egress reassembles sets of outer fragments
>>
>>     [I-D.ietf-intarea-tunnels<https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-10#ref-I-D.ietf-intarea-tunnels>] into packets, as long as no fragment
>>
>>     carries the Not-ECT codepoint, it SHOULD propagate CE markings such
>>
>>     that the proportion of reassembled packets output with CE markings is
>>
>>     broadly the same as the proportion of fragments arriving with CE
>>
>>     markings.
>   [RWG] This is addressed differently in Davids 2.  The text here by Bob is specifically a need for L4S as it overrides CE meaning and needs the proportion of CE marked bytes to be proper, this change would not be needed by SCE,

[BB] I have already said I am willing to just say that RFC3168 specifies 
what to do, and deal with updating RFC3168 fragment reassembly in a 
separate draft (indeed I suggested the approach that David has adopted).

David (and I) thought we had agreement late last year on 
probability-preserving for CE, which is why I wrote it into the latest 
rfc6040update-shim revision (and the tentative ecn-encap-guidelines that 
I failed to submit due to metadata errors):
http://www.bobbriscoe.net/projects/netsvc_i-f/consig/encap/draft-ietf-tsvwg-ecn-encap-guidelines-14-COULD-NOT-SUBMIT.txt

But if you're adamant about propagating CE if there are any CE 
fragments, then we're still stuck. That's the main reason we have to 
shunt off fragment reassembly into a separate process, in parallel to 
these drafts. Because we're not going to resolve it any time soon while 
you guys are treating anything I say with suspicion.

If you would be prepared to accept even the possibility that my 
explanation is unrelated to L4S v SCE, we might be able to make 
progress. But until then, there's no point me typing more and more words 
to try to explain, while you see 'L4S' written between every line, even 
thought it's not actually there.



> [RWG] however SCE could use, but does not need similiar treatment of ECT(1) marking.

[BB] OK, if you'll accept that, we're converging. Admittedly for the 
wrong reasons ('cos no-one expects standards track behaviour to be 
changed to support experimental behaviour, let alone a draft like SCE 
that isn't even chartered IETF work, and if it was it would be 
experimental). But no matter why we agree. The result is the same.


>
>>     The above statement describes the approximate desired outcome, not
>>
>>     the specific mechanism.  A simple to achieve this outcome would be to
>>
>>     leave a CE-mark on a reassembled packet if the head fragment is CE-
>>
>>     marked, irrespective of the markings on the other fragments.
>>
>>     Nonetheless, "SHOULD" is used in the above requirement to allow
>>
>>     similar perhaps more efficient approaches that result in
>>
>>     approximately the same outcome.
>>
>>
>>
>>     In RFC 3168<https://tools.ietf.org/html/rfc3168> the approach to propagating CE markings during fragment
>>
>>     reassembly required that a reassembled packet has to be be CE-marked
>>
>>     if any of its fragments is CE-marked.  This "logical OR" approach to
>>
>>     CE marking during reassembly was intended to ensure that no
>>
>>     individual CE marking is ever lost.  However, an unintended
>>
>>     consequence is that the proportion of packets with CE markings
>>
>>     increases.  For instance, with the logical OR approach, once a
>>
>>     sequence of packets each consisting of 2 fragments, has been
>>
>>     reassembled, the fraction of packets that are CE-marked roughly
>>
>>     doubles (because the number of marks remains roughly the same, but
>>
>>     the number of packets halves).
>   [RWG] I know there has been a long thread on this, but I am still grappling with how any unnecessary increase in CE marks occurs.  Sure, the proportion of packets with CE markings increases, but ONLY if additional congestion occured in the tunnel and that SHOULD lead to additional CE marks.

[BB] Have you worked through the schematic that I put up here:
http://bobbriscoe.net/projects/netsvc_i-f/consig/encap/ecn-reassembly.pdf ?

Please, please, allow even the possibility that I might not be saying 
this for L4S alone. When thinking about probabilistic problems like the 
time between congestion events, even mathematicians can all convince 
themselves for years that one way is correct, until someone points out a 
gotcha. Please consider that there is a /possibility/ that you and 
Jonathan might have fallen into the same gotcha as everyone fell into 
when RFC3168 fragment reassembly was written. I think David has now seen it.

If you're prepared to approach this with an open mind, as a parallel 
process to drafting rfc6040update-shim, I don't mind trying to explain 
where the flaw in the logic is, with respect to RFC3168 CE reassembly 
for Reno-ECN and CoDel-ECN.



[BB] Separately, I will draft up text for rfc6040update-shim and 
ecn-encap-guidelines in an email on this list, to get agreement, before 
committing to a new revision.



Bob
>
>>
>>
>>     This specification does not rule out the logical OR approach of RFC<https://tools.ietf.org/html/rfc3168>
>>
>>     3168<https://tools.ietf.org/html/rfc3168>.  So a tunnel egress MAY CE-mark a reassembled packet if any of
>>
>>     the fragments are CE-marked (and none are Not-ECT).  However, this
>>
>>     approach could result in reduced link utilization, or bias against
>>
>>     flows that are fragmented relative to those that are not.
>>
>> Regards
>>
>>
>> Bob
>> On 15/09/2019 22:07, Black, David wrote:
>> This email concerns draft-ietf-tsvwg-ecn-encap-guidelines and draft-ietf-tsvwg-rfc6040update-shim, which are being handled together for WG Last Call and RFC publication, and is posted in my role as shepherd and responsible WG chair for these drafts.The current situation is that both drafts are stuck due to a problem with the fragementation text added to the rfc6040update-shim draft.   Section 5 on ECN Propagation and Fragmentation/Reassembly was added to that draft in response to a WGLC comment, and it appears to have gone too far in the direction of trying to do the proverbial "right thing".
>>
>> The core of the problem is in these two paragraphs in Section 5 of that draft (https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-09#section-5):
>>
>>     As a tunnel egress reassembles sets of outer fragments
>>
>>     [I-D.ietf-intarea-tunnels] into packets, it SHOULD propagate CE
>>
>>     markings on the basis that a congestion indication on a packet
>>
>>     applies to all the octets in the packet.  On average, a tunnel egress
>>
>>     SHOULD approximately preserve the number of CE-marked and ECT(1)-
>>
>>     marked octets arriving and leaving (counting the size of inner
>>
>>     headers, but not encapsulating headers that are being stripped).
>>
>>     This process proceeds irrespective of the addresses on the inner
>>
>>     headers.
>>
>>
>>     Even if only enough incoming CE-marked octets have arrived for part
>>
>>     of the departing packet, the next departing packet SHOULD be
>>
>>     immediately CE-marked.  This ensures that CE-markings are propagated
>>
>>     immediately, rather than held back waiting for more incoming CE-
>>
>>     marked octets.  Once there are no outstanding CE-marked octets, if
>>
>>     only enough incoming ECT(1)-marked octets have arrived for part of
>>
>>     the departing packet, the next departing packet SHOULD be immediately
>>
>>     marked ECT(1).
>>
>> Much as that may be the proverbial "right thing" to do, particularly with the benefit of 20/20 hindsight, that text is inconsistent with the following text from Section 5.3 of RFC 3168 (https://tools.ietf.org/html/rfc3168#section-5.3), as Markku Kojo has pointed out:
>>
>>
>>     ECN-capable packets MAY have the DF (Don't Fragment) bit set.
>>
>>     Reassembly of a fragmented packet MUST NOT lose indications of
>>
>>     congestion.  In other words, if any fragment of an IP packet to be
>>
>>     reassembled has the CE codepoint set, then one of two actions MUST be
>>
>>     taken:
>>
>>
>>
>>        * Set the CE codepoint on the reassembled packet.  However, this
>>
>>          MUST NOT occur if any of the other fragments contributing to
>>
>>          this reassembly carries the Not-ECT codepoint.
>>
>>
>>
>>        * The packet is dropped, instead of being reassembled, for any
>>
>>          other reason.
>>
>>
>>
>>     If both actions are applicable, either MAY be chosen.  Reassembly of
>>
>>     a fragmented packet MUST NOT change the ECN codepoint when all of the
>>
>>     fragments carry the same codepoint.
>>
>> The 6040update-shim draft is intended to update RFC 6040, and a number of the tunnel protocol drafts, but it is not intended to update RFC 3168, and hence the above new text (albeit well-intentioned) is a showstopper.   Changing ECN fragmentation behavior should be done in a separate draft.
>>
>> Bob (as draft editor) - do you want to propose some new text to the list, possibly after private email discussion with Marco and me to figure out what it needs to say?
>>
>> Thanks, --David
>> ----------------------------------------------------------------
>> David L. Black, Senior Distinguished Engineer
>> Dell EMC, 176 South St., Hopkinton, MA  01748
>> +1 (774) 350-9323 New    Mobile: +1 (978) 394-7754
>> David.Black@dell.com<mailto:David.Black@dell.com>
>> ----------------------------------------------------------------
>>
>>
>>
>>
>> --
>>
>> ________________________________________________________________
>>
>> Bob Briscoe                               http://bobbriscoe.net/

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/