Re: [tsvwg] Fragmentation & ECN encapsulation drafts, one more try

"Rodney W. Grimes" <ietf@gndrsh.dnsmgr.net> Wed, 18 March 2020 15:46 UTC

Return-Path: <ietf@gndrsh.dnsmgr.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 33AB43A17D4 for <tsvwg@ietfa.amsl.com>; Wed, 18 Mar 2020 08:46:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.623
X-Spam-Level:
X-Spam-Status: No, score=-1.623 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.274, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8jyszEMmepHA for <tsvwg@ietfa.amsl.com>; Wed, 18 Mar 2020 08:46:31 -0700 (PDT)
Received: from gndrsh.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C69333A17D1 for <tsvwg@ietf.org>; Wed, 18 Mar 2020 08:46:30 -0700 (PDT)
Received: from gndrsh.dnsmgr.net (localhost [127.0.0.1]) by gndrsh.dnsmgr.net (8.13.3/8.13.3) with ESMTP id 02IFjfMk002817; Wed, 18 Mar 2020 08:45:41 -0700 (PDT) (envelope-from ietf@gndrsh.dnsmgr.net)
Received: (from ietf@localhost) by gndrsh.dnsmgr.net (8.13.3/8.13.3/Submit) id 02IFjfQv002816; Wed, 18 Mar 2020 08:45:41 -0700 (PDT) (envelope-from ietf)
From: "Rodney W. Grimes" <ietf@gndrsh.dnsmgr.net>
Message-Id: <202003181545.02IFjfQv002816@gndrsh.dnsmgr.net>
In-Reply-To: <MN2PR19MB40454A8F2A88B864C6A768BE83F70@MN2PR19MB4045.namprd19.prod.outlook.com>
To: "Black, David" <David.Black@dell.com>
Date: Wed, 18 Mar 2020 08:45:41 -0700
CC: Bob Briscoe <ietf@bobbriscoe.net>, "tsvwg@ietf.org" <tsvwg@ietf.org>
X-Mailer: ELM [version 2.4ME+ PL121h (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="US-ASCII"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/-RLSrimvRC__qdPBxoMJIFTSLvs>
Subject: Re: [tsvwg] Fragmentation & ECN encapsulation drafts, one more try
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Mar 2020 15:46:34 -0000

David,

> Going back to Bob's message on the 4 options (A-D) and looking for a path forward, I believe the WG is converging on something close to option C):
> 
> > C) Say nothing about fragmentation and reassembly in rfc6040update-shim or ecn-encap-guidelines.
> > Then use a later RFC to update them both (stds track and BCP) with a considered 'correct' approach.
> > ecn-encap-guidelines would still say include what it has always said about re-framing (which is a similar but different subject).
> 
> Given how we got to this point, I believe that the rough consensus of the TSVWG WG is not to use either of these drafts to update RFC 3168, and instead to use a new draft to propose changes to RFC 3168 for fairness across fragmented and non-fragmented flows.
> 
> Another important reason for a new draft is that this WG and other impacted WGs need to discuss the implementation impacts of the proposed changes, as implementations can be expected to have complied with the "MUST" in section 5.3 of RFC 3168 that results in fragmented IPv4 flows receiving more congestion indications that similar-sized flows that are not fragmented.  That discussion and determination of what to do is going to take a while, including a broad WG Last Call that includes all the affected areas of the IETF - that's much better done via a new draft focused on that topic rather than the existing rfc6040update-shim draft.
> 
> However, the path forward cannot just delete Section 5 of the rfc6040update-shim draft, thereby ignoring the concern, as we do have a WGLC comment to resolve that requires stating something about fragmentation.  I suggest that the overall goal should be to state as little as possible.  There appear to be 3 things that need to be dealt with:
> 
>   1.  When a tunnel ingress fragments a packet, the ECN field in every fragment has the same value as the original packet.
>      *   This was implicit in RFC 3168, at least as I read Section 5.3 of RFC 3168.
>      *   It makes sense to state this explicitly with a MUST and have that statement update RFC 6040.  Such a statement would not be changed by the new draft to propose updates to RFC 3168.
>   2.  When a packet is reassembled by a tunnel egress, and one of the fragments is marked with CE, RFC 3168 specifies what to do.
>      *   Given the desire to change RFC 3168, that is all that should be stated here.
>      *   That avoids repeating or reinforcing text from RFC 3168 for which changes may be proposed in the not too distant future.
>   3.  When a packet is reassembled by a tunnel egress, and none of the fragments are marked with CE, RFC 3168 does not specify what to do.
>      *   In fact, RFC 3168 even allows use of an ECN value that was in none of the fragments (see the last paragraph in Section 5.3 of RFC 3168), but I'd be surprised to find "running code" that behaves that way.
>      *   I think I've seen some list discussion to the effect that RFC 3168 applies a bitwise "logical OR" across the ECN field values in the fragments - RFC 3168 does not do that.  The use of "logical OR" in the text below (from Bob) refers to case 2 above - if any fragment is marked with CE and the reassembled packet is forwarded, then RFC 3168 requires the reassembled packet to be marked with CE.

 [RWG] I do not believe anyone was using "bitwise" as that would mangle an (ECT(1) | ECT(0)) into a CE mark.

>      *   Going back to earlier list discussion (October of last year), the clear outcome was to use the ECN value from "any" fragment in the reassembled packet, with the implementation choosing the fragment from which that value is obtained.

 [RWG] I do believe it is possible to do slightly better than "any".  Certain combinations may even indicate a failure, ie how would it be possible to get a 00 fragment with any other fragments?
Propose 3a.  When a packet is reassembled by a tunnel egress and all fragments contain the same ECN marking that ECN marking is applied.
Propose 3b.  When a packet is reassembled by a tunnel egress and if any fragment contains ECN 00 the reassembled packet should be marked 00, (or dropped?).

 [RWG] This only leaves how to deal with packet fragments that contain a mixture of 01 and 10.  And I believe the present Internet would not produce this combination, so we should have a fairly easy time at sorting this case out.  It appears that making ECT(1) a priority override of ECT(0) would work well for either case of using ECT(1) as input or output and hence for both L4S and SCE and any other possible scheme.

Propose 4a.  When a packet is reassmbled by a tunnel egress and any mixture of ECN 01 and 10 are present the reassmbled packet should be marked ECT(1).

>      *   Trying to walk a fine line, I'd suggest crafting text to states that using the ECN value from one of the fragment is a suggested implementation that meets the requirements of RFC 3168, and leave it there without applying a "SHOULD" keyword to that behavior, lest that keyword have to be updated in the not too distant future by proposed changes to RFC 3168.
> 
> I believe that the approach to the first two items is clear from WG discussion to date.  Item 3 is new.  Please comment.

 [RWG] Agree on items 1 and 2 being clear.  Comments on 3 inline above.  My propose(ed) adds may need better ordering arrangement and the values 3a, 3b and 4a can be ignored.
Keep reading as I have made some comments inline to Bob's email.


> Thanks, --David
> 
> From: Bob Briscoe <ietf@bobbriscoe.net>
> Sent: Tuesday, March 10, 2020 2:47 PM
> To: Black, David; tsvwg@ietf.org
> Subject: Re: [tsvwg] Status of ECN encapsulation drafts (i.e., stuck)
> 
> 
> [EXTERNAL EMAIL]
> David,
> 
> I admit to curling up into a little ball and trying to ignore this controversy when it arose.
> Let me try to sort this out now, for both ecn-encap-guidelines and rfc6040update-shim.
> 
> Back in Sep '19 (quoted at the end) you asked me not to use rfc64040update-shim to update RFC3168's fragmentation behaviour, even if it's the "right thing" to do, given I was saying that there were problems with the RFC3168 approach.
> 
> Background: Neither RFC3168 nor RFC6040 covered fragmentation & reassembly during encap and decap. So Joe Touch suggested rfc6040update-shim should fix that omission. Seems reasonable enough. However, it doesn't seem right to fix an omission by the stop-gap of:
> 1. requiring the approach in RFC3168 that we know is potentially problematic.
> 2. then planning to correct what we write, by updating it in a later RFC.
> 
> Let's call that approach (A). I don't like that at all. What if step #2 never happens?
> Fortunately, that's not the only way out of this. I can think of three other ways:
> B) The compromise text I've drafted below, which states the high level intent of a good mechanism as a SHOULD, and gives an example of how to do it. Then also allows the RFC3168 mechanism as a "MAY".
> C) Say nothing about fragmentation and reassembly in rfc64040update-shim or ecn-encap-guidelines. Then use a later RFC to update them both (stds track and BCP) with a considered 'correct' approach. ecn-encap-guidelines would still say include what it has always said about re-framing (which is a similar but different subject).
> D) Convince ourselves that fragmentation and reassembly during encap and decap is allowed to be different from fragmentation and reassembly without encapsulation.
> 
> Last night, I took approach (B), but with too little time left to discuss it on the list. I scrubbed the offending paras from rfc6040update-shim and replaced them with those below (also at https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-10#section-5 ).
> 
> Thinking about it further since last night, I'm now inclining towards approach (C).
> 5.  ECN Propagation and Fragmentation/Reassembly
> 
> 
> 
>    The following requirements update RFC6040<https://tools.ietf.org/html/rfc6040>, which omitted handling of
> 
>    the ECN field during fragmentation or reassembly.  These changes
> 
>    might alter how many ECN-marked packets are propagated by a tunnel
> 
>    that fragments packets, but this would not raise any backward
> 
>    compatibility issues:
> 
> 
> 
>    If a tunnel ingress fragments a packet, it MUST set the outer ECN
> 
>    field of all the fragments to the same value as it would have set if
> 
>    it had not fragmented the packet.


> 
> 
> 
>    During reassembly of outer fragments [I-D.ietf-intarea-tunnels<https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-10#ref-I-D.ietf-intarea-tunnels>], if
> 
>    the ECN fields of the outer headers being reassembled into a single
> 
>    packet consist of a mixture of Not-ECT and other ECN codepoints, the
> 
>    packet MUST be discarded.

 [RWG] I draw the same conclusion as Bob here, though I make the MUST a possibility.  I am actually fine with either.

> 
> 
> 
>    As a tunnel egress reassembles sets of outer fragments
> 
>    [I-D.ietf-intarea-tunnels<https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-10#ref-I-D.ietf-intarea-tunnels>] into packets, as long as no fragment
> 
>    carries the Not-ECT codepoint, it SHOULD propagate CE markings such
> 
>    that the proportion of reassembled packets output with CE markings is
> 
>    broadly the same as the proportion of fragments arriving with CE
> 
>    markings.

 [RWG] This is addressed differently in Davids 2.  The text here by Bob is specifically a need for L4S as it overrides CE meaning and needs the proportion of CE marked bytes to be proper, this change would not be needed by SCE, however SCE could use, but does not need similiar treatment of ECT(1) marking.

> 
>    The above statement describes the approximate desired outcome, not
> 
>    the specific mechanism.  A simple to achieve this outcome would be to
> 
>    leave a CE-mark on a reassembled packet if the head fragment is CE-
> 
>    marked, irrespective of the markings on the other fragments.
> 
>    Nonetheless, "SHOULD" is used in the above requirement to allow
> 
>    similar perhaps more efficient approaches that result in
> 
>    approximately the same outcome.
> 
> 
> 
>    In RFC 3168<https://tools.ietf.org/html/rfc3168> the approach to propagating CE markings during fragment
> 
>    reassembly required that a reassembled packet has to be be CE-marked
> 
>    if any of its fragments is CE-marked.  This "logical OR" approach to
> 
>    CE marking during reassembly was intended to ensure that no
> 
>    individual CE marking is ever lost.  However, an unintended
> 
>    consequence is that the proportion of packets with CE markings
> 
>    increases.  For instance, with the logical OR approach, once a
> 
>    sequence of packets each consisting of 2 fragments, has been
> 
>    reassembled, the fraction of packets that are CE-marked roughly
> 
>    doubles (because the number of marks remains roughly the same, but
> 
>    the number of packets halves).

 [RWG] I know there has been a long thread on this, but I am still grappling with how any unnecessary increase in CE marks occurs.  Sure, the proportion of packets with CE markings increases, but ONLY if additional congestion occured in the tunnel and that SHOULD lead to additional CE marks.

> 
> 
> 
>    This specification does not rule out the logical OR approach of RFC<https://tools.ietf.org/html/rfc3168>
> 
>    3168<https://tools.ietf.org/html/rfc3168>.  So a tunnel egress MAY CE-mark a reassembled packet if any of
> 
>    the fragments are CE-marked (and none are Not-ECT).  However, this
> 
>    approach could result in reduced link utilization, or bias against
> 
>    flows that are fragmented relative to those that are not.
> 
> Regards
> 
> 
> Bob
> On 15/09/2019 22:07, Black, David wrote:
> This email concerns draft-ietf-tsvwg-ecn-encap-guidelines and draft-ietf-tsvwg-rfc6040update-shim, which are being handled together for WG Last Call and RFC publication, and is posted in my role as shepherd and responsible WG chair for these drafts.The current situation is that both drafts are stuck due to a problem with the fragementation text added to the rfc6040update-shim draft.   Section 5 on ECN Propagation and Fragmentation/Reassembly was added to that draft in response to a WGLC comment, and it appears to have gone too far in the direction of trying to do the proverbial "right thing".
> 
> The core of the problem is in these two paragraphs in Section 5 of that draft (https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-09#section-5):
> 
>    As a tunnel egress reassembles sets of outer fragments
> 
>    [I-D.ietf-intarea-tunnels] into packets, it SHOULD propagate CE
> 
>    markings on the basis that a congestion indication on a packet
> 
>    applies to all the octets in the packet.  On average, a tunnel egress
> 
>    SHOULD approximately preserve the number of CE-marked and ECT(1)-
> 
>    marked octets arriving and leaving (counting the size of inner
> 
>    headers, but not encapsulating headers that are being stripped).
> 
>    This process proceeds irrespective of the addresses on the inner
> 
>    headers.
> 
> 
>    Even if only enough incoming CE-marked octets have arrived for part
> 
>    of the departing packet, the next departing packet SHOULD be
> 
>    immediately CE-marked.  This ensures that CE-markings are propagated
> 
>    immediately, rather than held back waiting for more incoming CE-
> 
>    marked octets.  Once there are no outstanding CE-marked octets, if
> 
>    only enough incoming ECT(1)-marked octets have arrived for part of
> 
>    the departing packet, the next departing packet SHOULD be immediately
> 
>    marked ECT(1).
> 
> Much as that may be the proverbial "right thing" to do, particularly with the benefit of 20/20 hindsight, that text is inconsistent with the following text from Section 5.3 of RFC 3168 (https://tools.ietf.org/html/rfc3168#section-5.3), as Markku Kojo has pointed out:
> 
> 
>    ECN-capable packets MAY have the DF (Don't Fragment) bit set.
> 
>    Reassembly of a fragmented packet MUST NOT lose indications of
> 
>    congestion.  In other words, if any fragment of an IP packet to be
> 
>    reassembled has the CE codepoint set, then one of two actions MUST be
> 
>    taken:
> 
> 
> 
>       * Set the CE codepoint on the reassembled packet.  However, this
> 
>         MUST NOT occur if any of the other fragments contributing to
> 
>         this reassembly carries the Not-ECT codepoint.
> 
> 
> 
>       * The packet is dropped, instead of being reassembled, for any
> 
>         other reason.
> 
> 
> 
>    If both actions are applicable, either MAY be chosen.  Reassembly of
> 
>    a fragmented packet MUST NOT change the ECN codepoint when all of the
> 
>    fragments carry the same codepoint.
> 
> The 6040update-shim draft is intended to update RFC 6040, and a number of the tunnel protocol drafts, but it is not intended to update RFC 3168, and hence the above new text (albeit well-intentioned) is a showstopper.   Changing ECN fragmentation behavior should be done in a separate draft.
> 
> Bob (as draft editor) - do you want to propose some new text to the list, possibly after private email discussion with Marco and me to figure out what it needs to say?
> 
> Thanks, --David
> ----------------------------------------------------------------
> David L. Black, Senior Distinguished Engineer
> Dell EMC, 176 South St., Hopkinton, MA  01748
> +1 (774) 350-9323 New    Mobile: +1 (978) 394-7754
> David.Black@dell.com<mailto:David.Black@dell.com>
> ----------------------------------------------------------------
> 
> 
> 
> 
> --
> 
> ________________________________________________________________
> 
> Bob Briscoe                               http://bobbriscoe.net/

-- 
Rod Grimes                                                 rgrimes@freebsd.org