Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:SuggestedFragmentation/Reassemblytext

Bob Briscoe <ietf@bobbriscoe.net> Sat, 20 March 2021 18:28 UTC

To: Markku Kojo <kojo@cs.helsinki.fi>
Cc: Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, "tsvwg-chairs@ietf.org" <tsvwg-chairs@ietf.org>, Joe Touch <touch@strayalpha.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <CE03DB3D7B45C245BCA0D243277949363076629A@MX307CL04.corp.emc.com> <CE03DB3D7B45C245BCA0D24327794936307662EA@MX307CL04.corp.emc.com> <1920ABCD-6029-4E37-9A18-CC4FEBBFA486@gmail.com> <CE03DB3D7B45C245BCA0D2432779493630768173@MX307CL04.corp.emc.com> <6D176D4A-C0A7-41BA-807A-5478D28A0301@strayalpha.com> <CE03DB3D7B45C245BCA0D24327794936307688C5@MX307CL04.corp.emc.com> <alpine.DEB.2.21.1911171041020.5835@hp8x-60.cs.helsinki.fi> <9024d91a-bb08-fb45-84f8-ce89ba90648d@bobbriscoe.net> <alpine.DEB.2.21.2012141735030.5844@hp8x-60.cs.helsinki.fi> <1e038b64-8276-3515-ac45-e0fc84e1c413@bobbriscoe.net> <alpine.DEB.2.21.2103081540280.3820@hp8x-60.cs.helsinki.fi> <3c778eb9-56dc-3d58-0de4-c6373d1090ec@bobbriscoe.net> <alpine.DEB.2.21.2103181233160.3820@hp8x-60.cs.helsinki.fi>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <8ac0d6dd-1648-ee8d-d107-55ef7fe7695f@bobbriscoe.net>
Date: Sat, 20 Mar 2021 18:27:57 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.21.2103181233160.3820@hp8x-60.cs.helsinki.fi>
Content-Type: multipart/alternative; boundary="------------640A9E49AC71C3B0CCF88BCB"
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/YuWJaMeJvIi-o-FXs8XCXDY8oVc>
Subject: Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:SuggestedFragmentation/Reassemblytext
Precedence: list

Markku, all,

On 18/03/2021 14:18, Markku Kojo wrote:
> Hi Bob, all,
>
> apologies for the additional delay on this.
>
> Please see inline after Bob's suggestion, my 2 cents on this tricky 
> issue tagged [MK].
>
> On Mon, 8 Mar 2021, Bob Briscoe wrote:
>
>> Markku,
>>
>> Thx. Unfortunately, this draft is coming up in the mtg this afternoon.
>> I take some of the blame -  only re-posting the draft a couple of 
>> hours ago, which
>> presumably reminded you that you were going to think on this.
>>
>> inline tagged [BB]
>>
>> On 08/03/2021 13:48, Markku Kojo wrote:
>>       Hi Bob,
>>
>>       this issue and text on it seems acceptable to me.
>>
>>       However, the other issue with the two contradictory SHOULD's - 
>> that I
>>       now notice I have never replied to - seems not ok, I think.
>>
>>
>> [BB] The question is whether we have to solve this now.
>>
>> I have a solution to resolve the contradiction. At the risk of 
>> prolonging the
>> progress of this draft, I'll say it now. But if we can't resolve this 
>> in the next
>> couple of days, I think we should go ahead with the two contradictory 
>> SHOULDs.
>>
>> For the list, here's the two contradictory paras:
>>
>>    Congestion indications SHOULD be propagated on the basis that an
>>    encapsulator or decapsulator SHOULD approximately preserve the
>>    proportion of PDUs with congestion indications arriving and leaving.
>>
>>    The mechanism for propagating congestion indications SHOULD ensure
>>    that any incoming congestion indication is propagated immediately,
>>    not held awaiting the possibility of further congestion indications
>>    to be sufficient to indicate congestion on an outgoing PDU.
>>
>> Possible resolution of the contradiction: the "SHOULD approximately 
>> preserve the
>> proportion" is a rough long term average goal while "SHOULD ensure 
>> that incoming
>> congestion indication is propagated immediately" is a requirement for 
>> after there
>> has been some period (TBD) without any marking.
>>
>> The big question is where would an implementer set that timescale? It 
>> needs to be a
>> "typical RTT in the deployment environment" or some such get-out 
>> clause. I guess
>> this is best left to the implementer.
>>
>
> [MK]:
>
> My concerns were and still are mainly for traffic under Standards 
> Track congestion control, i.e., majority of the traffic foreseen for a 
> long time. So, if not otherwise mentioned, the comments apply for 
> handling properly the traffic under Standards Track CC.
>
> The subject line was for the shim draft, but the solution IMO should 
> be the same for all cases where "fragments" are decapsulated/reassembled
> and my comments therefore mainly address fragmentation & reassembly. 
> i.e., when small fragmented packets are under AQM drop and later 
> reassembled.

[BB] It's not enough to make ecn-encap the same as shim. The reassembly 
logic in RFC3168 is only defined when packets are reassembled from 
/smaller/ fragments. When a L2 frame is /larger/ than an IP packet, or 
/overlaps/ the boundary between IP packets, the reassembly logic in 
RFC3168 makes is undefined - it makes no sense.

For instance, some link layers treat IP packets as a continuous byte 
stream, then break the stream into the largest possible frames, like so:

----------------->+<---------------------------->+<------------------------------>+<----
Fr1       |                Fr2           | Fr3                |
+-------------+-------------+-------------+-------------+-------------+-------------+---
|   Pkt1      |    Pkt2     |    Pkt3     |   Pkt4      | Pkt5     |   
Pkt6      |
+-------------+-------------+-------------+-------------+-------------+-------------+---

Then, say Fr2 was marked. On decap should Pkt2, Pkt3 & Pkt4 be marked, 
or just Pkt3 & Pkt4?

Section 4.6 of ecn-encap (where the contradictory SHOULDs are) covers 
re-framing and definitely does not cover fragmentation/reassembly.
(Fragmentation/reassembly is only covered by RFC3168 and by Section 5 of 
the shim draft.)

The scope of section 4.6 on reframing in ecn-encap never included 
fragmentation until I was asked to widen it for draft-13. But in 
subsequent conversation, it was agreed that fragmentation should be 
referred to RFC3168. So after draft-14 there was no longer any reason 
for a draft about L2 encapsulation to give any guidelines about 
fragmentation (which had never really been the intention).


>
> The suggestion above may work with high fidelity CC traffic like L4S 
> but unfortunately not with standard CC traffic. I have doubts about 
> correct behaviour with L4S traffic though.

[BB] L4S is irrelevant to this conversation. It is experimental and so 
has to fit in with the standards track it finds in the Internet.
The reframing section in ecn-encap-guidelines existed long before L4S 
was even thought of (indeed it was in the first ever ecn-encap draft in 
March 2011).

>
> There are two major problems:
>
> 1) The suggested approach assumes an AQM that uses propabilistic
>    dropping. All AQMs do not employ propabilistic dropping. If
>    the decapsulator/node doing reasembly does not know which
>    type of AQM marked the PDUs/fragments, it cannot do the right
>    decision to not apply the above approach of preserving
>    the proportion of PDUs with congestion indications arriving
>    and leaving in case of AQM that does not employ propabilistic
>    dropping.

[BB] It doesn't assume probabilistic at all. If you've got that 
impression from me saying in the example linked below "an AQM marks 2% 
of packets," that doesn't mean I'm saying the AQM is probabilistic. If a 
deterministic AQM like CoDel marks every 50th packet, that is still 
equivalent to 2% marking.

The average proportion of marks at a fully utilized bottleneck 
determines the average capacity share of standard congestion controls 
(recall the 1/sqrt(p) in the Reno equation). So, even if there were no 
probabilistic AQMs in the Internet, the two contradictory SHOULDs would 
still be necessary, because satisfying the second SHOULD alone 
(undelayed signal) would roughly double the long-term average proportion 
of marking of fragmented vs unfragmented packets (thus breaking the 
first SHOULD). I explained this in the posting to Jake here: 
https://mailarchive.ietf.org/arch/msg/tsvwg/Da0sagcLnvPzh6xKFdHFRUZ9w5o/

Nonetheless, the second SHOULD (undelayed signal) is useful when the AQM 
is only fully utilized intermittently.

Actually, you don't say what other type(s) of AQM you are thinking of. I 
imagine:
* Spacing based (like CoDel or PDPC)
* Threshold based
Or did you mean something else?

Strictly, I also don't know what approach you are saying won't work, 
because no approach is described in ecn-encap any more. Just the two 
contradictory SHOULDs. And I deliberately didn't describe a way to 
resolve the two SHOULDs in detail in my email.

Reason: we need to take this one step at a time. So I've proposed a 
first step where we reach consensus that both these contradictory 
requirements are necessary. If I propose a compromise between the two 
contradictory requirements, people seem to delight in pointing out that 
it's not perfect for half the compromise (which is the definition of a 
compromise!).

So first things first, do you accept there's a trade-off here (even if 
there was only standard TCP traffic on the Internet) and the two 
contradictory SHOULDs capture it?

>
> 2) Majority of the congestion controlled traffic today is
>    non-ECT traffic, i.e., traffic under loss-based CC (let's
>    but delay-based CC aside now for to keep things simple enough),
>    and the above solution does not work for it.
>
>    The guiding principle of the original ECN design was to treat
>    ECN traffic eguitably with the loss-based traffic, i.e., not
>    to give preference to ECN traffic. If the congestion
>    indications marked on small fragments are reduced when
>    reassembeled then the ECN traffic is preferred over loss-based
>    CC traffic because it is impossible to reduce lost fragments
>    but each lost fragment results in loss of the entire packet
>    at reassembly and therefore triggers a congestion indication
>    without exception.

[BB] That's true. Certainly, if there's an AQM in a tunnel, like the 
example in the email to Jake linked above, fragmented NECT packets 
running alongside non-fragmented will tend to experience twice the drop 
level. This is the same problem as there used to be when IP was carried 
over ATM and each cell loss amplified into a whole packet loss. But that 
doesn't mean we have to make ECN reassembly exactly mimic the 
rubbishness of drop.

Whatever, I'm not sure what you're criticizing here 'cos the shim draft 
now essentially does mimic the rubbishness of drop, by deferring to RFC3168:
https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-rfc6040update-shim-13#section-5

While the ecn-encap draft doesn't currently define an approach, it just 
gives the two contradictory SHOULDs.

My email suggestion to draw a line between the two approaches at a 
'typical RTT' was deliberately high-level. It would mimic the 
rubbishness of drop when the spacing of the markings seen by each flow 
was wider than the 'typical RTT' (i.e. intermittent congestion episodes) 
but it would ensure the proportions were the same during persistent 
congestion. Rationale: when the spacing is narrow (persistent 
congestion), the exact timing of each notification doesn't matter 
because it wasn't long since the last and won't be long before the next.


>
> The suggested solution also would be very hard to configure because 
> RTT is not known. The potential RTT range in the Internet is from 
> sub-millisecs to over 1 sec. In some specific evironments the typical 
> RTT could be known but not for the general case and these docs are 
> targetting Standards Track, so the solution should apply for all 
> environments.

[BB] I didn't say 'the RTT' I said 'the typical RTT in the deployment 
environment'.

The typical RTT (the mode) in the environment of a particular link can 
be known. It can be measured (out of band), then configured. And it 
could be reviewed occasionally, although it's unlikely to change for 
many years at a time.

Also, it's easy to criticize a proposed compromise between the 
contradictory SHOULDs for not being perfect (again, that's the 
definition of a compromise). But you say at the end that you don't have 
a solution to offer. So how are we going to move forward?

I've continued to respond to all your points in the rest of this long 
email, but I'd like to try to focus on the contradictory SHOULD 
requirements first.


>
> In addition, when there are multiple flows sharing a tunnel, the 
> proposal would potentially concentrate marks to a smaller number of 
> some flows, i.e., ignore/move marks for/from some flows to a smaller 
> number of flows.
> This would be extremely undesirable in high load cases, e.g., when 
> several flows are in slow start. In such a case it is important to 
> mark packets for several flows in order to have fast and strong enough 
> CC reaction.
> If the AQM has succesfully spread the marks over a number of flows, 
> but this gets supressed to a smaller number of flows the overall 
> congestion control reaction at the bottleneck is inappropriately 
> diminished and postponed.

[BB] When an AQM applies marks to an aggregate, it randomly hits 
particular flows. Compare two flows in slow-start alongside each other, 
one with slightly larger packets that get fragmented, the other not. The 
fragments will have roughly half the size and twice the packet rate. So 
on average an AQM will mark twice as many packets in the fragmented 
flow. Then if the marks are close enough together for reassembly to 
preserve the proportion of marks, it will halve the absolute number of 
marks, thus leaving about the same number of marks in the two flows.

Assuming my time-based proposal could be implemented, if only one or two 
AQM marks hit each flow, the reassembly process will be more likely to 
preserve all the marks, because the time between them will be longer. 
Then the fragmented flow will end up with more marks than the 
unfragmented, which I think is what you want.


> After considering this problem, my conclusion is that the only working 
> approach I can come up with can be achieved by not applying any 
> congestion indication manipulation when reassembling 

[BB] What do you mean? I think you mean "taking the approach in RFC3168" 
(which is a congestion indication manipulation).
That is the approach now adopted in the shim draft. But the ecn-encap 
draft covers more general re-framing (and fragmentation/reassembly is 
out of its scope).

> and by taking different approach to packet drops with AQMs that employ 
> probabilistic dropping. 

[BB] Eh? Reassembly doesn't know what the AQM was, or even whether there 
was an AQM. Anyway, I've argued that an AQM that sets the spacing 
between marks in an aggregate is equivalent to probabilistic. Unless you 
are thinking of some different AQM that I'm not aware of.

> The correct approach for dropping had already been taken by the RED 
> design with byte-mode queue measurement and drop (note: with the error 
> in calculating the drop propbability corrected).
>
> I'm very well aware of the strong case that RFC 7141 makes for 
> packet-mode drop, and it makes this problem even much trickier to solve. 

[BB] Reading ahead, I think you're saying that it's hard to solve the 
problem because you haven't really abandoned your preference for scaling 
down drop/marking of smaller packets ('byte-mode'). So you think that 
ought to be a good solution, but you know it's got its own problems. If 
you accept that independence from packet size (packet-mode) is 
preferable to byte-mode, it's easier to make everything consistent.

(BTW, it would be really hard if reassembly had to cater for some AQMs 
doing byte mode and others doing packet mode.)


> However, if we concentrate only on the problem of dropping small 
> fragments and reassembling them, the byte-mode drop together with 
> reassembly logic in RFC 3168 results in the correct outcome.
>
> Why? By giving lower drop probability to small fragments, the 
> byte-mode drop ensures that the congestion signal (mark or drop) is 
> given (approximately) at the same level of data queued (or: of queuing 
> delay), no matter whether the packets are small or large (i.e., on a 
> bit-congested bottleneck the operating range of AQM is entered at the 
> same time no matter what size the packets in the queue are). 
> Therefore, it is ok to use fractional cwnd decrease as Standards Track 
> TCP does.
>
> When the drops/marks at the bottleneck are targetted at small 
> fragments, it does not mean that the (TCP) sender operates on small 
> packets, but it sends MSS-sized segments and is basically unware of 
> fragmentation. Therefore, the sender also increments its cwnd using 
> large MSS-sized units and there is no small packet bias in its 
> performance (because byte-mode drop at the bottleneck does correct 
> job). Small packets (fragments) do not go faster.
>
> The fact that the performance problem with small packets does not 
> originate from one reason only but from two main reasons: 1) how drops 
> are handled at the bottleneck device and 2) how cwnd is incremented at 
> the sender endpoint.

[BB] All this is true, but academic. Such AQMs are not at all common, 
and also deprecated for the many good reasons in RFC7141, including to 
prevent amplification of small packet flooding attacks (see my response 
to your point on this later).

The RED /design/ included packet-mode and byte-mode. But, according to 
this survey:
     https://tools.ietf.org/html/rfc7141#appendix-A
byte mode was rarely if ever implemented in production equipment (those 
respondents who gave reasons mostly said it was due to complexity).

DOCSIS PIE is the only widely deployed AQM I know of that implements 
marking dependent on packet size.
     https://tools.ietf.org/html/rfc8034#section-4.6
Nonetheless, to mitigate amplification of small packet flooding attacks, 
it sets a floor of 85% for the reduced drop probability for smaller 
packets, and anyway DOCSIS PIE does not support ECN.

>
> The research that RFC 7141 has used as basis when justifying the 
> choice with the drop mode seems not to take this fact properly into 
> account. Instead, it tries to solve the small packet propblem entirely 
> at the AQM node, which of course is wrong kind of reverse engineering 
> that RFC 7141 states. 

[BB] Surely you mean the opposite - RFC 7141 does nothing about packet 
size at the AQM. It "tries to solve the small packet problem" solely at 
the end system. The AQM is explicitly /not/ reverse engineering what it 
thinks end systems might do. See section 3.3. of RFC7141 entitled 
"Transport Independent Network".
     https://tools.ietf.org/html/rfc7141#section-3.3
In particular: "

    When the network does not take packet size into account, it allows
    transport protocols to choose whether or not to take packet size into
    account.

"
Anyway, the "Appropriate Byte Counting" approach [RFC3465] is now 
formally recommended in standard TCP congestion control [RFC5681].

> However, solving only the problem 1) that an AQM node using 
> probabilistic dropping creates by dropping small packets faster at the 
> AQM node itself is not reverse engineering. 

[BB] Eh? An AQM node using probabilistic dropping doesn't drop smaller 
packets faster.

Imagine two senders, both sending at the same bit-rate, but one sending 
packets half the size of the other (and therefore at twice the packet 
rate). If an AQM drops 2% of packets randomly, it will drop twice as 
many packets from the flow with smaller packets. But it will therefore 
drop bits from both at the /same/ rate.

If the one sending smaller packets chooses to increase by one smaller 
packet per RTT, the AIMD process will certainly converge with it running 
at half the rate of the other. But the point RFC7141 makes is that the 
end system chose to do that. By dropping any size packet with the same 
probability, the network treats them both the same. It drops the same 
rate of bits from them both, if they both send at the same bit rate.

So the place to fix this disparity is in the end-system. If you tried to 
fix it in the network, an AQM faced with two flows running at the same 
bit rate would have to drop less bits from the flow that happened to 
divide up its bit-rate into smaller packets. This creating a perverse 
incentive for everyone to use smaller packets.

> Solving the problem with small packet senders, which increment cwnd in 
> smaller units, anywhere except at the endpoint is of course not the 
> way to go.

[BB] Yes. We seem to be agreeing now.


>
> I also do not fully agree with RFC 7141 that giving a lower drop 
> probability to smaller packets would notably amplify flooding attacks.
> AQMs are not designed to protect against flooding attacks and they 
> cannot. There are and need to be other tools for that. Having a higher 
> drop probability for smaller packets does not prevent small packet 
> attacker from (almost) fully utilizing the bottleneck link capacity 
> (AQMs are designed to drop excess packets). Sending unresponsive 
> floods will anyway push away almost all competing responsive traffic 
> as the responsive traffic reduces its sending rate to less than one 
> packet per RTT.

[BB] That's a reasonable argument. And I agree that AQMs aren't designed 
to protect against flooding attacks. But they shouldn't be designed to 
help them.

Also, your argument is that dropping smaller packets with lower 
probability doesn't amplify flooding attacks against responsive traffic 
as much as one might think. It still amplifies them. And it /does/ 
strongly amplify attacks against unresponsive flows (which includes 
semi-elastic flows and responsive flows once they have been squeezed 
down to their minimum rate or minimum window).


>
> So, unfortunately I'm not able to offer a quick and simple solution to 
> this question at hand.



>
> thanks,
>
> /Markku
>
>> Bob
>>
>>
>>       I'm now occupied for the next few hours, so I'll come back with 
>> more
>>       detailed reasoning after the tsvwg meeting today.
>>
>>       Cheers,
>>
>>       /Markku
>>
>>       On Mon, 8 Mar 2021, Bob Briscoe wrote:
>>
>>             Markku, chairs, all,
>>
>>             Having reached agreement on the text last Dec, I then went
>>             and dropped the ball and
>>             forgot all about this draft,... and ecn-encap-guidelines.
>>
>>             == ECN-ENCAP-GUIDELINES ==
>>
>>             I shall upload a new rev shortly with the following single
>>             diff that I noticed
>>             during the meeting last Nov, and said I would do at the next
>>             rev:
>>
>>              4.6.  Reframing and Congestion Markings
>>
>>                 The guidance in this section is worded in terms of
>>             framing
>>                 boundaries, but it applies equally whether the protocol
>>             data units
>>             -   are frames, cells, packets or fragments.
>>             +   are frames, cells or packets.
>>
>>             == RFC6040UPDATE-SHIM ==
>>
>>             I shall upload a new rev shortly, with the following 3 paras
>>             at the end of S.5 on
>>             ECN fragmentation/reassembly:
>>
>>             Para 1 is just moved up from the end but otherwise
>>             unchanged.
>>             Para 2 is unchanged.
>>             Para 3 is the text agreed on this list last Dec subject to
>>             further checking, with
>>             one exception: I removed the citation of RFC3168 after
>>             "equivalent".
>>                 Reason: The citation of RFC6040 at the end is the
>>             relevant one, 'cos it
>>             introduced the mechanism for ECT(0) and ECT(1) to be either
>>             equivalent or two
>>             severity levels. In this respect it updated RFC3168. So it
>>             would not be appropriate
>>             to cite RFC3168, which only said the two were equivalent. If
>>             we cited RFC3168 here,
>>             it could be interpreted as if we're saying two RFC give
>>             conflicting definitions.
>>
>>                 Section 5.3 of [RFC3168] defines the process that a
>>             tunnel egress
>>                 follows to reassemble sets of outer fragments
>>                 [I-D.ietf-intarea-tunnels] into packets.
>>
>>                 During reassembly of outer fragments
>>             [I-D.ietf-intarea-tunnels], if
>>                 the ECN fields of the outer headers being reassembled
>>             into a single
>>                 packet consist of a mixture of Not-ECT and other ECN
>>             codepoints, the
>>                 packet MUST be discarded.
>>
>>             +   If there is mix of ECT(0) and ECT(1) fragments, then the
>>             reassembled
>>             +   packet MUST be set to either ECT(0) or ECT(1). In this
>>             case,
>>             +   reassembly SHOULD take into account that the RFC series
>>             has so far
>>             +   ensured that ECT(0) and ECT(1) can either be considered
>>             equivalent,
>>             +   or they can provide 2 levels of congestion severity,
>>             where the
>>             +   ranking of severity from highest to lowest is CE,
>>             ECT(1), ECT(0)
>>             +   [RFC6040].
>>
>>             If any of this isn't acceptable, I'll have to post another
>>             rev, but I think it's
>>             what was agreed.
>>
>>             Cheers
>>
>>
>>
>>             Bob
>>
>>
>>             On 14/12/2020 15:44, Markku Kojo wrote:
>>                   Hi Bob, all,
>>
>>                   apologies for the delay, now catching up again.
>>
>>                   yes, handling mix of ECT(0)/ECT(1) like in the new
>>             proposed text below
>>                   seems reasonable choice (for now).
>>
>>                   I'll come back shortly with the issue in the other
>>             thread. It seems
>>                   less clear, actually seems quite difficult to handle
>>             correctly for all
>>                   foreseen cases.
>>
>>                   /Markku
>>
>>                   On Thu, 3 Dec 2020, Bob Briscoe wrote:
>>
>>                         Markku, all,
>>
>>                         I am also only now catching up with the list...
>>
>>                         On 17/11/2019 08:46, Markku Kojo wrote:
>>                               Hi Dave, Joe, All,
>>
>>                               Catching up ...
>>
>>                               I agree with the modified new text as well
>>             as
>>                         treatment of an ECT(0)/ECT(1)
>>                               mix as "any".
>>
>>
>>                         [BB] Thanks. For the list, the current text that
>>             Markku is
>>                         agreeing with is here:
>>
>> https://tools.ietf.org/html/draft-ietf-tsvwg-rfc6040update-shim-11#section-5
>>
>>                         Regarding reassembly of a mix of ECT(0)/ECT(1).
>>             I agree
>>                         with David that the current text
>>                         should handle this case that 3168 doesn't
>>             address.
>>                         And I agree with Joe that an interim way of
>>             handling it is
>>                         needed, not just punting until
>>                         later.
>>
>>                         I see that all of Jonathan, David and you Markku
>>             are happy
>>                         with reassembling a mix of
>>                         ECT(0) and ECT(1) to result in either ECT(0) or
>>             ECT(1).
>>                         (for now). I think we can go one
>>                         better than that, still without precluding a
>>             more specific
>>                         RFC later. Here's proposed
>>                         text:
>>
>>                         After the following para:
>>
>>                            During reassembly of outer fragments
>>                         [I-D.ietf-intarea-tunnels], if
>>                            the ECN fields of the outer headers being
>>             reassembled
>>                         into a single
>>                            packet consist of a mixture of Not-ECT and
>>             other ECN
>>                         codepoints, the
>>                            packet MUST be discarded.
>>
>>                         Add:
>>
>>                               If there is mix of ECT(0) and ECT(1)
>>             fragments, then
>>                         the reassembled packet
>>                               MUST be set to either ECT(0) or ECT(1). In
>>             this case,
>>                         reassembly SHOULD take
>>                               into account that the RFC series has so
>>             far ensured
>>                         that ECT(0) and ECT(1)
>>                               can either be considered equivalent
>>             [RFC3168], or
>>                         they can provide 2 levels
>>                               of congestion severity, where the ranking
>>             of severity
>>                         from highest to lowest
>>                               is CE, ECT(1), ECT(0) [RFC6040].
>>
>>
>>                         Rationale: This avoids constraining future RFCs,
>>             but at
>>                         least lays out all the
>>                         interoperabilityrequirements we already have for
>>             handling
>>                         this mixture. Then if an
>>                         implementer wants to just default to choosing
>>             one, it hints
>>                         that they should choose
>>                         ECT(1).
>>
>>
>>
>>                               I also want to repeat my comment that
>>
>>  draft-ietf-tsvwg-ecn-encap-guidelines-13
>>
>>                               added similar new text that alters RFC
>>             3168, and it
>>                         should be modified
>>                               accordingly.
>>
>>
>>                         [BB] I'll start another thread for this, rather
>>             than make
>>                         this thread too unweildy.
>>
>>
>>
>>                         Bob
>>
>>
>>
>>                               Thanks,
>>
>>                               /Markku
>>
>>                               PS. I missed Bob's response to my comment
>>             at the
>>                         time, but will reply it
>>                               separately at some point.
>>
>>
>>                               On Wed, 9 Oct 2019, David Black wrote:
>>
>>                                     At this juncture, for an
>>             ECT(0)/ECT(1) mix
>>                         across a set of
>>                                     fragments being reassembled, I would
>>             suggest
>>                         using "any" (i.e.,
>>                                     either is ok) at this juncture to
>>             avoid
>>                         constraining what we may
>>                                     do in the future; in particular,
>>             this allows
>>                         use of the value in
>>                                     the first or last fragment, both of
>>             which are
>>                         likely to be
>>                                     convenient approaches for some
>>             implementations.
>>
>>                                     Thanks, --David
>>
>>                                           -----Original Message-----
>>                                           From: Joe Touch
>>             <touch@strayalpha.com>
>>                                           Sent: Wednesday, October 9,
>>             2019 10:29 AM
>>                                           To: Black, David
>>                                           Cc: Jonathan Morton;
>>             tsvwg@ietf.org
>>                                           Subject: Re: [tsvwg]
>>
>>             draft-ietf-tsvwg-rfc6040update-shim:
>>                         Suggested
>> Fragmentation/Reassembly text
>>
>>
>>                                           [EXTERNAL EMAIL]
>>
>>                                           Hi, all,
>>
>>                                           I disagree with the suggestion
>>             below.
>>
>>                                           Pushing this “under the rug”
>>             for an
>>                         indeterminate
>>                                           later date only serves to
>>                                           undermine the importance of
>>             this issue.
>>
>>                                           At a MINIMUM, there needs to
>>             be direct
>>                         guidance in
>>                                           place until a “better”
>>                                           solution can be developed. For
>>             now, that
>>                         would mean
>>                                           one of the following:
>>                                           - use the max of the frag code
>>             point
>>                         values
>>                                           - use the min of the frag code
>>             point
>>                         values
>>                                           - use “any” of the frag code
>>             point values
>>                                           - pick some other way (first,
>>             the one in
>>                         the initial
>>                                           fragment i.e., offset 0), etc.
>>
>>                                           One of these needs to be
>>             *included at
>>                         this time*.
>>
>>                                           If a clean up doc needs to be
>>             issued, it
>>                         can override
>>                                           individual “scattered”
>>                                           recommendations later.
>>
>>                                           Joe
>>
>>                                                 On Oct 9, 2019, at 6:33
>>             AM, Black,
>>                         David
>> <David.Black@dell.com>
>>             wrote:
>>
>>                                                       The one case this
>>             doesn't
>>                                                       really cover is
>>             what happens
>>                                                       when a fragment
>>
>>                                           set
>>                                                       has a mixture of
>>             ECT(0) and
>>                                                       ECT(1)
>>             codepoints.  This
>>                                                       probably isn't
>>             very
>>                                                       relevant to
>>             current ECN
>>                                                       usage, but may
>>             become
>>                                                       relevant with SCE,
>>             in
>>
>>                                           which
>> middleboxes on the
>>             tunnel
>>                                                       path may introduce
>>             such a
>>                                                       mixture to
>>             formerly
>>                                                       "pure" packets.
>>             From my
>> perspective, a
>>             likely
>>                                                       RFC-3168 compliant
>> implementation of
>>             arbitrarily
>>                                                       choosing one
>>             fragment's ECN
>>                                                       codepoint as
>> authoritative
>>             (where it
>>                                                       doesn't conflict
>>             with other
>>                                                       rules) is
>>             acceptable, but
>>                                                       this doesn't
>>             currently seem
>>                                                       to be mandatory.
>>
>>                                                       With the above
>>             language, it
>>                                                       should be
>>             sufficient to
>>                                                       update RFC-3168 to
>>
>>                                           cover
>>                                                       this case at an
>>             appropriate
>>                                                       time, rather than
>>             scattering
>>                                                       further
>>
>>                                           requirements
>>                                                       in many documents.
>>
>>
>>                                                 I would concur that
>>             using a
>>                         separate
>>                                                 draft to cover that case
>>             at the
>>
>>                                           appropriate time would be the
>>             better
>>                         course of
>>                                           action.
>>
>>                                                 Thanks, --David
>>
>> -----Original
>>             Message-----
>>                                                       From: Jonathan
>>             Morton
>>
>>             <chromatix99@gmail.com>
>>                                                       Sent: Tuesday,
>>             October 8,
>>                                                       2019 6:55 PM
>>                                                       To: Black, David
>>                                                       Cc: tsvwg@ietf.org
>>                                                       Subject: Re:
>>             [tsvwg]
>>
>>                         draft-ietf-tsvwg-rfc6040update-shim:
>>                                                       Suggested
>>
>>             Fragmentation/Reassembly text
>>
>>
>>                                                       [EXTERNAL EMAIL]
>>
>>                                                             On 8 Oct,
>>             2019,
>>                                                             at 10:51 pm,
>> Black, David
>>
>>             <David.Black@dell.com>
>> wrote:
>>
>> **NEW**:
>>             Beyond
>> those first
>>             two
>> paragraphs,
>>             I
>> suggest
>>             deleting
>>                                                             the
>>
>>                                           rest
>>                                                       of Section 5 of
>>             the
>> rfc6040update-shim
>>             draft and
>> substituting the
>>
>>                                           following
>>                                                       paragraph:
>>
>>                                                               As a
>>             tunnel
>> egress
>> reassembles
>>             sets
>>                                                             of outer
>> fragments
>>
>>
>>                         [I-D.ietf-intarea-tunnels]
>>                                                             into
>>             packets, it
>>                                                             MUST comply
>>             with
>> the
>>             reassembly
>> requirements
>>             in
>> Section 5.3
>>             of
>>                                                             RFC 3168 in
>> order to
>>             ensure
>>                                                             that
>>             indications
>>                                                             of
>>             congestion are
>>                                                             not lost.
>>
>>                                                             It is
>>             certainly
>> possible to
>> continue
>>             from
>>                                                             that text to
>> paraphrase
>>             part
>>                                                             or all
>>
>>                                           of
>>                                                       Section 5.3 of RFC
>>             3168, but
>>                                                       I think the above
>>             text
>>                                                       crisply addresses
>>             the
>>                                                       problem, and
>>             avoids
>> possibilities of
>>             subtle
>> divergence.  I do
>>             like the
>> “reassembles sets
>>             of outer
>>                                                       fragments” lead-in
>>             text
>>                                                       (which I copied
>>             from
>>
>>                                           the
>>                                                       current
>>             rfc6040shim-update
>>                                                       draft) because
>>             that text
>>                                                       makes it clear
>>             that
>>                                                       reassembly
>>             logically precedes
>> decapsulation at
>>             the tunnel
>>                                                       egress.
>>
>> Comments?
>>
>>
>>                                                       Looks good to me.
>>
>>                                                       The one case this
>>             doesn't
>>                                                       really cover is
>>             what happens
>>                                                       when a fragment
>>
>>                                           set
>>                                                       has a mixture of
>>             ECT(0) and
>>                                                       ECT(1)
>>             codepoints.  This
>>                                                       probably isn't
>>             very
>>                                                       relevant to
>>             current ECN
>>                                                       usage, but may
>>             become
>>                                                       relevant with SCE,
>>             in
>>
>>                                           which
>> middleboxes on the
>>             tunnel
>>                                                       path may introduce
>>             such a
>>                                                       mixture to
>>             formerly
>>                                                       "pure" packets.
>>             From my
>> perspective, a
>>             likely
>>                                                       RFC-3168 compliant
>> implementation of
>>             arbitrarily
>>                                                       choosing one
>>             fragment's ECN
>>                                                       codepoint as
>> authoritative
>>             (where it
>>                                                       doesn't conflict
>>             with other
>>                                                       rules) is
>>             acceptable, but
>>                                                       this doesn't
>>             currently seem
>>                                                       to be mandatory.
>>
>>                                                       With the above
>>             language, it
>>                                                       should be
>>             sufficient to
>>                                                       update RFC-3168 to
>>
>>                                           cover
>>                                                       this case at an
>>             appropriate
>>                                                       time, rather than
>>             scattering
>>                                                       further
>>
>>                                           requirements
>>                                                       in many documents.
>>
>>                                                       - Jonathan Morton
>>
>>
>>
>>
>>
>>                         --
>>
>> ________________________________________________________________
>>                         Bob Briscoe
>>                         http://bobbriscoe.net/
>>                                        PRIVILEGED AND CONFIDENTIAL
>>
>>
>>
>>             --
>> ________________________________________________________________
>>             Bob Briscoe
>>             http://bobbriscoe.net/
>>
>>
>>
>> -- 
>> ________________________________________________________________
>> Bob Briscoe                               http://bobbriscoe.net/
>>
>>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

[tsvwg] draft-ietf-tsvwg-rfc6040update-shim: Sugg… Black, David
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Jonathan Morton
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Black, David
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Joe Touch
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Black, David
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Jonathan Morton
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Markku Kojo
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Bob Briscoe
[tsvwg] ecn-encap: (was: draft-ietf-tsvwg-rfc6040… Bob Briscoe
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Jonathan Morton
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Markku Kojo
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim: … Bob Briscoe
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Markku Kojo
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Bob Briscoe
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Black, David
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Markku Kojo
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Bob Briscoe
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Jonathan Morton
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Black, David
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Bob Briscoe
[tsvwg] ecn-encap-guidelines reframing section Bob Briscoe
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Black, David
Re: [tsvwg] ecn-encap-guidelines reframing section Black, David
Re: [tsvwg] ecn-encap-guidelines reframing section Bob Briscoe
Re: [tsvwg] ecn-encap-guidelines reframing section Jonathan Morton
Re: [tsvwg] ecn-encap-guidelines reframing section Jonathan Morton
Re: [tsvwg] ecn-encap-guidelines reframing section Black, David
Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:S… Markku Kojo
Re: [tsvwg] ecn-encap-guidelines reframing section Markku Kojo
Re: [tsvwg] ecn-encap-guidelines reframing section Markku Kojo