Re: [tsvwg] ecn-encap-guidelines reframing section

Bob Briscoe <ietf@bobbriscoe.net> Wed, 24 March 2021 09:49 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 52D0B3A28FE; Wed, 24 Mar 2021 02:49:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.434
X-Spam-Level:
X-Spam-Status: No, score=-1.434 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OacWExx57063; Wed, 24 Mar 2021 02:49:02 -0700 (PDT)
Received: from mail-ssdrsserver2.hosting.co.uk (mail-ssdrsserver2.hosting.co.uk [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B268C3A28CB; Wed, 24 Mar 2021 02:49:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eey7YVkqQ+P5527cF6/H1U5niQCNWZBR3MniqzlJw/I=; b=mzOIA5xG9MMv7spQgBuW3WceZ3 nPoAxDS2rrnAE5U4oiWn0n3aCEG1wg75ExSCZP8I8VolWplSIF+fl+obE73ufflOA6JNYgFJmmCCj vbI3AxebzF23H3FkFZ/w1PLnC5ChqB18qwqMuMr1Yf/4XQ05hvBCQM4U2k3ukZ/kLVxI7Wrpk9Yfe i9jCVeCinpi54ckotep1X+a/YkX1tQxic4peVm6aEH2bv/2GjBR5XB1g/sTPX2VYlG/j5X+7/AxTC Byrg+c7UqXTWELt3qrRMmjNZSmEGe3prnkb8IMKochIeI20rbogmr8qcLbNjCP7WQauLu7qIkoxk2 rUYzbDkg==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:54274 helo=[192.168.1.11]) by ssdrsserver2.hosting.co.uk with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from <ietf@bobbriscoe.net>) id 1lP08I-0005hy-PV; Wed, 24 Mar 2021 09:48:58 +0000
To: "Black, David" <David.Black@dell.com>, Jonathan Morton <chromatix99@gmail.com>
Cc: Markku Kojo <kojo@cs.helsinki.fi>, Joe Touch <touch@strayalpha.com>, Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, "tsvwg-chairs@ietf.org" <tsvwg-chairs@ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <CE03DB3D7B45C245BCA0D243277949363076629A@MX307CL04.corp.emc.com> <CE03DB3D7B45C245BCA0D2432779493630768173@MX307CL04.corp.emc.com> <6D176D4A-C0A7-41BA-807A-5478D28A0301@strayalpha.com> <CE03DB3D7B45C245BCA0D24327794936307688C5@MX307CL04.corp.emc.com> <alpine.DEB.2.21.1911171041020.5835@hp8x-60.cs.helsinki.fi> <9024d91a-bb08-fb45-84f8-ce89ba90648d@bobbriscoe.net> <alpine.DEB.2.21.2012141735030.5844@hp8x-60.cs.helsinki.fi> <1e038b64-8276-3515-ac45-e0fc84e1c413@bobbriscoe.net> <alpine.DEB.2.21.2103081540280.3820@hp8x-60.cs.helsinki.fi> <3c778eb9-56dc-3d58-0de4-c6373d1090ec@bobbriscoe.net> <alpine.DEB.2.21.2103181233160.3820@hp8x-60.cs.helsinki.fi> <8ac0d6dd-1648-ee8d-d107-55ef7fe7695f@bobbriscoe.net> <CD5B98D1-9BAE-4B74-8751-A8AF293AEFC3@gmail.com> <MN2PR19MB4045C7AD9873F378FB542CF283659@MN2PR19MB4045.namprd19.prod.outlook.com> <10cb995d-7ac0-99c8-4013-5ea8a518e643@bobbriscoe.net> <MN2PR19MB40451E51462D82DF2F81D18183639@MN2PR19MB4045.namprd19.prod.outlook.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <6b9f2527-ccbd-af2a-caa3-8a0b7c234aa6@bobbriscoe.net>
Date: Wed, 24 Mar 2021 09:48:56 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <MN2PR19MB40451E51462D82DF2F81D18183639@MN2PR19MB4045.namprd19.prod.outlook.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hosting.co.uk
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hosting.co.uk: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hosting.co.uk: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/CbLy2bMp0i2LXiKMfBABUJ6VFCk>
Subject: Re: [tsvwg] ecn-encap-guidelines reframing section
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Mar 2021 09:49:07 -0000

David, see [BB2] inline...

On 24/03/2021 01:00, Black, David wrote:
> Bob,
>
>> [BB] Er...hum...
>> You seem to have forgotten that you are talking about just dumping the
>> point that I believe was missing from RFC3168. We came to a long-fought
>> agreement that we would not decide on this before publishing these
>> drafts. But now you are proposing we decide on this before publishing
>> these drafts.
> Well ... not exactly ... because it appears to me that the reframing section (4.6) of the ecn-encap-guidelines draft does not address IP packet reassembly from IP fragments, and hence falls outside the scope of RFC 3168.
>
> That said, I nonetheless concur with your sense that it would be good to preserve flexibility for addressing this area in a more specific fashion in the future.
>
>>> For replacement, my initial sense matches Jonathan's, in particular that a layer 2
>>> congestion mark ought not to result in congestion marking multiple IP packets:
>> [BB] The whole problem I identified with only thinking in terms of the
>> second SHOULD is that you end up with either inflated or deflated
>> marking, depending respectively whether frames are smaller or larger
>> than packets. That is the whole point of the need for the two
>> contradictory requirements.
> My current inclination is that we may be best served by postponing a decision on whether inflated/deflated marking is a bug or a feature, roughly analogous to what we've agreed to for the rfc6040update-shim draft.  The second "SHOULD" is clearly consistent with that approach, as propagating congestion indications quickly reduces the latency of transport protocol congestion response, which is a "good thing" on its own:
>
>>>      The mechanism for propagating congestion indications SHOULD ensure
>>>      that any incoming congestion indication is propagated immediately,
>>>      not held awaiting the possibility of further congestion indications
>>>      to be sufficient to indicate congestion on an outgoing PDU.
> Turning to the first "SHOULD," this paragraph from one of Markku's messages frames the conundrum well (at least for me):
>
> 	I'm very well aware of the strong case that RFC 7141 makes for
> 	packet-mode drop, and it makes this problem even much trickier to solve.
> 	However, if we concentrate only on the problem of dropping small fragments
> 	and reassembling them, the byte-mode drop together with reassembly logic
> 	in RFC 3168 results in the correct outcome.
>
> Of course, a significant aspect of the situation here is that reframing is not just about small fragments.
>
> It's ironic that you (Bob) as an author of RFC 7141 are advocating byte-mode in this context - that's not intended to imply self-contradiction, unsoundness of argument, etc., but rather to serve as an indication of the complexity and subtlety of this situation.  Definitively resolving this situation now appears to require digging in well beyond this high-level byte-mode vs. packet-mode discussion, a journey that I'd really like to avoid in the hope of landing these drafts in our AD's lap in the near future.

[BB2] The first SHOULD is not advocating byte-mode marking. It is 
designed to preserve the packet-mode marking applied by an AQM. See next 
response.

> So, in what may be an attempt to "have my cake and eat it too" I'd like to suggest rewriting the first SHOULD in terms of an observation that does not directly opine on byte-mode vs. packet-mode and does not use RFC 2119 keywords, e.g.:
>
> OLD
>       Congestion indications SHOULD be propagated on the basis that an
>       encapsulator or decapsulator SHOULD approximately preserve the
>       proportion of PDUs with congestion indications arriving and leaving.
> NEW
>       For environments in which protocol and/or application response to
>       congestion is sensitive to the number of bytes in IP packets with
>       congestion indications rather than the number of IP packets with
>       congestion indications, encapsulators and decapsulators ought to
>       approximately preserve the proportion of PDUs with congestion
>       indications arriving and leaving.  See RFC 7141 [RFC7141] for further
>       discussion.
>
> Would something like that text work?

[BB2] I'm afraid not, because this is not about some niche environment. 
Both the SHOULDs in the current draft are intended to apply to all known 
AQMs and all known congestion controls including standard TCP.

Before we discuss the requirement, can we make sure we're all on the 
same page regarding some basic facts about preserving markings when PDU 
boundaries change:

                    | marked    marked
                    | PDUs      bytes
-------------------+------------------
preserving prop'n  |  ==        ==
preserving number  |  !=        ==

For those who prefer writing, this means that, when the boundaries 
between PDUs change, preserving the proportion of marked PDUs, the 
proportion of marked bytes, and the number of marked bytes all mean the 
same thing. But preserving the number of marked PDUs is not the same as 
any of the others. And note that preserving the timing is the same as 
preserving the number of marked PDUs.

Does everyone agree on these factual points, at least?

For instance, consider CoDel counting 200 PDUs between marks on the 
outer headers, then imagine that on decap the boundaries between the 
PDUs are changed to create half as many PDUs...
Then, if each single mark is preserved as a single marked PDU, it will 
result in only 100 PDUs between marks. This is because the total number 
of PDUs has changed, so you cannot preserve both the number of marked 
PDUs and the number of unmarked PDUs.

With all existing congestion controls that I know of:
* the instantaneous behaviour and responsiveness depends on the timing 
of individual marks (the second SHOULD).
* but the flow rate of long-running flows depends on the average 
proportion of marked packets (the first SHOULD). {Note 1}

It's up for debate how we solve this dilemma, but can people at least 
agree (or not) that this dilemma exists.


Bob

{Note 1} For instance, the average proportion of marked packets is 'p' 
in the well known Reno formula,
     cwnd_avg = sqrt(3/2p)




>
> Thanks, --David
>
> -----Original Message-----
> From: Bob Briscoe <ietf@bobbriscoe.net>
> Sent: Tuesday, March 23, 2021 7:24 PM
> To: Black, David; Jonathan Morton
> Cc: Markku Kojo; Joe Touch; Markku Kojo; tsvwg-chairs@ietf.org; tsvwg@ietf.org
> Subject: ecn-encap-guidelines reframing section
>
>
> [EXTERNAL EMAIL]
>
> David,
>
> On 22/03/2021 22:00, Black, David wrote:
>> ---------------------------------
>>
>> Moving onto the ecn-encap draft (Section 4.6), the text involved concerns
>> how to propagate layer 2 frame congestion marks to IP packets which might
>> be fragments.  As this text is not dealing with reassembly of IP fragments, it
>> cannot be in conflict with the reassembly text in RFC 3168, which has nothing
>> to say about layer 2 frame congestion marks:
>>
>>      Congestion indications SHOULD be propagated on the basis that an
>>      encapsulator or decapsulator SHOULD approximately preserve the
>>      proportion of PDUs with congestion indications arriving and leaving.
>>
>>      The mechanism for propagating congestion indications SHOULD ensure
>>      that any incoming congestion indication is propagated immediately,
>>      not held awaiting the possibility of further congestion indications
>>      to be sufficient to indicate congestion on an outgoing PDU.
>>
>> Bob initially suggested the following:
>>
>>> Possible resolution of the contradiction: the "SHOULD approximately preserve
>>> the proportion" is a rough long term average goal while "SHOULD ensure that
>>> incoming congestion indication is propagated immediately" is a requirement
>>> for after there has been some period (TBD) without any marking.
>> I'm going to go one step further and suggest removing the first "SHOULD" - the
>> whole notion of rate-based marking of IP packets reassembled from fragments
>> is what got us into the tarpit for the rfc6040update-shim draft, and the first
>> "SHOULD" appears to be headed into the same tarpit, only perhaps deeper
>> as the frames involved may contain multiple packets and/or fragments and/or
>> portions of packets and/or portions of fragments.  That's not exactly pretty ...
> [BB] Er...hum...
> You seem to have forgotten that you are talking about just dumping the
> point that I believe was missing from RFC3168. We came to a long-fought
> agreement that we would not decide on this before publishing these
> drafts. But now you are proposing we decide on this before publishing
> these drafts.
>
>> For replacement, my initial sense matches Jonathan's, in particular that a layer 2
>> congestion mark ought not to result in congestion marking multiple IP packets:
> [BB] The whole problem I identified with only thinking in terms of the
> second SHOULD is that you end up with either inflated or deflated
> marking, depending respectively whether frames are smaller or larger
> than packets. That is the whole point of the need for the two
> contradictory requirements.
>
>>> I would say that one mark applied at link layer should result in one mark applied
>>> to one IP packet.  Exactly which one doesn't really matter, as long as it has some
>>> tangible connection to the frame that was marked.  Word it that way, and we'll
>>> be fine.  In particular, this method should work for *both* conventional and
>>> high-fidelity sensitive traffic.
>> That also has the useful simplification of not asking the implementation of this draft
>> to roughly track a long term average in some fashion.
> [BB] No tracking of a long-term average is needed in the implementation,
> only in the /requirement/. One example implementation would be a single
> counter per aggregate (for the first SHOULD) and a timeout for the
> second SHOULD. The two override each other to create a compromise that
> addresses each requirement in the traffic scenarios where it is most
> applicable.
>
> If you want me to give example pseudocode in this email, I would love
> to. But I thought we agreed that we are not going to solve the dilemma
> in this text, we are just going to state the requirements. Having worked
> on this draft for so many years, and having developed what I believe is
> a solution, I find that highly unsatisfactory. But we agreed to it.
>
>
>
> Bob
>
>> Thanks, --David
>>
>> -----Original Message-----
>> From: Jonathan Morton <chromatix99@gmail.com>
>> Sent: Sunday, March 21, 2021 2:42 PM
>> To: Bob Briscoe
>> Cc: Markku Kojo; Joe Touch; Markku Kojo; tsvwg-chairs@ietf.org; tsvwg@ietf.org
>> Subject: Re: [tsvwg] draft-ietf-tsvwg-rfc6040update-shim:SuggestedFragmentation/Reassemblytext
>>
>>
>> [EXTERNAL EMAIL]
>>
>>> On 20 Mar, 2021, at 8:27 pm, Bob Briscoe <ietf@bobbriscoe.net> wrote:
>>>
>>> It's not enough to make ecn-encap the same as shim. The reassembly logic in RFC3168 is only defined when packets are reassembled from /smaller/ fragments. When a L2 frame is /larger/ than an IP packet, or /overlaps/ the boundary between IP packets, the reassembly logic in RFC3168 makes is undefined - it makes no sense.
>>>
>>> For instance, some link layers treat IP packets as a continuous byte stream, then break the stream into the largest possible frames, like so:
>>>
>>> ----------------->+<---------------------------->+<------------------------------>+<----
>>>           Fr1       |                Fr2           |             Fr3                |
>>> +-------------+-------------+-------------+-------------+-------------+-------------+---
>>> |   Pkt1      |    Pkt2     |    Pkt3     |   Pkt4      |    Pkt5     |   Pkt6      |
>>> +-------------+-------------+-------------+-------------+-------------+-------------+---
>>>
>>> Then, say Fr2 was marked. On decap should Pkt2, Pkt3 & Pkt4 be marked, or just Pkt3 & Pkt4?
>> I would say that one mark applied at link layer should result in one mark applied to one IP packet.  Exactly which one doesn't really matter, as long as it has some tangible connection to the frame that was marked.  Word it that way, and we'll be fine.  In particular, this method should work for *both* conventional and high-fidelity sensitive traffic.
>>
>>    - Jonathan Morton

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/