Re: [tsvwg] Update to Position Statement on ECT(1)

"Scheffenegger, Richard" <rs.ietf@gmx.at> Sun, 24 May 2020 06:50 UTC

Return-Path: <rs.ietf@gmx.at>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 03C9E3A0997 for <tsvwg@ietfa.amsl.com>; Sat, 23 May 2020 23:50:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w1r1b8ozYOqv for <tsvwg@ietfa.amsl.com>; Sat, 23 May 2020 23:50:24 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 540DC3A0999 for <tsvwg@ietf.org>; Sat, 23 May 2020 23:50:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1590303020; bh=CXN5wQvuYwONpovBmhyNYaLwEGcEUl9pgLg7/V8GEhQ=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=Wk5tp3kXcH3nY4ipfajlA2ij1b0yrJjETIRZ2KXItTxMc8q3NYW2C0jbDVAEIpxFT xZL1DLNq9CsOba/vpxX3jrcI/0n7Tg+Qhgn3DEZdb2QG8/NSfVhl4WScvJey1hhDIM 8Cb+7+enXbnzCOwfvDz+1cZTEmQl5sq/NBjPxS2c=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.233.104] ([185.236.167.136]) by mail.gmx.com (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1Mqs0R-1jHB9R1QQ5-00mrqi for <tsvwg@ietf.org>; Sun, 24 May 2020 08:50:20 +0200
To: tsvwg@ietf.org
References: <BE44EAE9-5CFB-4F5D-85B8-05AFA516C151@akamai.com> <CACL_3VEbUHB-Omwp1-g5Tq3G3J-kKj9N3jPZLcfruicw3X=AsA@mail.gmail.com> <2CBBD8CD-2088-4E41-B113-EED665853D3C@akamai.com> <CAM4esxSFCBcxXjz5JJJg1z6+wwfN3mTrtJ8bKiBsj2TeOmmFSw@mail.gmail.com> <93331803-e7db-95dc-a4ae-052c347c3c86@bobbriscoe.net> <MN2PR19MB4045568B4A794F1DCE6974BB83B90@MN2PR19MB4045.namprd19.prod.outlook.com>
From: "Scheffenegger, Richard" <rs.ietf@gmx.at>
Message-ID: <3d24de48-1475-152d-4a38-a06e584d75d8@gmx.at>
Date: Sun, 24 May 2020 08:50:19 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <MN2PR19MB4045568B4A794F1DCE6974BB83B90@MN2PR19MB4045.namprd19.prod.outlook.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:3iiV6tIhFy5HCri+PziPTVamNLOdJNaV9s+20PmoTAGe4tv3S/S IRgxKLKJmf/GS3ZGgVyCyaMNCBTi6yue7CmC4JWcF2SwpnuSP5lzuNKwg9mVpxHHAv1sjt8 sRMMyB0y4Uw7o2RWL19LTWGz+BIXGPH9kd1WQnjiaek1WnbM8VoA2f3pMyhgqoMzG4DFd4a 0W4rDyPEfHr/4oBqv/FYA==
X-UI-Out-Filterresults: notjunk:1;V03:K0:FzqoGC4rEL0=:0yw/a9xe8bEgT/LjKB3NJw hOlyywhmwM2Rs6AQS8dghvdJlDfWlc+ADYKxQ1ooFl6L+qDRDwWto2JAUwDD4TWIRw1fylApo SsM5Y8pe68R6TNg3rejEGPtt+yPctLECYeoos+5YnbwMyHOIS9QG6/pc6qJMocyAoi3ugT+t0 kbZqWWMVzAE7nfO88VT9RjLctQcRW7djgmpIPZWw7fT+QU89uonFHwCRGFl0G9fOcwHHYyqpI 1gw6daYMhtObjbNQW1BAczpixtMBAlzvjkNLPW4HTtcFEgHzMwFbFMlGrPDVxEJXTDuZzITPy SMklCOXVYbJnv96as/G/xI4raN/gkmzSaBJ6fBMRMwb1/T1hQJ1yZIqhImxrKcCYui/8qKRDN cQYkXnrML0Y+Vyp5cvdwtoa1Gv4jFKpjXk02rKvifvi12h/2Jz8FA8CJ8XdQIvCi94bTEdZRA UHUnSx8J28PvI7DhoHTL2HYpPuEV8jQZIazxUR6BvBzPdAVpQz8S+cwpJW9mbbLQ75FgkeM7i 8bS2r3lc6+0vZLG8We7CNWm1SDWZp/NeBDOVzlAg+g72SUv8xWsRK+tWE6NVnQ3hOfUlShA15 LZbFwPzJeSEcj9CJfV9ty36TaxO/mZoIxsWoXFnyPmt3IXeVK5rhIMs3LLtKMlEPn0g9/9+XY muuDc5771NL5CCLP5hzpcfJh/3Xi7zCuo0gzU9RjQU+dx7JpTlMfgB/fN10AL90MFsIOvHRkT Ud1vTGJTt3X4J9itW6r6ogLUUQvBrmMhjv6yuujviqXP4uSa+8/gymDvVybJxm2OcImXzoXuA eHTrcIml6RSCO3a2AyuFkDnafcJwiKVuHGdVg1NhkS7eUjFigPoiy9BXT6V1Nv+HOiXeG5NY0 kryF50JPp9HB5VrS2eRwbkAi6+1dW4hp+j3WqkR1U+mPS1GgWSlgZOx3Rp/ofT9lEfcblyUph 0u8Xow/6dAzQjszuGYSnNcwgespku1bjICxMPoLX7gGvlaGmeIEDsPkVHbd+Dnh+LFoZhzGkw QwVU67WNG2EvEoSRiyXjHNSLIKMfi+O9xIl+d7IOjT9aLtWX6+xp27gpUoYzLScF7N0dQoHcO kjXOG8ClaHGXZhI/NasSrS/KUop0Jx8+Xo7EKYIs/8PNp5e33zriLfuuBRIAbfZbx0Q9LeM5P /7PKrcL7QQryWQYaVVDhFQf5JkCkiDqjMEsUrTlRzWtO1XNAWRSZAXjEu2MbSLjK1JPVUYkox Kto7g3TiHHyc9wFyc
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/52K_5gXC66Xxgl8D5Dc-lQ4RO0s>
Subject: Re: [tsvwg] Update to Position Statement on ECT(1)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 24 May 2020 06:50:28 -0000

David, Jake,

Just to be completely clear here:

The proposed signal is, that a receive can latch one codepoint,
indicating CE was received (currently ECE). And that latch is
maintained, until the sender - successfully - conveyes, that it did in
fact react to the CE signal (currently CWR)...

So, any signal for fractional queue occupancy during this window can no
longer signaled in the TCP header, only in the AccECN options.

And if the CWR-like codepoint is missed, ignored (like a least one stack
currently does at times), dropped, another window of fractional feedback
is gone (in the TCP header) while another Multiplicative Decrease
reaction has to happen.


Maybe I missed this, but under the SCE scheme, I was assuming the 1/p
signal was ECT(1), not ECT(0). Or have we all agreed already, that
ECT(1) is the "new" network input signal (both SCE and L4S), but an AQM
may re-mark it into ECT(0) or CE?

I bring this up because I apparently missed the context around "and uses
ECT(0) [remarked from ECT(1)]".

Regards,
    Richard



Am 19.05.2020 um 15:19 schrieb Black, David:
> [still posting as an individual, not WG chair]
>
>
> Bob,
>
>> [BB] I don't know how you get to one codepoint in ACE being enough for
> CE. The whole idea of AccECN is to be robust to ACK loss and thinning.
>
>  > If it was as simple as you make out, we could have sorted out this
> problem snip-snap and all be home in time for tea.
>
> I think you may have missed the point.  The topic under discussion is a
> 2-level marking system that retains the semantics of CE as calling for
> an RFC3168-style Multiplicative Decrease (MD), and uses ECT(0) [remarked
> from ECT(1)] as the queue occupancy mark for 1/p congestion control.
> With two signals (CE and ECT(0)) arriving at the receiver, there need to
> be ways to reflect both to the sender.
>
> The MD signal (CE) overrides the queue occupancy signal, as it calls for
> a much larger backoff.  ACK thinning should be less of a concern for MD
> because any arrival of CE (or detection of a drop) at the receiver is
> supposed to cause MD at the sender, and that sender MD reaction is
> limited to once per RTT.
>
> The underlying reasons why a single codepoint may suffice to signal MD
> support via the AccECN field are two-fold:
>
>   * The MD signal (CE) overrides the queue occupancy signal (ECT(0)), as
>     it calls for much more sender backoff, so there’s no point in AccECN
>     for queue occupancy if a CE has been received.
>   * The MD signal does not require the gradations of AccECN – it’s an
>     undifferentiated “Oh sh*t!” signal calling for MD backoff (and
>     reversion to RFC3168 sender behavior).
>
> That would still leave the rest of the AccECN codepoints to reflect the
> queue occupancy signal in the absence of CE arriving or a drop being
> detected.
>
> Thanks, --David
>
> *From:*tsvwg <tsvwg-bounces@ietf.org> *On Behalf Of *Bob Briscoe
> *Sent:* Tuesday, May 19, 2020 8:59 AM
> *To:* Martin Duke; Holland, Jake
> *Cc:* TSVWG
> *Subject:* Re: [tsvwg] Update to Position Statement on ECT(1)
>
> [EXTERNAL EMAIL]
>
> Martin,
>
> On 18/05/2020 23:11, Martin Duke wrote:
>
>     Jake,
>
>     I'm intrigued by this discussion of the ECT(1)->ECT(0) proposal, as
>     something that could definitively solve the safety concerns. I'll
>     make two unrelated points:
>
>     1) If the current L4S proposal is in need of an MD signal, there's
>     always dropping the packet.  Although packet loss is bad, maybe some
>     drops at the end of slow start is the tradeoff we have to make to
>     get low latency. Implementations really concerned about loss can be
>     less aggressive during slow start.
>
>     2) Clearly, there is no AccECN signaling problem for ECT(1)->ECT(0)
>     for QUIC, and for TCP paths where the option gets through. It this
>     is an issue of the three ACe bits, I think one codepoint in ACE
>     would be sufficient to indicate that a CE mark was received, which
>     IMO would trump whatever other feedback is in that header.. Unless
>     there's some sort of performance cliff in not being able to encode 7
>     ECT(0) marks, this seems like a non-problem
>
>
> [BB] I don't know how you get to one codepoint in ACE being enough for
> CE. The whole idea of AccECN is to be robust to ACK loss and thinning.
> If it was as simple as you make out, we could have sorted out this
> problem snip-snap and all be home in time for tea.
>
> As we all try to remove more and more latency, we are all going to find
> that the problems get harder and harder - little details matter a lot.
> Without really solid feedback precision, you lose the consistent latency
> of L4S.
>
> Richard Scheffenegger and I designed a 4-3-1 codepoint scheme for ACE at
> one point (4 for CE, 3 for ECT(1) and one for Not-ECT). See
> https://tools.ietf.org/html/draft-kuehlewind-tcpm-accurate-ecn-03#section-3.2.1
> Table 3. However, it was removed because it was "jack of all trades,
> master of none." By trying to cover all the codepoints, it didn't give
> sufficient robustness to any.
>
> Even using all 8 values of the 3-bit counter for feeding back just one
> codepoint is on the edge of what is needed today. I sometime liken the
> ACE part of AccECN to a "stalking horse" - i.e. a vehicle that works for
> today, while also acting as a platform to get the larger AccECN Option
> deployed for the longer term - when the ACE field might have become too
> small for the possible future level of ACK thinning on some paths.
>
> In fact, over the years {1} that we've been trying to get AccECN
> standardized, you will find all sorts of weird and wonderful compressed
> encodings, but in the end, as well as inadequate robustness, arguments
> for simplicity took precedence. Not just because complexity breeds bugs
> (which is not an insignificant consideration for TCP), but also because
> the semantics of ACE also have to be supported by segmentation offload
> hardware.
>
> When, I highlighted the problems with tunnelling that the ECT1->0 scheme
> has, I only hinted at these problems with TCP, because I thought
> incompatibility with tunnelling should be a sufficient argument.
>
> In summary, if you think deploying a change to IP is easy, you've
> probably not absorbed the full breadth of the problem.
>
>
> Regards
>
>
>
> Bob
>
> {1} Since Oct 2005:
> https://tools.ietf.org/html/draft-briscoe-tsvwg-re-ecn-tcp-00#section-4.1
>
>
>     Martin (as an individual)
>
>     On Sun, May 10, 2020 at 5:09 PM Holland, Jake
>     <jholland=40akamai.com@dmarc.ietf.org
>     <mailto:40akamai.com@dmarc.ietf.org>> wrote:
>
>         Hi Mike,
>
>         From: "C. M. Heard" <heard@pobox.com <mailto:heard@pobox.com>>
>          > Yes, combinations marked (**) below would have to changed
>         from RFC 6040:
>         ....
>          > Similar changes would be needed for
>         draft-ietf-tsvwg-rfc6040update-shim and
>         draft-ietf-tsvwg-ecn-encap-guidelines.
>          >
>          > Clearly, the need to get such changes deployed would be a
>         barrier to barrier to adoption.
>
>         Yes.  I think in a recent thread I heard it confirmed that
>         current tunnel
>         handling of these is kind of spotty today, specifically:
>
>            > as an endpoint we'll be dealing with weird inconsistencies
>         that basically
>            > never fully understood anything beyond "don't lose CE
>         marks" and maybe
>            > "loss is acceptable in confusing cases", if we're lucky.
>
>            [BB] See reply to Dave Taht just now, which pretty much
>         confirms what
>            you've just said.
>
>         that's from here:
>         https://mailarchive.ietf.org/arch/msg/tsvwg/QudqLu1RTQZCVnS4HNf8jnZ_kIY/
>
>         (I think the Dave Taht reply he was referencing was this one:
>         https://mailarchive.ietf.org/arch/msg/tsvwg/2ElPK72IiFg2gHJZ_rLMUKTDfCI/
>         )
>
>         I'm beginning to think the reason I've come down differently
>         than the
>         L4S team on the judgement call for this approach being better,
>         in light
>         of the state of tunneling encapsulation deployments, might boil
>         down to
>         a disagreement over the answer to a question like:
>            Which is better:
>            - losing the safety response of MD from a loaded classic
>         queue, or
>            - losing some of the reliability on the low-latency response
>         when there
>              is a dualq on-path?
>
>         I'm beginning to think we might be stuck with one of those
>         options for
>         tunneled paths until tunnel decap implementations can be widely
>         upgraded
>         in deployment, however this lands.
>
>          > - the existing accecn spec would often lose non-CE signals
>          >
>          > Actually, I would go farther and say that something rather
>         different from the existing AccECN draft would be needed. AccECN
>         provides accurate feedback of the number of CE marks observed.
>         Under the proposed scheme L4S would need to getting accurate
>         feedback of the number of ECT(0)  (pre-congestion / some
>         congestion) marks. AccECN would need to be re-worked to provide
>         both that and, in addition, either the existing ECE/CWR
>         handshake or something else that performs the equivalent
>         function. The most obvious solution would be to repurpose NS and
>         one or  more currently reserved flag bits (or use other ideas
>         from RFC 7560 Sec 5.2) and leave ECE/CWR unchaged. I note in
>         passing the SCE proposal would have to do something along the
>         same lines (though AFAICT that has not yet been fully fleshed out).
>
>         Agreed, I think a different feedback than AccECN would be
>         smarter if the
>         ECT(1)->ECT(0) approach goes forward, and I like the NS
>         reflection approach
>         that SCE's implementation started with.  (Although it might lose
>         fidelity
>         from some ack aggregation responses, I'd expect usually that the
>         marking
>         rate maintains proportionality on the low-congestion signal, and
>         where that
>         fails, the standard ECE response is at least reliable, so it
>         covers the
>         safety considerations the same way as classic marking.)
>
>         However, I also think for anyone who disagrees, other viable
>         approaches
>         for the feedback might be possible.  But in that direction, I do
>         think
>         it probably would need to differ from AccECN.  This would of
>         course need to
>         be nailed down in the end, though I didn't get into it in the
>         first email.
>         But the potential complexity here is one of the reasons I rate the
>         suggestion as perhaps a major architectural change for L4S.
>
>         I tend to think that the per-ECT(0) reflection in NS is the best
>         way,
>         but I don't think it would change the rest of the argument if
>         that position
>         turned out wrong.
>
>          > - For paths with multiple AQMs, the classifier partially
>         loses integrity in
>          >   later AQMs when earlier AQMs are loaded.  (Note also the
>         worse downside
>          >   that increasing deployment of new AQMs potentially reduces
>         the fidelity
>          >   further.)
>          >
>          > If I understand what is being said, this is because ECT(0)
>         would become ambiguous, as it can appear either on an L4S packet
>         with a pre-congestion marking, or a non-L4S packet.  Doesn't the
>         same issue exist with the current L4S proposal for CE-marked
>         packets?
>
>         Yes, but the L4S specs go over this and walk through the
>         reasoning for
>         why they landed on classifying CE into the LL queue, and the net
>         result
>         in that case is  that the ECT(1)->CE marking strategy that L4S
>         currently
>         follows will keep the 1/p packet signals in the LL queue for the
>         later
>         hops.
>
>         (The potential problems were mostly limited to mis-classified marked
>         classic traffic, which will tend to be fewer in number and also less
>         severe given that the flow is slated to back off anyway, plus a
>         review of
>         the main implementations suggested they wouldn't be doing
>         double-backoffs
>         if the CE packet was out of order, IIRC.)
>
>         It might be better to phrase this not as "loses integrity", but
>         rather as
>         "might systematically increase the latency experienced" for the
>         1/p signal,
>         since when there are multiple dualqs in line, those packets (but
>         not others
>         from L4S flows) will land in the classic queue on the later
>         dualqs.  This
>         is arguably a worse downside than the classification failure
>         from putting
>         CE-marked classic traffic in the LL queue.
>
>          > Actually, it seems to me that this approach would yield
>         exactly the same congestion signaling capability as using ECT(1)
>         as a  pre-congestion / some congestion mark. All that has been
>         done is to reverse the role of ECT(1) and ECT(0) compared to
>         what the SCE draft and RFC 6040 envisioned. In other words:
>          >
>          >      +-----+-----+
>          >      | ECN FIELD |
>          >      +-----+-----+
>          >         0     0        Not-ECT
>          >         0     1        ECT(1) - L4S/SCE Capable AND No
>         Congestion
>          >         1     0        ECT(0) - Some Congestion OR RFC 3168
>         ECN Capable
>          >         1     1        CE
>
>         Yes, that's my understanding.  I think the whole proposal can
>         reasonably be
>         summarized as "SCE with the bits flipped".
>
>          > Jake, you said that the three issues discussed above --
>         tunnels, AccECN, and multiple AQMs in the path are "a few of the
>         known tradeoffs." What are the others?
>
>         These are all the ones I know of yet, but I think Bob and Koen
>         might have
>         some others they already know.  I'm not sure I got the whole
>         story yet.
>
>         Thanks for your comments.
>
>         Best regards,
>         Jake
>
>
>
> --
>
> ________________________________________________________________
>
> Bob Briscoehttp://bobbriscoe.net/
>