Re: [tsvwg] Update to Position Statement on ECT(1)

Martin Duke <martin.h.duke@gmail.com> Wed, 20 May 2020 00:51 UTC

Return-Path: <martin.h.duke@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D71A33A0843 for <tsvwg@ietfa.amsl.com>; Tue, 19 May 2020 17:51:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fH5oxuBC5TPR for <tsvwg@ietfa.amsl.com>; Tue, 19 May 2020 17:51:44 -0700 (PDT)
Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 21F453A080B for <tsvwg@ietf.org>; Tue, 19 May 2020 17:51:22 -0700 (PDT)
Received: by mail-il1-x12a.google.com with SMTP id j2so1407063ilr.5 for <tsvwg@ietf.org>; Tue, 19 May 2020 17:51:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OPdQm/R+LFhGMkUEOkW5vC5EXqNUFV1c1KjvQ0eAPsM=; b=KaZbr5uBjQfsw/Nyf6vMH8CIdecGHXnhzd8W/ytxW89sJPcN1vIXlYBfoVStjvUM65 yYwjKn7XAa01iBk3MK5uELXZReFBOZT4ISpgFwhZkEcGWI5WrjPamOGhJtHYYHam95t9 7t+vjORdg3E/JlnopSfmBt3QKyozXZUMaR/QkYtyZDqmhFoTpjePcA+sIk5PYiXwVOaC LEo0PEG7hfP6v25x3yp2gWQDIEiCGKyaV5fzA+0s99Tfpo4MYGf/iY6cDZCwUsic433a aEyiRz9e15IkOulZJmkRf1Ubq1ShPWKT80Ny8+KDLWciXeXxH16GX9vhlqfl0PzvuVSV kkCQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OPdQm/R+LFhGMkUEOkW5vC5EXqNUFV1c1KjvQ0eAPsM=; b=VkAorGKKC/kdJ/0amFCM94x4J20lSi7leb3LtegNjrE0O/2vmVHHno+LTAWGVbRDm5 Mgfu74/0FyytAIAEwWzpZnG+chloH2coZfPjpCqJdCDWu4QnzIzmZRHArouQEM7tTJzU 957AgPyFdyNZXHH7DBw8pHWzDA+QIsKm6upqOYRBuUWJa2kmh2c4MUHXJ0SNC2zOOrCo yi++md9rEQq07lbjapX2FIOHeBNY9sUeKY56cLjqQ/AwHh3EXVjsf4OBonoiaBYfctyU qIKr353u1o3GLZcrg3R+BDBIkwIErkNgGlPfHBuMO2lUfIrLe2T2jJhUVBz4l8UAVi+g 8LnA==
X-Gm-Message-State: AOAM532F4TfgjU3/JlqY+lDbcjU0hv8vxbRksae+Ph24AXSindtXH6vY wxWNtt+u4asEIS/1lpJsI3eE42eRea/f1oyJWrc=
X-Google-Smtp-Source: ABdhPJzvuTuohK1OPCphdBtAH+R9vWzB/dXqfYSOrZq6IDBSo6fWkqUtzN649vAqTWWXfmn3Bios5r4ERpwMm1gJhno=
X-Received: by 2002:a92:5c89:: with SMTP id d9mr1751986ilg.237.1589935881280; Tue, 19 May 2020 17:51:21 -0700 (PDT)
MIME-Version: 1.0
References: <BE44EAE9-5CFB-4F5D-85B8-05AFA516C151@akamai.com> <CACL_3VEbUHB-Omwp1-g5Tq3G3J-kKj9N3jPZLcfruicw3X=AsA@mail.gmail.com> <2CBBD8CD-2088-4E41-B113-EED665853D3C@akamai.com> <CAM4esxSFCBcxXjz5JJJg1z6+wwfN3mTrtJ8bKiBsj2TeOmmFSw@mail.gmail.com> <1D8D2AF8-F805-4BAC-8126-355A8337D830@akamai.com> <CAM4esxSMELAi0BMBRynYTx44iY6f-yLEWng4QQ2Pxt9J-haxFg@mail.gmail.com> <DE770902-CA1E-405C-A944-F12114AF2C3B@akamai.com>
In-Reply-To: <DE770902-CA1E-405C-A944-F12114AF2C3B@akamai.com>
From: Martin Duke <martin.h.duke@gmail.com>
Date: Tue, 19 May 2020 17:51:09 -0700
Message-ID: <CAM4esxQTyDNfNiAFhiHL9Zb3OPr9jivkrD2u8DtvhsMw_2Yv-g@mail.gmail.com>
To: "Holland, Jake" <jholland@akamai.com>
Cc: "C. M. Heard" <heard@pobox.com>, TSVWG <tsvwg@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000006b646105a609cb3c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/dr_fiChDiYlbCmMaFnQzBbcLpO4>
Subject: Re: [tsvwg] Update to Position Statement on ECT(1)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 May 2020 00:51:47 -0000

Hi Jake,

First, thanks to Bob for the RFC 6040 reference. That cleared my confusion
about the tunnel problem up right away.

Jake, I like your frame of the interesting trade-off between losing the L4S
signal and losing the 3168 signal. However, let me add yet another concern
about ECT1->0: by design, packets so marked may arrive *much* later than
neighbors not marked, which means they are likely to be declared lost
anyway. Do you see this as a problem, or post bottleneck is the latency
difference likely small?

Martin

On Tue, May 19, 2020, 13:13 Holland, Jake <jholland@akamai.com> wrote:

> Hi Martin,
>
>
>
> New responses with <JH2></JH2>
>
>
>
> *From: *Martin Duke <martin.h.duke@gmail.com>
> *Date: *Tuesday, May 19, 2020 at 10:21 AM
>
> To me, the problem with loss as the only (viable) MD signal is the
> existing devices that don’t drop until well after exceeding reasonable
> fairness bounds from an unresponsive flow, so for TCP Prague to maintain
> compatibility with classic queues, it would still need a successful classic
> queue detection mechanism.
>
>
>
> I probably misunderstood. I thought the desired signal was a MD signal in
> the low-latency queue. Yes, for 3168 queues you'd have to essentially treat
> ECT(1) as Not-ECT. So that may not be a productive direction.
>
>
>
> <JH2>
>
> As I understand it, this is the current L4S MD signal if the LL queue
> overflows into the low latency queue and has congestion there.
>
>
>
> I agree this works as far as it goes, but the safety problem that seems to
> need an MD signal is the interaction with existing queue implementations
> that know nothing about L4S, and have a pre-existing interpretation of
> ECT(1).
>
>
>
> (Even for FQ, the interaction is very poor, but shared queues is where the
> poor interaction is arguably a major safety concern.  Mostly nobody seems
> to care much about the occasional hash collision in FQ that would have the
> same problem, because most FQs are 1024 slots, so after birthday paradox
> considerations it’s only 1/32 or so of the competing flows.)
>
>
>
> Anyway, if there was sufficiently good consensus that these classic queues
> are irrelevant, then deprecating them responsibly is a potentially viable
> way forward, but that’s not what any of the current BCPs or standards say
> today, I think.
>
> </JH2>
>
>
>
>
>
> Good catch -- soon after I sent this, I thought, "what about CWR". The
> problem with using a second codepoint is that the sender signal then
> pre-empts the feedback signal -- fine for functionally half-duplex
> connections, but not generally. I can see a few ways out of this:
>
>
>
> 1) Take another TCP header reserved bit so that we can keep CWR and the 3
> bits for ACE. (Ugh)
>
> 2) Come up with a heuristic to stop sending ECE, obviating CWR.
>
> 3) Come up with a heuristic to mix packets with CWR and ACE as needed
> depending on counter activity.
>
> Though I'll defer to Bob on the issues of re-engineering ACE -- it sounds
> like he's been through all the permutations.
>
>
>
> <JH2>
>
> My conclusion that it seemed workable to me came with the assumption that
> #3 would be pretty straightforward.
>
>
>
> You don’t need very many CWRs, and it might be possible to clean up the
> specs a bit about when exactly they should be sent (like maybe they’re only
> appropriate when there’s a MD event, and maybe they shouldn’t be sent for
> linear reductions during a fine-tuning response, now that we’re
> contemplating such a thing).
>
>
>
> So I was thinking for bidirectional traffic that has the newly suggested
> ACE signaling (with a codepoint for ECE and CWR, plus 6 count slots), you
> could just put CWR at the top priority, so the (only occasional) CWR would
> overrule either a latched ECE that’s still in progress or one of the
> fine-tuning signal slots, for the occasional packet reporting an MD event.
> (The logic from 3168 still holds here: if CWR is on a data packet and the
> data packet is lost, you’ll still get an MD response, so it doesn’t need to
> be reliable like ECE, and the counter makes it so skipping a packet doesn’t
> lose the fine-tuning signal, since it’s no worse than thinning the acks
> would have been.)
>
>
>
> I do worry a bit whether a 6-count is enough to detect rollover well, but
> like with the ESCE approach for reflecting the fine-tuning signal, losing
> some accuracy is acceptable if we have a reliable MD signal, so I think
> it’s probably ok.
>
>
>
> I wouldn’t say no to #1 either, even if only to increase the ACE count and
> still keep a 2-codepoint signal for the non-counting signals.
>
> </JH2>
>
>
>
>
>
> I guess I don't understand the tunneling problem. If we have a mark that
> demands MD, that trumps feedback about low-latency fine tuning, does it not?
>
>
>
> <JH2>
>
> As tempting as it is to just answer “yes, of course”, I’ll acknowledge
> it’s not quite so clear-cut.  (And I feel like you might not be alone in
> not understanding this facet of the problem yet, so thanks for asking about
> it.  It hasn’t been very easy to follow.)
>
>
>
> The issue as I understand it is that tunnels as they stand today can only
> export the CE signal, period.  (At best.  That’s if they’re
> RFC6040-compliant.  If we want to carry 2 signals, we’d need to obsolete
> RFC 6040 with a decapsulation algorithm that carries 2 signals instead, and
> update a lot of tunnel endpoints, hence a decade-long rollout before most
> tunnels have it.)
>
>
>
> The L4S proponents have argued persistently that it’s not necessary to
> respond to a classic CE signal with MD, and that the value gained from the
> fine-tuning signal is key to the L4S value proposition, and therefore that
> the fine tuning should be carried in CE, regardless of the interactions
> with classic queues.
>
>
>
> And it’s also fair to note that RFC 8311 introduced some significant
> question as to whether a mark from a classic queue now demands an MD
> response, so now it’s sort of a judgement call about whether we can get
> “effective congestion control” without honoring such a signal in the MD way
> that the classic queues are expecting.
>
>
>
> So I think it’s an open question at this point about what constitutes a
> reasonable outcome when MD is not the response to a classic queue’s CE
> signal.
>
>
>
> Some of that depends on whether you think the classic queue detection can
> be made successful, since it provides an alternative MD response that’s not
> a direct response to a CE signal from the classic queue.  (There are
> currently some differing opinions on how likely this is to work in
> practice, just in case it wasn’t complicated enough yet.)
>
>
>
> I of course have my opinions, but I don’t know of an established consensus
> here, and to me it seems like the main crux of the debate over safety
> properties of L4S, at this point.
>
> </JH2>
>
>
>