Re: [tsvwg] Neal Cardwell's rationale for supporting ECT(1) as an input/L4S signal

Neal Cardwell <ncardwell@google.com> Tue, 12 May 2020 20:16 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4BA343A0A92 for <tsvwg@ietfa.amsl.com>; Tue, 12 May 2020 13:16:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.6
X-Spam-Level:
X-Spam-Status: No, score=-17.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8AcGkFovtEkm for <tsvwg@ietfa.amsl.com>; Tue, 12 May 2020 13:16:48 -0700 (PDT)
Received: from mail-vs1-xe2e.google.com (mail-vs1-xe2e.google.com [IPv6:2607:f8b0:4864:20::e2e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9DA5D3A0927 for <tsvwg@ietf.org>; Tue, 12 May 2020 13:16:48 -0700 (PDT)
Received: by mail-vs1-xe2e.google.com with SMTP id l25so8710716vso.6 for <tsvwg@ietf.org>; Tue, 12 May 2020 13:16:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=obiYTGwIYgJbbl/K4eBT9oKqFLfirpXRES2dkUxXWAE=; b=tSU+kOAJx8Kt70QdWR52gAu70KSXAmNJT2l0cWxzxE0sZfOxzsPlSEyG0iYI4Xy77P vkPl1SA1jkL4N3vDvdNzewcf1xDp9IbQI066MFY/eJMaO/a55KCiTiL/1yc9AeVPUVIa twtG1lMDA8aucNVIi/IhRe6dZ8pwp3LjyBBMZbHj/q1Wc0LFly9mAyqC2xsEbY0r5vmB huXf2m3bBCka0T7F2qGixz2rYV4xG1aQPgd/wzO/1HTeR2J/g97JjdwwvcBQmoKdqZVH SzjWjAH9ICy86o7WcmdyXJ2MC+FEuid1fyq7G7nrUg8WlZEJHFNPUJRr8PnanNP6Rgms 6RcA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=obiYTGwIYgJbbl/K4eBT9oKqFLfirpXRES2dkUxXWAE=; b=Wmobe5WIfkyRLUYWcvIPVlnznlNuh0kbOfhBx4faSUA4p8tXSSEJWt6zstGJUBfSX8 A9tULCef86IwzPtNh7jmxZU2pwD+tYxBqGchS09ACnL+EA44OXyRWep97dNWsc7gMM9u YR5/d3fYIyYZAisaCE5fo8NtHTm7SpBanoiLGJfCUlGdKgpcBJBnsQwvSVGS1A3miH6E ndrnkFqWcWUvTbeL23kCkwFa+HOA2V/G7XiTx/fqKV8UcMSe2QiEGF7zCa3iq8kEzOeD 80b8Etd6T5zWyHtiuQ6kwggb30RbVs+ouEO6boUnnfrncOBd18g/wUtz6ODXp2fYKxhW j4cw==
X-Gm-Message-State: AGi0PubYlWxjQXGVLdBJ/ApJCoSIYS9kpA03x9D9kKZm2B2E6e74HqFD 1DLWHngn4rwr6cIstDQaecWGzzegylKp3E6kfztZWA==
X-Google-Smtp-Source: APiQypLJph3ijn4MSy0mP1kOhu9BPpeTNZTKLL12jzKDDDcxMGQpCzI+7uO51dNZ7JOSYs+2QENDVurI/AT5g9NLBfg=
X-Received: by 2002:a05:6102:2045:: with SMTP id q5mr18000111vsr.199.1589314607076; Tue, 12 May 2020 13:16:47 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQy=7f79Mj_GQBU-UsodTRORjB2U6rCPPQ+1Zck_gxr-rww@mail.gmail.com> <4b86712a-bc1e-cf40-12de-86032666ba1c@kit.edu>
In-Reply-To: <4b86712a-bc1e-cf40-12de-86032666ba1c@kit.edu>
From: Neal Cardwell <ncardwell@google.com>
Date: Tue, 12 May 2020 16:16:29 -0400
Message-ID: <CADVnQy=TR1D1ecb7tsnLTWjAfyaYRXOuhPvjTX6zPAZExj6phQ@mail.gmail.com>
To: Roland Bless <roland.bless@kit.edu>
Cc: tsvwg IETF list <tsvwg@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/IWZNcqQCAwjX7AQT-QGJpLAqeEQ>
Subject: Re: [tsvwg] Neal Cardwell's rationale for supporting ECT(1) as an input/L4S signal
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 May 2020 20:16:52 -0000

Hi Roland,

Please see some thoughts in-line below...

On Fri, May 8, 2020 at 1:12 PM Roland Bless <roland.bless@kit.edu> wrote:
>
> Hi Neal and all,
>
> see inline.
>
> On 08.05.20 at 17:19 Neal Cardwell wrote:
> > As part of the discussion around the consensus call on ECT(1), I
> > wanted to post my personal rationale for supporting ECT(1) as an
> > input/classification signal, along the lines of the L4S proposals.
> >
> > This is my personal opinion; I'm not speaking for any company or
> > institution.
> >
> > Many of these points will echo the excellent points that were made at
> > the TSVWG interim on April 27, by Greg White, Bob Briscoe, Jana
> > Iyengar, Stuart Cheshire, and Andrew McGregor; I agree with all of the
> > points they made. But for the sake of discussion and debate I wanted
> > to amplify what they said and enumerate the particular considerations
> > that are most compelling to me.
>
> I had the feeling that this was more a discussion around L4S vs. SCE
> proposals and not about the
> ECT(1) codepoint in question. Several arguments were along the lines of
> "L4S is more advanced
> and needs to be finished as it already took too long". I don't think
> that this is a good argument as
> the L4S proposal could also use ECT(1) as output and use a DSCP as
> classifier. See below.

I think it's overly optimistic to think that DSCP could be used as an
end-to-end classifier for a new congestion control class across the
public Internet. Rewriting DSCP bits at ingress into an AS is a
ubiquitous mechanism, and such an undertaking would require hundreds
or thousands of ASes to rearrange their custom internal DSCP/QoS
architecture in compatible ways. And there is a combinatorial
explosion if an AS needs to deal with each of its existing DSCP/QoS
levels needing to support both the new congestion control class and
classic congestion controls. This doesn't seem like a realistic path
to me.

> > - Much of the installed base of switch hardware around the world can
> > already be configured to signal shallow-threshold congestion with
> > DCTCP/L4S-style CE marks. Many have CE-marking mechanisms baked into
> > hardware, and would need to be replaced in order to deploy SCE.
> The question is: could they easily support using ECT(1) instead of CE
> for the shallow-threshold marking?

I suspect that in most high-speed switches and routers, the
functionality to use CE for ECN markings is etched into ASIC hardware,
and could not be changed.

> Supporting L4S's dual queue and classification by ECT(1) would IMHO
> require even more extensive hardware changes.

Yes. But dual-queue is not a required component of L4S inside a
datacenter. So the existing installed base of datacenter switches
would not need to be changed at all to participate in L4S. Those
switches are designed to mark any ECT(0) or ECT(1) packets with CE
when the queue passes a shallow threshold. That works for their
current use case, and it works for L4S. No changes would be needed, to
hardware or firmware.

> > - Encapsulation/decapsulation is a widely prevalent and important
> > technology today in production networks. With the installed base of
> > encap/decap mechanisms, it is likely that for many implementations any
> > SCE marking applied to packets would just get stripped off with the
> > outer header when decapsulated.
> >
> I'm not sure about this one. L4S could be affected equally or do you
> think that CE is handled differently? I wasn't able to find a quick
> answer by looking at
> https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-encap-guidelines-13

Here my understanding was based on a description of this issue by Bob
Briscoe. I'm not sure what the original data source is for that issue.
(And apologies if I have misinterpreted the issue.)

> > - To get low latency in the public Internet, we need multiple queues.
> > With SCE AFAICT there's not yet a compelling story about how to
> > classify traffic to maintain a low-delay queue and  a classic queue,
> > since there is no ECN code point to identify the low-delay traffic.
> > AFAICT with SCE you likely must either (a) have a single high-delay
> > queue with a jumble of Reno, CUBIC, classic RFC 3168, and
> > SCE-respecting RFC 3168 senders, (b)  use fair queuing (tricky in
> > hardware), (c) use DSCP to classify (academically pure but unrealistic
> > across multiple ASes in the public Internet), (d) use a dual-queue
> > system that heuristically classifies all flows into the low-delay
> > queue by default but kicks them out using a queue-protection algorithm.
>
> I agree here and queue separation would also be essential for SCE. But
> using a DSCP for classification would
> be the more extensible solution as ECT(1) wouldn't always mean to use
> the L4S queue semantics. Instead both
> solutions could use the ECT(1) mark with nearly the same congestion
> signal semantics. Moreover, both solutions have
> problems with "misbehaving" flows in the wrong queue.

As mentioned above, I think using DSCP for congestion control
classification, while theoretically clean and extensible, is not a
realistic/feasible option.

Best regards,
neal