Re: [tsvwg] TCP Prague's RFC 5033 guidelines status?

Pete Heist <pete@heistp.net> Tue, 12 November 2019 17:14 UTC

Return-Path: <pete@heistp.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 30BA0120844 for <tsvwg@ietfa.amsl.com>; Tue, 12 Nov 2019 09:14:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.399
X-Spam-Level:
X-Spam-Status: No, score=-1.399 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_SBL=0.5, URIBL_SBL_A=0.1] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=heistp.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v-ZdXOIt42Wv for <tsvwg@ietfa.amsl.com>; Tue, 12 Nov 2019 09:14:23 -0800 (PST)
Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4CDB9120842 for <tsvwg@ietf.org>; Tue, 12 Nov 2019 09:14:23 -0800 (PST)
Received: by mail-wr1-x442.google.com with SMTP id f2so19369164wrs.11 for <tsvwg@ietf.org>; Tue, 12 Nov 2019 09:14:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=5OKDnO8jXnkhUlNSnYplzwKFAtM/ilVeJlE2V9LBwR4=; b=VUyARydncp/urfpwa7emB8tRyQbrAiVFJRjeNrHXCHiKRJh+qeybVHIKu8a/6MjmyO 7u5OQNiN/U8mmSjpm0QIpdgh8OX0yUjcBZfnC8kOR+PuWlu0IQTc9698pQbaFXoi38A4 6+lf8r8DOBi1y1zrw4J03PtkD5zVlYoboRpCMlK73BvyMdHQlvUusL3x/+qEo9toi+2t 8ksUtOvsvq2f8AJQ1P19w6dJbgq0nu6CeszlMyNCPzBL1vJMW3WqWC1G4ocVme2KrGyZ KbmbOIpdDy5gKg19VQugauiBsC/zSqG60Ah7diEQfuwzi0RE2HG6bngTKpWCw0CPJHuP wCFQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=5OKDnO8jXnkhUlNSnYplzwKFAtM/ilVeJlE2V9LBwR4=; b=rOTNIx7etYkJuuoW9MaIW0EEv57jJ6uU23afJnwew24yOYvdSSZFwWuFcx0aDd9Lcd iLyfI/q8SQWuUi4GEZeOjsZdy62t1Cwm8TS2nDySLt7maHS5VxDSeQtPfoaa7ZXvzTwc oLQu6s2xXPxF2xQ6MkyZJL5NHBCE19mu+BQc9mjyIwZGkYN4XsBKZRzfoOjcSVGs2ehM zwfkaHcb2u+czBYTkS/y5Uj685pm259xlyl9ygyCK4RjeG/nOOXgDwK/1MO6ututyo8Q 5WguogjdEjcf3by+44xNaDiDbEqDDZeDxxJ6EoEMEI9YghavTvHrq5NJ8NMnIOVivRzC TBDA==
X-Gm-Message-State: APjAAAV+VGsY3cKj2YRnWUcUW0i1PmEjmWwn/+uHyo81mEE7pGHZrExV 98K/lS0XC89XAGweJMR1g84FJrY1zzI=
X-Google-Smtp-Source: APXvYqw8osCeAHx59HcP+h75khohaJDPrO3E3RkONoxk6xKHmFknYcCrSCkLahq2lImKAfG+IDwZJw==
X-Received: by 2002:adf:df09:: with SMTP id y9mr25082795wrl.25.1573578861684; Tue, 12 Nov 2019 09:14:21 -0800 (PST)
Received: from yoda.luk.heistp.net (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id o81sm3905856wmb.38.2019.11.12.09.14.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Nov 2019 09:14:21 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Pete Heist <pete@heistp.net>
In-Reply-To: <0D4D02AB-3FA9-42F2-BBB9-5011C11527D5@akamai.com>
Date: Tue, 12 Nov 2019 18:14:19 +0100
Cc: Wesley Eddy <wes@mti-systems.com>, "rgrimes@freebsd.org" <rgrimes@freebsd.org>, "Black, David" <David.Black@dell.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <E9B784CC-3E12-4689-8881-A07A2E8CC9DC@heistp.net>
References: <201911120204.xAC24jqO037009@gndrsh.dnsmgr.net> <c4a40e37-b451-8d13-f182-54cace8c759d@mti-systems.com> <0D4D02AB-3FA9-42F2-BBB9-5011C11527D5@akamai.com>
To: "Holland, Jake" <jholland@akamai.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/DlNa32eYyRalUIHi0PkoDC9_oJ8>
Subject: Re: [tsvwg] TCP Prague's RFC 5033 guidelines status?
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Nov 2019 17:14:25 -0000

Thanks, this looks like a good start on an application of 5033 to L4S. 

I wanted to back up and mention that what led me to RFC5033 was a reference to it in section 5 of draft-fairhurst-tsvwg-cc. I’d been looking for guidance on how to evaluate SCE. Even though it’s older, it resonated with me due to its emphasis on the need for scientific rigor when evaluating new CC proposals (thanks to the late Sally Floyd for that).

I admit I'll need to read and re-read some of the L4S literature referred to, as some of the below has likely been addressed in theory. Although there may be other theoretical questions, my interest is in ascertaining what if any more closed experiments are required before public experiments begin? I’ve put a few (*) next to what seems most important to me.

Comments below…

Pete

> On Nov 12, 2019, at 8:19 AM, Holland, Jake <jholland@akamai.com> wrote:
> 
> Thanks Wes, this is a helpful start.
> 
> I think the [DualQ-Test] reference from dualq also probably fits
> in #2: difficult environments, with an examination of behavior in
> the presence of fixed completely unresponsive UDP traffic at
> various levels:
> https://riteproject.files.wordpress.com/2018/07/thesis-henrste.pdf
> 
> What my request was about is that I suspect there's probably more,
> it's just hard to comb through it all to see what's missing.
> 
> Best,
> Jake
> 
> 
> On 2019-11-11, 20:49, "Wesley Eddy" <wes@mti-systems.com> wrote:
> 
> Thanks, you make good points.
> 
> I noticed that it was done very clearly for CUBIC in 
> https://tools.ietf.org/html/rfc8312 which looks like a great recent 
> example of how the 5033 topics were each addressed.
> 
> For Prague + L4S, I think we have a lot of material spread out in 
> various places, and are pretty much covering all the bases to some 
> extent, but lack a single place where it's all collected.
> 
> Here are my thoughts on what we have:
> 
> (1) Impact on Standard TCP
> 
> Parts of Appendix A and Section 4 of l4s-id draft are relevant, e.g. 
> A.1.4 "Fall back to Reno-friendly congestion control on classic ECN 
> bottlenecks"
> 
> Results w/ CUBIC + Prague in github/heistp/sce-l4s-bakeoff and 
> l4s.cablelabs.com data are relevant.
> 
> Issue 16 results from l4s.cablelabs.com are relevant.
> 
> 4.1.1 of the dualq doc is relevant.
> 
> (2) Difficult Environments
> 
> I'm not sure this is super-deeply investigated yet?
> 
> Some parts of the dualq draft touch on it, though since dualq is a 
> construction where the actual AQM algorithm(s) can vary, it's maybe 
> sufficient that those AQMs are suitable for the environments they're 
> used in?
> 
> Scenario 5 in github/heistp/sce-l4s-bakeoff and l4s.cablelabs.com tests 
> may be one data point, though I think the gist of 5033 is wider on 
> wireless access, etc.

Here’s a non-exhaustive list of difficult environments and their characteristics mentioned or implied in 5033. The two *’d are very common and may present challenges.

- wireless and other bursty links (*)
- asymmetric links and delays (*)
- very high and very low speed links
- high delay links (e.g. GEO sat)
- high BDP links
- intermittent link connectivity

> (3) Investigating a Range of Environments
> 
> Section 4 and A.1.6 ("Scaling down to fractional congestion windows") of 
> the l4s-id doc partly speak to this.  A.2.2 ("Faster than Additive 
> Increase") is also partly relevant.
> 
> The scenarios used for github/heistp/sce-l4s-bakeoff and 
> l4s.cablelabs.com data have some datapoints and use different base 
> delays, but I think the gist of this in 5033 is probably towards wider 
> sweeping of the underlying network rates, numbers of flows, and other 
> variables.

Some other conditions:

- bi-directional traffic (*)
- compatibility with other existing RFC3168 AQMs, e.g. RED and PIE (CoDel / fq_codel we started on)
- many flows in a bottleneck
- packet loss, corruption and re-ordering
- misbehaving senders, receivers or routers (really (6))

> (4) Protection Against Congestion Collapse
> 
> A.1.3 ("Fall back to Reno-friendly congestion control on packet loss") 
> pretty much covers this?
> 
> (5) Fairness within the Alternate Congestion Control Algorithm
> 
> Section 4 and A.1.5 ("Reduce RTT dependence") of l4s-id are relevant.  
> Also A.2.3 ("Faster Convergence at Flow Start") is relevant.
> 
> The two-flow cases in github/heistp/sce-l4s-bakeoff and 
> l4s.cablelabs.com data are a simple case of this (2 flow cases / 
> prage-vs-prague).
> 
> (6) Performance with Misbehaving Nodes and Outside Attackers
> 
> Section 8 of l4s-arch + Section 4 of dualq and covers this?
> 
> Scenario 4 results from the github/heistp/sce-l4s-bakeoff and 
> l4s.cablelabs.com tests have some relation to this also?

Scenario 4 covers the case of what happens if ECT(1) is set on all packets for a “classic” flow, and that seems to work as designed.

Could a flow instead set ECT(1) on only the “right” packets, like only so many per RTT, in order to gain some advantage?

> (7) Responses to Sudden or Transient Events
> 
> This could be related to parameter tuning of AQMs used in the dualq 
> construction and to Prague, but I'm not sure it's applicable to the 
> higher-level L4S architecture itself.

Might include:
- sudden congestion, capacity, and RTT changes (*)
- bottleneck shifts (*)
- multipath routing and routing changes

> (8) Incremental Deployment
> 
> Section 6.3 of l4s-arch covers this?

Compatibility with existing RFC3168 AQMs due to the redefinition of CE was one of our main concerns here. Some of its impacts on fairness and TCP RTT have been explored in the “bakeoff” tests.

There’s mention of it in l4s-arch 6.3.3 and it’s addressed further in the recent paper "TCP Prague Fall-back on Detection of a Classic ECN AQM”.