Re: [tcpm] Proceeding CUBIC draft - thoughts and late follow-up

Martin Duke <martin.h.duke@gmail.com> Tue, 21 June 2022 17:38 UTC

Return-Path: <martin.h.duke@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2AD17C157B54; Tue, 21 Jun 2022 10:38:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.107
X-Spam-Level:
X-Spam-Status: No, score=-2.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rrCuwAtpMX9C; Tue, 21 Jun 2022 10:38:20 -0700 (PDT)
Received: from mail-ua1-x932.google.com (mail-ua1-x932.google.com [IPv6:2607:f8b0:4864:20::932]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 54960C15AAC2; Tue, 21 Jun 2022 10:38:20 -0700 (PDT)
Received: by mail-ua1-x932.google.com with SMTP id p19so2417845uam.4; Tue, 21 Jun 2022 10:38:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=aK1qVAimHq3FtyDjBZzRSrv3C18i9VydlOZrINC4np8=; b=eiY59cgtj1bUvuSau0cvqFX4RE2I2C+KDBRU7TWMGYmZqNqmayJM2209Dy/ffrY8lv 5hy+BPGzznH8+CZB5QamPMqzelYXo5N090LRxOuW2TizGT0mg8iWUpVtL30QvzAETqM/ D17vMiSRkZNNJX+7JSkyQGOBJzcR3xaRqd4qVVe1GpMEDm1MIXdWR29+3MFGOLeRwhG6 LvoZttN5XUe9p9A9q1gOKwjBnbkCIbB21wXuU3S0yWOxblDf9zK7RAVlZHryWFo0RbVv eKBt44ccafpeq0lV8glNdihrvNd/6DmY3TPuUROHFGlZvi0BAST+/JQK2zgu6aP3pDJR cBDg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=aK1qVAimHq3FtyDjBZzRSrv3C18i9VydlOZrINC4np8=; b=l58sSmyXAJSX2mxF2KzorUEQXd3HjO69l0pxeLrsDG3s4uLLMZhRUXvLfCzLHNW6R5 /CIVOjvNeptWM/obW404COUv7DHyOrfPD8mSEoQA9EY9X+sEbvkkpcAeQR5in8mHPrAt AWdpU3i3dpX7RCqfn7F3vo/pPh/Wm8puWOhHvI6OnKQ9HKMfcJOQk7OFblsJGb4G+a69 JyzfJHFqDiokqkuqAaDzgpnSQ3ZuyiWxn3+I+aRR3hkVwgfG9Zk2/Axeu6FP0KKw/RFZ BwsAnEUVhp56SO1eUXicvPMHoSaV4mZ/S5cmPTFfbEVkNOq0ynExeltfELD2nWKFBw7w dewQ==
X-Gm-Message-State: AJIora/2tXuflOAXaShEGVENDeQZbJG93u7sv4pGrc7mRWaLRqFWKHz7 JbDAdP86ulTe3bkh1IcHnAI1u0jkumexJMc9gLEKu/Pf
X-Google-Smtp-Source: AGRyM1tliVmfCY8Yz+iK7RKeGuKW3N5RaI2HKrPBqcAxjmKZuSX/aaCGnTJti02hZb+D1UVhi/v1fxnP5R9JC7CR1qE=
X-Received: by 2002:ab0:314f:0:b0:379:704d:a076 with SMTP id e15-20020ab0314f000000b00379704da076mr11067994uam.55.1655833099080; Tue, 21 Jun 2022 10:38:19 -0700 (PDT)
MIME-Version: 1.0
References: <F86ABD0E-D615-4EB9-924A-05F222123964@apple.com>
In-Reply-To: <F86ABD0E-D615-4EB9-924A-05F222123964@apple.com>
From: Martin Duke <martin.h.duke@gmail.com>
Date: Tue, 21 Jun 2022 10:38:07 -0700
Message-ID: <CAM4esxThEBcYZPdZeFbAHzweH3aFDTGnpr9BEOGdmvR3F3MSrg@mail.gmail.com>
To: Vidhi Goel <vidhi_goel=40apple.com@dmarc.ietf.org>
Cc: Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>, tcpm-chairs <tcpm-chairs@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000ad9dee05e1f8af13"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/9nwZx-Uz7L73gZZHdMbkYA9t7gc>
Subject: Re: [tcpm] Proceeding CUBIC draft - thoughts and late follow-up
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Jun 2022 17:38:24 -0000

(with no hats)

Markku,

I think it's important to distinguish between "aggressive" algorithms that
are aggressive and reach a superior equilibrium for everyone using that
algorithm, and aggressive algorithms that don't scale if everyone is using
them.

There's one scenario (A) that I think everyone would agree was acceptable:
1) Early adopters deploy a new algorithm
2) The old algorithm is not affected at all
3) As users migrate from new to old, the network converges on a
higher-utilization equilibrium

Similarly, we would all agree that Scenario (B) is unacceptable
1) Deploy new algorithm
2) The old algorithm is starved and unusable
3) As users migrate from new to old, the network converges on a
higher-utilization equilibrium

There's a middle ground (C) where the old algorithm suffers degraded
performance, but not fatally. Reasonable people can disagree on where the
exact threshold lies, and the argument has several dimensions. It's an
eternal human argument about how much damage is acceptable in making
technical progress that we won't settle here.

In the case of Cubic, it is *extremely widely* deployed. Whether or not
doing damage to Reno connections was justified, we have already sped
through (2) and have landed on (3). Cubic is the default and users
generally have to seek out Reno to use it. So what is to be gained by
continuing to defend an inferior equilibrium against a superior one that
has already won in the market?

As for RFC 9002: this was an expedient choice; QUICWG needed a standard
congestion control, was not chartered to create a new one, and there was
only one on the shelf to choose from. If Cubic had been standards-track,
the WG may very well have chosen that one. In the real world the most
important production QUIC implementations are not using Reno.

On Mon, Jun 20, 2022 at 6:08 PM Vidhi Goel <vidhi_goel=
40apple.com@dmarc.ietf.org> wrote:

> If we are talking about RFC 9002 New Reno implementations, then that
> already modifies RFC 5681 and doesn’t comply with RFC 5033. Since it has a
> major change from 5681 for any congestion event, I wouldn’t call it closely
> following new Reno. Also, in another email, you said that you didn’t follow
> discussions on QUIC WG for RFC 9002, so how do you know whether QUIC
> implementations are using New Reno or CUBIC congestion control?
> It would be good to stay consistent in our replies, if you agree RFC 9002
> is already non compliant with RFC 5033, then why use it as a reference to
> cite Reno implementations!
>
> Vidhi
>
> > On Jun 20, 2022, at 5:06 PM, Markku Kojo <kojo=
> 40cs.helsinki.fi@dmarc.ietf.org> wrote:
> > Hi Lars,
> >
> > On Sun, 19 Jun 2022, Lars Eggert wrote:
> >
> >> Hi,
> >>
> >> sorry for misunderstanding/misrepresenting  your issues.
> >>
> >>> On Jun 6, 2022, at 13:29, Markku Kojo <kojo@cs.helsinki.fi> wrote:
> >>> These issues are significant and some number of people have also said
> >>> they should not be left unaddressed. Almost all of them are related to
> >>> the behaviour of CUBIC in the TCP-friendly region where it is intended
> >>> and required to fairly compete with the current stds track congestion
> >>> control mechanisms. The evaluation whether CUBIC competes fairly
> >>> *cannot* be achieved without measuring the impact of CUBIC to the
> >>> other traffic competing with it over a shared bottleneck link. This
> >>> does not happen by deploying but requires specifically planned
> measurements.
> >>
> >> So whether CUBIC competes fairly with Reno in certain regions is a
> >> completely academic question in 2022. There is almost no Reno traffic
> >> anymore on the Internet or in data centers.
> >
> > To my understanding we have quite a bit QUIC traffic for which RFC 9002
> has just been published and it follows Reno CC quite closely with some
> exceptions. We have also some SCTP traffic that follows very closely Reno
> CC and numerous proprietary UDP-based protocols that RFC 8085 requires to
> follow the congestion control algos as described in RFC 2914 and RFC 5681.
> So, are you saying RFC 2914, RFC 8085 and RFC 9002 are just academic
> exercises?
> >
> > Moreover, my answer to why we see so little Reno CC traffic is very
> simple: people deployed CUBIC that is more aggressive than Reno CC, so it
> is an inherent outcome that hardly anyone is willing to run Reno CC when
> others are running a more aggressive CC algo that leaves little room for
> competing Reno CC.
> >
> >> I agree that it in an ideal world, the ubiquitous deployment of CUBIC
> >> should have been accompanied by A/B testing, including an investigation
> >> into impact on competing non-CUBIC traffic.
> >>
> >> But that didn’t happen, and we find ourselves in the situation we’re
> in. What is gained by not recognizing CUBIC as a standard?
> >
> > First, if the CUBIC draft is published as it currently is that would
> give an IETF stamp and 'official' start for "a spiral of increasingly
> > aggressive TCP implementations" that RFC 2914 appropriately warns about.
> The little I had time to follow L4S discussions in tsvwg people already
> insisted to compare L4S performance to CUBIC instead of Reno CC. The fact
> is that we don't know how much more aggressive CUBIC is than Reno CC in its
> TCP friendly region. However, if I recall correctly it was considered Ok
> that L4S is somewhat more aggressive than CUBIC. So, the spiral has already
> started within the IETF as well as in the wild (Internet).
> >
> > Second, by recognizing CUBIC as a standard as it is currently written
> would ensure that all issues that have been raised would get ignored and
> forgotten forever.
> >
> > Third, you did not indicate which issue are you referring to. A part of
> the issues have nothing to do with fair competition against Reno CC in
> certain regions. E.g, issue 2 causes also self-inflicted problems to a flow
> itself as Neal indicated based on some traces he had seen. And there is a
> simple, effective and safe fix to it as I have proposed.
> >
> > As I have tried to say, I do not care too much what would be the status
> of CUBIC when it gets published as long as we do not hide the obvious
> issues it has and we have a clear plan to ensure that all issues that have
> not been resoved by the time of publishing it will have a clear path and
> incentive to get fixed. IMO that can be best achieved by publishing it as
> Experimental and documenting all unresolved issues in the draft. That
> approach would involve the incentive for all proponents to do whatever is
> needed (measurements, algo fixes/tuning) to solve the remaining issues and
> get it to stds track.
> >
> > But let me ask a different question: what is gained and how does the
> community benefit from a std that is based on flawed design that does not
> behave as intended?
> >
> > Congestion control specifications are considered as having significant
> operational impact on the Internet similar to security mechanisms. Would
> you in IESG support publication of a security mechanism that is shown to
> not operate as intended?
> >
> > Could we now finally focus on solving each of the remaining issues and
> discussing the way forward separately with each of them? Issue 3 a) has
> pretty much been solved already (thanks Neal), some text tweaking may still
> be needed.
> >
> > Thanks,
> >
> > /Markku
> >
> >> Thanks,
> >> Lars
> >>
> >> --
> >> Sent from a mobile device; please excuse typos.
> > _______________________________________________
> > tcpm mailing list
> > tcpm@ietf.org
> > https://www.ietf.org/mailman/listinfo/tcpm
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>