Re: [tcpm] Proceeding CUBIC draft - thoughts and late follow-up

Lars Eggert <lars@eggert.org> Wed, 22 June 2022 06:31 UTC

Return-Path: <lars@eggert.org>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BCE09C15AD31; Tue, 21 Jun 2022 23:31:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.105
X-Spam-Level:
X-Spam-Status: No, score=-2.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=eggert.org
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hKMIY5kH3-Sr; Tue, 21 Jun 2022 23:31:02 -0700 (PDT)
Received: from mail.eggert.org (mail.eggert.org [IPv6:2a00:ac00:4000:400:211:32ff:fe22:186f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E3478C157B37; Tue, 21 Jun 2022 23:31:01 -0700 (PDT)
Received: from smtpclient.apple (mobile-access-5d6aab-149.dhcp.inet.fi [93.106.171.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.eggert.org (Postfix) with ESMTPSA id 637461F035A; Wed, 22 Jun 2022 09:30:52 +0300 (EEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=eggert.org; s=dkim; t=1655879452; bh=4bIBE6KRGxzYx6rbq+pNhsKDWO189P+fbg7LByxJ/qQ=; h=Subject:From:In-Reply-To:Date:Cc:References:To; b=nHg125ovzjyuAx9W4nnoZaWylszdi60NB6uT0wMlleuq+8wHjnd9pcdyrz/Psv2zj cADpQfN5q+6frU77szGpLW8eQC0HDoEXJ1KoQrQT92PDWeuRRWGiHXUBFdrfcp7Pc9 aNFbJV8mAx5XoOucCqCBGfMVK/yeEGJsxC4YV2UA=
Content-Type: multipart/signed; boundary="Apple-Mail=_619E3B4E-39CC-438A-8881-F141FD36ACFE"; protocol="application/pgp-signature"; micalg="pgp-sha512"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
From: Lars Eggert <lars@eggert.org>
In-Reply-To: <alpine.DEB.2.21.2206210130120.7292@hp8x-60.cs.helsinki.fi>
Date: Wed, 22 Jun 2022 09:30:51 +0300
Cc: Gorry Fairhust <gorry@erg.abdn.ac.uk>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>, tcpm-chairs <tcpm-chairs@ietf.org>
Message-Id: <A0821CC3-23E2-4521-86CA-E110B4B6E955@eggert.org>
References: <alpine.DEB.2.21.2206061135361.7292@hp8x-60.cs.helsinki.fi> <89E12A4E-2CDD-4DFD-9CBE-E2B669BE8C4C@eggert.org> <alpine.DEB.2.21.2206210130120.7292@hp8x-60.cs.helsinki.fi>
To: Markku Kojo <kojo@cs.helsinki.fi>
X-MailScanner-ID: 637461F035A.A4B85
X-MailScanner: Not scanned: please contact your Internet E-Mail Service Provider for details
X-MailScanner-From: lars@eggert.org
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/QLJ3umsdzHpVxRzKRKtIL4dtups>
Subject: Re: [tcpm] Proceeding CUBIC draft - thoughts and late follow-up
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Jun 2022 06:31:08 -0000

Hi,

On 2022-6-21, at 3:06, Markku Kojo <kojo@cs.helsinki.fi> wrote:
> To my understanding we have quite a bit QUIC traffic for which RFC 9002 has just been published and it follows Reno CC quite closely with some exceptions.

see Vidhi's message on the differences between Reno and RFC9002.

Also, my understanding is that the most widely deployed QUIC stacks in production actually use CUBIC or BBR v1 or v2 and not RFC9002.

> We have also some SCTP traffic that follows very closely Reno CC

The SCTP used for WebRTC in production (Webex, Zoom, etc.) is AFAIK not using Reno (or CUBIC, or RMCAT).

> and numerous proprietary UDP-based protocols that RFC 8085 requires to follow the congestion control algos as described in RFC 2914 and RFC 5681. So, are you saying RFC 2914, RFC 8085 and RFC 9002 are just academic exercises?

What the IETF requires in RFCs and what sees deployment are two different things. These RFCs are meant to give implementors who may not be aware of the intricacies of CC some background and a solid foundation to implement.

> Moreover, my answer to why we see so little Reno CC traffic is very simple: people deployed CUBIC that is more aggressive than Reno CC, so it is an inherent outcome that hardly anyone is willing to run Reno CC when others are running a more aggressive CC algo that leaves little room for competing Reno CC.

CUBIC might be more aggressive than Reno, but it is not problematically so. And its slight increase in aggressiveness  - w/o any apparent major issues - results in better application performance, which is why it is seeing deployment.

> First, if the CUBIC draft is published as it currently is that would give an IETF stamp and 'official' start for "a spiral of increasingly
> aggressive TCP implementations" that RFC 2914 appropriately warns about.

RFC2914 was written at a time when the IETF had practically no participation from the engineers that implemented and shipped CC algorithms for the major stacks, and the need for proper CC was a lot less well and widely understood as it is now.

We are in a much different situation now, where hyperscalar and other massively deployed services pay extremely close attention to how well their content pipeline operates, and whose engineers are participating in this group and the broader IETF.

There is an increasing desire to optimize CC, BBR being maybe the latest example, but at the same time there is also a huge awareness of the risks of being too aggressive, maybe more so now than at any time in the past. I don't think there is a risk of a CC spiral of death.

> The little I had time to follow L4S discussions in tsvwg people already insisted to compare L4S performance to CUBIC instead of Reno CC.

Of course they would - CUBIC is what runs on the Internet. If you want to compare yourself to the current status quo, that is your baseline.

> Second, by recognizing CUBIC as a standard as it is currently written would ensure that all issues that have been raised would get ignored and forgotten forever.

I don't see this risk at all. One motivation for publishing a bis version of RFC 8312 was to document the bug fixes that have occurred in deployments since RFC 8312 was published. Publishing the bis will not stop us from publishing future improvements.

> As I have tried to say, I do not care too much what would be the status of CUBIC when it gets published as long as we do not hide the obvious issues it has and we have a clear plan to ensure that all issues that have not been resoved by the time of publishing it will have a clear path and incentive to get fixed.

I'd like to point out that I see nobody else in the WG claiming that CUBIC has "obvious issues" or is a "flawed design". It's not perfect, but nothing ever is. CUBIC has been running the majority of the Internet traffic for the last decade, and the Internet seems to be doing OK.

We'll publish additional improvements to CUBIC when they are proposed, tested and have WG consensus.

> IMO that can be best achieved by publishing it as Experimental and documenting all unresolved issues in the draft.
> That approach would involve the incentive for all proponents to do whatever is needed (measurements, algo fixes/tuning) to solve the remaining issues and get it to stds track.

Please propose a short paragraph of text that outlines these "unresolved issues", which we might then see if the WG has consensus for adding it to the draft?

> But let me ask a different question: what is gained and how does the community benefit from a std that is based on flawed design that does not behave as intended?

So even if CUBIC was a "flawed design that does not behave as intended", it seems in practice to perform pretty well without major issues, seems to deliver QoE improvements to the applications that run above it, and is ubiquitously deployed on the Internet.

Not publishing it on the standards track sends a pretty strong message to the implementer community that the IETF community is completely out of touch with deployed realities. This risks us being taken seriously.

> Congestion control specifications are considered as having significant operational impact on the Internet similar to security mechanisms. Would you in IESG support publication of a security mechanism that is shown to not operate as intended?

Why do you believe CUBIC does not "operate as intended"?

What matters is whether a security or congestion control mechanism is fit for purpose and without major failure cases. I believe that is the case for CUBIC.

> Could we now finally focus on solving each of the remaining issues and discussing the way forward separately with each of them? Issue 3 a) has pretty much been solved already (thanks Neal), some text tweaking may still be needed.

As editors of a WG document, we'll incorporate changes as they gain WG consensus. There was a proposal (and support) to address one of your suggestions, and we merged Neal's PR. If and when that happens for other suggestions, we'll follow suit.

Thanks,
Lars