Re: [tcpm] Proceeding CUBIC draft - thoughts and late follow-up

Ian Swett <ianswett@google.com> Wed, 22 June 2022 13:05 UTC

Return-Path: <ianswett@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E04F5C15AE28 for <tcpm@ietfa.amsl.com>; Wed, 22 Jun 2022 06:05:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.609
X-Spam-Level:
X-Spam-Status: No, score=-17.609 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YED3jw5QyhyM for <tcpm@ietfa.amsl.com>; Wed, 22 Jun 2022 06:05:43 -0700 (PDT)
Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0E1CBC14F72A for <tcpm@ietf.org>; Wed, 22 Jun 2022 06:05:43 -0700 (PDT)
Received: by mail-wr1-x42c.google.com with SMTP id g27so16682434wrb.10 for <tcpm@ietf.org>; Wed, 22 Jun 2022 06:05:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=w0I/slVaYMH89hw8/irOVJ3bAk+NCCZDMQ73xZMdhMQ=; b=AMHvk2VvG8mpg8JFMZCv/qC65mT9iEu9A6YBFVfmDMsqDfkcpgtX0nm8cAKZwvFL5j UI1J6mJ7Iqhb6/Kb9qJWr3mfulJEHsm7faQfQCYMGpuhciWnWTJetGKkOAysWx5xAeVC 5nyd9RyEhUkOi826/DtGvDeWjnyCcv5PP9jJJSl/Y4AUok5sIMjZRPjrgcDXejM7gIQg uDwA/Pc8vlgK8qFap7bjR/PZkEFg5zDpC0Jpl40dqYXL45zsIT5GjKeCzsGn1qEqj22s kDwEJXb2WCfUqgIFqT97QByuREPaMSTkkFiUnQQNxFwQ8+Y8GLMvHwn15PYI5bcs20sR w2GA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=w0I/slVaYMH89hw8/irOVJ3bAk+NCCZDMQ73xZMdhMQ=; b=WL+L/aZucRU6Xr/t97u1osSflBzIUeHlWjpMnYv5sMf5dfnaE28RuQ39/H1LuHsELo iP+ExX2qN+i8p2F5CYOIo77dssjSjuPbKeYCZGHD9+0K+sJH5y3RhlI+3V12IEqUNlaV 5XpWUxXjNX2DAxmcwpmmiHncw5sbHiQhER++BtaK2SWWoaIf76DtNs9yOQHMpUkSQnwt YuG1sEZf6JipS/JEvbjfAtv4OxNvqXsYioTA10NZrGTYMHc0bWmUMvKs5lsYLzngW4hk 2293ckm/woksBFD8U+UeGe40dIU6oYII+gTQB3tohwiNK51ACMrULpDO9ZSQPJAcUv7X 4Urw==
X-Gm-Message-State: AJIora+1UTN1a6mmFS+NSbbPMDKZvuWQ8CttOXIMNvhL/lCHx1jrPwoG X4RYmG+R0JyNXfyPg45OrRl7lBkl3dHs3sfcY0NFfw==
X-Google-Smtp-Source: AGRyM1tBqFHgTU/62ChjtyeQYudfuqPtacP0lWsPBKsFrTkiEgDnqDCPLEEwUqvpL3IIqifIvLKSwRx2z4BRoB+nWo4=
X-Received: by 2002:a5d:4bce:0:b0:21b:9f38:e23d with SMTP id l14-20020a5d4bce000000b0021b9f38e23dmr3360723wrt.322.1655903141314; Wed, 22 Jun 2022 06:05:41 -0700 (PDT)
MIME-Version: 1.0
References: <alpine.DEB.2.21.2206061135361.7292@hp8x-60.cs.helsinki.fi> <89E12A4E-2CDD-4DFD-9CBE-E2B669BE8C4C@eggert.org> <alpine.DEB.2.21.2206210130120.7292@hp8x-60.cs.helsinki.fi> <A0821CC3-23E2-4521-86CA-E110B4B6E955@eggert.org>
In-Reply-To: <A0821CC3-23E2-4521-86CA-E110B4B6E955@eggert.org>
From: Ian Swett <ianswett@google.com>
Date: Wed, 22 Jun 2022 09:05:29 -0400
Message-ID: <CAKcm_gM64RYc2hj0ZwMLfrmW2BFt4+SrCwZ2yROYNzjYv2kSiQ@mail.gmail.com>
To: Lars Eggert <lars@eggert.org>
Cc: Markku Kojo <kojo@cs.helsinki.fi>, Gorry Fairhust <gorry@erg.abdn.ac.uk>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>, tcpm-chairs <tcpm-chairs@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000085d75705e208fef1"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/WWNx4tGwwDIdPLoGdocLfNdiJQ4>
Subject: Re: [tcpm] Proceeding CUBIC draft - thoughts and late follow-up
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Jun 2022 13:05:44 -0000

On Wed, Jun 22, 2022 at 2:31 AM Lars Eggert <lars@eggert.org> wrote:

> Hi,
>
> On 2022-6-21, at 3:06, Markku Kojo <kojo@cs.helsinki.fi> wrote:
> > To my understanding we have quite a bit QUIC traffic for which RFC 9002
> has just been published and it follows Reno CC quite closely with some
> exceptions.
>
> see Vidhi's message on the differences between Reno and RFC9002.
>
> Also, my understanding is that the most widely deployed QUIC stacks in
> production actually use CUBIC or BBR v1 or v2 and not RFC9002.
>

(chair hat off)
Google uses BBRv1 on most internet-facing server connections in QUIC and
TCP and Cubic in QUIC clients.  We have active experiments with BBRv2, but
nothing with Reno.

As a practical note, the vast majority of these flows never operate in the
region of traditional fairness analyses where multiple flows are attempting
to fully utilize the link for many (>10) RTTs in a row.  The majority never
exit slow start and those that do are frequently app-limited.


> > We have also some SCTP traffic that follows very closely Reno CC
>
> The SCTP used for WebRTC in production (Webex, Zoom, etc.) is AFAIK not
> using Reno (or CUBIC, or RMCAT).
>
> > and numerous proprietary UDP-based protocols that RFC 8085 requires to
> follow the congestion control algos as described in RFC 2914 and RFC 5681.
> So, are you saying RFC 2914, RFC 8085 and RFC 9002 are just academic
> exercises?
>
> What the IETF requires in RFCs and what sees deployment are two different
> things. These RFCs are meant to give implementors who may not be aware of
> the intricacies of CC some background and a solid foundation to implement.
>
> > Moreover, my answer to why we see so little Reno CC traffic is very
> simple: people deployed CUBIC that is more aggressive than Reno CC, so it
> is an inherent outcome that hardly anyone is willing to run Reno CC when
> others are running a more aggressive CC algo that leaves little room for
> competing Reno CC.
>
> CUBIC might be more aggressive than Reno, but it is not problematically
> so. And its slight increase in aggressiveness  - w/o any apparent major
> issues - results in better application performance, which is why it is
> seeing deployment.
>
> > First, if the CUBIC draft is published as it currently is that would
> give an IETF stamp and 'official' start for "a spiral of increasingly
> > aggressive TCP implementations" that RFC 2914 appropriately warns about.
>
> RFC2914 was written at a time when the IETF had practically no
> participation from the engineers that implemented and shipped CC algorithms
> for the major stacks, and the need for proper CC was a lot less well and
> widely understood as it is now.
>
> We are in a much different situation now, where hyperscalar and other
> massively deployed services pay extremely close attention to how well their
> content pipeline operates, and whose engineers are participating in this
> group and the broader IETF.
>
> There is an increasing desire to optimize CC, BBR being maybe the latest
> example, but at the same time there is also a huge awareness of the risks
> of being too aggressive, maybe more so now than at any time in the past. I
> don't think there is a risk of a CC spiral of death.
>
> > The little I had time to follow L4S discussions in tsvwg people already
> insisted to compare L4S performance to CUBIC instead of Reno CC.
>
> Of course they would - CUBIC is what runs on the Internet. If you want to
> compare yourself to the current status quo, that is your baseline.
>
> > Second, by recognizing CUBIC as a standard as it is currently written
> would ensure that all issues that have been raised would get ignored and
> forgotten forever.
>
> I don't see this risk at all. One motivation for publishing a bis version
> of RFC 8312 was to document the bug fixes that have occurred in deployments
> since RFC 8312 was published. Publishing the bis will not stop us from
> publishing future improvements.
>
> > As I have tried to say, I do not care too much what would be the status
> of CUBIC when it gets published as long as we do not hide the obvious
> issues it has and we have a clear plan to ensure that all issues that have
> not been resoved by the time of publishing it will have a clear path and
> incentive to get fixed.
>
> I'd like to point out that I see nobody else in the WG claiming that CUBIC
> has "obvious issues" or is a "flawed design". It's not perfect, but nothing
> ever is. CUBIC has been running the majority of the Internet traffic for
> the last decade, and the Internet seems to be doing OK.
>
> We'll publish additional improvements to CUBIC when they are proposed,
> tested and have WG consensus.
>
> > IMO that can be best achieved by publishing it as Experimental and
> documenting all unresolved issues in the draft.
> > That approach would involve the incentive for all proponents to do
> whatever is needed (measurements, algo fixes/tuning) to solve the remaining
> issues and get it to stds track.
>
> Please propose a short paragraph of text that outlines these "unresolved
> issues", which we might then see if the WG has consensus for adding it to
> the draft?
>
> > But let me ask a different question: what is gained and how does the
> community benefit from a std that is based on flawed design that does not
> behave as intended?
>
> So even if CUBIC was a "flawed design that does not behave as intended",
> it seems in practice to perform pretty well without major issues, seems to
> deliver QoE improvements to the applications that run above it, and is
> ubiquitously deployed on the Internet.
>
> Not publishing it on the standards track sends a pretty strong message to
> the implementer community that the IETF community is completely out of
> touch with deployed realities. This risks us being taken seriously.
>
> > Congestion control specifications are considered as having significant
> operational impact on the Internet similar to security mechanisms. Would
> you in IESG support publication of a security mechanism that is shown to
> not operate as intended?
>
> Why do you believe CUBIC does not "operate as intended"?
>
> What matters is whether a security or congestion control mechanism is fit
> for purpose and without major failure cases. I believe that is the case for
> CUBIC.
>
> > Could we now finally focus on solving each of the remaining issues and
> discussing the way forward separately with each of them? Issue 3 a) has
> pretty much been solved already (thanks Neal), some text tweaking may still
> be needed.
>
> As editors of a WG document, we'll incorporate changes as they gain WG
> consensus. There was a proposal (and support) to address one of your
> suggestions, and we merged Neal's PR. If and when that happens for other
> suggestions, we'll follow suit.
>
> Thanks,
> Lars
>
>