Re: [tcpm] Hystart survey of large server operators

Yuchung Cheng <ycheng@google.com> Thu, 29 July 2021 16:40 UTC

Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 988B63A0C15 for <tcpm@ietfa.amsl.com>; Thu, 29 Jul 2021 09:40:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -18.097
X-Spam-Level:
X-Spam-Status: No, score=-18.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.499, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v-hw39Tf39eB for <tcpm@ietfa.amsl.com>; Thu, 29 Jul 2021 09:40:21 -0700 (PDT)
Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7F61C3A0C17 for <tcpm@ietf.org>; Thu, 29 Jul 2021 09:40:21 -0700 (PDT)
Received: by mail-wm1-x334.google.com with SMTP id f14-20020a05600c154eb02902519e4abe10so7219909wmg.4 for <tcpm@ietf.org>; Thu, 29 Jul 2021 09:40:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZGa4ZwO2eFBnICzGG8n9w0dO7ckJyxxaaPtRgsu3heg=; b=QFY8jnoLcg7tgsjvquv+sJGJBcWBQmeO3ydfMFA45DSW1P8+murAB7pfYiOREQmfdR s2oYo2DGyNhlaF+jIG4JHQW5ia2xvtLQGwcAeiYLzRhsyvjc7j7lC2hML/lrOVGXO658 sQ4ySf1BeXoxbV1+DY+qyqMTHZP/EoOYIMT/wogTpklHeJSpg1lk714G/4KkNJwyUirf GxXGe0avmdhDygSpxl2FMfuKgo0kJkwfzt7RmxI/80CfxDfNviwQam7pJQCqZwCMz+RC QF2EaHA7WZQzukhxxBDl7VT4cyV1BXd+Nn0Ayc7mqDi/l+M0CPkDZgVxWJMEY5j9tBcy Sztw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZGa4ZwO2eFBnICzGG8n9w0dO7ckJyxxaaPtRgsu3heg=; b=XE1D3TzChIzNHqNxT7d7YaZ9iQaA45XL+RujfKASJH6NZnsFdT9DUk10pAAbnDID77 LsqSAA5fEEuOBvoSGomRx9+u4tiTDhQoSZhKJGV0IzcATLWxQNsG4XZPhyQ9Ld0JIP/T GnXAZA0iwB7r/NjX8cLRYrBTgGieS7s8s+WBwxPeiQ2Tsp4Uz36hZr2PsAzmOK4QSD50 u2MuiglN9S0cYZznPYueh+7l9iWjXbNDFIrpM8s2SoPYOO7MLrXMvPaKjoxJ7YeK9FAD vVBTlPhSEV5+8cCfetauVNL25vB7OQGD++XOwFFPICZDntNFE3CQ/xTHJO0lNdDEeUTa 1BQg==
X-Gm-Message-State: AOAM5328XFKfFuD0kOMfB7pmcA6V8uSl/CCgleihJMX88cfzcFKh0/wN LRvmzDuklOfg2AgSjmSeyNpdZgPOaXjLBYSEvudJiw==
X-Google-Smtp-Source: ABdhPJyQADPVBn/NLKWRN/anEK5ZSx3ca97QyM0YNTs8aMUL07lxJ8ryqYPS+W91LxgKnLdufPGMQEO5Hyk8hIm2oiU=
X-Received: by 2002:a05:600c:354a:: with SMTP id i10mr15572350wmq.171.1627576817987; Thu, 29 Jul 2021 09:40:17 -0700 (PDT)
MIME-Version: 1.0
References: <162610476442.30543.4667406094304409800@ietfa.amsl.com> <98289918-67d1-2be1-723d-2df66be46fac@bobbriscoe.net> <PH0PR00MB1030126A3220BC056A406490B6E99@PH0PR00MB1030.namprd00.prod.outlook.com> <PH0PR00MB1030E0697BE8E93074B901D2B6E99@PH0PR00MB1030.namprd00.prod.outlook.com> <84ac0f00-b828-2503-fd7c-0ef7c6465768@bobbriscoe.net> <CAK6E8=f7qKDs-MFr4G6bpz82Swn3iCJEkWLL5yr+vV8z9zMb=Q@mail.gmail.com> <004bfb54-32d6-0758-cd36-df52542c5a9d@bobbriscoe.net> <4c203f66-6429-d717-8213-8bdf9d3a7b2a@huitema.net> <CAK6E8=dpdFjcEoPkMORw+CjFAmpo+ZtS-Scj6qh9oYSmbtZyWA@mail.gmail.com> <a746b460-cce8-c061-ae11-5773a70c2a0a@bobbriscoe.net>
In-Reply-To: <a746b460-cce8-c061-ae11-5773a70c2a0a@bobbriscoe.net>
From: Yuchung Cheng <ycheng@google.com>
Date: Thu, 29 Jul 2021 09:39:40 -0700
Message-ID: <CAK6E8=drA5B+giL9uKh8ZbYwbex26k6_d3iOGj5TO5tn48+=Hw@mail.gmail.com>
To: Bob Briscoe <ietf@bobbriscoe.net>
Cc: huitema@huitema.net, "tcpm@ietf.org" <tcpm@ietf.org>, "draft-ietf-tcpm-hystartplusplus@ietf.org" <draft-ietf-tcpm-hystartplusplus@ietf.org>, Neal Cardwell <ncardwell@google.com>
Content-Type: multipart/alternative; boundary="00000000000015945405c845c22d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/o5ZhOvxtiMwb-LfQbB5UpW8dGL0>
Subject: Re: [tcpm] Hystart survey of large server operators
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Jul 2021 16:40:27 -0000

On Thu, Jul 29, 2021 at 9:28 AM Bob Briscoe <ietf@bobbriscoe.net> wrote:

> Yuchung,
>
> On 29/07/2021 00:35, Yuchung Cheng wrote:
>
> same experience with Chrisitian
>
> For more than half a decade Google traffic used Linux CUBIC Hystart. This
> includes google.com and YouTube public Internet traffic, as well as
> internal Google TCP traffic. When we deployed pacing around 2013 we
> disabled the Hystart ACK-train mechanism (that Hystart++ prohibits now) as
> it causes false SS exit. But we continued to use the stock Hystart Delay
> mechanism. ~2014-15 we switched to BBR which uses a different startup more
> robust to delay jitters.
>
>
> [BB] You or Neal (sry can't remember which) were one of those I surveyed
> who said they had disabled Hystart. So perhaps this was a misunderstanding
> over the ACK-train part.
>
Yes this must be mis-communication. Since that was in 2018, it's either "we
don't use hystart b/c BBR doesn't use hystart" or "we don't use hystart
ack-train when hystart was used in Cubic before BBR".



> But IMO the simpler hystart mitigating SS over-shoot is still highly
> valuable for Cubic and other congestion controls, as Internet bandwidth
> continues to hike.
>
> Did those providers disable hystart b/c of its poor interaction between
> the ack-train mode and pacing?
>
>
> [BB] I didn't ask them why.
>
>
>
> Bob
>
> Maybe it's time to disable the ack-train approach in the upstream
> Linux when hystart++ is standardized.
>
> On Wed, Jul 28, 2021 at 9:47 AM Christian Huitema <huitema@huitema.net>
> wrote:
>
>> I certainly do not qualify as a "large operator", but when implementing
>> Cubic for QUIC I found that Hystart was critical for performance.
>> Specifically, the classical slow-start is very prone to overshooting the
>> capacity of the path, which causes large batches of errors. With web-like
>> traffic, these batches of errors cause increase latency for the first
>> transactions in a session, and thus a drop in perceived quality. Hystart
>> solves that. With Hystart, I observed that a large fraction of sessions do
>> not experience any packet loss at all.
>>
>> As an aside, we should strive for this suppression of packet losses.
>> That's actually a big reason for moving from Cubic to BBR. Hystart
>> suppresses the losses during the initial phase of the connection, but Cubic
>> still relies on periodically testing the limit of path capacity and causing
>> losses during the subsequent phase. BBR's probing for bottleneck bandwidth
>> is much more conservative, does not cause such losses. It might be possible
>> to adapt Cubic to also not cause losses, for example by ending an epoch
>> early if too many CE marks are received or if the RTT increases. That would
>> be worth trying.
>>
>> -- Christian Huitema
>> On 7/28/2021 3:19 AM, Bob Briscoe wrote:
>>
>> Yuchung,
>>
>> It was during a Mar 2018 ad hoc workshop Jana had organized at the London
>> IETF entitled 'BBR and the intersection with other work". I can't remember
>> why I needed to know at the time, but during the break I approached the
>> major server operators individually, established whether they used Cubic,
>> and if so asked whether they used Hystart or disabled it. It's hard to
>> anonymize the results, because IMMSMC all those that used Cubic said they
>> disabled Hystart.
>>
>> If this isn't correct, then maybe people who replied at the time thought
>> they were being asked something else. Or maybe it was correct then but
>> isn't now.
>>
>>
>> Bob
>>
>>
>> On 28/07/2021 00:46, Yuchung Cheng wrote:
>>
>> Wait -- how is hystart "invariably disabled" in Linux (cubic)!?
>>
>> What data indicates that
>>
>> On Tue, Jul 27, 2021 at 4:37 PM Bob Briscoe <ietf@bobbriscoe.net
>> <mailto:ietf@bobbriscoe.net> <ietf@bobbriscoe.net>> wrote:
>>
>>     Any large server operator out there who are using Cubic,
>>     If you're willing to state whether or not your operations disable
>>     Hystart, pls do so in reply.
>>     Then Praveen can cite this mailing list thread in the Hystart++
>>     draft, as requested below.
>>
>>     If you are willing to reply privately, I would be willing to keep
>>     your confidences, and provide an anonymized result to the list.
>>
>>     Cheers
>>
>>
>>
>>     Bob
>>
>>     On 27/07/2021 01:50, Praveen Balasubramanian wrote:
>>
>>
>>     Although Hystart is default enabled in Linux, it is invariably
>>     disabled. So, it's misleading to just say Hystart is default
>>     enabled, which implies it's widely used, when people clearly find
>>     it has problems (which motivates Hystart++). I found this out
>>     through an informal survey I did at the Mar'18 IETF in London by
>>     asking round the implementers of the most prevalent stacks (I
>>     would name names if I could find the note I later sent to someone
>>     or to some list, but I can't find it - sry).
>>
>>
>>     I'd like some citations on this versus anecdata if possible
>>     before I add that caveat to the text. Do large deployments
>>     disable this? I haven't come across this suggestion in any
>>     Linux tuning guides to date.
>>
>>
>>     --
>> ________________________________________________________________
>>     Bob Briscoehttp://bobbriscoe.net/  <http://bobbriscoe.net/>
>> <http://bobbriscoe.net/>
>>
>>     _______________________________________________
>>     tcpm mailing list
>>     tcpm@ietf.org <mailto:tcpm@ietf.org> <tcpm@ietf.org>
>>     https://www.ietf.org/mailman/listinfo/tcpm
>>     <https://www.ietf.org/mailman/listinfo/tcpm>
>> <https://www.ietf.org/mailman/listinfo/tcpm>
>>
>>
>>
>> _______________________________________________
>> tcpm mailing listtcpm@ietf.orghttps://www.ietf.org/mailman/listinfo/tcpm
>>
>>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
>