From nobody Tue Jul 12 15:08:37 2022
Return-Path: <vidhi_goel@apple.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
 by ietfa.amsl.com (Postfix) with ESMTP id 8E61FC15948B
 for <tcpm@ietfa.amsl.com>; Tue, 12 Jul 2022 15:08:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.688
X-Spam-Level: 
X-Spam-Status: No, score=-2.688 tagged_above=-999 required=5
 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.582, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001,
 URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001]
 autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key)
 header.d=apple.com
Received: from mail.ietf.org ([50.223.129.194])
 by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id N8eMwKxRuk3P for <tcpm@ietfa.amsl.com>;
 Tue, 12 Jul 2022 15:08:34 -0700 (PDT)
Received: from rn-mailsvcp-ppex-lapp44.apple.com
 (rn-mailsvcp-ppex-lapp44.rno.apple.com [17.179.253.48])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by ietfa.amsl.com (Postfix) with ESMTPS id 4DFCBC14F613
 for <tcpm@ietf.org>; Tue, 12 Jul 2022 15:08:23 -0700 (PDT)
Received: from pps.filterd (rn-mailsvcp-ppex-lapp44.rno.apple.com [127.0.0.1])
 by rn-mailsvcp-ppex-lapp44.rno.apple.com (8.16.1.2/8.16.1.2) with
 SMTP id 26CM3wc6009902; Tue, 12 Jul 2022 15:08:19 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com;
 h=content-type :
 mime-version : subject : from : in-reply-to : date : cc :
 content-transfer-encoding : message-id : references : to; s=20180706;
 bh=Si4vZX02FCLTZ+RRpHb/M3NJALQfZfz0DKlAMKWAU9M=;
 b=KMVqiH4z/0Q4cAd4fL87Q06dg8zJI++KWDPeOHEcOQn+Q76xEMnndf+iFtmvQGmO33BB
 qhkCCy3gIP03xQBIT8HOAkZ4nPlGITKV9PCXpHJS67ATAYfb6gvluJi/TzZJsN/61Rd8
 vh0EcGDrJRyv1NYaEc538oInuk0s55bNi5h5CHWAIlh1GTaKReJ2srPX8U/xh4wYJd8C
 YCmohVpOITAEy1gbB5QKLtYb8uGeqacqEZhQ6f0S5VvANON7n1v69gl4MzLyiQTR9I7K
 Xdi0hJZF4As0/4dZ95evno5NBlgrcuqjtFMNY9LqDiJGHJpQlARWsdUAaVGkVfLTdFDx BQ== 
Received: from rn-mailsvcp-mta-lapp02.rno.apple.com
 (rn-mailsvcp-mta-lapp02.rno.apple.com [10.225.203.150])
 by rn-mailsvcp-ppex-lapp44.rno.apple.com with ESMTP id 3h756925ah-2
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO);
 Tue, 12 Jul 2022 15:08:19 -0700
Received: from rn-mailsvcp-mmp-lapp04.rno.apple.com
 (rn-mailsvcp-mmp-lapp04.rno.apple.com [17.179.253.17])
 by rn-mailsvcp-mta-lapp02.rno.apple.com
 (Oracle Communications Messaging Server 8.1.0.18.20220407 64bit (built Apr 7
 2022)) with ESMTPS id <0REX00BHIHHUJ1D0@rn-mailsvcp-mta-lapp02.rno.apple.com>; 
 Tue, 12 Jul 2022 15:08:18 -0700 (PDT)
Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp04.rno.apple.com by
 rn-mailsvcp-mmp-lapp04.rno.apple.com
 (Oracle Communications Messaging Server 8.1.0.18.20220407 64bit (built Apr 7
 2022)) id <0REX00B00HE3W600@rn-mailsvcp-mmp-lapp04.rno.apple.com>; Tue,
 12 Jul 2022 15:08:18 -0700 (PDT)
X-Va-A: 
X-Va-T-CD: eec07acd3a4911885bcc982ce9d5f2f5
X-Va-E-CD: 0e431dbdca4a3d4d3048747b6fa801c0
X-Va-R-CD: dfd96253c021dcd07098c82b26a40e8c
X-Va-CD: 0
X-Va-ID: c6828135-bead-427b-82e3-9ec8840bc423
X-V-A: 
X-V-T-CD: eec07acd3a4911885bcc982ce9d5f2f5
X-V-E-CD: 0e431dbdca4a3d4d3048747b6fa801c0
X-V-R-CD: dfd96253c021dcd07098c82b26a40e8c
X-V-CD: 0
X-V-ID: cb0b9884-67a1-4d92-beda-539e113c1f1d
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.517, 18.0.883
 definitions=2022-07-12_12:2022-07-12,
 2022-07-12 signatures=0
Received: from smtpclient.apple (vimac.scv.apple.com [17.192.154.53])
 by rn-mailsvcp-mmp-lapp04.rno.apple.com
 (Oracle Communications Messaging Server 8.1.0.18.20220407 64bit (built Apr 7
 2022))
 with ESMTPSA id <0REX00PFDHHTOG00@rn-mailsvcp-mmp-lapp04.rno.apple.com>; Tue,
 12 Jul 2022 15:08:18 -0700 (PDT)
Content-type: text/plain; charset=utf-8
MIME-version: 1.0 (Mac OS X Mail 16.0 \(3726.0.9.1.22\))
From: Vidhi Goel <vidhi_goel@apple.com>
In-reply-to: <alpine.DEB.2.21.2207112144430.7292@hp8x-60.cs.helsinki.fi>
Date: Tue, 12 Jul 2022 15:08:10 -0700
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, Lisong Xu <xu@unl.edu>
Content-transfer-encoding: quoted-printable
Message-id: <7CF26B3A-D6C3-48F6-AA82-424231DD95D4@apple.com>
References: <alpine.DEB.2.21.2206141500480.7292@hp8x-60.cs.helsinki.fi>
 <alpine.DEB.2.21.2207112144430.7292@hp8x-60.cs.helsinki.fi>
To: Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>,
 Yoshifumi Nishida <nsd.ietf@gmail.com>
X-Mailer: Apple Mail (2.3726.0.9.1.22)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.517, 18.0.883
 definitions=2022-07-12_12:2022-07-12,
 2022-07-12 signatures=0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/VWEeP0tZLCGY1BP2nL5JP47x-hM>
Subject: Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>,
 <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>,
 <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Jul 2022 22:08:36 -0000

Hi Markku,

I emailed about this to other co-authors and we think that this change =
is completely untested for Cubic and we think that this could be =
considered of a future version of Cubic, not the current rfc8312bis.
To change Beta from 0.7 to 0.5 during slow-start, we would at least need =
some experience either from lab testing or deployment since all current =
deployments of Cubic for both TCP and QUIC use 0.7 as Beta during slow =
start. Since a lot of implementations currently use hystart(++) along =
with Cubic, we don=E2=80=99t see any high risk of overaggressive sending =
rate and that is what the current rfc8312bis suggests as well. In fact, =
changing Beta from 0.7 to 0.5 can still be aggressive without using =
hystart.

Thanks,
Vidhi=20

> On Jul 11, 2022, at 5:55 PM, Markku Kojo =
<kojo=3D40cs.helsinki.fi@dmarc.ietf.org> wrote:
>=20
> Hi all,
>=20
> below please find proposed text to solve the Issue 2 a). I will =
propose text to solve 2 b) once we have come to conclusion with 2 a). =
For description and arguments for issues 2 a) and 2 b), please see the =
original issue descriptions below.
>=20
> Sec 4.6. Multiplicative Decrease
>=20
> Old:
>   The parameter Beta__cubic_ SHOULD be set to 0.7, which is different
>   from the multiplicative decrease factor used in [RFC5681] (and
>   [RFC6675]) during fast recovery.
>=20
>=20
> New:
>   If the sender is not in slow start when the congestion event is
>   detected, the parameter Beta__cubic_ SHOULD be set to 0.7, which
>   is different from the multiplicative decrease factor used in
>   [RFC5681] (and [RFC6675].
>   This change is justified in the Reno-friendly region during
>   congestion avoidance because a CUBIC sender compensates the higher
>   multiplicative decrease factor than that of Reno by applying
>   a lower additive increase factor during congestion avoidance.
>=20
>   However, if the sender is in slow start when the congestion event is
>   detected, the parameter Beta__cubic_ MUST be set to 0.5 [Jacob88].
>   This results in the sender continuing to transmit data at the =
maximum
>   rate that the slow start determined to be available for the flow.
>   Using Beta__cubic_ with a value larger than 0.5 when the congestion
>   event is detected in slow start would result in an overagressive =
send
>   rate where the sender injects excess packets into the network and
>   each such packet is guaranteed to be dropped or force a packet from
>   a competing flow to be dropped at a tail-drop bottleneck router.
>   Furthermore, injecting such undelivered packets creates a danger of
>   congestion collapse (of some degree) "by delivering packets through
>   the network that are dropped before reaching their ultimate
>   destination." [RFC 2914]
>=20
>=20
>   [Jacob88] V. Jacobson, Congestion avoidance and control, SIGCOMM =
'88.
>=20
> Thanks,
>=20
> /Markku
>=20
> On Tue, 14 Jun 2022, Markku Kojo wrote:
>=20
>> Hi all,
>>=20
>> this thread starts the discussion on the issue 2: CUBIC is specified =
to use incorrect multiplicative-decrease factor for a congestion event =
that occurs when operating in slow start. And, applying HyStart++ does =
not remove the problem, it only mitigates it in some percentage of =
cases.
>>=20
>> I think it is useful to discuss this in two phases: 2 a) and 2 b) =
below.
>> For anyone commenting/arguing on the part 2 b), it is important to =
first
>> acknowledge whether (s)he thinks the original design and logic by Van =
Jacobson is correct. If not, one should explain why Van's design logic =
is incorrect.
>>=20
>> Issue 2 a)
>> ----------
>>=20
>> To begin with, let's but aside a potential use of HyStart++ (also =
assume tail drop router unless otherwise mentioned).
>>=20
>> The use of an MD factor larger than 0.5 is against the theory and =
original design by Van Jacobson as explained in the congavoid paper =
[Jacob88]. Any MD factor value larger then 0.5 will result sending extra =
packets during Fast Recovery following the congestion event (drop). All =
extra packets will become dropped at a tail-drop bottleneck (if a lonely =
flow).
>>=20
>> Note that at the time when the drop becomes signalled at the TCP =
sender, the size of the cwnd is double the available network capacity =
that slow start determined for the flow. That is, using MD=3D0.5 is =
already as aggressive as possible, leaving no slack. Therefore, if =
MD=3D0.7 is used, the TCP sender enters fast recovery with cwnd that is =
40% larger that the determined network capacity and all excess packets =
are guaranteed to become dropped, or even worse, the excess packets are =
likely to force packets for any competing flows to become unfairly be =
dropped.
>>=20
>> Moreover, if NewReno loss recovery is in use, a CUBIC sender will
>> operate overagressively for a very long time. For example, if the
>> available network capacity for the flow is 100 packets, cwnd will =
have
>> value 200 when the congestion is signalled and the CUBIC sender =
enters
>> fast recovery with cwnd=3D140 and injects 40 excess packets for each =
of
>> the subsequent 100 RTTs it stays in fast recovery, forcing 4000 =
packets to become inevitably and totally unnecessarily dropped.
>>=20
>> Even worse, this behaviour of sending 'undelivered packets' is =
against
>> the congestion control principles as it creates a danger of =
congestion
>> collapse (of some degree) "by delivering packets through the network
>> that are dropped before reaching their ultimate destination." [RFC =
2914]
>>=20
>> Such undelivered packets unnecessarily eat capacity from other flows
>> sharing the path before the bottleneck.
>>=20
>> RFC 2914 emphasises:
>>=20
>> "This is probably the largest unresolved danger with respect to
>> congestion collapse in the Internet today."
>>=20
>> It is very easy to envision a realistic network setup where this =
creates a degree of congestion collapse where a notable portion of =
useful network capacity is wasted due to the undelivered packets.
>>=20
>>=20
>> [Jacob88] V. Jacobson, Congestion avoidance and control, SIGCOMM '88.
>>=20
>>=20
>> Issue 2 b)
>> ----------
>>=20
>> The CUBIC draft suggests that HyStart++ should be used *everywhere* =
instead of the traditional Slow Start (see section 4.10).
>>=20
>> Although the draft does not say it, seemingly the authors suggest =
using HyStart++ instead of traditional Slow Start in order to avoid the =
problem of over-aggressive behaviour discussed above. This, however, has =
several issues.
>>=20
>> First. it is directly in conflict with HyStart++ specification which =
says that HyStart++ should be used only for the initial Slow Start. =
However, the overaggressive behaviour after slow start is also a =
potential problem with slow start during an RTO recovery; in case of =
sudden congestion that reduces available capacity for a flow down to a =
fraction of the currently available capacity, it is very likely that an =
RTO occurs. In such a case the RTO recovery in slow start inevitably =
overshoots and it is crucial for all flows not to be overaggressive.
>>=20
>> Second, the experimental results for initial slow start in HyStart++ =
draft suggest that while HyStart++ achieves good results HyStart++ is =
unable to exit slow start early and avoid overshoot in a significant =
percentage of cases.
>>=20
>> Given the above issues, the CUBIC draft must require that MD of 0.5 =
is used when the congestion event occurs while the sender is (still) in =
slow start. The use of MD=3D0.5 is an obvious stumble in the original =
CUBIC and the original CUBIC authors have already acknowledged this. It =
seems also obvious that instead of correcting the actual problem (use of =
MD other than 0.5), HyStart and HyStart++ have been proposed to address =
the design mistake. While HyStart++ is a useful method also when used =
with MD=3D0.5, when used alone it only mitigates the impact of the =
actual problem rather than solves the problem.
>>=20
>> What should be done for the cases where HyStart++ exits slow start =
but
>> is not able to avoid (some level of) overshoot and dropped packets is =
IMO an open issue. Resolving it requires additional experiments and it =
should be resolved separately when we have more data. For now when we do =
not have enough data and understanding of the behaviour we should IMO =
follow the general IETF guideline "be conservative in what you send" and =
specify that MD =3D 0.5 should be used for a congestion event that =
occurs for a packet sent in slow start.
>>=20
>> Thanks,
>>=20
>> /Markku
>>=20
>=20
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm

