Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set cwnd to ssthresh exiting fast recovery?

Neal Cardwell <ncardwell@google.com> Wed, 09 August 2023 15:06 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3016EC1516E3 for <tcpm@ietfa.amsl.com>; Wed, 9 Aug 2023 08:06:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -22.506
X-Spam-Level:
X-Spam-Status: No, score=-22.506 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ukUZbl6cipW4 for <tcpm@ietfa.amsl.com>; Wed, 9 Aug 2023 08:06:09 -0700 (PDT)
Received: from mail-ot1-x331.google.com (mail-ot1-x331.google.com [IPv6:2607:f8b0:4864:20::331]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 12D32C1516E0 for <tcpm@ietf.org>; Wed, 9 Aug 2023 08:06:09 -0700 (PDT)
Received: by mail-ot1-x331.google.com with SMTP id 46e09a7af769-6bcae8c4072so4802983a34.1 for <tcpm@ietf.org>; Wed, 09 Aug 2023 08:06:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691593568; x=1692198368; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Zm6FBWCm7SdWBmDYW41JnGT9RDVlSckg+GrAfYtulyw=; b=H7BvX1WnlLI9XCNFIoPMieq9z2lgl0DOXcSO2LiKB02smIG1s59mEVleEtcm7+RN4T GfgtW6k7znkAJRUzPaoq/EOcHwvOU65iYIeNf67E0GPUyH83GjH0aPIjH0/5co+PJpEz guEEf4wpOl+7pKeMPh7W78uXB6ICbUw+ouwoEEMvlY7/ACQKEsnmbCwewsS8CeCG1lUq fEer5JEP++9quK7L/g9x2cddpejYq+/PoVO2ixRdHIF25APL/qCm6GahFfyXyIJy2gsy ONG4u1jrwHXWHqDnugK5a/3bsdmut9skCQQctfaLB2TAuBAVaT/LGzQ3+wYbOe21e7Hf ozWQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691593568; x=1692198368; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Zm6FBWCm7SdWBmDYW41JnGT9RDVlSckg+GrAfYtulyw=; b=P26XwmZiBgtYg2z0gP6M6d/ZoS/8m09moPfJx3rU+Y5d2oMPhRaDJPml/Ti06G7z4B gHz+noAssZC2k55bIMWBX+nbePTdqCormatOJlFRH31pNKcaJwSJf77TcC9YB77gfTOh UHSgXD2xqn+v35+yN8CmcY/u709ZmNm8YCcuSFB/nQiEblQTkXvQ5+O70cVeVz9cPBws KBIKQEBqMuqL+3dF/pwm3odUWzNqy9rqWp3+xBQzy9IO53U0Z0AC/ZoG4/1fqA861jj5 uH5vLHgj8KzrvJ8vx/RtqA0sKjtOXOh585e8VD1z4898wAuOpwy04ER5pfyCvSvuncQ5 8TQQ==
X-Gm-Message-State: AOJu0YzSsxq7O5YHPttTd/StIT0QYr7fpzmjco2rIXGN0bJhwCdsuG8J /yZm461sP1VXw7SDaQ1U+Z7LLIrH+skdYS62tHDaE5j0nREhCwjU+5t8YA==
X-Google-Smtp-Source: AGHT+IFYUEfV4xAUN5f1srDJcjNxa6xOOMrrS6232Fi2n/t3JgiO1KtKsUCBY67RlZpQE4z9yWhSNi3YQRGieGZb1OY=
X-Received: by 2002:a05:6358:4287:b0:134:c552:522e with SMTP id s7-20020a056358428700b00134c552522emr1415251rwc.24.1691593567821; Wed, 09 Aug 2023 08:06:07 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQy=rbTc1rb5PKA1mvSJm61UTb=T5xzOkMBBB2Yadoe691A@mail.gmail.com> <CAK6E8=ckFHoiRTmLEy6ZH8z2ovv9+7S_UzUqnO3W4xcumyA1Gg@mail.gmail.com> <CADVnQyk7nxmaoTHh5qo9XvhrWojoB2R78FK0zX5CcwoZq6c=hg@mail.gmail.com> <CAK6E8=cXXWfHd+T3GkDEhJ6TmbstygL=qD4nns3w50DTe2eaZw@mail.gmail.com> <CADVnQy=Q5cvN_+Fa0rbNc2a_Aqe=haROOd4SNpk9TbvE1MXVvQ@mail.gmail.com> <CADVnQymCZkqRw6f8JTuFXhNXEo1KJx4S48gXaBaOPRasOVCg+Q@mail.gmail.com> <CAAK044QCh_KyFugteUo1eaez_6LipCXtJKW1rxaHqhidfRRGmQ@mail.gmail.com> <F24D815E-4932-4A84-B6C6-ECBCEB487199@netflix.com> <CAAK044QvbVHs+eFfitxpDUQOM2_vtBei-p5+ZUcatXTyYYE++g@mail.gmail.com> <CADVnQyn-Oi+0XpZMa9KLPdSMwCYpB-PQNYb0f6xRB6FeCMteoA@mail.gmail.com> <CAAK044RR1Vd3tNhsUXH4Ce66BVwg_z+O-vOrACmiOzf-+avS8A@mail.gmail.com> <CAAK044QDjUej5Z=Q32i+P6zJe72ZnSDF0JJjkqrEN5zSHtqwYQ@mail.gmail.com> <CAK6E8=depqAzh0FN0JYkXOWYsZU-bGfXybaqj4Jn2sySKb9_rg@mail.gmail.com> <CAAK044T6GX6mC0oX=f46PjJScx5ah4hivvhYAcZ_TbBUj1nMtw@mail.gmail.com>
In-Reply-To: <CAAK044T6GX6mC0oX=f46PjJScx5ah4hivvhYAcZ_TbBUj1nMtw@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Wed, 09 Aug 2023 11:05:50 -0400
Message-ID: <CADVnQynwqhG+RCsO5mHyg-9jTRVGfhqgNpN5nqe5kx52VD5cEQ@mail.gmail.com>
To: Yoshifumi Nishida <nsd.ietf@gmail.com>
Cc: Yuchung Cheng <ycheng@google.com>, tcpm <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000b8191606027ed115"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/-HxjVruHTN4-O1CwJ_wv0Ytt_HE>
Subject: Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set cwnd to ssthresh exiting fast recovery?
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2023 15:06:11 -0000

Hi Yoshifumi,

You are correct that draft-ietf-tcpm-prr-rfc6937bis-04 does not
 incorporate the suggestion in this thread to have a "cwnd = ssthresh" step
at the end of fast recovery. My sense was that this was because we had not
come to a conclusion / resolution of this question in this thread. :-)

I would still argue that it's important for PRR to set cwnd = ssthresh at
the end of recovery. Without setting cwnd = ssthresh at the end of
recovery, cwnd could end recovery far below ssthresh, leading to unusably
terrible performance; performance that would be far worse than RFC 6675
recovery (which simply sets cwnd = ssthresh at the start of recovery).

The Linux TCP PRR has had this cwnd = ssthresh step at the end of recovery
since the original PRR implementation in 2011:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a262f0cdf1f2916ea918dc329492abb5323d9a6c

best regards,
neal




On Wed, Aug 9, 2023 at 3:16 AM Yoshifumi Nishida <nsd.ietf@gmail.com> wrote:

> Hi Yuchung,
>
> Thanks for the response.
> I just would like to check one thing.
> In my understanding, Neal's suggestion here was to adjust cwnd to ssthresh
> at the end of recovery.
> But, I cannot find the statement or logic for such adjustment. Does this
> mean we decided there's no adjustment at the end of recovery? Or, am I
> missing something?
>
> Thanks,
> --
> Yoshi
>
> On Tue, Aug 8, 2023 at 2:34 PM Yuchung Cheng <ycheng@google.com> wrote:
>
>> Hi Yoshifumi,
>>
>> That part is how the "RecoverFS" state variable is calculated in the
>> draft. See the diff of 03/04 on Section 5 and 6 regarding "RecoverFS" state
>> variable definition and computation.
>> https://author-tools.ietf.org/iddiff?url2=draft-ietf-tcpm-prr-rfc6937bis-04
>>
>> Does that make sense?
>>
>> On Tue, Aug 8, 2023 at 12:01 AM Yoshifumi Nishida <nsd.ietf@gmail.com>
>> wrote:
>>
>>> Hi Yuchung,
>>>
>>> I think you have already updated the draft on the following point from
>>> the discussions in the last WG meeting.
>>> Could you point out which part has been updated? I'm just checking..
>>> Thanks,
>>> --
>>> Yoshi
>>>
>>> On Fri, May 5, 2023 at 11:51 AM Yoshifumi Nishida <nsd.ietf@gmail.com>
>>> wrote:
>>>
>>>> Hi Neal,
>>>>
>>>> Yes, I think I understand your point.
>>>> I prefer the current logic in some ways as it's more conservative as I
>>>> think we cannot always presume that queue has been drained at the end of
>>>> recovery.
>>>> But, I also think it may look too conservative.
>>>> I am expecting that the authors provide some insights on this point.
>>>> --
>>>> Yoshi
>>>>
>>>>
>>>> On Tue, May 2, 2023 at 11:31 AM Neal Cardwell <ncardwell@google.com>
>>>> wrote:
>>>>
>>>>> Hi Yoshi,
>>>>>
>>>>> You are right that because PRR always sets cwnd to ssthresh at the end
>>>>> of recovery, there will be some cases where with PRR cwnd jumps up
>>>>> drastically at the end of the recovery.
>>>>>
>>>>> However, AFAIK cwnd jumping up drastically, per se, is not a problem.
>>>>> Big bursts of packets going into the network is a problem. And given the
>>>>> dynamics of the alternative loss recovery algorithms (RFC6675 and PRR),
>>>>> both can allow bursts of packets; just in different circumstances:
>>>>>
>>>>> (1) RFC6675: Because RFC6675 sets cwnd once at the start of fast
>>>>> recovery, using (4.2) from RFC6675:
>>>>>
>>>>> ssthresh = cwnd = (FlightSize / 2)
>>>>>
>>>>> ...that means RFC6675 allows big bursts at the moment any loss is
>>>>> detected: any time L packets are lost, the sender can burst L more packets.
>>>>>
>>>>> (2) PRR: PRR is specifically designed to avoid big bursts in response
>>>>> to packet losses; no matter the structure or timing of the losses, PRR only
>>>>> allows a big burst at the end of Fast Recovery after all holes have been
>>>>> plugged, and the algorithm sets cwnd to ssthresh.
>>>>>
>>>>> So in your example ("For example, many packets were lost before
>>>>> entering recovery"), AFAICT RFC6675 can allow a big burst at the beginning
>>>>> of recovery, when the lost packets are detected. AFAICT in this case PRR
>>>>> can allow a burst of packets at the end of recovery when it sets cwnd to
>>>>> ssthresh, but at least at this point the bottleneck queue has potentially
>>>>> drained somewhat.
>>>>>
>>>>> Please let me know if that analysis misses something important. :-)
>>>>>
>>>>> Thanks!
>>>>> neal
>>>>>
>>>>>
>>>>> On Mon, May 1, 2023 at 5:22 PM Yoshifumi Nishida <nsd.ietf@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Randall,
>>>>>>
>>>>>> I might miss something, but here's what I've thought..
>>>>>> If we lost many packets in a RTT such as the Figure 5 in the 6937bis
>>>>>> draft, I think the window growth during the recovery period will be bound
>>>>>> by PRR-CRB or PRR-SSRB.
>>>>>> Hence, I think the cwnd at the end of recovery can be smaller than we
>>>>>> expect as shown in figure 5.
>>>>>> --
>>>>>> Yoshi
>>>>>>
>>>>>> On Mon, May 1, 2023 at 4:17 AM Randall Stewart <rrs@netflix.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Neal and Yoshi:
>>>>>>>
>>>>>>> Neal: So the FreeBSD implementation in rack, like linux, does the
>>>>>>> same exact thing set cwnd to ssthresh at
>>>>>>> exit from recovery.
>>>>>>>
>>>>>>> Yoshi: I don’t see how this would cause cwnd to be larger, since at
>>>>>>> the entry to recovery you set ssthresh = cwnd *  Beta. But
>>>>>>>           maybe I am missing something, can you give an example like
>>>>>>> Neal did below?
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> R
>>>>>>>
>>>>>>> On May 1, 2023, at 5:32 AM, Yoshifumi Nishida <nsd.ietf@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Neal,
>>>>>>>
>>>>>>> If we always set cwnd to ssthresh at the end of recovery, I am
>>>>>>> guessing there will be some cases where cwnd jumps up drastically at the
>>>>>>> end of the recovery. For example, many packets were lost before entering
>>>>>>> recovery.  Or, am I missing something?
>>>>>>> --
>>>>>>> Yoshi
>>>>>>>
>>>>>>> On Wed, Apr 19, 2023 at 7:37 PM Neal Cardwell <ncardwell=
>>>>>>> 40google.com@dmarc.ietf.org> wrote:
>>>>>>>
>>>>>>>> Working through examples for the "draft-ietf-tcpm-prr-rfc6937bis-03
>>>>>>>> and RecoverFS initialization" thread this evening, I ran into another
>>>>>>>> potential issue.
>>>>>>>>
>>>>>>>> The Linux TCP implementation of PRR explicitly/directly sets cwnd
>>>>>>>> to ssthresh at the end of fast recovery (in tcp_end_cwnd_reduction()). But
>>>>>>>> this behavior is not in the algorithm in the PRR RFC or draft, at least in
>>>>>>>> the figures in section 6, Algorithms. Maybe it is in the prose somewhere
>>>>>>>> and I missed it; but in that case I'd argue strongly to put this in the
>>>>>>>> figures in section 6, Algorithms.
>>>>>>>>
>>>>>>>> AFAICT in some cases this is strictly necessary to get cwnd to grow
>>>>>>>> to reach ssthresh. Without such a direct step, cwnd could end up far below
>>>>>>>> ssthresh at the end of recovery. Here's an example to illustrate:
>>>>>>>>
>>>>>>>> CC = CUBIC
>>>>>>>>
>>>>>>>> cwnd = 10
>>>>>>>>
>>>>>>>> The reordering degree was estimated to be large, so the connection
>>>>>>>> will wait for more than 3 packets to be SACKed before entering fast
>>>>>>>> recovery.
>>>>>>>>
>>>>>>>> --- Application writes 10*MSS.
>>>>>>>>
>>>>>>>> TCP sends packets P1 .. P10.
>>>>>>>> pipe = 10 packets in flight (P1 .. P10)
>>>>>>>>
>>>>>>>> --- P2..P9 SACKed  -> do nothing
>>>>>>>>
>>>>>>>> (Because the reordering degree was previously estimated to be
>>>>>>>> large.)
>>>>>>>>
>>>>>>>> --- P10 SACKed -> mark P1 as lost and enter fast recovery
>>>>>>>>
>>>>>>>> PRR:
>>>>>>>> ssthresh = CongCtrlAlg() = 7 packets // CUBIC
>>>>>>>> prr_delivered = 0
>>>>>>>> prr_out = 0
>>>>>>>> RecoverFS = snd.nxt - snd.una = 10 packets (P1..P10)
>>>>>>>>
>>>>>>>> DeliveredData = 1  (P10 was SACKed)
>>>>>>>>
>>>>>>>> prr_delivered += DeliveredData   ==> prr_delivered = 1
>>>>>>>>
>>>>>>>> pipe =  0  (all packets are SACKed or lost; P1 is lost, rest are
>>>>>>>> SACKed)
>>>>>>>>
>>>>>>>> safeACK = false (snd.una did not advance)
>>>>>>>>
>>>>>>>> if (pipe > ssthresh) => if (0 > 7) => false
>>>>>>>> else
>>>>>>>>   // PRR-CRB by default
>>>>>>>>   sndcnt = MAX(prr_delivered - prr_out, DeliveredData)
>>>>>>>>          = MAX(1 - 0, 1)
>>>>>>>>          = 1
>>>>>>>>
>>>>>>>>   sndcnt = MIN(ssthresh - pipe, sndcnt)
>>>>>>>>          = MIN(7 - 0, 1)
>>>>>>>>          = 1
>>>>>>>>
>>>>>>>> cwnd = pipe + sndcnt
>>>>>>>>      = 0    + 1
>>>>>>>>      = 1
>>>>>>>>
>>>>>>>> retransmit P1
>>>>>>>>
>>>>>>>> prr_out += 1   ==> prr_out = 1
>>>>>>>>
>>>>>>>> --- P1 retransmit plugs hole; receive cumulative ACK for P1..P10
>>>>>>>>
>>>>>>>> DeliveredData = 1  (P1 was newly ACKed)
>>>>>>>>
>>>>>>>> prr_delivered += DeliveredData   ==> prr_delivered = 2
>>>>>>>>
>>>>>>>> pipe =  0  (all packets are cumuatively ACKed)
>>>>>>>>
>>>>>>>> safeACK = (snd.una advances and no further loss indicated)
>>>>>>>> safeACK = true
>>>>>>>>
>>>>>>>> if (pipe > ssthresh) => if (0 > 7) => false
>>>>>>>> else
>>>>>>>>   // PRR-CRB by default
>>>>>>>>   sndcnt = MAX(prr_delivered - prr_out, DeliveredData)
>>>>>>>>          = MAX(2 - 1, 1)
>>>>>>>>          = 1
>>>>>>>>   if (safeACK) => true
>>>>>>>>     // PRR-SSRB when recovery is in good progress
>>>>>>>>     sndcnt += 1   ==> sndcnt = 2
>>>>>>>>
>>>>>>>>   sndcnt = MIN(ssthresh - pipe, sndcnt)
>>>>>>>>          = MIN(7 - 0, 2)
>>>>>>>>          = 2
>>>>>>>>
>>>>>>>> cwnd = pipe + sndcnt
>>>>>>>>      = 0    + 2
>>>>>>>>      = 2
>>>>>>>>
>>>>>>>> So we exit fast recovery with cwnd=2 even though ssthresh is 7.
>>>>>>>>
>>>>>>>> As noted above, the Linux TCP implementation does not suffer this
>>>>>>>> problem because it explicitly/directly sets cwnd to ssthresh at the end of
>>>>>>>> fast recovery.
>>>>>>>>
>>>>>>>> I would recommend including this cwnd=ssthresh step at the end of
>>>>>>>> recovery in the draft, to ensure that cwnd reaches ssthresh at the end of
>>>>>>>> fast recovery, even in cases like this where there will be insufficient
>>>>>>>> delivered data in fast recovery to allow pipe to incrementally grow to
>>>>>>>> reach ssthresh using PRR-SSRB.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> neal
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> tcpm mailing list
>>>>>>>> tcpm@ietf.org
>>>>>>>> https://www.ietf.org/mailman/listinfo/tcpm
>>>>>>>> <https://www.google.com/url?q=https://www.ietf.org/mailman/listinfo/tcpm&source=gmail-imap&ust=1683538345000000&usg=AOvVaw2cOITQpYcuP_M95396rEmw>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> tcpm mailing list
>>>>>>> tcpm@ietf.org
>>>>>>>
>>>>>>> https://www.google.com/url?q=https://www.ietf.org/mailman/listinfo/tcpm&source=gmail-imap&ust=1683538345000000&usg=AOvVaw2cOITQpYcuP_M95396rEmw
>>>>>>>
>>>>>>>
>>>>>>> ------
>>>>>>> Randall Stewart
>>>>>>> rrs@netflix.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>