Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization

Yoshifumi Nishida <nsd.ietf@gmail.com> Mon, 22 May 2023 06:34 UTC

Return-Path: <nsd.ietf@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 68557C169527 for <tcpm@ietfa.amsl.com>; Sun, 21 May 2023 23:34:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w1lpnkINos0D for <tcpm@ietfa.amsl.com>; Sun, 21 May 2023 23:34:28 -0700 (PDT)
Received: from mail-oa1-x36.google.com (mail-oa1-x36.google.com [IPv6:2001:4860:4864:20::36]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 94E5EC16950A for <tcpm@ietf.org>; Sun, 21 May 2023 23:34:28 -0700 (PDT)
Received: by mail-oa1-x36.google.com with SMTP id 586e51a60fabf-19a5831610dso1755374fac.1 for <tcpm@ietf.org>; Sun, 21 May 2023 23:34:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684737267; x=1687329267; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=OskgXtHc+o5eZfvI4bObNJ4c5Acxo/cFgQQXfwhmBb4=; b=YfE2t9AZf8Sdf0sNckVRWuIjmn81VTEvH+S865Zzf0CrgUxmB5DrTM5SmM1SHNAs8J RkK3oO6K0C7B5kYQAdLW36SuO4WtXso34mTP0+ECSq0XIdG2H+XFvXc8jRGbNiha8FD1 HhWT5p3J5ASAFv1qu+T1dS1cuHkDCgvPkAO0gMJpZO6FeNH/3s+/YSN4ariOVUpKxJw5 usQaYAgsrqEQoR1BwsQ3BcUlcVaAzdH82/CWhNJTwmz2xyCD61puWSZFWtpUBRH7XNng SGYgw5HJYHpuA882ZrISZWEA9qEmdvFBPbWcjhLFtFWk1zNem+W/eh8jNY3FR1M84gfD 27Cg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684737267; x=1687329267; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OskgXtHc+o5eZfvI4bObNJ4c5Acxo/cFgQQXfwhmBb4=; b=Hvx5lChq2N1PcKZeg7cVAUKzS/fzThZ6J0/glIN0awEpNbK3z/KAEY+eqwcs5H/r8t 5Z6DBrdelCC5Mcjxj4sfBSbnFtk6nzNN8OrvWIC4LjnOkHgS/Yz+XmTuPINMnI6dE0Uu mTt/hAMP5Mm0nbFcv6qHz/wJSCawH9WywOG4dn/gXDfl8SS6KH/wLKCqq/mI19JaDb9x abTbglki8M6Gpi6gVvVSUe3yW7jdZeK+U7iVZ3pKhFrw3SfVQ6jOVHG49+jKtNaj73QB GapzMbklqcxRYWJfE1mZL5r6pIEDZR3IJSuyq8iu6yTploUd+0mCs9s4NTmB2BAbTArj dnBA==
X-Gm-Message-State: AC+VfDyEtthM8xCkvwnKdjqHOmAPI1ZuAigcCap5myWdvQ76RSlEdIjZ 7AMk5DRUvB5ZwQUy8a3L7BnsFdZuLsigPE0BSl0=
X-Google-Smtp-Source: ACHHUZ5x8xdG0xdtAILgYrazjSCdR6bFlObXyqQg2vkmvam3TNk6e+U9m9N5yGojYD9QDDblpWquTuub14j4uM1YsYk=
X-Received: by 2002:a05:6870:72f:b0:19b:f5b7:dcf6 with SMTP id ea47-20020a056870072f00b0019bf5b7dcf6mr1885786oab.6.1684737267298; Sun, 21 May 2023 23:34:27 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQy=rbTc1rb5PKA1mvSJm61UTb=T5xzOkMBBB2Yadoe691A@mail.gmail.com> <CAK6E8=ckFHoiRTmLEy6ZH8z2ovv9+7S_UzUqnO3W4xcumyA1Gg@mail.gmail.com> <CADVnQyk7nxmaoTHh5qo9XvhrWojoB2R78FK0zX5CcwoZq6c=hg@mail.gmail.com> <CAK6E8=cXXWfHd+T3GkDEhJ6TmbstygL=qD4nns3w50DTe2eaZw@mail.gmail.com> <CADVnQy=Q5cvN_+Fa0rbNc2a_Aqe=haROOd4SNpk9TbvE1MXVvQ@mail.gmail.com>
In-Reply-To: <CADVnQy=Q5cvN_+Fa0rbNc2a_Aqe=haROOd4SNpk9TbvE1MXVvQ@mail.gmail.com>
From: Yoshifumi Nishida <nsd.ietf@gmail.com>
Date: Sun, 21 May 2023 23:34:16 -0700
Message-ID: <CAAK044RnWkzjJAuwQpc6eir-sss8jqhgSEc5srkrdEtdtEKSjA@mail.gmail.com>
To: Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org>
Cc: Yuchung Cheng <ycheng@google.com>, Matt Mathis <mattmathis@measurementlab.net>, tcpm <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000005b752f05fc427689"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/x8AMmLMgIEJKJKlLIBVzPws_iWQ>
Subject: Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 May 2023 06:34:32 -0000

Hello,

Just in case, as this discussion has been quiet for a while..
I personally think what Neal mentions seems to make sense although I'm not
very sure which approach is better.
I hope this part will be addressed in the updated version of the draft.
--
Yoshi

On Wed, Apr 19, 2023 at 7:20 PM Neal Cardwell <ncardwell=
40google.com@dmarc.ietf.org> wrote:

>
>
> On Tue, Apr 18, 2023 at 7:35 PM Yuchung Cheng <ycheng@google.com> wrote:
>
>>
>>
>> On Mon, Apr 17, 2023 at 2:00 PM Neal Cardwell <ncardwell@google.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Apr 17, 2023 at 4:13 PM Yuchung Cheng <ycheng@google.com> wrote:
>>>
>>>> Hi Neal,
>>>>
>>>> That's a good point and it was considered in the early stage of PRR. We
>>>> picked FlightSize (=snd.nxt - snd.una) to ensure ssthresh/RecoverFS
>>>> faithfully reflects the proportion of the congestion control window
>>>> reduction: RFC5681 still use FlightSize to compute ssthresh. But  some TCP
>>>> or specific C.C.s may use either cwnd (e.g. Linux cubic/reno) or pipe.  How
>>>> about a small graph:
>>>>
>>>> "If a TCP or congestion control implementation uses cwnd or pipe
>>>> instead of FlightSize to compute ssthresh, then RecoverFS should use the
>>>> specific metric accordingly, i.e. cwnd right before recovery"
>>>>
>>>
>>> AFAICT that analysis is conflating two different issues:
>>>
>>> (1) How does the congestion control compute ssthresh (based on cwnd or
>>> pipe or FlightSize?) You rightly point out that approaches vary for this
>>> part.
>>>
>>> (2) How does PRR determine what fraction of outstanding packets have
>>> been delivered (aka prr_delivered / RecoverFS)?
>>>
>>> AFAICT to get the right answer for the (2) question, RecoverFS should be
>>> initialized to "pipe", no matter what approach the CC takes for answering
>>> (1).
>>>
>>> My understanding of PRR is that if (pipe > ssthresh) is true, then the
>>> algorithm is doing Proportional Rate Reduction, and is essentially
>>> computing:
>>>
>>>       sndcnt ~= (target data sent in recovery)            - (actual data
>>> sent in recovery)
>>>       sndcnt ~= (fraction of data delivered) * ssthresh   - prr_out
>>>       sndcnt ~= (prr_delivered / RecoverFS ) * ssthresh   - prr_out
>>>
>>> For the (target data sent in recovery) to equal ssthresh at the end of
>>> the first round in recovery the algorithm must reach the point where
>>> prr_delivered == RecoverFS, so that (prr_delivered / RecoverFS ) is 1.
>>> Since  prr_delivered can only reach as high as "pipe" at the start of
>>> recovery, to be able to reach that condition we need to have RecoverFS ==
>>> "pipe". If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and
>>> prr_delivered won't be able to match RecoverFS and the (target data sent in
>>> recovery) won't reach ssthresh, and the algorithm will undershoot (the cwnd
>>> won't reach the ssthresh specified by congestion control, however it
>>> calculated that).
>>>
>> I still can't parse your analysis after reading it multiple times.
>>
>> "prr_delivered can only reach as high as "pipe" at the start of recovery"
>> --> prr_delivered is initiated to 0 at the start of the recovery?
>> "If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and
>> prr_delivered won't be able to match RecoverFS" --> why is RecoverFS too
>> big and prr_delivered won't reach it.
>>
>> I am not saying RecoverFS initiated to "pipe" is wrong. I just don't see
>> a substantial difference between FlightSize vs pipe, unless the
>> FlightSize/cwnd is small and/or limited-transmits were not used.
>>
>> maybe you can walk an example with FlightSize vs pipe...
>>
>
> Discussing a concrete example is a good idea!
>
> Here's an example, sketching the behavior with
> draft-ietf-tcpm-prr-rfc6937bis-03, AFAICT from trying to execute the
> example by hand:
>
> CC = Reno
>
> cwnd = 100 packets
>
> The application writes 100*MSS.
>
> TCP sends 100 packets.
>
> In this example, to make the effects more clear, the TCP sender has
> detected reordering with RACK-TLP or some other technique, so does not
> enter fast recovery on the third SACKed packet, but rather waits a while to
> accumulate more SACKs.
>
> From the flight of 100 packets, 1 packet is lost (P1), and 24 packets are
> SACKed (packets P2..P25).
>
> We enter fast recovery with PRR.
>
> RecoverFS = snd.nxt - snd.una = 100
>
> ssthresh = cwnd / 2 = 50  (Reno)
>
> pipe = snd.nxt - snd.una - (lost + SACKed) = 100 - (1 + 24) = 75 packets
>
> The expression (pipe > ssthresh) is true for a number of consecutive
> SACKs, so we use the PRR code path repeatedly for a while as SACKs stream
> in for P26..P100.
>
> Given the PRR code path math, in general, the target number of packets
> sent so far in recovery will be:
>
>    target_sent_so_far = CEIL(prr_delivered * ssthresh / RecoverFS)
>                       = CEIL(prr_delivered * 50 / 100)
>                       = CEIL(prr_delivered * .5)
>
> What happens: This will cause the sender to send 1 packet for every 2
> packets delivered (SACKed). Specifically, the connection will send 1 packet
> for every 2 packets SACKed for the first 50 packets SACKed of the round
> trip. This will cause pipe to fall from 75 to 75 - 50*0.5 = 75 - 25 = 50
> packets during that period, at which point (pipe > ssthresh) becomes false
> and the connection will follow the PRR-CRB path to match the sending
> process to the delivery process (packet conservation) to keep pipe matching
> ssthresh. So the sender's rate was inconsistent: for 50 SACKs it sends at 1
> packet for every 2 packets delivered (SACKed); then for 25 SACKs it sends
> at 1 packet for every 1 packet delivered (SACKed). So we don't meet the
> goal of making "pipe" transition smoothly and consistently from its initial
> value to ssthresh.
>
> What we want instead: the in-flight data (pipe) progressing smoothly from
> 75 to 50 over the course of the full round trip, with the 75 packets SACKed
> mapping smoothly into 50 packets transmitted, a ratio of 50 packets send
> for 75 packets delivered, or a sent/delivered ratio of 50/75, or 0.666.
>
> So what we want is: initializing with RecoverFS = pipe, so we have :
>    target_sent_so_far = CEIL(prr_delivered * ssthresh / RecoverFS)
>                       = CEIL(prr_delivered * 50 / 75)
>                       = CEIL(prr_delivered * 0.666)
>
> That should achieve the goal of sending 50 packets for 75 packets
> delivered, or a sent/delivered ratio of 50/75, or 0.666, aka sending 2
> packets for every 3 packets SACKed. In particular, at the end of the round
> trip time we'll have:
>
>    target_sent_so_far = CEIL(prr_delivered * 50 / 75)
>                       = CEIL(75 * 50 / 75)
>                       = 50
>
> Hopefully that illustrates why, for the target_sent_so_far to smoothly
> rise to ssthresh at the end of the first round in recovery, RecoverFS
> should be initialized to pipe.
>
> The difference between the current initialization (RecoverFS = snd.nxt -
> snd.una) and the proposed initialization (RecoverFS = pipe) would probably
> be small in the typical case. But in cases like this where the sender has
> detected reordering and is therefore allowing many SACKed packets before
> entering recovery, AFAICT the difference could be significant.
>
> Best regards,
> neal
>
>
>
>
>>
>>
>>>
>>> What am I missing? :-)
>>>
>>> best regards,
>>> neal
>>>
>>>
>>>
>>>
>>>> On Mon, Apr 17, 2023 at 11:23 AM Neal Cardwell <ncardwell@google.com>
>>>> wrote:
>>>>
>>>>> Regarding this line in draft-ietf-tcpm-prr-rfc6937bis-03:
>>>>>
>>>>>    RecoverFS = snd.nxt - snd.una // FlightSize right before recovery
>>>>>
>>>>> AFAICT this should be:
>>>>>
>>>>>   RecoverFS = pipe  // RFC 6675 pipe algorithm
>>>>>
>>>>> Rationale: when recovery starts, often snd.nxt - snd.una includes 1 or
>>>>> more lost packets above snd.una and 3 or more SACKed packets above that;
>>>>> those packets are not really in the pipe, and not really in the FlightSize.
>>>>>
>>>>> With the draft as-is, packets that were SACKed on ACKs that happened
>>>>> before entering fast recovery are incorporated in RecoverFS (snd.nxt -
>>>>> snd.una) but never in prr_delivered (since that is set to 0 upon entering
>>>>> fast recovery), so at the end of fast recovery the expression:
>>>>>
>>>>>   CEIL(prr_delivered * ssthresh / RecoverFS)
>>>>>
>>>>> can be quite far below ssthresh, for very large numbers of packets
>>>>> SACKed before entering fast recovery (e.g., if the reordering degree is
>>>>> large).
>>>>>
>>>>> AFAICT that means that at the end of recovery the cwnd could be quite
>>>>> far below ssthresh, to the same degree, resulting in the cwnd being less
>>>>> than what congestion control specified when the connection entered fast
>>>>> recovery.
>>>>>
>>>>> AFAICT switching to RecoverFS = pipe fixes this, since it means that
>>>>> RecoverFS only includes packets in the pipe when the connection enters fast
>>>>> recovery, and thus prr_delivered can eventually reach RecoverFS, so tha) t
>>>>> the target number of packets sent (CEIL(prr_delivered * ssthresh /
>>>>> RecoverFS) can fully reach ssthresh.
>>>>>
>>>>> Apologies if I'm missing something or this has already been discussed.
>>>>>
>>>>> best regards,
>>>>> neal
>>>>>
>>>>> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm
>