Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization

Yuchung Cheng <ycheng@google.com> Tue, 18 April 2023 23:35 UTC

Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C6D13C14CE4B for <tcpm@ietfa.amsl.com>; Tue, 18 Apr 2023 16:35:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S4LE0c8lTuGz for <tcpm@ietfa.amsl.com>; Tue, 18 Apr 2023 16:35:34 -0700 (PDT)
Received: from mail-yw1-x1135.google.com (mail-yw1-x1135.google.com [IPv6:2607:f8b0:4864:20::1135]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EAF41C14CEF9 for <tcpm@ietf.org>; Tue, 18 Apr 2023 16:35:33 -0700 (PDT)
Received: by mail-yw1-x1135.google.com with SMTP id 00721157ae682-54fe3cd445aso173786887b3.5 for <tcpm@ietf.org>; Tue, 18 Apr 2023 16:35:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681860933; x=1684452933; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=dox/ZjXZDQkUYlnyXr7gJyk/VFtVnDyjz51/dYoWgKI=; b=bIKV5YhhEPNrCb3BBsptkcxKkwm6Ikez4RBXIHfUEcnj9s5VeHaCADemqNtAbrNCjZ /gLrPl5KKEGPe9TuXCB9/SwJ6T0otGhw/qFyBcKpsZ3LHCe11wlEHmFJgVs06xkpeGHq vbsQii3NTc+FdL9F/Yovn6ODcmYidF53si8WAICf6FO+lUle6H1Hht4GmKPMop4QjUjn caO/7xWBYJ1fdf23cCwOVp6cxhEJbYq4g0bRELxxwrWSXvxZEv6zAmioSr3BVay16DtH XE17otYUg3TR6jmJp0FEqlKhkjhBXgKd15f2+Sf2oduRG0NvTpNBk2xeYxnLDLllmUri 7zoQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681860933; x=1684452933; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dox/ZjXZDQkUYlnyXr7gJyk/VFtVnDyjz51/dYoWgKI=; b=AyaSVY28xcQPa8yyQyA2bxlFKbpL3mkUDsl7NnJS/utwcflttPV8i4J3oIz/u3JoPM 2Hv8eySyKptS/76IOQejRazubLAr3O14r2nJBxXT9QhlrBtFK50CY4VPEilzxn/N3KZ/ 3n28KZej1qvQ3EQY49GkfXX9YiFr9Px/hin3dRntM2cRGHUkZaUIdK43XYcajowYIIvK LM5lVws9sLDIvk5EmO9Esifc1JBTY76AG1AFodwlOzARkhmOkoZVbpnmbp+kFWQAa4bp a4Agzcmkm4pkPcaTrcxm9zdXywJixKyZt+OSJcmHDMgpd0TlDBiCmHyw+JQNLjB3B+fI rDWQ==
X-Gm-Message-State: AAQBX9dfoGaELXdEiK3mu2GBdlr62/PCs3XmO/RIL9JDqVuEBRJJJ2Mo JEChWAbU1F6FvpgkQ5FazlgkhGOY69tvhoTDvCOkRA==
X-Google-Smtp-Source: AKy350ZvIWKsszYIgar8EryoUx08L3dQZHM7oqbQ73ff+qXgW+TT7Yyrc2lwcLxy6kd26xtktQCCS1KYqsy7ss6+euQ=
X-Received: by 2002:a81:4519:0:b0:54e:edf3:b48f with SMTP id s25-20020a814519000000b0054eedf3b48fmr351916ywa.5.1681860932659; Tue, 18 Apr 2023 16:35:32 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQy=rbTc1rb5PKA1mvSJm61UTb=T5xzOkMBBB2Yadoe691A@mail.gmail.com> <CAK6E8=ckFHoiRTmLEy6ZH8z2ovv9+7S_UzUqnO3W4xcumyA1Gg@mail.gmail.com> <CADVnQyk7nxmaoTHh5qo9XvhrWojoB2R78FK0zX5CcwoZq6c=hg@mail.gmail.com>
In-Reply-To: <CADVnQyk7nxmaoTHh5qo9XvhrWojoB2R78FK0zX5CcwoZq6c=hg@mail.gmail.com>
From: Yuchung Cheng <ycheng@google.com>
Date: Tue, 18 Apr 2023 16:34:56 -0700
Message-ID: <CAK6E8=cXXWfHd+T3GkDEhJ6TmbstygL=qD4nns3w50DTe2eaZw@mail.gmail.com>
To: Neal Cardwell <ncardwell@google.com>
Cc: tcpm <tcpm@ietf.org>, Matt Mathis <mattmathis@measurementlab.net>, Nandita Dukkipati <nanditad@google.com>
Content-Type: multipart/alternative; boundary="000000000000745cea05f9a4c324"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/G2U2W-W8W304NlEjp7AoOnsYHhY>
Subject: Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Apr 2023 23:35:34 -0000

On Mon, Apr 17, 2023 at 2:00 PM Neal Cardwell <ncardwell@google.com> wrote:

>
>
> On Mon, Apr 17, 2023 at 4:13 PM Yuchung Cheng <ycheng@google.com> wrote:
>
>> Hi Neal,
>>
>> That's a good point and it was considered in the early stage of PRR. We
>> picked FlightSize (=snd.nxt - snd.una) to ensure ssthresh/RecoverFS
>> faithfully reflects the proportion of the congestion control window
>> reduction: RFC5681 still use FlightSize to compute ssthresh. But  some TCP
>> or specific C.C.s may use either cwnd (e.g. Linux cubic/reno) or pipe.  How
>> about a small graph:
>>
>> "If a TCP or congestion control implementation uses cwnd or pipe instead
>> of FlightSize to compute ssthresh, then RecoverFS should use the specific
>> metric accordingly, i.e. cwnd right before recovery"
>>
>
> AFAICT that analysis is conflating two different issues:
>
> (1) How does the congestion control compute ssthresh (based on cwnd or
> pipe or FlightSize?) You rightly point out that approaches vary for this
> part.
>
> (2) How does PRR determine what fraction of outstanding packets have been
> delivered (aka prr_delivered / RecoverFS)?
>
> AFAICT to get the right answer for the (2) question, RecoverFS should be
> initialized to "pipe", no matter what approach the CC takes for answering
> (1).
>
> My understanding of PRR is that if (pipe > ssthresh) is true, then the
> algorithm is doing Proportional Rate Reduction, and is essentially
> computing:
>
>       sndcnt ~= (target data sent in recovery)            - (actual data
> sent in recovery)
>       sndcnt ~= (fraction of data delivered) * ssthresh   - prr_out
>       sndcnt ~= (prr_delivered / RecoverFS ) * ssthresh   - prr_out
>
> For the (target data sent in recovery) to equal ssthresh at the end of the
> first round in recovery the algorithm must reach the point where
> prr_delivered == RecoverFS, so that (prr_delivered / RecoverFS ) is 1.
> Since  prr_delivered can only reach as high as "pipe" at the start of
> recovery, to be able to reach that condition we need to have RecoverFS ==
> "pipe". If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and
> prr_delivered won't be able to match RecoverFS and the (target data sent in
> recovery) won't reach ssthresh, and the algorithm will undershoot (the cwnd
> won't reach the ssthresh specified by congestion control, however it
> calculated that).
>
I still can't parse your analysis after reading it multiple times.

"prr_delivered can only reach as high as "pipe" at the start of recovery"
--> prr_delivered is initiated to 0 at the start of the recovery?
"If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and
prr_delivered won't be able to match RecoverFS" --> why is RecoverFS too
big and prr_delivered won't reach it.

I am not saying RecoverFS initiated to "pipe" is wrong. I just don't see a
substantial difference between FlightSize vs pipe, unless the
FlightSize/cwnd is small and/or limited-transmits were not used.

maybe you can walk an example with FlightSize vs pipe...


>
> What am I missing? :-)
>
> best regards,
> neal
>
>
>
>
>> On Mon, Apr 17, 2023 at 11:23 AM Neal Cardwell <ncardwell@google.com>
>> wrote:
>>
>>> Regarding this line in draft-ietf-tcpm-prr-rfc6937bis-03:
>>>
>>>    RecoverFS = snd.nxt - snd.una // FlightSize right before recovery
>>>
>>> AFAICT this should be:
>>>
>>>   RecoverFS = pipe  // RFC 6675 pipe algorithm
>>>
>>> Rationale: when recovery starts, often snd.nxt - snd.una includes 1 or
>>> more lost packets above snd.una and 3 or more SACKed packets above that;
>>> those packets are not really in the pipe, and not really in the FlightSize.
>>>
>>> With the draft as-is, packets that were SACKed on ACKs that happened
>>> before entering fast recovery are incorporated in RecoverFS (snd.nxt -
>>> snd.una) but never in prr_delivered (since that is set to 0 upon entering
>>> fast recovery), so at the end of fast recovery the expression:
>>>
>>>   CEIL(prr_delivered * ssthresh / RecoverFS)
>>>
>>> can be quite far below ssthresh, for very large numbers of packets
>>> SACKed before entering fast recovery (e.g., if the reordering degree is
>>> large).
>>>
>>> AFAICT that means that at the end of recovery the cwnd could be quite
>>> far below ssthresh, to the same degree, resulting in the cwnd being less
>>> than what congestion control specified when the connection entered fast
>>> recovery.
>>>
>>> AFAICT switching to RecoverFS = pipe fixes this, since it means that
>>> RecoverFS only includes packets in the pipe when the connection enters fast
>>> recovery, and thus prr_delivered can eventually reach RecoverFS, so tha) t
>>> the target number of packets sent (CEIL(prr_delivered * ssthresh /
>>> RecoverFS) can fully reach ssthresh.
>>>
>>> Apologies if I'm missing something or this has already been discussed.
>>>
>>> best regards,
>>> neal
>>>
>>>