Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization
Yoshifumi Nishida <nsd.ietf@gmail.com> Mon, 22 May 2023 18:38 UTC
Return-Path: <nsd.ietf@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23C11C14CE4B for <tcpm@ietfa.amsl.com>; Mon, 22 May 2023 11:38:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I3ZWN56bD-4C for <tcpm@ietfa.amsl.com>; Mon, 22 May 2023 11:38:09 -0700 (PDT)
Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4231FC14CE38 for <tcpm@ietf.org>; Mon, 22 May 2023 11:38:09 -0700 (PDT)
Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-19a16355c51so4579722fac.0 for <tcpm@ietf.org>; Mon, 22 May 2023 11:38:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684780688; x=1687372688; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=83CuEFUv1wtMmIAFp/mvIQu4YsfHhkxjEb7DrMkF7ag=; b=XeKGoV6+FRTRs8KdbEiLMJAhKMJUOmzcsYw7dx7eby3jUQvwIoHdkw1XOfZ4HF3Jsf RpkDzRfOb1VBOq2vT/hwk+BwGprf4fiItp47uZCd055grFmOsjZ73UoOZPH5vRikO3Bk rntPAaHKFMmUjDzjy6D4NXnougeqtkZCk0RWp6LWQSzFwUU4e5xH3BEufggtjU+8ggVM 6F8HJ8wiTAft/BUxA9HaT+0/IiY+SjlLLq4XKKH2sbxMgP12h6LzBLy7lj1aR9SjYoaa 5CcTgrJJ45XtjXoYYA/stPyjJKdCAm1Oizo84uKkKsO8zQj+Ehbg1O2BQgo3HxjZaf+i fh/A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684780688; x=1687372688; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=83CuEFUv1wtMmIAFp/mvIQu4YsfHhkxjEb7DrMkF7ag=; b=ZZc4PJo38L1GObPbNMS/9CL3B5yZDdwTQUeDXZSm9kowu5p7REAZrJlIES5t/r1AFK wcIP6oz6ZhrLQFt1/DmO8SKZDnY5wCpvTW5iiTPoUexK92G0QEhFKzhBoGxKTeN9Cq8x BUEqfkOI9gy6OcCZJMhSCbpytg5i0WX5lOG5vrUTtsRkIDFuadDgNM1uDRC8yFJirriw LcMN4ZmejLPC5mteTJmpSz/HhjTtk/7oY0xTv0v1BxDMhkp4ifWgoZET1g/V3CVPzV9J 3fkoirlnGhkw87VYeILeUFXjG+k/ggIk1iQrqj6gPSUmavC5BRmv76aymFbMbe+7cVdt mUAg==
X-Gm-Message-State: AC+VfDxwEDCIt9URSU2m+UOuHj0+EvWR3ZLsMUpcoTb/qjzMoNPdD32o QHbL4j70430Dt4i4+O8x0eQYCGZlNwyVenGe9qk=
X-Google-Smtp-Source: ACHHUZ5RE3SGbyCssWNfS2Reo/9z5GtAf8NmkZ8TRDPcwhHbsfFQO6GYk9xr1jCslCgPRZawL4f5xo4Ycy7v/UGF7pA=
X-Received: by 2002:a05:6870:9512:b0:195:fd17:7554 with SMTP id u18-20020a056870951200b00195fd177554mr6535436oal.48.1684780688273; Mon, 22 May 2023 11:38:08 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQy=rbTc1rb5PKA1mvSJm61UTb=T5xzOkMBBB2Yadoe691A@mail.gmail.com> <CAK6E8=ckFHoiRTmLEy6ZH8z2ovv9+7S_UzUqnO3W4xcumyA1Gg@mail.gmail.com> <CADVnQyk7nxmaoTHh5qo9XvhrWojoB2R78FK0zX5CcwoZq6c=hg@mail.gmail.com> <CAK6E8=cXXWfHd+T3GkDEhJ6TmbstygL=qD4nns3w50DTe2eaZw@mail.gmail.com> <CADVnQy=Q5cvN_+Fa0rbNc2a_Aqe=haROOd4SNpk9TbvE1MXVvQ@mail.gmail.com> <CAAK044RnWkzjJAuwQpc6eir-sss8jqhgSEc5srkrdEtdtEKSjA@mail.gmail.com> <CAK6E8=fomoH3NfmZfvsn1jUODMfAQJ-Ep5h81g4Aed6FYYN6Eg@mail.gmail.com>
In-Reply-To: <CAK6E8=fomoH3NfmZfvsn1jUODMfAQJ-Ep5h81g4Aed6FYYN6Eg@mail.gmail.com>
From: Yoshifumi Nishida <nsd.ietf@gmail.com>
Date: Mon, 22 May 2023 11:37:57 -0700
Message-ID: <CAAK044RNkUkAFh-2jWymQWFe2fY90F-Z-QbCSSeZEGk2=Sc7bw@mail.gmail.com>
To: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org>, Matt Mathis <mattmathis@measurementlab.net>, tcpm <tcpm@ietf.org>, Nandita Dukkipati <nanditad@google.com>
Content-Type: multipart/alternative; boundary="00000000000072f81e05fc4c920e"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/ptv75gUt9mE-kSk9uLcVz6HPzaU>
Subject: Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 May 2023 18:38:11 -0000
Hi Yuchung, Thank you so much! OK, we will wait for the updates. -- Yoshi On Mon, May 22, 2023 at 10:09 AM Yuchung Cheng <ycheng@google.com> wrote: > Hi Yoshifumi, > > Sorry for the radio silence. Neal will help co-author and update the draft > as he has many insights. We'll provide an update soon, hopefully we can > move forward before next meeting in SF. > > On Sun, May 21, 2023 at 11:34 PM Yoshifumi Nishida <nsd.ietf@gmail.com> > wrote: > >> Hello, >> >> Just in case, as this discussion has been quiet for a while.. >> I personally think what Neal mentions seems to make sense although I'm >> not very sure which approach is better. >> I hope this part will be addressed in the updated version of the draft. >> -- >> Yoshi >> >> On Wed, Apr 19, 2023 at 7:20 PM Neal Cardwell <ncardwell= >> 40google.com@dmarc.ietf.org> wrote: >> >>> >>> >>> On Tue, Apr 18, 2023 at 7:35 PM Yuchung Cheng <ycheng@google.com> wrote: >>> >>>> >>>> >>>> On Mon, Apr 17, 2023 at 2:00 PM Neal Cardwell <ncardwell@google.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Apr 17, 2023 at 4:13 PM Yuchung Cheng <ycheng@google.com> >>>>> wrote: >>>>> >>>>>> Hi Neal, >>>>>> >>>>>> That's a good point and it was considered in the early stage of PRR. >>>>>> We picked FlightSize (=snd.nxt - snd.una) to ensure ssthresh/RecoverFS >>>>>> faithfully reflects the proportion of the congestion control window >>>>>> reduction: RFC5681 still use FlightSize to compute ssthresh. But some TCP >>>>>> or specific C.C.s may use either cwnd (e.g. Linux cubic/reno) or pipe. How >>>>>> about a small graph: >>>>>> >>>>>> "If a TCP or congestion control implementation uses cwnd or pipe >>>>>> instead of FlightSize to compute ssthresh, then RecoverFS should use the >>>>>> specific metric accordingly, i.e. cwnd right before recovery" >>>>>> >>>>> >>>>> AFAICT that analysis is conflating two different issues: >>>>> >>>>> (1) How does the congestion control compute ssthresh (based on cwnd or >>>>> pipe or FlightSize?) You rightly point out that approaches vary for this >>>>> part. >>>>> >>>>> (2) How does PRR determine what fraction of outstanding packets have >>>>> been delivered (aka prr_delivered / RecoverFS)? >>>>> >>>>> AFAICT to get the right answer for the (2) question, RecoverFS should >>>>> be initialized to "pipe", no matter what approach the CC takes for >>>>> answering (1). >>>>> >>>>> My understanding of PRR is that if (pipe > ssthresh) is true, then the >>>>> algorithm is doing Proportional Rate Reduction, and is essentially >>>>> computing: >>>>> >>>>> sndcnt ~= (target data sent in recovery) - (actual >>>>> data sent in recovery) >>>>> sndcnt ~= (fraction of data delivered) * ssthresh - prr_out >>>>> sndcnt ~= (prr_delivered / RecoverFS ) * ssthresh - prr_out >>>>> >>>>> For the (target data sent in recovery) to equal ssthresh at the end of >>>>> the first round in recovery the algorithm must reach the point where >>>>> prr_delivered == RecoverFS, so that (prr_delivered / RecoverFS ) is 1. >>>>> Since prr_delivered can only reach as high as "pipe" at the start of >>>>> recovery, to be able to reach that condition we need to have RecoverFS == >>>>> "pipe". If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and >>>>> prr_delivered won't be able to match RecoverFS and the (target data sent in >>>>> recovery) won't reach ssthresh, and the algorithm will undershoot (the cwnd >>>>> won't reach the ssthresh specified by congestion control, however it >>>>> calculated that). >>>>> >>>> I still can't parse your analysis after reading it multiple times. >>>> >>>> "prr_delivered can only reach as high as "pipe" at the start of >>>> recovery" --> prr_delivered is initiated to 0 at the start of the recovery? >>>> "If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and >>>> prr_delivered won't be able to match RecoverFS" --> why is RecoverFS too >>>> big and prr_delivered won't reach it. >>>> >>>> I am not saying RecoverFS initiated to "pipe" is wrong. I just don't >>>> see a substantial difference between FlightSize vs pipe, unless the >>>> FlightSize/cwnd is small and/or limited-transmits were not used. >>>> >>>> maybe you can walk an example with FlightSize vs pipe... >>>> >>> >>> Discussing a concrete example is a good idea! >>> >>> Here's an example, sketching the behavior with >>> draft-ietf-tcpm-prr-rfc6937bis-03, AFAICT from trying to execute the >>> example by hand: >>> >>> CC = Reno >>> >>> cwnd = 100 packets >>> >>> The application writes 100*MSS. >>> >>> TCP sends 100 packets. >>> >>> In this example, to make the effects more clear, the TCP sender has >>> detected reordering with RACK-TLP or some other technique, so does not >>> enter fast recovery on the third SACKed packet, but rather waits a while to >>> accumulate more SACKs. >>> >>> From the flight of 100 packets, 1 packet is lost (P1), and 24 packets >>> are SACKed (packets P2..P25). >>> >>> We enter fast recovery with PRR. >>> >>> RecoverFS = snd.nxt - snd.una = 100 >>> >>> ssthresh = cwnd / 2 = 50 (Reno) >>> >>> pipe = snd.nxt - snd.una - (lost + SACKed) = 100 - (1 + 24) = 75 packets >>> >>> The expression (pipe > ssthresh) is true for a number of consecutive >>> SACKs, so we use the PRR code path repeatedly for a while as SACKs stream >>> in for P26..P100. >>> >>> Given the PRR code path math, in general, the target number of packets >>> sent so far in recovery will be: >>> >>> target_sent_so_far = CEIL(prr_delivered * ssthresh / RecoverFS) >>> = CEIL(prr_delivered * 50 / 100) >>> = CEIL(prr_delivered * .5) >>> >>> What happens: This will cause the sender to send 1 packet for every 2 >>> packets delivered (SACKed). Specifically, the connection will send 1 packet >>> for every 2 packets SACKed for the first 50 packets SACKed of the round >>> trip. This will cause pipe to fall from 75 to 75 - 50*0.5 = 75 - 25 = 50 >>> packets during that period, at which point (pipe > ssthresh) becomes false >>> and the connection will follow the PRR-CRB path to match the sending >>> process to the delivery process (packet conservation) to keep pipe matching >>> ssthresh. So the sender's rate was inconsistent: for 50 SACKs it sends at 1 >>> packet for every 2 packets delivered (SACKed); then for 25 SACKs it sends >>> at 1 packet for every 1 packet delivered (SACKed). So we don't meet the >>> goal of making "pipe" transition smoothly and consistently from its initial >>> value to ssthresh. >>> >>> What we want instead: the in-flight data (pipe) progressing smoothly >>> from 75 to 50 over the course of the full round trip, with the 75 packets >>> SACKed mapping smoothly into 50 packets transmitted, a ratio of 50 packets >>> send for 75 packets delivered, or a sent/delivered ratio of 50/75, or 0.666. >>> >>> So what we want is: initializing with RecoverFS = pipe, so we have : >>> target_sent_so_far = CEIL(prr_delivered * ssthresh / RecoverFS) >>> = CEIL(prr_delivered * 50 / 75) >>> = CEIL(prr_delivered * 0.666) >>> >>> That should achieve the goal of sending 50 packets for 75 packets >>> delivered, or a sent/delivered ratio of 50/75, or 0.666, aka sending 2 >>> packets for every 3 packets SACKed. In particular, at the end of the round >>> trip time we'll have: >>> >>> target_sent_so_far = CEIL(prr_delivered * 50 / 75) >>> = CEIL(75 * 50 / 75) >>> = 50 >>> >>> Hopefully that illustrates why, for the target_sent_so_far to smoothly >>> rise to ssthresh at the end of the first round in recovery, RecoverFS >>> should be initialized to pipe. >>> >>> The difference between the current initialization (RecoverFS = snd.nxt - >>> snd.una) and the proposed initialization (RecoverFS = pipe) would probably >>> be small in the typical case. But in cases like this where the sender has >>> detected reordering and is therefore allowing many SACKed packets before >>> entering recovery, AFAICT the difference could be significant. >>> >>> Best regards, >>> neal >>> >>> >>> >>> >>>> >>>> >>>>> >>>>> What am I missing? :-) >>>>> >>>>> best regards, >>>>> neal >>>>> >>>>> >>>>> >>>>> >>>>>> On Mon, Apr 17, 2023 at 11:23 AM Neal Cardwell <ncardwell@google.com> >>>>>> wrote: >>>>>> >>>>>>> Regarding this line in draft-ietf-tcpm-prr-rfc6937bis-03: >>>>>>> >>>>>>> RecoverFS = snd.nxt - snd.una // FlightSize right before recovery >>>>>>> >>>>>>> AFAICT this should be: >>>>>>> >>>>>>> RecoverFS = pipe // RFC 6675 pipe algorithm >>>>>>> >>>>>>> Rationale: when recovery starts, often snd.nxt - snd.una includes 1 >>>>>>> or more lost packets above snd.una and 3 or more SACKed packets above that; >>>>>>> those packets are not really in the pipe, and not really in the FlightSize. >>>>>>> >>>>>>> With the draft as-is, packets that were SACKed on ACKs that happened >>>>>>> before entering fast recovery are incorporated in RecoverFS (snd.nxt - >>>>>>> snd.una) but never in prr_delivered (since that is set to 0 upon entering >>>>>>> fast recovery), so at the end of fast recovery the expression: >>>>>>> >>>>>>> CEIL(prr_delivered * ssthresh / RecoverFS) >>>>>>> >>>>>>> can be quite far below ssthresh, for very large numbers of packets >>>>>>> SACKed before entering fast recovery (e.g., if the reordering degree is >>>>>>> large). >>>>>>> >>>>>>> AFAICT that means that at the end of recovery the cwnd could be >>>>>>> quite far below ssthresh, to the same degree, resulting in the cwnd being >>>>>>> less than what congestion control specified when the connection entered >>>>>>> fast recovery. >>>>>>> >>>>>>> AFAICT switching to RecoverFS = pipe fixes this, since it means that >>>>>>> RecoverFS only includes packets in the pipe when the connection enters fast >>>>>>> recovery, and thus prr_delivered can eventually reach RecoverFS, so tha) t >>>>>>> the target number of packets sent (CEIL(prr_delivered * ssthresh / >>>>>>> RecoverFS) can fully reach ssthresh. >>>>>>> >>>>>>> Apologies if I'm missing something or this has already been >>>>>>> discussed. >>>>>>> >>>>>>> best regards, >>>>>>> neal >>>>>>> >>>>>>> _______________________________________________ >>> tcpm mailing list >>> tcpm@ietf.org >>> https://www.ietf.org/mailman/listinfo/tcpm >>> >>
- [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and Reco… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Yuchung Cheng
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Yuchung Cheng
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Neal Cardwell
- [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set cwn… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Randall Stewart
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yuchung Cheng
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yuchung Cheng
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Yoshifumi Nishida
- Re: [tcpm] [EXTERNAL] Re: draft-ietf-tcpm-prr-rfc… Yi Huang
- Re: [tcpm] [EXTERNAL] Re: draft-ietf-tcpm-prr-rfc… Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: draft-ietf-tcpm-prr-rfc… Yi Huang
- Re: [tcpm] [EXTERNAL] Re: draft-ietf-tcpm-prr-rfc… Yuchung Cheng
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Neal Cardwell
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03: set… Randall Stewart
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Yoshifumi Nishida
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Yuchung Cheng
- Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and … Yoshifumi Nishida