Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization

Neal Cardwell <ncardwell@google.com> Mon, 17 April 2023 21:00 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 11B7BC13AE23 for <tcpm@ietfa.amsl.com>; Mon, 17 Apr 2023 14:00:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rIMYXgbpZ7Zj for <tcpm@ietfa.amsl.com>; Mon, 17 Apr 2023 14:00:26 -0700 (PDT)
Received: from mail-ua1-x92e.google.com (mail-ua1-x92e.google.com [IPv6:2607:f8b0:4864:20::92e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5C5A0C15155E for <tcpm@ietf.org>; Mon, 17 Apr 2023 14:00:26 -0700 (PDT)
Received: by mail-ua1-x92e.google.com with SMTP id q10so4081434uas.2 for <tcpm@ietf.org>; Mon, 17 Apr 2023 14:00:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1681765225; x=1684357225; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=iOF1VNV1DMqfYe8Yl+rrc8g6RDRzvz6VGEenYZuffus=; b=QoT92KfJuG/NnJhHuoR8oIuVr9cNVfQFustPY8MGa9kf37n7HzZ2WPr6XmAygXPHaI tSq/Nxi9cNPwElSuNmknRgvzpF+Xo3objLz+0zLYbWqxy3Bd/pWUzafdnisoJgzpij8k 5bugBakm2/fLa6IGR7YcUTDCAYdCsCSHGYL4Yp9udtrJJ7Aaoyv5j8lPkFHu2SUuxKlX a3XgY2rN5EIIoJ+elKwG8o/BphJt+LEV17zhRsu4V1Ad+kT+D9tZRfOFiAQDfbZFMXDj RyndW+iuL0rPr3gFuFaTgVCUBRwg/LDTD/xF/lTFaiwtryeRT19ifYdslNsXb8NMaQM8 BwtA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681765225; x=1684357225; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=iOF1VNV1DMqfYe8Yl+rrc8g6RDRzvz6VGEenYZuffus=; b=k6YOaGLtlk5aFhtsYu0Pu/LCbrQWpmFkR0B1X+QINcgrGL4bIpJ0RPZB89llCfs8h6 s6JRF2bDdltNC+9XDsSwETgsH2T1S4SMGRqejiSWZ//Y0FEafbSyupuyM6PrXfvNaqjK 3JNXlMT4hrRaxqbq8KBO0S3RAKn8IH9nyqGRa3Oh2XNOzWLolpmWyP0EH+lJd0+K/cP1 7/lO9KHFcEKFYRyOBBurnJR1lOEyQBWFLiPuxBlEviAt1+E2yS3J06AJZ4rARW/UNILW jRA+XuVcVrrDCy9tUIoWCBrYg51eWiYNO91eNZMUH27tEdt/zom7qNA7dH8V2z27tc1s Jm2A==
X-Gm-Message-State: AAQBX9ehtZvWK+fB43rrhQvT+RYbfkvY35qmRYz4dX3VdY/hO06AmTJL +hZjG2Uu7FGg94s1aku6eFKxvwudONGBEJi9qoq0Fw==
X-Google-Smtp-Source: AKy350ZkQ+r3fPxov63Q0Auzx1PZRMEMeXpXkcmn6vxOyRoh7C1xE1mIAA4SBvaqsFYgZQ7YY+Q1EiqbdAncNehKnQk=
X-Received: by 2002:a1f:4582:0:b0:43f:c4b2:b11d with SMTP id s124-20020a1f4582000000b0043fc4b2b11dmr5357267vka.3.1681765225190; Mon, 17 Apr 2023 14:00:25 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQy=rbTc1rb5PKA1mvSJm61UTb=T5xzOkMBBB2Yadoe691A@mail.gmail.com> <CAK6E8=ckFHoiRTmLEy6ZH8z2ovv9+7S_UzUqnO3W4xcumyA1Gg@mail.gmail.com>
In-Reply-To: <CAK6E8=ckFHoiRTmLEy6ZH8z2ovv9+7S_UzUqnO3W4xcumyA1Gg@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Mon, 17 Apr 2023 17:00:06 -0400
Message-ID: <CADVnQyk7nxmaoTHh5qo9XvhrWojoB2R78FK0zX5CcwoZq6c=hg@mail.gmail.com>
To: Yuchung Cheng <ycheng@google.com>
Cc: tcpm <tcpm@ietf.org>, Matt Mathis <mattmathis@measurementlab.net>, Nandita Dukkipati <nanditad@google.com>
Content-Type: multipart/alternative; boundary="000000000000d854cf05f98e7a38"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/gEBnl78vfgZ4bUKWZJ3NFol0Myw>
Subject: Re: [tcpm] draft-ietf-tcpm-prr-rfc6937bis-03 and RecoverFS initialization
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2023 21:00:27 -0000

On Mon, Apr 17, 2023 at 4:13 PM Yuchung Cheng <ycheng@google.com> wrote:

> Hi Neal,
>
> That's a good point and it was considered in the early stage of PRR. We
> picked FlightSize (=snd.nxt - snd.una) to ensure ssthresh/RecoverFS
> faithfully reflects the proportion of the congestion control window
> reduction: RFC5681 still use FlightSize to compute ssthresh. But  some TCP
> or specific C.C.s may use either cwnd (e.g. Linux cubic/reno) or pipe.  How
> about a small graph:
>
> "If a TCP or congestion control implementation uses cwnd or pipe instead
> of FlightSize to compute ssthresh, then RecoverFS should use the specific
> metric accordingly, i.e. cwnd right before recovery"
>

AFAICT that analysis is conflating two different issues:

(1) How does the congestion control compute ssthresh (based on cwnd or pipe
or FlightSize?) You rightly point out that approaches vary for this part.

(2) How does PRR determine what fraction of outstanding packets have been
delivered (aka prr_delivered / RecoverFS)?

AFAICT to get the right answer for the (2) question, RecoverFS should be
initialized to "pipe", no matter what approach the CC takes for answering
(1).

My understanding of PRR is that if (pipe > ssthresh) is true, then the
algorithm is doing Proportional Rate Reduction, and is essentially
computing:

      sndcnt ~= (target data sent in recovery)            - (actual data
sent in recovery)
      sndcnt ~= (fraction of data delivered) * ssthresh   - prr_out
      sndcnt ~= (prr_delivered / RecoverFS ) * ssthresh   - prr_out

For the (target data sent in recovery) to equal ssthresh at the end of the
first round in recovery the algorithm must reach the point where
prr_delivered == RecoverFS, so that (prr_delivered / RecoverFS ) is 1.
Since  prr_delivered can only reach as high as "pipe" at the start of
recovery, to be able to reach that condition we need to have RecoverFS ==
"pipe". If RecoverFS is (snd.nxt - snd.una) then RecoverFS is too big, and
prr_delivered won't be able to match RecoverFS and the (target data sent in
recovery) won't reach ssthresh, and the algorithm will undershoot (the cwnd
won't reach the ssthresh specified by congestion control, however it
calculated that).

What am I missing? :-)

best regards,
neal




> On Mon, Apr 17, 2023 at 11:23 AM Neal Cardwell <ncardwell@google.com>
> wrote:
>
>> Regarding this line in draft-ietf-tcpm-prr-rfc6937bis-03:
>>
>>    RecoverFS = snd.nxt - snd.una // FlightSize right before recovery
>>
>> AFAICT this should be:
>>
>>   RecoverFS = pipe  // RFC 6675 pipe algorithm
>>
>> Rationale: when recovery starts, often snd.nxt - snd.una includes 1 or
>> more lost packets above snd.una and 3 or more SACKed packets above that;
>> those packets are not really in the pipe, and not really in the FlightSize.
>>
>> With the draft as-is, packets that were SACKed on ACKs that happened
>> before entering fast recovery are incorporated in RecoverFS (snd.nxt -
>> snd.una) but never in prr_delivered (since that is set to 0 upon entering
>> fast recovery), so at the end of fast recovery the expression:
>>
>>   CEIL(prr_delivered * ssthresh / RecoverFS)
>>
>> can be quite far below ssthresh, for very large numbers of packets SACKed
>> before entering fast recovery (e.g., if the reordering degree is large).
>>
>> AFAICT that means that at the end of recovery the cwnd could be quite far
>> below ssthresh, to the same degree, resulting in the cwnd being less than
>> what congestion control specified when the connection entered fast recovery.
>>
>> AFAICT switching to RecoverFS = pipe fixes this, since it means that
>> RecoverFS only includes packets in the pipe when the connection enters fast
>> recovery, and thus prr_delivered can eventually reach RecoverFS, so tha) t
>> the target number of packets sent (CEIL(prr_delivered * ssthresh /
>> RecoverFS) can fully reach ssthresh.
>>
>> Apologies if I'm missing something or this has already been discussed.
>>
>> best regards,
>> neal
>>
>>