Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc6937bis-06.txt

Markku Kojo <kojo@cs.helsinki.fi> Mon, 18 March 2024 05:31 UTC

Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D3582C14F70B for <tcpm@ietfa.amsl.com>; Sun, 17 Mar 2024 22:31:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.008
X-Spam-Level:
X-Spam-Status: No, score=-2.008 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f9eQok1Tf87V for <tcpm@ietfa.amsl.com>; Sun, 17 Mar 2024 22:31:14 -0700 (PDT)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4DAE3C14F747 for <tcpm@ietf.org>; Sun, 17 Mar 2024 22:31:10 -0700 (PDT)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Mon, 18 Mar 2024 07:31:02 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=dkim20130528; bh=GyGzZcfWf1wsK6MrD WpeDQagE3ukljMnsLmSm1X/XnI=; b=AAPxDxLBrKlvc+w+QX9mmUPSymdZNTBKa U3399nfGh+9oUU7E67z4Tt+SluI3UhX07F4h0nuRzVozB/R0/8yRgkpLtfDIYeOu BfqbBX4Qq/0pBNevwzWlFKrJj0J8k8+mt0V7139uzT5RmafjzHpBJRdE/D/RqRiX QkS8mS+3e0=
Received: from hp8x-60.cs.helsinki.fi (85-76-12-231-nat.elisa-mobile.fi [85.76.12.231]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Mon, 18 Mar 2024 07:31:01 +0200 id 00000000005A00C6.0000000065F7D195.00004EC3
Date: Mon, 18 Mar 2024 07:31:01 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org>
cc: tcpm@ietf.org, Matt Mathis <mattmathis@measurementlab.net>
In-Reply-To: <CADVnQy=rvCoQC0RwVq=P2XWFGPrXvGKvj2cAooj94yx+WzXz3A@mail.gmail.com>
Message-ID: <8e5f0a7-b39b-cfaa-5c38-edeb9916bef6@cs.helsinki.fi>
References: <170896098131.16189.4842811868600508870@ietfa.amsl.com> <CADVnQy=rvCoQC0RwVq=P2XWFGPrXvGKvj2cAooj94yx+WzXz3A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-20187-1710739862-0001-2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/KpWTfcUHC3AcULHCvWu6WLU5caM>
Subject: Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc6937bis-06.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2024 05:31:18 -0000

Hi, Neal, all,

I have not been able to follow the progress of this draft for a long 
while, so apologies for chiming in this late.

I took a quick look at the latest discussions on setting RecoverFS.

The idea of setting RecoverFS = pipe seems like a neat idea to get cwnd 
descend smoothly to the target in the given example. However, isn't it 
more important to ensure that in all cases the sender sends at most 
ssthresh pkts per RTT during the recovery and that in the end of the 
recovery cwnd is at most ssthresh?

If I am not mistaken, reodering (and Ack losses) may result in undesired 
outcome. Let's modify Neal's example a bit such that reordering occurs but 
reordering window (with RACK-TLP) is slightly too small:

CC = Reno
cwnd = 100 packets
The application writes 100*MSS.
TCP sends 100 packets.

In this example the TCP sender has detected reordering with RACK-TLP or 
some other technique, so does not enter fast recovery on the third 
SACKed packet, but rather waits a while to accumulate more SACKs.

>From the flight of 100 packets, 1 packet is lost (P1), and 24 packets 
are delayed (packets P2..P25) and 3 packets (P26..P28) are SACKed 
(assume P2..P25 arrive after P28, for example).

We enter fast recovery with PRR.

RecoverFS = snd.nxt - snd.una = 100

ssthresh = cwnd / 2 = 50  (Reno)

pipe = snd.nxt - snd.una - (lost + SACKed) = 100 - (25 + 3) = 72 packets

The expression (pipe > ssthresh) is true for a number of consecutive 
SACKs, so we use the PRR code path repeatedly for a while as SACKs stream 
in for P2..25 and P29..P100.

When the the SACK for P100 has been processed we have sent

Sent_so_far = CEIL(prr_delivered * ssthresh / RecoverFS)
             = CEIL(96 * 50 / 72)
             = 67

So, PRR does not exit with cwnd = 50 but with much higher cwnd than 
expected.

If CC = CUBIC

Sent_so_far = CEIL(prr_delivered * ssthresh / RecoverFS)
             = CEIL(96 * 70 / 72)
             = 94


The same behavior seems to occurs also if the is significant Ack loss 
among P2..25 and at least one pkt gets reordered out of the reordering 
window. But, maybe I am missing something?


In addition, it seems that the algorithm in the latest version does not 
address my WGLC comment on reducing send rate (ssthresh) again if 
RACK-TLP detects loss of a retransmission. The sender must reduce 
ssthresh again as loss of a rexmit occurs on another RTT. If it is not 
done, the fast recovery keeps on sending at the same rate until the end 
of recovery regardless of how many times a segment has to be 
retransmitted. This sounds very bad behaviour to me in front of heavy 
congestion that drops a lot of pkts (rexmits) and the PRR sender does not 
react at all.

In addtion, there are a few other things that might be useful to
correct/clarify:

- When exactly does PRR algorithm exit? That is, is the algo steps
   executed also for the final cumulative ACK that covers RecoveryPoint
   (or recover)?

- The draft reads:

   "Although increasing the window
    during recovery seems to be ill advised, it is important to remember
    that this is actually less aggressive than permitted by RFC 5681,
    which sends the same quantity of additional data as a single burst in
    response to the ACK that triggered Fast Retransmit."

  I think it should cite RFC 6675 as TCP Reno loss recovery specified in
  RFC 5681 soes not send such a burst.


On Mon, 26 Feb 2024, Neal Cardwell wrote:

> As noted in the draft, revision 06 primarily has a single change relative to 05: it updates
> RecoverFS to be initialized as "RecoverFS = pipe" in both the prose and pseudocode.
> 
> Thanks to Richard Scheffenegger and the TCPM community for reviewing the 05 revision.
> Comments/suggestions welcome!
> 
> Thanks!
> neal
> 
> 
> On Mon, Feb 26, 2024 at 10:23 AM <internet-drafts@ietf.org> wrote:
>       Internet-Draft draft-ietf-tcpm-prr-rfc6937bis-06.txt is now available. It is a
>       work item of the TCP Maintenance and Minor Extensions (TCPM) WG of the IETF.
>
>          Title:   Proportional Rate Reduction for TCP
>          Authors: Matt Mathis
>                   Nandita Dukkipati
>                   Yuchung Cheng
>                   Neal Cardwell
>          Name:    draft-ietf-tcpm-prr-rfc6937bis-06.txt
>          Pages:   17
>          Dates:   2024-02-26
>
>       Abstract:
>
>          This document updates the experimental Proportional Rate Reduction
>          (PRR) algorithm, described RFC 6937, to standards track.  PRR
>          provides logic to regulate the amount of data sent by TCP or other
>          transport protocols during fast recovery.  PRR accurately regulates
>          the actual flight size through recovery such that at the end of
>          recovery it will be as close as possible to the slow start threshold
>          (ssthresh), as determined by the congestion control algorithm.
>
>       The IETF datatracker status page for this Internet-Draft is:
>       https://datatracker.ietf.org/doc/draft-ietf-tcpm-prr-rfc6937bis/
>
>       There is also an HTML version available at:
>       https://www.ietf.org/archive/id/draft-ietf-tcpm-prr-rfc6937bis-06.html
>
>       A diff from the previous version is available at:
>       https://author-tools.ietf.org/iddiff?url2=draft-ietf-tcpm-prr-rfc6937bis-06
>
>       Internet-Drafts are also available by rsync at:
>       rsync.ietf.org::internet-drafts
> 
>
>       _______________________________________________
>       tcpm mailing list
>       tcpm@ietf.org
>       https://www.ietf.org/mailman/listinfo/tcpm
> 
> 
>