[tcpm] Re: PRR behaviour on detecting loss of a retransmission(WAS:I-D Action: draft-ietf-tcpm-prr-rfc6937bis-06.txt)
Markku Kojo <kojo@cs.helsinki.fi> Tue, 29 October 2024 14:45 UTC
Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A9080C1D61EB for <tcpm@ietfa.amsl.com>; Tue, 29 Oct 2024 07:45:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.006
X-Spam-Level:
X-Spam-Status: No, score=-2.006 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2wRmE6zaHbwG for <tcpm@ietfa.amsl.com>; Tue, 29 Oct 2024 07:45:13 -0700 (PDT)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A1AEC1D531A for <tcpm@ietf.org>; Tue, 29 Oct 2024 07:45:12 -0700 (PDT)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Tue, 29 Oct 2024 16:40:00 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=dkim20130528; bh=O97VJcKjlbRq1H21+ yOEO9WUO81whjGA+N1htfzDUz8=; b=iC5HeFXVQOvzXN+HSzWccU6WYjlrNjyDv BIfcNB77Etrz5axZ/Q4sB4Qpu8FA7ZTqNM+88evkvKnLWVdVezUuES4hsscyTv/j 2aPvEnkw9VJhH7S9AJrJWe1iQi6/eK7N9f1vUYJs2HhLBqzCnkJBPcwjoy/EsxTO 7xOzYDe5xA=
Received: from dx6-cs-02.pc.helsinki.fi (dx6-cs-02.pc.helsinki.fi [193.167.160.58]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Tue, 29 Oct 2024 16:40:00 +0200 id 00000000005A1C77.000000006720F3C0.000019E9
Date: Tue, 29 Oct 2024 16:40:00 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org>
In-Reply-To: <CADVnQymFwhGuR7c9cYN5_xCdM=s1L=rjG+Tf6HsFkpyvPUmBLQ@mail.gmail.com>
Message-ID: <b81cd0c3-ba7d-127c-135f-8f74e889d4eb@cs.helsinki.fi>
References: <170896098131.16189.4842811868600508870@ietfa.amsl.com> <CADVnQy=rvCoQC0RwVq=P2XWFGPrXvGKvj2cAooj94yx+WzXz3A@mail.gmail.com> <8e5f0a7-b39b-cfaa-5c38-edeb9916bef6@cs.helsinki.fi> <CADVnQynR99fQjWmYj-rYZ4nZxYS=-O7zbfWjJLMxd5Lqcpwgcg@mail.gmail.com> <705f77a7-2f1d-905c-cd6b-e3a7463239fb@cs.helsinki.fi> <CADVnQymFwhGuR7c9cYN5_xCdM=s1L=rjG+Tf6HsFkpyvPUmBLQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-6657-1730212800-0001-2"
Message-ID-Hash: 3WZG2AIW3TE2RXYUOW6HJUMVU7ZFQ2EL
X-Message-ID-Hash: 3WZG2AIW3TE2RXYUOW6HJUMVU7ZFQ2EL
X-MailFrom: kojo@cs.helsinki.fi
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tcpm.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: tcpm@ietf.org, Matt Mathis <mattmathis@measurementlab.net>, Matt Mathis <ietf@mattmathis.net>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [tcpm] Re: PRR behaviour on detecting loss of a retransmission(WAS:I-D Action: draft-ietf-tcpm-prr-rfc6937bis-06.txt)
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/ixc0VisUb_d_qSEx-w6LLoB82v8>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Owner: <mailto:tcpm-owner@ietf.org>
List-Post: <mailto:tcpm@ietf.org>
List-Subscribe: <mailto:tcpm-join@ietf.org>
List-Unsubscribe: <mailto:tcpm-leave@ietf.org>
Hi Neal, all, I just noted that I have missed a bunch of replies as I had been out of email for about one month starting in mid July. Many thanks for the replies and clarifications and my apologies for not having time to track and search the mailing list until now. I just now reply to this thread wrt lost retransmission handling. The others are of less importance and possibly fine with the latest adjustments. I'll pass through the rest of the threads ASAP within a few coming days to see if there was anything important left. The major point with the suggested way of reinitializing the PRR state using the same steps as in the beginning of recovery is that it often results in incorrect result if one follows the current CC RFCs on how to do multiplicative decrease. Reason: both FlightSize and cwnd are badly off during Fast Recovery and must not be used to compute new ssthresh (and cwnd). Pls see inline tagged [MK2]. On Tue, 16 Jul 2024, Neal Cardwell wrote: > > > On Tue, Jun 25, 2024 at 10:25 AM Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org> wrote: > Hi Neal, all, > > I changed the subject line for discussing this specific topic of PRR > behaviour when loss of a retransmission is detected. > > Please see below tagged [MK]. > > On Mon, 18 Mar 2024, Neal Cardwell wrote: > > > In addition, it seems that the algorithm in the latest version does not > > address my WGLC comment on reducing send rate (ssthresh) again if > > RACK-TLP detects loss of a retransmission. The sender must reduce > > ssthresh again as loss of a rexmit occurs on another RTT. If it is not > > done, the fast recovery keeps on sending at the same rate until the end > > of recovery regardless of how many times a segment has to be > > retransmitted. This sounds very bad behaviour to me in front of heavy > > congestion that drops a lot of pkts (rexmits) and the PRR sender does not > > react at all. > > > > > > I would argue that the question of whether a connection should reduce ssthresh when > RACK-TLP detects the > > loss of a retransmission, while important, is outside the scope of PRR. PRR is taking > loss detection and > > congestion control decisions as externally provided inputs into PRR. When to mark a > packet as lost is a > > loss detection question, and whether to reduce ssthresh upon a particular packet loss > is a congestion > > control decision. PRR is focused on taking the ssthresh output from congestion > control, and loss detection > > decisions from the loss detection algorithm, and deciding how to evolve the cwnd to > try to smoothly and > > safely converge the volume of in-flight data toward the given ssthresh. > > [MK] Yes, agreed that loss detection (including detecting the loss of > a retransmission) is outside of scope of PRR (i.e., detecting loss of > rexmit is currently RACK-TLP). > > However, PRR is a congestion control algorithm defining the congestion > control behaviour of the sender during a fast recovery. Currently it > borrows only the multiplicative decrease factor from other congestion > control algos, that is, either from RFC 5681 or RFC 9438) but defines > everything else in controlling the send rate (= congestion ctrl) during > fast recovery. > > PRR does not need to define the multiplicative decrease factor to be used > when a loss of rexmitted segment is detected. It may borrow it from > another doc like it currently does for entering loss recovery. > However, I don't quite see how some other document possibly could define > how the other PRR-specific variables are reinitialized, > e.g., RecoverFS. Maybe I am missing something but the algo seems > not to work correctly with a lowered ssthresh after detection of > lost rexmit unless RecoverFS (and prr_deliverd and prr_out too?) is also > adjusted. Could you explain how the algo is supposed to work upon > detecting loss of a rexmit with multiplicative decrease factor of 0.5, > for example. > > Thanks, > > /Markku > > > Hi Markku, > > In regards to the way PRR is intended to work after a data sender detects a lost retransmit, I have > chatted with Matt about this, and I think Matt and I have a similar perspective on this. Let me > offer my thoughts: > > + I agree it doesn't make sense for other IETF documents to try to define how PRR-specific variables > are reinitialized. To do so would invite a combinatorial explosion of standards, as every congestion > control algorithm doc would need to specify how PRR-enabled and non-PRR implementations should work. > And then those documents would need to be updated every time the PRR algorithm specification > changes. [MK2] Yes, agreed. > + Likewise, I would argue that it doesn't make sense for the PRR document to attempt to define (a) > when a congestion control algorithm decides to slow down (reduce ssthresh and/or cwnd), [MK2] Sure PRR document does not need to define (a). Lowering ssthresh and cwnd on the loss of a retransmission is already MUST in RFC 5681. The long term tradition in CC RFCs has been to repeat crucial MUSTs with a normative reference. So, I thing this document should also explicitly repeat that on detecting loss of a retransmission, the TCP sender MUST lower ssthresh once per RTT (and give enough details on how to do it correctly with PRR to avoid pitfalls with flightsize and cwnd, but not saying how much to lower). > or (b) what > the exact ssthresh value is as a result of the slow-down decision. For the PRR document to attempt > to document (a) and (b) would likewise result in a combinatorial explosion of text, and dependencies > between documents, and document updates. [MK2] Agreed, no need to define the exact ssthresh value, but needs to advise how to compute it correctly (see below). > Instead, the model we are advocating with PRR is a separation of concerns: > > + a congestion control algorithm decides: > (a) when a congestion control algorithm decides to slow down (reduce ssthresh and/or cwnd) > (b) what the exact ssthresh value is as a result of the slow-down decision > > + PRR decides: given (a) and (b) decisions made by a separate congestion control algorithm, how to > set the cwnd using each ACK > > How should that work, in practice, after a data sender detects a lost retransmit? > > Each time the data sender detects a lost retransmit, the congestion control algorithm should decide > whether or not to slow down. I would think that, ideally, a well-designed congestion control > algorithm should slow down multiplicatively once per round trip, for every round trip in which there > is any loss detected (whether it is a lost retransmit or a lost original retransmission). Any time > the congestion control algorithm decides to slow down (for whatever reason), it would initiate a new > PRR episode, and invoke the PRR initialization code. That would take care of initializing RecoverFS, > prr_deliverd, and prr_out, to ensure that the PRR behavior over the next round trip and beyond works > correctly. [MK2] To ensure that multiplicative decrease becomes implemented correctly this document should give exact advise how to compute the new ssthresh value. If one follows current RFCs (e.g., RFC 5681 or RFC 9438) and computes the new value of ssthresh (and cwnd) either using flightsize (ssthresh = 0.5 * FlightSize or ssthresh = 0.7 *FlightSize) or cwnd (ssthresh = 0.5 * cwnd or ssthresh = 0.7 * cwnd) the result is often not correct (or is more or less random). This is because fligthsize becomes inflated during fast recovery as the TCP sender sends new data during the recovery (before any cumulative/partial ACKs arrive). Similarly, with PRR cwnd reaches the target (=correctly reduced) value only in the end of recovery, meaning that cwnd is too big during the recovery. A typical, simple scenario with flightsize, for example: Amount of outstanding data is 100 segments and a loss is detected -> ssthresh = 50 or 70 (assume CC algo is Reno or Cubic) and recovery starts. Assume the fast rexmitted (1st lost) segment becomes dropped. During the first RTT the TCP sender injects 50 or 70 new data segments -> FlightSize = 150 or 170. Soon after the first RTT, the TCP sender detects the loss of the rexmitted segment and computes: ssthresh = 0.5 * 150 = 75 (Reno) or ssthresh = 0.7 * 170 = 119 (Cubic). This results in three times higher ssthress value than expected with Reno (= 25) and ~ 2.5 times higher than expected with Cubic (= 49). A simple scenario with cwnd, for example: Amount of outstanding data is 100 segments and a loss is detected -> ssthresh = 50 or 70 (assume CC algo is Reno or Cubic) and recovery starts. Assume there is significant number of losses in the current window of data and the fast rexmitted (=1st lost) segment becomes dropped. In addition, there may be significant Ack loss. That is, a typical case with very heavy congestion and it would be crucial to reduce ssthresh (and cwnd) correctly. During the first RTT only little date gets delivered (i.e., hardly any SACKed data and little additional lost segments are detected, keeping cwnd ~ flightsize (= cwnd before entering recovery). When lost rexmit becomes detected after one RTT, the TCP sender computes new ssthresh = 0.5 * cwnd or 0.7 * cwnd and the results is only minimal reduction from the ssthresh used during the first RTT of recovery, instead lowering twice with the same multiplicative decrease factor. I hope this clarifies the issue and the need to define that ssthresh MUST NOT be reinitialized using flightsize or cwnd. Maybe on detecting loss of a rexmit, ssthresh could be reinitialized ssthresh = multiplicatice_decrease_factor * ssthresh? In addition, when rereading the prr algo I found also an additonal problem with the PRR algo that uses pipe (RFC 6675 pipe algorithm) together with RACK-TLP loss detection. The algo uses pipe as the (quite accurate) estimate of outstanding data. However, the definition of pipe depends on loss detection in RFC 6675 that defines a lost segment different from RACK-TLP. The pipe algo depends of RFC 6675 IsLost() function that requires three SACked segments above the lost segment to declare the segment lost, while this is often not the case with RACK-TLP. Shouldn't the "pipe" estimate in the algo be based on the loss detection algorithm in use? Otherwise, pipe may be badly off in number of scenarios. The same holds for non-SACK fast recovery, that is NewReno. I think the document/algo should clarify that, without SACK, the SACKed segments for pipe calculation are estimated in the similar way as they are estimated for DeliveredData (i.e., one SACKd segment = one duplicate ACK). Hope this is helpful. Best regards, /Markku > I would propose that we make that more clear in the PRR document. > > In section 6, "Algorithm", I would propose we change the existing text: > > At the beginning of recovery, initialize the PRR state. > to: > > At the beginning of a congestion control response episode initiated > by the congestion control algorithm, a TCP data sender using PRR > MUST initialize the PRR state. The timing of the start of a > congestion control response episode is entirely up to the > congestion control algorithm, and (for example) could correspond to > the start of a fast recovery episode, or a once-per-round-trip > reduction when lost retransmits or lost original transmissions are > detected after fast recovery is already in progress. > > How does that sound to everyone? > > neal > > >
- [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc6937bis… internet-drafts
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Markku Kojo
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Markku Kojo
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Markku Kojo
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- [tcpm] Re: PRR behaviour on detecting loss of a r… Neal Cardwell
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Yoshifumi Nishida
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Matt Mathis
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Matt Mathis
- [tcpm] About growing cwnd when the sender is rate… Michael Welzl
- Re: [tcpm] I-D Action: draft-ietf-tcpm-prr-rfc693… Markku Kojo
- [tcpm] Re: PRR behaviour on detecting loss of a r… Markku Kojo
- [tcpm] Re: About growing cwnd when the sender is … Neal Cardwell
- [tcpm] Re: About growing cwnd when the sender is … Michael Welzl
- [tcpm] Re: About growing cwnd when the sender is … Neal Cardwell
- [tcpm] Re: About growing cwnd when the sender is … Michael Welzl
- [tcpm] Re: About growing cwnd when the sender is … Michael Welzl
- [tcpm] Re: About growing cwnd when the sender is … Christian Huitema
- [tcpm] Re: About growing cwnd when the sender is … Michael Welzl
- [tcpm] Re: I-D Action: draft-ietf-tcpm-prr-rfc693… Neal Cardwell
- [tcpm] Re: PRR behaviour on detecting loss of a r… Markku Kojo
- [tcpm] Re: PRR behaviour on detecting loss of a r… Neal Cardwell
- [tcpm] Re: PRR behaviour on detecting loss of a r… Neal Cardwell
- [tcpm] Re: PRR behaviour on detecting loss of a r… Markku Kojo
- [tcpm] Re: PRR behaviour on detecting loss of a r… Yoshifumi Nishida
- [tcpm] Re: PRR behaviour on detecting loss of a r… Markku Kojo
- [tcpm] Re: PRR behaviour on detecting loss of a r… Matt Mathis
- [tcpm] Re: PRR behaviour on detecting loss of a r… Neal Cardwell