Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt> (The RACK-TLPloss detection algorithm for TCP) to Proposed Standard
Markku Kojo <kojo@cs.helsinki.fi> Fri, 04 December 2020 13:02 UTC
Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5BDAD3A0C6D; Fri, 4 Dec 2020 05:02:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fibwtf3_iHlf; Fri, 4 Dec 2020 05:02:09 -0800 (PST)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 66FDF3A0C63; Fri, 4 Dec 2020 05:01:59 -0800 (PST)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Fri, 04 Dec 2020 15:01:51 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=dkim20130528; bh=vQl8WvUAQA3xdDVwk WnpL92FIeZDusv6GdyUVAa0H3w=; b=hUOXhh/kD6HJVwalWr4L2bwAJPd46c/xL tjKZpOXjLEpuwHGqfPLTMBK5gMHKY82CMBITlGc0cwok/eQJMO7FSi9MqPf8xrjL 15EBJATf4Sk3C5Qw8T5+QwzwBiX6ELfizK2cLbIrydCng1bR7q0SXEqOgmB89M+c nR0/LR4gI8=
Received: from hp8x-60 (85-76-105-68-nat.elisa-mobile.fi [85.76.105.68]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Fri, 04 Dec 2020 15:01:50 +0200 id 00000000005A1C6A.000000005FCA333E.000043A3
Date: Fri, 04 Dec 2020 15:01:50 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: last-call@ietf.org
cc: IETF-Announce <ietf-announce@ietf.org>, tcpm@ietf.org, draft-ietf-tcpm-rack@ietf.org, tuexen@fh-muenster.de, draft-ietf-tcpm-rack.all@ietf.org, tcpm-chairs@ietf.org
In-Reply-To: <160557473030.20071.3820294165818082636@ietfa.amsl.com>
Message-ID: <alpine.DEB.2.21.2012030145440.5180@hp8x-60.cs.helsinki.fi>
References: <160557473030.20071.3820294165818082636@ietfa.amsl.com>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="US-ASCII"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/KJd-jheMK5GezpUo-p1IcoajD4M>
Subject: Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt> (The RACK-TLPloss detection algorithm for TCP) to Proposed Standard
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2020 13:02:13 -0000
Hi all, I know this is a bit late but I didn't have time earlier to take look at this draft. Given that this RFC to be is standards track and RECOMMENDED to replace current DupAck-based loss detection, it is important that the spec is clear on its advice to those implementing it. Current text seems to lack important advice w.r.t congestion control, and even though the spec tries to decouple loss detection from congestion control and does not intend to modify existing standard congestion control some of the examples advice incorrect congestion control actions. Therefore, I think it is worth to correct the mistakes and take yet another look at a few implications of this specification. Sec. 3.4 (and elsewhere when discussing recovering a dropped retransmission): It is very useful that RACK-TLP allows for recovering dropped rexmits. However, it seems that the spec ignores the fact that loss of a retransmission is a loss in a successive window that requires reacting to congestion twice as per RFC 5681. This advice must be included in the specification because with RACK-TLP recovery of dropped rexmit takes place during the fast recovery which is very different from the other standard algorithms and therefore easy to miss when implementing this spec. Sec 9.3: In Section 9.3 it is stated that the only modification to the existing congestion control algorithms is that one outstanding loss probe can be sent even if the congestion window is fully used. This is fine, but the spec lacks the advice that if a new data segment is sent this extra segment MUST NOT be included when calculating the new value of ssthresh as per the equation (4) of RFC 5681. Such segment is an extra segment not allowed by cwnd, so it must be excluded from FlightSize, if the TLP probe detects loss or if there is no ack and RTO is needed to trigger loss recovery. In these cases the temporary over-commit is not accounted for as DupAck does not decrease FlightSize and in case of an RTO the next ACK comes too late. This is similar to the rule in RFC 5681 and RFC 6675 that prohibits including the segments transmitted via Limitid Transmit in the calculation of ssthresh. In Section 9.3 a few example scenarios are used to illustriate the intended operation of RACK-TLP. In the first example a sender has a congestion window (cwnd) of 20 segments on a SACK-enabled connection. It sends 10 data segments and all of them are lost. The text claims that without RACK-TLP the ending cwnd would be 4 segments due to congestion window validation. This is incorrect. As per RFC 7661 the sender MUST exit the non-validated phase upon an RTO. Therefore the ending cwnd would be 5 segments (or 5 1/2 segments if the TCP sender uses the equation (4) of RFC 5681). The operation with RACK-TLP would inevitably result in congestion collapse if RACK-TLP behaves as described in the example because it restores the previous cwnd of 10 segments after the fast recovery and would not react to congestion at all! I think this is not the intended behavior by this spec but a mistake in the example. The ssthresh calculated in the beginning of loss recovery should be 5 segments as per RFC 6675 (and RFC 5681). Furthermore, it seems that this example with RACK-TLP refers to using PRR_SSRB which effectively implements regular slow start in this case(?). From congestion control point of view this is correct because the entire flight of data as well as ack clock was lost. However, as correctly discussed in Sec 2, congestion window must be reset to 1 MSS when an entire flight of data is and Ack clock is lost. But how can an implementor know what to do if she/he is not implementing the experimental PRR algrorithm? This spec articulates specifying an alternative for DupAck counting, indicating that TLP is used to trigger Fast Retransmit & Fast Recovery only, not a loss recovery in slow start. This means that without an additional advise an implementation of this spec would just halve the cwnd and ssthresh and send a potentially very large burst of segments in the beginning of the Fast Recovery because there is no ack clock. So, this spec begs for an advise (MUST) when to slow start and reset cwnd and when not, or at least a discussion of this problem and some sort of advise what to do and what to avoid. And, maybe a recommendation to implement it with PRR? Another question relates to the use of TLP and adjusting timer(s) upon timeout. In the same example discussed above, it is clear that PTO that fires TLP is just a more aggressive retransmit timer with an alternative data segment to (re)transmit. Therefore, as per RFC 2914 (BCP 41), Sec 9.1, when PTO expires, it is in effect a retransmission timout and the timer(s) must be backed-off. This is not adviced in this specification. Whether it is the TCP RTO or PTO that should be backed-off is an open question. Otherwise, if the congestion is persistent and further transmission are also lost, RACK-TLP would not react to congestion properly but would keep retransmitting with "constant" timer value because new RTT estimate cannot be obtained. On a buffer bloated and heavily congested bottleneck this would easily result in sending at least one unnecessary retransmission per one delivered segment which is not advisable (e.g., when there are a huge number of applications sharing a constrained bottleneck and these applications are sending only one (or a few) segments and then waiting for an reply from the peer before sending another request). Additional notes: Sec 2.2: Example 2: "Lost retransmissions cause a resort to RTO recovery, since DUPACK-counting does not detect the loss of the retransmissions. Then the slow start after RTO recovery could cause burst losses again that severely degrades performance [POLICER16]." RTO reovery is done in slow start. The last sentence is confusing as there is no (new) slow-start after RTO recovery (or more precisely slow start continues until cwnd > ssthresh). Do you mean: if/when slow start still continues after RTO Recovery has repaired lost segments, it may cause burst losses again? Example 3: "If the reordering degree is beyond DupThresh, the DUPACK- counting can cause a spurious fast recovery and unnecessary congestion window reduction. To mitigate the issue, [RFC4653] adjusts DupThresh to half of the inflight size to tolerate the higher degree of reordering. However if more than half of the inflight is lost, then the sender has to resort to RTO recovery." This seems to be somewhat incorrect description of TCP-NCR specified in RFC 4653. TCP-NCR uses Extended Limited Transmit that keeps on sending new data segments on DupAcks that makes it likely to avoid an RTO in the given example scenario, if not too many of the the new data segments triggered by Extended Limited Transmit are lost. Sec. 3.5: "For example, consider a simple case where one segment was sent with an RTO of 1 second, and then the application writes more data, causing a second and third segment to be sent right before the RTO of the first segment expires. Suppose only the first segment is lost. Without RACK, upon RTO expiration the sender marks all three segments as lost and retransmits the first segment. When the sender receives the ACK that selectively acknowledges the second segment, the sender spuriously retransmits the third segment." This seems incorrect. When the sender receives the ACK that selectively acknowledges the second segment, it is a DupAck as per RFC 6675 and does not increase cwnd and cwnd remains as 1 MSS and pipe is 1 MSS. So, the rexmit of the third segment is not allowad until the cumulative ACK of the first segment arrives. Best regards, /Markku On Mon, 16 Nov 2020, The IESG wrote: > > The IESG has received a request from the TCP Maintenance and Minor Extensions > WG (tcpm) to consider the following document: - 'The RACK-TLP loss detection > algorithm for TCP' > <draft-ietf-tcpm-rack-13.txt> as Proposed Standard > > The IESG plans to make a decision in the next few weeks, and solicits final > comments on this action. Please send substantive comments to the > last-call@ietf.org mailing lists by 2020-11-30. Exceptionally, comments may > be sent to iesg@ietf.org instead. In either case, please retain the beginning > of the Subject line to allow automated sorting. > > Abstract > > > This document presents the RACK-TLP loss detection algorithm for TCP. > RACK-TLP uses per-segment transmit timestamps and selective > acknowledgements (SACK) and has two parts: RACK ("Recent > ACKnowledgment") starts fast recovery quickly using time-based > inferences derived from ACK feedback. TLP ("Tail Loss Probe") > leverages RACK and sends a probe packet to trigger ACK feedback to > avoid retransmission timeout (RTO) events. Compared to the widely > used DUPACK threshold approach, RACK-TLP detects losses more > efficiently when there are application-limited flights of data, lost > retransmissions, or data packet reordering events. It is intended to > be an alternative to the DUPACK threshold approach. > > > > > The file can be obtained via > https://datatracker.ietf.org/doc/draft-ietf-tcpm-rack/ > > > > No IPR declarations have been submitted directly on this I-D. > > > > > > _______________________________________________ > tcpm mailing list > tcpm@ietf.org > https://www.ietf.org/mailman/listinfo/tcpm >
- [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt> (… The IESG
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Yuchung Cheng
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Ian Swett
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] [Last-Call] Last Call: <draft-ietf-tcp… Michael Welzl
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Yuchung Cheng
- Re: [tcpm] [Last-Call] Last Call: <draft-ietf-tcp… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Neal Cardwell
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Martin Duke
- Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Praveen Balasubramanian
- Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Martin Duke
- Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Yuchung Cheng
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Neal Cardwell
- Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Neal Cardwell
- Re: [tcpm] [EXTERNAL] Re: Last Call:<draft-ietf-t… Markku Kojo
- Re: [tcpm] [EXTERNAL] Re: Last Call:<draft-ietf-t… Markku Kojo
- Re: [tcpm] Last Call: <draft-ietf-tcpm-rack-13.tx… Markku Kojo
- Re: [tcpm] [EXTERNAL] Re: Last Call: <draft-ietf-… Praveen Balasubramanian
- Re: [tcpm] [EXTERNAL] Re: Last Call:<draft-ietf-t… Markku Kojo