Re: [tcpm] WGLC: 2581bis
Markku Kojo <kojo@cs.helsinki.fi> Sat, 22 December 2007 03:04 UTC
Return-path: <tcpm-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1J5uf4-0007ak-1E; Fri, 21 Dec 2007 22:04:26 -0500
Received: from tcpm by megatron.ietf.org with local (Exim 4.43) id 1J5uf2-0007af-Jg for tcpm-confirm+ok@megatron.ietf.org; Fri, 21 Dec 2007 22:04:24 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1J5uf2-0007aX-8E for tcpm@ietf.org; Fri, 21 Dec 2007 22:04:24 -0500
Received: from courier.cs.helsinki.fi ([128.214.9.1] helo=mail.cs.helsinki.fi) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1J5uf1-00034O-Bv for tcpm@ietf.org; Fri, 21 Dec 2007 22:04:24 -0500
Received: from x40-26.cs.helsinki.fi (a88-112-189-166.elisa-laajakaista.fi [88.112.189.166]) (AUTH: PLAIN cs-relay, TLS: TLSv1/SSLv3,256bits,AES256-SHA) by mail.cs.helsinki.fi with esmtp; Sat, 22 Dec 2007 05:04:21 +0200 id 000805C0.476C7EB5.00007DDB
Received: by x40-26.cs.helsinki.fi (Postfix, from userid 3011) id 8B1A8BFC5; Sat, 22 Dec 2007 05:04:20 +0200 (EET)
Received: from localhost (localhost [127.0.0.1]) by x40-26.cs.helsinki.fi (Postfix) with ESMTP id 63BC7BFB4; Sat, 22 Dec 2007 05:04:20 +0200 (EET)
Date: Sat, 22 Dec 2007 05:04:19 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: tcpm@ietf.org
Subject: Re: [tcpm] WGLC: 2581bis
In-Reply-To: <20071127004720.GD3385@hut.isi.edu>
Message-ID: <Pine.LNX.4.64.0712220127420.7480@x40-26.cs.helsinki.fi>
References: <20071127004720.GD3385@hut.isi.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 944ecb6e61f753561f559a497458fb4f
Cc: blanton@cs.purdue.edu, Ted Faber <faber@ISI.EDU>, vern@icir.org, mallman@icir.org
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Errors-To: tcpm-bounces@ietf.org
Ted, Mark, all, I'd like to see this document to advance to DS. However, after reading the draft and taking a look at a few traces from Linux TCP implementation it seems that there is one issue that may need some attention first. Looking into the Linux TCP behavior (that differs from what the draft specifies) seems to reveal an issue with the draft in its usage of FlightSize as a bad estimate for the amount of outstanding data in the *network* and consequently inappropriate adjustment of ssthresh (and cwnd). That is, replacing cwnd with FlightSize in equation 4 seem to have resulted in similar kind of problems as there were earlier when cwnd was used in the equation: When Limited Transmit alg (step 1 of fast rexmit & fast recovery alg in Section 3.2) is used with the current definition of FlightSize and equation 4, ssthresh and cwnd will be assigned larger value than what would be appropriate. The reason is that FlightSize is increased in step 1 and it is then used in step 2 and 3 to determine the new value of ssthresh and cwnd. However, allowing Limited Transmit to send a new data segment on the arrival of the 1st and 2nd dupack rests on the assumption that a dupack indicates that a segment has left the network and thereby the number of outstanding segments in the network remains unchanged (like cwnd remains). This means that using Limited Transmit results in less reduction in ssthresh and cwnd compared to the case where Limited Transmit is not in use. As the difference in the new ssthresh (and cwnd) value is only (at most) one MSS, this does not make TCP significantly more aggressive with large windows, but with a small window size the difference is significant. For example, with cwnd of 4 segments and a single segment loss, a TCP sender applying Limited Transmit per current spec continues with cwnd of 3 segments while a TCP sender not applying Limited Transmit halves its cwnd and continues with cwnd of 2 segments. This may have a significant effect on a bottleneck link that is shared by a number of connections proceeding with a small window. One simple possibility of fixing this is to redefine equation 4 as ssthresh = max ( min(FlightSize,cwnd) / 2, 2*SMSS) Similar problems in correctly determining a new value of ssthresh may occur also in other cases where the actual amount of outstanding data (significanly) differs from FlightSize, i.e., when TCP sender is already in loss recovery. Linux does not experience this problem with FlightSize as it maintains a more accurate estimate (akin to pipe variable in RFC 3517) for the amount of outstanding data, and uses it to determine the new value of ssthresh (and cwnd) when entering loss recovery. Other comments/suggestions: 1. Section 3.1, 3rd para: It might be useful also note that the purpose of the slow start algorithm is to (re)start the ack clock (in addition to determining the available capacity). 2. Section 3.1: "On the other hand, when a TCP sender detects segment loss using the retransmission timer and the given segment has already been retransmitted at least once, the value of ssthresh is held constant." Should be clarified that this applies only when the retransmission timer expires again for the same segment, not when retransmission timer expires for a fast retransmitted segment. 3. Section 3.2, 1st step of the fast retransmit and fast recovery alg: It would be useful to note that allowing a TCP sender to send a new data segment on the 1st and 2nd dupack is in violation to the definition of cwnd in Section 2: "At any given time, a TCP MUST NOT send data with a sequence number higher than the sum of the highest acknowledged sequence number and the minimum of cwnd and rwnd." 4. Section 4.3: "Loss in two successive windows of data, or the loss of a retransmission, should be taken as two indications of congestion and, therefore, cwnd (and ssthresh) MUST be lowered twice in this case." Lowering ssthresh twice on the loss of a retransmission triggered by an RTO would be in contradiction with what is said in Section 3.1 (see item 2 above). Should clarify that this is valid only with the loss of a fast retransmit (or the loss of a retransmission in fast recovery with an advanced loss recovery alg such as NewReno or SACK-based fast recovery) Thanks, /Markku _______________________________________________ tcpm mailing list tcpm@ietf.org https://www1.ietf.org/mailman/listinfo/tcpm
- [tcpm] WGLC: 2581bis Ted Faber
- RE: [tcpm] WGLC: 2581bis Agarwal, Anil
- Re: [tcpm] Re: WGLC: comments on 2581bis Gorry Fairhurst
- [tcpm] WGLC: comments on 2581bis Gorry Fairhurst
- [tcpm] Re: WGLC: comments on 2581bis Mark Allman
- Re: [tcpm] Re: WGLC: comments on 2581bis John Heffner
- Re: [tcpm] WGLC: 2581bis Ted Faber
- Re: [tcpm] WGLC: 2581bis Alfred Hönes
- Re: [tcpm] WGLC: 2581bis Markku Kojo
- Re: [tcpm] WGLC: 2581bis Ted Faber