Re: [tcpm] WGLC: 2581bis
Markku Kojo <kojo@cs.helsinki.fi> Sat, 22 December 2007 03:04 UTC
Return-path: <tcpm-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1J5uf4-0007ak-1E; Fri, 21 Dec 2007 22:04:26 -0500
Received: from tcpm by megatron.ietf.org with local (Exim 4.43) id 1J5uf2-0007af-Jg for tcpm-confirm+ok@megatron.ietf.org; Fri, 21 Dec 2007 22:04:24 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1J5uf2-0007aX-8E for tcpm@ietf.org; Fri, 21 Dec 2007 22:04:24 -0500
Received: from courier.cs.helsinki.fi ([128.214.9.1] helo=mail.cs.helsinki.fi) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1J5uf1-00034O-Bv for tcpm@ietf.org; Fri, 21 Dec 2007 22:04:24 -0500
Received: from x40-26.cs.helsinki.fi (a88-112-189-166.elisa-laajakaista.fi [88.112.189.166]) (AUTH: PLAIN cs-relay, TLS: TLSv1/SSLv3,256bits,AES256-SHA) by mail.cs.helsinki.fi with esmtp; Sat, 22 Dec 2007 05:04:21 +0200 id 000805C0.476C7EB5.00007DDB
Received: by x40-26.cs.helsinki.fi (Postfix, from userid 3011) id 8B1A8BFC5; Sat, 22 Dec 2007 05:04:20 +0200 (EET)
Received: from localhost (localhost [127.0.0.1]) by x40-26.cs.helsinki.fi (Postfix) with ESMTP id 63BC7BFB4; Sat, 22 Dec 2007 05:04:20 +0200 (EET)
Date: Sat, 22 Dec 2007 05:04:19 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: tcpm@ietf.org
Subject: Re: [tcpm] WGLC: 2581bis
In-Reply-To: <20071127004720.GD3385@hut.isi.edu>
Message-ID: <Pine.LNX.4.64.0712220127420.7480@x40-26.cs.helsinki.fi>
References: <20071127004720.GD3385@hut.isi.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 944ecb6e61f753561f559a497458fb4f
Cc: blanton@cs.purdue.edu, Ted Faber <faber@ISI.EDU>, vern@icir.org, mallman@icir.org
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Errors-To: tcpm-bounces@ietf.org
Ted, Mark, all,
I'd like to see this document to advance to DS. However,
after reading the draft and taking a look at a few traces
from Linux TCP implementation it seems that there is one issue
that may need some attention first.
Looking into the Linux TCP behavior (that differs from what the
draft specifies) seems to reveal an issue with the draft in its
usage of FlightSize as a bad estimate for the amount of outstanding
data in the *network* and consequently inappropriate adjustment of
ssthresh (and cwnd). That is, replacing cwnd with FlightSize in
equation 4 seem to have resulted in similar kind of problems as there
were earlier when cwnd was used in the equation:
When Limited Transmit alg (step 1 of fast rexmit & fast recovery
alg in Section 3.2) is used with the current definition of
FlightSize and equation 4, ssthresh and cwnd will be assigned
larger value than what would be appropriate. The reason is
that FlightSize is increased in step 1 and it is then used in
step 2 and 3 to determine the new value of ssthresh and cwnd.
However, allowing Limited Transmit to send a new data segment
on the arrival of the 1st and 2nd dupack rests on the assumption
that a dupack indicates that a segment has left the network
and thereby the number of outstanding segments in the network
remains unchanged (like cwnd remains).
This means that using Limited Transmit results in less reduction
in ssthresh and cwnd compared to the case where Limited Transmit
is not in use. As the difference in the new ssthresh (and cwnd)
value is only (at most) one MSS, this does not make TCP
significantly more aggressive with large windows, but with a small
window size the difference is significant. For example, with cwnd
of 4 segments and a single segment loss, a TCP sender applying
Limited Transmit per current spec continues with cwnd of 3
segments while a TCP sender not applying Limited Transmit halves
its cwnd and continues with cwnd of 2 segments. This may have
a significant effect on a bottleneck link that is shared by a
number of connections proceeding with a small window.
One simple possibility of fixing this is to redefine equation 4 as
ssthresh = max ( min(FlightSize,cwnd) / 2, 2*SMSS)
Similar problems in correctly determining a new value of ssthresh
may occur also in other cases where the actual amount of outstanding
data (significanly) differs from FlightSize, i.e., when TCP sender
is already in loss recovery.
Linux does not experience this problem with FlightSize as it
maintains a more accurate estimate (akin to pipe variable in RFC
3517) for the amount of outstanding data, and uses it to determine
the new value of ssthresh (and cwnd) when entering loss recovery.
Other comments/suggestions:
1. Section 3.1, 3rd para:
It might be useful also note that the purpose of the slow start
algorithm is to (re)start the ack clock (in addition to determining
the available capacity).
2. Section 3.1:
"On the other hand, when a TCP sender detects segment loss using
the retransmission timer and the given segment has already been
retransmitted at least once, the value of ssthresh is held
constant."
Should be clarified that this applies only when the retransmission
timer expires again for the same segment, not when retransmission
timer expires for a fast retransmitted segment.
3. Section 3.2, 1st step of the fast retransmit and fast recovery alg:
It would be useful to note that allowing a TCP sender to send a
new data segment on the 1st and 2nd dupack is in violation to
the definition of cwnd in Section 2:
"At any given time, a TCP MUST NOT send data with a sequence
number higher than the sum of the highest acknowledged sequence
number and the minimum of cwnd and rwnd."
4. Section 4.3:
"Loss in two successive windows of data, or the loss of a
retransmission, should be taken as two indications of congestion
and, therefore, cwnd (and ssthresh) MUST be lowered twice in this
case."
Lowering ssthresh twice on the loss of a retransmission triggered
by an RTO would be in contradiction with what is said in Section 3.1
(see item 2 above). Should clarify that this is valid only with the
loss of a fast retransmit (or the loss of a retransmission in fast
recovery with an advanced loss recovery alg such as NewReno or
SACK-based fast recovery)
Thanks,
/Markku
_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www1.ietf.org/mailman/listinfo/tcpm
- [tcpm] WGLC: 2581bis Ted Faber
- RE: [tcpm] WGLC: 2581bis Agarwal, Anil
- Re: [tcpm] Re: WGLC: comments on 2581bis Gorry Fairhurst
- [tcpm] WGLC: comments on 2581bis Gorry Fairhurst
- [tcpm] Re: WGLC: comments on 2581bis Mark Allman
- Re: [tcpm] Re: WGLC: comments on 2581bis John Heffner
- Re: [tcpm] WGLC: 2581bis Ted Faber
- Re: [tcpm] WGLC: 2581bis Alfred Hönes
- Re: [tcpm] WGLC: 2581bis Markku Kojo
- Re: [tcpm] WGLC: 2581bis Ted Faber