[tcpm] 2581 implementation report, take 2
Mark Allman <mallman@icir.org> Tue, 30 October 2007 19:58 UTC
Return-path: <tcpm-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com)
by megatron.ietf.org with esmtp (Exim 4.43)
id 1ImxEH-0002XR-Vb; Tue, 30 Oct 2007 15:58:25 -0400
Received: from tcpm by megatron.ietf.org with local (Exim 4.43)
id 1ImxEH-0002XD-Aw
for tcpm-confirm+ok@megatron.ietf.org; Tue, 30 Oct 2007 15:58:25 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org)
by megatron.ietf.org with esmtp (Exim 4.43) id 1ImxEG-0002X3-VW
for tcpm@ietf.org; Tue, 30 Oct 2007 15:58:25 -0400
Received: from pork.icsi.berkeley.edu ([192.150.186.19])
by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1ImxEG-0000E2-5e
for tcpm@ietf.org; Tue, 30 Oct 2007 15:58:24 -0400
Received: from guns.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net
[69.222.35.58])
by pork.ICSI.Berkeley.EDU (8.12.11.20060308/8.12.11) with ESMTP id
l9UJwMgq023066 for <tcpm@ietf.org>; Tue, 30 Oct 2007 12:58:22 -0700
Received: from lawyers.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net
[69.222.35.58]) by guns.icir.org (Postfix) with ESMTP id C9CBF11512E1
for <tcpm@ietf.org>; Tue, 30 Oct 2007 15:58:15 -0400 (EDT)
Received: from lawyers.icir.org (localhost [127.0.0.1])
by lawyers.icir.org (Postfix) with ESMTP id 9CA6C2D9591
for <tcpm@ietf.org>; Tue, 30 Oct 2007 15:56:09 -0400 (EDT)
To: tcpm@ietf.org
From: Mark Allman <mallman@icir.org>
Organization: ICSI Center for Internet Research (ICIR)
Song-of-the-Day: 30 Days in the Hole
MIME-Version: 1.0
Date: Tue, 30 Oct 2007 15:56:09 -0400
Message-Id: <20071030195609.9CA6C2D9591@lawyers.icir.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 25eb6223a37c19d53ede858176b14339
Subject: [tcpm] 2581 implementation report, take 2
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: mallman@icir.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>,
<mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>,
<mailto:tcpm-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0227490603=="
Errors-To: tcpm-bounces@ietf.org
Attached is a slightly tweaked version of the 2581 implementation report. The report includes input from the linux community noting that 2581 is implemented in their stack and that they have not seen any sort of big problems because of it. allman
Background:
+ RFC 2581 is a re-write of RFC 2001. RFC 2001 was a description
of TCP's congestion control algorithms that was published long
after these algorithms were in nearly ubiquitous deployment
throughout the Internet (largely triggered by the congestion
collapses of the mid-1980s).
+ While RFC 2001 was a description of the algorithms, RFC 2581 is
a more traditional specification. We stress that the RFC was
written based on running code and experience.
+ The mechanisms in RFC 3042 (Limited Transmit) and RFC 3390
(Larger Initial Congestion Window) are also rolled into the
current document. Both of these enhancements are Proposed
Standards that have gathered wide consensus within the community
based on deployment experience.
+ The traditional test of two interoperable implementations to
move a Proposed Standard to Draft Standard is less obvious in
the case of congestion control mechanisms. Congestion control
is about *when* to send a segment and not *what* that segment
looks like, how to process it, how big fields are, etc.
Therefore, it is difficult to assess "interoperability" in the
traditional sense. Below we cite several sources that show or
suggest that multiple implementations of the mechanisms exist
and seem to work as intended.
+ The new version of the document clarifies a number of small
issues that implementers have asked about over the years, but
does not make any large changes to the algorithms.
Known Implementations:
+ [WS95] discusses the BSD implementation of the core algorithms
in RFC 2581 (slow start, congestion avoidance, fast retransmit
and fast recovery). This implementation has formed the basis of
the TCP stack in numerous operating systems (NetBSD, FreeBSD,
OpenBSD, SunOS 4.x, BSDI, etc.). While various operating
systems may have diverged in small details (some of which is
documented in RFC 2581) the basic algorithms do not seem to have
changed.
+ Linux also supports RFC 2581 and does not report any adverse
impacts. See Attachment 1 below.
(The complaint in that email is not about the document itself or
even the algorithm within RFC 2581, but rather goes to our
congestion control principles. Further, as sketched the
behavior given in RFC 2581 is more conservative than desired and
therefore if this RFC is in error, it is erroring in the right
direction for stable operation.)
+ [Pax97] analyzes a number of implementations, finding both
correct and incorrect behavior relative to RFC 2581 across a
variety of implementations. The incorrect behavior fed into
[RFC2525].
+ [MAF05] tests for conformance along a number of angles by
probing the TCPs of over 70K web servers with specialized packet
streams that induce the stack to show how it handles various
situations. The results include:
+ The vast majority of server reduce their congestion window
by half in response to congestion (per RFC 2581's congestion
avoidance).
+ The majority of the web servers used an initial congestion
window of 1--2 packets.
+ Limited Transmit was used in over 20% of the servers.
+ While some servers do not use fast retransmit the
overwhelming majority implement it.
+ Many web servers use the fast recovery algorithm (with a
number using more advanced recovery such as NewReno
[RFC3782] or SACK-based loss recovery techniques
[RFC2018,RFC3517].
(Note that [MAF05] updates some of the results of [PF01]. The
newer results confirm the older results.)
References:
[MAF05] Alberto Medina, Mark Allman, Sally Floyd. Measuring the
Evolution of Transport Protocols in the Internet. ACM Computer
Communication Review, 35(2), April 2005.
[Pax97] Vern Paxson. Automated Packet Trace Analysis of TCP
Implementations. ACM SIGCOMM, September 1997.
[PF01] Jitu Padhye, Sally Floyd. Identifying the TCP Behavior of
Web Servers, SIGCOMM 2001, August 2001.
[WS95] Wright, G. and W. Stevens, "TCP/IP Illustrated, Volume 2: The
Implementation", Addison-Wesley, 1995.
Attachment 1:
Date: Mon, 24 Sep 2007 19:55:20 PDT
To: mallman@icir.org
From: Stephen Hemminger <shemminger@linux-foundation.org>
Subject: Re: rfc2581
[...]
Yes Linux implements RFC2581 and has not had any unstable or
congestion problems caused by that. In recent years, there has
been lots of refinements and alternatives added, but all the other
algorithms are more complex attempts to ensure proper and stable
response in "corner case" domains of large delay bandwidth
products and/or small router queues.
Linux also implements RFC2861 (congestion window validation) by
default which makes it less aggressive than many other
implementations. Because this caused some bursty applications to
have poor performance it was made optional.
The only real complaint against the principles of congestion
control has come from the financial community. Slow start can
cause connections to have latency, and when latency equates to
real $$ during transactions, customers get very sensitive to the
added delay. For a discussion of this see the presentation from
Credit Suisse at this 2007 Kernel
Summit. http://lwn.net/Articles/248878/ For that reason, they are
looking to alternatives to TCP/IP such as Infiniband.
--
Stephen Hemminger <shemminger@linux-foundation.org>
_______________________________________________ tcpm mailing list tcpm@ietf.org https://www1.ietf.org/mailman/listinfo/tcpm
- [tcpm] 2581 implementation report, take 2 Mark Allman