[tcpm] Window update algorithm differences
Andre Oppermann <andre@freebsd.org> Wed, 11 June 2008 12:50 UTC
Return-Path: <tcpm-bounces@ietf.org>
X-Original-To: tcpm-archive@megatron.ietf.org
Delivered-To: ietfarch-tcpm-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3A42F3A67E2; Wed, 11 Jun 2008 05:50:35 -0700 (PDT)
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 92E9C3A6780 for <tcpm@core3.amsl.com>; Wed, 11 Jun 2008 05:50:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.759
X-Spam-Level:
X-Spam-Status: No, score=-1.759 tagged_above=-999 required=5 tests=[AWL=-0.360, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, J_CHICKENPOX_35=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ByCLIb5gepbp for <tcpm@core3.amsl.com>; Wed, 11 Jun 2008 05:50:32 -0700 (PDT)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by core3.amsl.com (Postfix) with ESMTP id 6B8EE3A63D2 for <tcpm@ietf.org>; Wed, 11 Jun 2008 05:50:30 -0700 (PDT)
Received: (qmail 36052 invoked from network); 11 Jun 2008 11:46:11 -0000
Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender <andre@freebsd.org>) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for <tcpm@ietf.org>; 11 Jun 2008 11:46:11 -0000
Message-ID: <484FCA2C.2020600@freebsd.org>
Date: Wed, 11 Jun 2008 14:50:52 +0200
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Thunderbird 1.5.0.14 (Windows/20071210)
MIME-Version: 1.0
To: tcpm@ietf.org
Subject: [tcpm] Window update algorithm differences
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://www.ietf.org/mailman/private/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: tcpm-bounces@ietf.org
Errors-To: tcpm-bounces@ietf.org
There is some considerable disagreement on the correctness of the original window update test in various operating systems. Here is an overview of the current approaches used by the popular and open source TCP implementations: RFC793: section 3.9, page 72 SND.UNA < SEG.ACK =< SND.NXT, update window but not SND.WU.[SEQ|ACK] (SND.WU.SEQ < SEG.SEQ or (SND.WU.SEQ = SEG.SEQ and SND.WU.ACK =< SEG.ACK)) update everything. Stevens Vol.2: section 29.7, page 981-983 FreeBSD: src/sys/netinet/tcp_input.c, rev. 1.376 OpenBSD: src/sys/netinet/tcp_input.c, rev. 1.215 NetBSD: src/sys/netinet/tcp_input.c, rev. 1.287 SEG.SEQ > SND.WU.SEQ or (SEG.SEQ = SND.WU.SEQ and (SEG.ACK > SND.WU.ACK or (SEG.ACK = SND.WU.ACK and SEG.WND > SND.WND))) update everything. OpenSolaris: src/uts/common/inet/tcp/tcp.c, @swnd_update, rev. 6707 SEG.ACK > SND.WU.ACK or SEG.SEQ > SND.WU.SEQ or (SEG.SEQ = SND.WU.SEQ and SEQ.WND > SND.WND) update everything. Linux: net/ipv4/tcp_input.c, @tcp_ack_update_window(), rel. 2.6.25 SEG.ACK > SND.UNA or SEG.SEQ > SND.WU.SEQ or (SEG.SEQ = SND.WU.SEQ and SEG.WND > SND.WND) update everything. The OpenSolaris code contains some comments about being better in case of bi-directional traffic and alleged problems with the RFC793 method. The Linux code contains some general comments about the incorrectness of the BSD method without further elaboration. The obvious question is which one is correct or better than the others? Lets have a look at the basic requirement of the send window update. o Only newer than already seen segment should update the send window to prevent old and outdated information being used. o All evolves around how to reliably detect newer updates. Lets have a look at what makes a segment new: o When using timestamps, either the reflected TS is higher than the last one we got (we're sending data), or the TS from the other end is newer than what we currently reflect (we're receiving data or a window update). Problem: what to do when the round trip time is faster than the timestamp resolution? Fall back to the SEQ and ACK checks. SEG.TSECR > TS_RECENT_AGE or SEG.TSVAL > TS_RECENT o Data we sent has been ack'ed. Problem: None really. Doesn't trigger on old retransmits or out-of order. SEG.ACK > SND.UNA (and implicit SEG.ACK <= SND.NXT) o We receive new data. Problem: out-of order into reassembly queue, retransmits of missing segments, reordering of segments. Retransmit contains newer value. SEG.SEQ > RCV.NXT o No data sent or received but window increases. Problem: old delayed segment. Only allow if window increases. SEG.WND > SND.WND Hence I propose the following updated acceptable window update check: [1] (TS and SEG.TSECR > TS_RECENT_AGE or SEG.TSVAL > TS_RECENT or [2] SEG.ACK > SND.UNA or [3] (SEG.SEQ > SND.WU.SEQ and SEG.ACK >= SND.UNA) or [4] (SEG.SEQ = SND.WU.SEQ and SEG.ACK = SND.UNA and SEG.WND > SND.WND) SND.WND <- SEG.WND [5] SEG.SEQ > SND.WU.SEQ SND.WU.SEQ <- SEG.SEQ [6] (SND.WU.ACK <- SEG.ACK) [1] If either timestamp is newer than what we've already seen this is a new segment and the window it contains is certain to be valid without any further checks. [2] This is reliable indicator of a genuine window update. With the arrival of new data that is ack'ed the window also has been updated. [3] A higher sequence number tells us new data was received but if the ACK is lower than what we've already seen it must be a retransmit. [4] A pure window update if the sequence number is the same, the ACK is not lower than what we've already seen and the advertised window is larger than the one we had. [5] Only change the last update that gave us a window update if it is higher than what we have. This prevents retransmitted or reordered segments without a new ACK from updating our window. With timestamps we can reliably differentiate retransmits from out-of order segments. [6] Tracking the last ACK that updated the window has become unnecessary. SND.WU.ACK is also known as SND.WL2. Cases: o In unidirectional send we trigger always on [2] when our data is ack'ed. o In unidirectional receive we trigger on [3] for in-order segments. Out of order segments do not update the window unless they advance SND.WU.SEQ. Retransmits are not detected unless timestamps are enabled. In that case [1] triggers if the RTT is larger than the resolution of the timestamp clock. Otherwise window updates will resume when all missing segments are retransmitted and new segments beyond SND.WU.SEQ arrive. o In bidirectional traffic we trigger on [3] and [2]. If the transfer has loss or is re-ordered in either or both directions we also trigger in all important cases due to [2] when new data was ack'ed, and [3] new data with an up to date ACK is received. Above all the timestamp check allows all new segments no matter what order they are in. Feedback and pointing out of mistakes are welcome. BTW: TAC's are gone for good! -- Andre andre@FreeBSD.org _______________________________________________ tcpm mailing list tcpm@ietf.org https://www.ietf.org/mailman/listinfo/tcpm
- [tcpm] Window update algorithm differences Andre Oppermann
- Re: [tcpm] Window update algorithm differences Matt Mathis