Re: [tcpm] new version of 2988bis

"Scheffenegger, Richard" <rs@netapp.com> Thu, 30 December 2010 13:28 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 368D63A67A6 for <tcpm@core3.amsl.com>; Thu, 30 Dec 2010 05:28:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.299
X-Spam-Level:
X-Spam-Status: No, score=-10.299 tagged_above=-999 required=5 tests=[AWL=0.299, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8lx8sXzCShD1 for <tcpm@core3.amsl.com>; Thu, 30 Dec 2010 05:28:42 -0800 (PST)
Received: from mx3.netapp.com (mx3.netapp.com [217.70.210.9]) by core3.amsl.com (Postfix) with ESMTP id 8F8193A680A for <tcpm@ietf.org>; Thu, 30 Dec 2010 05:28:41 -0800 (PST)
X-IronPort-AV: E=Sophos; i="4.60,249,1291622400"; d="scan'208,217"; a="228738389"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx3-out.netapp.com with ESMTP; 30 Dec 2010 05:30:46 -0800
Received: from amsrsexc1-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id oBUDUjv0019401; Thu, 30 Dec 2010 05:30:45 -0800 (PST)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 30 Dec 2010 14:30:45 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CBA825.C316FEE2"
Date: Thu, 30 Dec 2010 13:28:17 -0000
Message-ID: <5FDC413D5FA246468C200652D63E627A0C2CA4C2@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <AANLkTi==-jZx2+JTTYPbtRu=hhrPPNhDONkPQvNkQozG@mail.gmail.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [tcpm] new version of 2988bis
Thread-Index: AcunjH6T043Vg8YdSLuryDR+g5TFygAlp7qA
References: <20101207033356.439642715136@lawyers.icir.org><AANLkTimTJSGMOuay10krCjJbu6pPnoGFuirz_Q3_tk0F@mail.gmail.com><5FDC413D5FA246468C200652D63E627A0C0A2FD4@LDCMVEXC1-PRD.hq.netapp.com><AANLkTim3BXjxrBr7e19y0KZ_m7j+v2O7z50h9XKzeiSw@mail.gmail.com> <AANLkTi==-jZx2+JTTYPbtRu=hhrPPNhDONkPQvNkQozG@mail.gmail.com>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: Jerry Chu <hkchu@google.com>, "Per Hurtig (work)" <per.hurtig@kau.se>
X-OriginalArrivalTime: 30 Dec 2010 13:30:45.0802 (UTC) FILETIME=[C38990A0:01CBA825]
Cc: tcpm@ietf.org, mallman@icir.org
Subject: Re: [tcpm] new version of 2988bis
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Dec 2010 13:28:50 -0000

 

Thank you Jerry!

 

Per,

 

The sender cannot know which segment will eventually be covered in a
cumulative ACK (it could be any segment in the current window), thus a
singular variable in tcpcb doesn't appear to be able to hold enough
information.

 

Of course, if you have additional side information for each segment (ie.
Linux TCP stack) or carried back for each segment (such as timestamps
reflecting the TCP sender clock perfectly), you can get away with much
less state in the sender. (I think the TS option would be the only
viable way for BSD-derived TCP stacks).

 

 

Also, I am very interested in learning your results on the slow-link -
if the variability inducted by the serialization delay would be enough
to prevent undue spurious RTOs. Note that RFC1323 rules out the
usability of timestamps for RTT calculation during a window containing
loss...

Thus I strongly suspect, that if you are sending a burst of data, and
loose one early segment (where serialization delay is still low), the
sender will not be able to adjust the RTTM (and varability) properly -
resulting exactly in the scenario described by Jerry (much higher
spurious RTOs).

 

Also, even though the RFCs have minRTO at 1 sec, all available stacks
violate this today per default, afaik. Typical minRTO is between 100 and
400 msec, and there are stacks with a default minRTO is in the low 10s
of msec...

 

Thus it might be more prudent to verify if the timeliness of latency
feedback to the sender is enough, to work with the current RTTM / RTO
tunables (sRTT + 4* varRTT), or if that formula needs to be adjusted...

 

Or, if the RFC1323 guidance of *not* using RTT feedback during loss
events need to be reviewed to allow a more tight estimation of RTT and
varRTT under all circumstances, in turn allowing more tight RTO
values...

 

Happy new year,

   Richard

 

 

From: Jerry Chu [mailto:hkchu@google.com] 
Sent: Mittwoch, 29. Dezember 2010 20:13
To: Per Hurtig (work)
Cc: Scheffenegger, Richard; tcpm@ietf.org; mallman@icir.org
Subject: Re: [tcpm] new version of 2988bis

 

On Wed, Dec 29, 2010 at 1:54 AM, Per Hurtig (work) <per.hurtig@kau.se>
wrote:

	I don't see why it should be necessary to track all unacked
segments
	(or maybe I'm just not seeing it...). The only difference
between this
	proposal and the original approach is that you account for the
sending
	time of the earliest outstanding segment. The restart only
happens

 

But doesn't that require one to record the sending time of each segment?

 

BTW, Linux already does this (i.e., timestamping each skb before
sending, regardless

of the timestamp option). Dunno about other OSes.

 

Jerry

 

	when a pure cumulative acknowledgment arrives. Can you exemplify
when
	the approach would fail?
	
	// Per

	
	On Wed, Dec 22, 2010 at 12:41, Scheffenegger, Richard
<rs@netapp.com> wrote:
	>
	> Hmm...
	>
	> How do you deal with multiple lost segments in the same
window? Your draft says
	>
	>  2.  Proposed Modifications
	>
	>   The document proposes an update of step 5.3 in Section 5 of
[RFC2988]
	>   to (and a similar update of step R3 in section 6.3.2 of
[RFC4960]):
	>
	>      When an ACK is received that acknowledges new data,
restart the
	>      retransmission timer so that it will expire RTO seconds
after the
	>      earliest outstanding segment was transmitted (for the
current
	>      value of RTO).
	>
	>   The update requires TCP implementations to track the time
elapsed
	>   since sending the last unacknowledged segment.  In practice,
this
	>   could be achieved by adding one variable to the transmission
control
	>   block.
	>
	> But wouldn't you really need to track the sending time of
*each* segment, to always be able to calculate the true T_earliest? I
can not see how a singular, global TCPCB variable addresses this -
perhaps I'm missing something?
	>
	> Wouldn't there be an opportunity to use this feature
synergistically with RFC1323 Timestamps also? (The ACK with new data
should reflect the sender's TCP clock (or a well derived value thereof)
at the time the segment was sent. Perhaps that information can help
reducing additional per-segment state in the sender?)
	>
	> Best regards,
	>   Richard Scheffenegger
	>
	>
	>> -----Original Message-----
	>> From: Per Hurtig (work) [mailto:per.hurtig@kau.se]
	>> Sent: Mittwoch, 22. Dezember 2010 10:16
	>> To: tcpm@ietf.org
	>> Cc: mallman@icir.org
	>> Subject: Re: [tcpm] new version of 2988bis
	>>
	>> Hi all,
	>>
	>> we have submitted an I-D that summarizes an alternate way to
restart
	>> the TCP/SCTP RTO timer:
	>>
	>> http://www.ietf.org/id/draft-hurtig-tcpm-rtorestart-00.txt
	>>
	>> The difference between this approach and RFC2988(bis)'s
approach is
	>> the way outstanding segments are
	>> considered. We're happy to receive any comments on the draft.
	>>
	>>
	>> Regards, Per H
	>>
	>>
	>>
	>>
	>> On Tue, Dec 7, 2010 at 04:33, Mark Allman <mallman@icir.org>
wrote:
	>> >
	>> > Folks-
	>> >
	>> > We posted a new version of the 2988bis I-D today.  It is:
	>> >
	>> >  draft-paxson-tcpm-rfc2988bis-01.txt
	>> >
	>> > The big change is a new set of empirical results that
pertain to
	>> > dropping the initial RTO from 3sec to 1sec is now given
(along with
	>> the
	>> > previous results from Google) in the appendix.  Generally
speaking,
	>> > these results show it is pretty safe to drop the initial
RTO.
	>> >
	>> > Have a look.  I believe the authors are pretty happy with
this
	>> document
	>> > at this point.
	>> >
	>> > allman
	>> >
	>> >
	>> >
	>> >
	>> >
	>> > _______________________________________________
	>> > tcpm mailing list
	>> > tcpm@ietf.org
	>> > https://www.ietf.org/mailman/listinfo/tcpm
	>> >
	>> _______________________________________________
	>> tcpm mailing list
	>> tcpm@ietf.org
	>> https://www.ietf.org/mailman/listinfo/tcpm
	>
	>
	_______________________________________________
	tcpm mailing list
	tcpm@ietf.org
	https://www.ietf.org/mailman/listinfo/tcpm