RE: [tcpm] revising 2581: setting ssthresh on RTOs

"Anantha Ramaiah \(ananth\)" <ananth@cisco.com> Mon, 31 July 2006 15:27 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1G7Zfk-0002q1-BJ; Mon, 31 Jul 2006 11:27:12 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1G7Zfi-0002pw-UJ for tcpm@ietf.org; Mon, 31 Jul 2006 11:27:10 -0400
Received: from sj-iport-3-in.cisco.com ([171.71.176.72] helo=sj-iport-3.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1G7Zfh-0004oV-FN for tcpm@ietf.org; Mon, 31 Jul 2006 11:27:10 -0400
Received: from sj-dkim-3.cisco.com ([171.71.179.195]) by sj-iport-3.cisco.com with ESMTP; 31 Jul 2006 08:27:08 -0700
X-IronPort-AV: i="4.07,199,1151910000"; d="scan'208"; a="437640131:sNHT28103240"
Received: from sj-core-5.cisco.com (sj-core-5.cisco.com [171.71.177.238]) by sj-dkim-3.cisco.com (8.12.11.20060308/8.12.11) with ESMTP id k6VFR8mE021685; Mon, 31 Jul 2006 08:27:08 -0700
Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by sj-core-5.cisco.com (8.12.10/8.12.6) with ESMTP id k6VFR8Yr014972; Mon, 31 Jul 2006 08:27:08 -0700 (PDT)
Received: from xmb-sjc-21c.amer.cisco.com ([171.70.151.176]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 31 Jul 2006 08:27:08 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [tcpm] revising 2581: setting ssthresh on RTOs
Date: Mon, 31 Jul 2006 08:27:07 -0700
Message-ID: <0C53DCFB700D144284A584F54711EC5801EB477B@xmb-sjc-21c.amer.cisco.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [tcpm] revising 2581: setting ssthresh on RTOs
Thread-Index: Aca0l1pmDNbVeR5jQSe/KViGFptVBwAHElhg
From: "Anantha Ramaiah (ananth)" <ananth@cisco.com>
To: mallman@icir.org, tcpm@ietf.org
X-OriginalArrivalTime: 31 Jul 2006 15:27:08.0465 (UTC) FILETIME=[C938A210:01C6B4B5]
DKIM-Signature: a=rsa-sha1; q=dns; l=4684; t=1154359628; x=1155223628; c=relaxed/simple; s=sjdkim3002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=ananth@cisco.com; z=From:=22Anantha=20Ramaiah=20\(ananth\)=22=20<ananth@cisco.com> |Subject:RE=3A=20[tcpm]=20revising=202581=3A=20setting=20ssthresh=20on=20RTOs; X=v=3Dcisco.com=3B=20h=3D4PVkPvBjEqPSwlXtfjgd1bngvBY=3D; b=msaUPao7cHi0POAQSLeOTK5UB4KU0LpG4UlL2suOM9rlTKSkrUyJgdJ+djGJdCKQIn+R4LYx V9s7FGOm3C0YAa58t9tdkhfLADMzWXQ6/JtEG3EOcMKMOaodjBd2cpoE;
Authentication-Results: sj-dkim-3.cisco.com; header.From=ananth@cisco.com; dkim=pass ( sig from cisco.com verified; );
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 287c806b254c6353fcb09ee0e53bbc5e
Cc:
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Errors-To: tcpm-bounces@ietf.org

Mark:

Here is my take :
- I am fine to make this change in RFC2581bis. Although I really don't
have a strong argument one way or the other regarding the scope, I would
vote for making this change in RFC2581.

- Option #2 is the way I would go. My argument on favour of #2 is that
vast majority of the internet has undergone a sea change in terms of
processing HP of the routers, bandwidth, speed of the link etc., and
recovery mechanisms needs to quickly adapt to the transient conditions
arising in these networks. I think having a larger ssthresh than before
(after successive RTO's) is fine and I see no major issues here.

$0.02,
-Anantha  

> -----Original Message-----
> From: Mark Allman [mailto:mallman@icir.org] 
> Sent: Monday, July 31, 2006 4:47 AM
> To: tcpm@ietf.org
> Subject: [tcpm] revising 2581: setting ssthresh on RTOs
> 
>  
> Folks-
> 
> We have talked about setting ssthresh after RTOs on this list 
> a number of times and we talked about it in Montreal.  I'd 
> like to verify what seemed the general mood in Montreal here 
> on the list.  I think this is the last bit we need to work 
> out on 2581bis.  If you have something else that you think 
> needs done to the document, please yell.
> 
> RFC 2581 says that on each RTO a TCP reduces ssthresh to 
> FlightSize / 2.  Consider a lost retransmit.  Say we RTO on 
> segment X, cut ssthresh to Y segments (Y > 2 - just for the 
> example), cwnd becomes 1 segment and the RTO is backed off.  
> Now, say we RTO on segment X again.  The cwnd will stay at 
> one, but the FlightSize is now 1 and so ssthresh takes its 
> minimum value of 2 segments.
> 
> The observation is that this forces linear growth for a 
> potentially long time if this loss hiccup was caused by some 
> small network issue like a handoff.  In that case, it'd be 
> nice to be able to keep ssthresh higher and use slow start 
> when packets started flowing again.
> 
> Also, as a practical matter I don't think the above scenario 
> is the way we intended things to work.  Rather, I think we 
> envisioned (but, alas, did not write) suggestion #1:
> 
> Suggestion #1: On the first RTO for some segment, set 
> ssthresh to FlightSize/2.  On each subsequent RTO for the 
> given segment halve ssthresh (ssthresh =/ 2).
> 
> Basically, this slowly degrades ssthresh as the RTO gets 
> backed off, such that the longer TCP has been transmitting 
> into a lousy network the less the TCP gets to use exponential 
> increase when packets start flowing again.
> 
> In addition, another variant has been suggested in the meantime, ...
> 
> Suggestion #2: On the first RTO for some segment, set 
> ssthresh to FlightSize/2.  On each subsequent RTO for the 
> given segment do not adjust ssthresh at all.
> 
> This variant means that a TCP always gets to re-probe with 
> slow start based on the pre-loss conditions no matter how 
> long it took to fix the loss. 
> 
> Both #1 and #2 are quite safe.  If the network is in a really 
> lousy state then the TCP is going to continue to get losses 
> even after getting out of RTO backoff without increasing the 
> congestion window all that much.  And, if that happens then 
> ssthresh will get further reduced (probably to its minimum).  
> Essentially, if the network is heavily loaded all of the 
> sudden then this additional loss isn't really going to be 
> exacerbated by the first couple RTTs of slow start.  If the 
> backoff was caused by something other than a suddenly 
> massively congested network then this tweak lets TCP get back 
> to a reasonable operating point more rapidly.
> 
> So, a couple of questions ... and, the authors current hits ...
> 
> (1) Is this change in-scope for 2581bis?  We said "no algorithmic
>     tweaks" and so one view is that this should be cooked 
> elsewhere and
>     rolled in later.
> 
>     The author's hit on this is that the behavior of slamming ssthresh
>     down on the first backed off RTO is not our intent and so tweaking
>     this seems in-scope.  
> 
> (2) Assuming folks are fine with making a change then which change
>     should we make?  Suggestion #1 or #2?
> 
>     The author's chatted and we feel like #2 is fine.  As noted above,
>     neither case really aggravates the state of the network 
> in suddenly
>     heavily loaded situations.  So, #2 seems OK.
> 
> What do people think?  Is #2 OK?  Or, something else?
> 
> Thanks in advance for the feedback!
> 
> allman
> 
> 
> 
> 

_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www1.ietf.org/mailman/listinfo/tcpm