[tcpm] revising 2581: setting ssthresh on RTOs

Mark Allman <mallman@icir.org> Mon, 31 July 2006 11:48 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1G7WGQ-0003rp-J1; Mon, 31 Jul 2006 07:48:50 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1G7WGP-0003re-AS for tcpm@ietf.org; Mon, 31 Jul 2006 07:48:49 -0400
Received: from wyvern.icir.org ([192.150.187.14]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1G7WGN-0004kf-TA for tcpm@ietf.org; Mon, 31 Jul 2006 07:48:49 -0400
Received: from guns.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by wyvern.icir.org (8.12.11/8.12.11) with ESMTP id k6VBmiO1057382 for <tcpm@ietf.org>; Mon, 31 Jul 2006 04:48:44 -0700 (PDT) (envelope-from mallman@icir.org)
Received: from lawyers.icir.org (guns.icir.org [69.222.35.58]) by guns.icir.org (Postfix) with ESMTP id 9C6A677AC21 for <tcpm@ietf.org>; Mon, 31 Jul 2006 07:48:43 -0400 (EDT)
Received: from lawyers.icir.org (localhost [127.0.0.1]) by lawyers.icir.org (Postfix) with ESMTP id 4E9524453E7 for <tcpm@ietf.org>; Mon, 31 Jul 2006 07:47:29 -0400 (EDT)
To: tcpm@ietf.org
From: Mark Allman <mallman@icir.org>
Organization: ICSI Center for Internet Research (ICIR)
Song-of-the-Day: Glory Days
MIME-Version: 1.0
Date: Mon, 31 Jul 2006 07:47:29 -0400
Message-Id: <20060731114729.4E9524453E7@lawyers.icir.org>
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 14582b0692e7f70ce7111d04db3781c8
Subject: [tcpm] revising 2581: setting ssthresh on RTOs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: mallman@icir.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1098177511=="
Errors-To: tcpm-bounces@ietf.org

 
Folks-

We have talked about setting ssthresh after RTOs on this list a number
of times and we talked about it in Montreal.  I'd like to verify what
seemed the general mood in Montreal here on the list.  I think this is
the last bit we need to work out on 2581bis.  If you have something else
that you think needs done to the document, please yell.

RFC 2581 says that on each RTO a TCP reduces ssthresh to FlightSize /
2.  Consider a lost retransmit.  Say we RTO on segment X, cut ssthresh
to Y segments (Y > 2 - just for the example), cwnd becomes 1 segment and
the RTO is backed off.  Now, say we RTO on segment X again.  The cwnd
will stay at one, but the FlightSize is now 1 and so ssthresh takes its
minimum value of 2 segments.

The observation is that this forces linear growth for a potentially long
time if this loss hiccup was caused by some small network issue like a
handoff.  In that case, it'd be nice to be able to keep ssthresh higher
and use slow start when packets started flowing again.

Also, as a practical matter I don't think the above scenario is the way
we intended things to work.  Rather, I think we envisioned (but, alas,
did not write) suggestion #1:

Suggestion #1: On the first RTO for some segment, set ssthresh to
FlightSize/2.  On each subsequent RTO for the given segment halve
ssthresh (ssthresh =/ 2).

Basically, this slowly degrades ssthresh as the RTO gets backed off,
such that the longer TCP has been transmitting into a lousy network the
less the TCP gets to use exponential increase when packets start flowing
again.

In addition, another variant has been suggested in the meantime, ...

Suggestion #2: On the first RTO for some segment, set ssthresh to
FlightSize/2.  On each subsequent RTO for the given segment do not
adjust ssthresh at all.

This variant means that a TCP always gets to re-probe with slow start
based on the pre-loss conditions no matter how long it took to fix the
loss. 

Both #1 and #2 are quite safe.  If the network is in a really lousy
state then the TCP is going to continue to get losses even after getting
out of RTO backoff without increasing the congestion window all that
much.  And, if that happens then ssthresh will get further reduced
(probably to its minimum).  Essentially, if the network is heavily
loaded all of the sudden then this additional loss isn't really going to
be exacerbated by the first couple RTTs of slow start.  If the backoff
was caused by something other than a suddenly massively congested
network then this tweak lets TCP get back to a reasonable operating
point more rapidly.

So, a couple of questions ... and, the authors current hits ...

(1) Is this change in-scope for 2581bis?  We said "no algorithmic
    tweaks" and so one view is that this should be cooked elsewhere and
    rolled in later.

    The author's hit on this is that the behavior of slamming ssthresh
    down on the first backed off RTO is not our intent and so tweaking
    this seems in-scope.  

(2) Assuming folks are fine with making a change then which change
    should we make?  Suggestion #1 or #2?

    The author's chatted and we feel like #2 is fine.  As noted above,
    neither case really aggravates the state of the network in suddenly
    heavily loaded situations.  So, #2 seems OK.

What do people think?  Is #2 OK?  Or, something else?

Thanks in advance for the feedback!

allman



_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www1.ietf.org/mailman/listinfo/tcpm