[tcpm] initial RTO (was Re: Tuning TCP parameters for the 21st century)

Mark Allman <mallman@icir.org> Mon, 27 July 2009 19:28 UTC

Return-Path: <mallman@icir.org>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id BF5AE3A6CB6 for <tcpm@core3.amsl.com>; Mon, 27 Jul 2009 12:28:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WqXm-1q0wLME for <tcpm@core3.amsl.com>; Mon, 27 Jul 2009 12:28:12 -0700 (PDT)
Received: from pork.ICSI.Berkeley.EDU (pork.ICSI.Berkeley.EDU [192.150.186.19]) by core3.amsl.com (Postfix) with ESMTP id D6D873A6988 for <tcpm@ietf.org>; Mon, 27 Jul 2009 12:28:12 -0700 (PDT)
Received: from guns.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by pork.ICSI.Berkeley.EDU (8.12.11.20060308/8.12.11) with ESMTP id n6RJSBOo009668; Mon, 27 Jul 2009 12:28:11 -0700
Received: from lawyers.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by guns.icir.org (Postfix) with ESMTP id D89BE3BC8A83; Mon, 27 Jul 2009 15:28:03 -0400 (EDT)
Received: from lawyers.icir.org (localhost [127.0.0.1]) by lawyers.icir.org (Postfix) with ESMTP id A142437DB08; Mon, 27 Jul 2009 15:28:05 -0400 (EDT)
To: Jerry Chu <hkchu@google.com>
From: Mark Allman <mallman@icir.org>
In-Reply-To: <d1c2719f0907131619t1a80997ep4080a3a721ef3627@mail.gmail.com>
Organization: International Computer Science Institute (ICSI)
Song-of-the-Day: Sweet Emotion
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="--------ma65477-1"; micalg="pgp-sha1"; protocol="application/pgp-signature"
Date: Mon, 27 Jul 2009 15:28:05 -0400
Sender: mallman@icir.org
Message-Id: <20090727192805.A142437DB08@lawyers.icir.org>
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: [tcpm] initial RTO (was Re: Tuning TCP parameters for the 21st century)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: mallman@icir.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jul 2009 19:28:13 -0000

Jerry-

A few random thoughts here ...

> I'll start with Lowering initRTO.
> 
> RFC1122 contains the following paragraph:
> 
> The following values SHOULD be used to initialize the
> estimation parameters for a new connection:
> 
> (a)  RTT = 0 seconds.
> 
> (b)  RTO = 3 seconds.  (The smoothed variance is to be
> initialized to the value that will result in this RTO).
> 
> The "3secs SHOULD" is reaffirmed in RFC2988.
> 
> From our own measurement of world wide RTT distribution to Google
> servers we believe 3secs is too conservative, and like to propose it
> to be reduced to 1sect.

I am not at all sure this is a good idea.  I have an easier time
believing the others on your list are perhaps reasonable.  But, this one
seems somewhat dubious to me.  A few things ...

  - The fundamental problem here is that you have *no* information.
    That is, we don't know how long the path is before we have done an
    exchange.  When you start from scratch you have nothing to go on
    except defaults.  So, it seems to me on those grounds alone
    conservativeness is fine.  Because,

  - If it was just an extra small packet or two that got sent out that
    doesn't seem like a Big Deal.  But, once you retransmit the SYN you
    no longer can take an RTT sample from the 3WHS per Karn's algorithm.
    So, if in fact the initial RTO is too short then it isn't just going
    to strobe out an extra packet, but what it means is that it's pretty
    likely that the packets in your initial window---after clumsily
    finishing the 3WHS---will likewise be retransmitted because the RTO
    estimate is low and we did not get the opportunity in the 3WHS to
    take an actual RTT sample to better seed the estimator.  This is RTO
    Hell.

  - At first blush timestamps might help here because if used then we
    don't have to use Karn's algorithm.  But, again, since we are just
    initiating a connection how do we know if the peer is going to use
    timestamps?  If the initiator sends a timestamp option then there is
    a chance that timestamps will be in use and therefore there is a
    chance you'll avoid RTO Hell.  But, there is also a chance you
    won't.  The 3WHS responder (sender of the SYN+ACK) will know if
    timestamps will be in use and therefore could perhaps lower the
    initial RTO (basically, this is the ECNSYN trick).  That doesn't
    seem all that unreasonable to me.

  - Now, if you track information across connections as others have
    noted then, sure.  It seems perfectly acceptable to take the view
    that with high confidence you understand that 1sec (or whatever)
    will be fine for an initial RTO over some path that you have
    transmitted traffic across in the recent past and so then you can
    use that.  In this case, you are picking an initial RTO for a
    connection but not flying completely in the dark.

  - It seems that (per the discussion in today's meeting) a naive
    lowering to 1sec is going to be problematic because we have
    bandwidth-on-demand networks, deep queues in access devices are not
    rare, etc.

In a subsequent email you note:

> Correct so there is a fine line to walk. But if > 98% of all TCP
> connections experience RTT << 1 sec, it just seems too conservative to
> have a global initRTO == 3secs just to avoid spurious retransmission
> in the < 2% category.

I agree that it is a fine line.  But, I think your 98%-vs-2% is far too
glib.  That is, we have to look at how bad we're making it for those 2%.
If we degraded each of those 2% by "a smidge" then who cares.  But, if
we really hose those connections (see second bullet above about RTO
Hell) it doesn't seem like a good tradeoff.  It's useful to remember
that TCP was designed to be general and not optimal.  Certainly we don't
want to unduly penalize most of the traffic (/users) to dogmatically
accommodate every last esoteric situation that might happen to crop up
on the third Tuesday of the 6th month following the most recent blue
moon.  But, we also can err on the other side, too.  I think simple
percentages as you have given are pretty superficial and we'd need to go
beyond that to really decide what line we wanted to walk.

Just my two bits ...

allman