Re: [tcpm] initial RTO (was Re: Tuning TCP parameters for the 21st century)

Mark Allman <mallman@icir.org> Wed, 29 July 2009 16:06 UTC

Return-Path: <mallman@icir.org>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8A3313A6F33 for <tcpm@core3.amsl.com>; Wed, 29 Jul 2009 09:06:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.449
X-Spam-Level:
X-Spam-Status: No, score=-2.449 tagged_above=-999 required=5 tests=[AWL=0.150, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yGPisp2wePIK for <tcpm@core3.amsl.com>; Wed, 29 Jul 2009 09:06:05 -0700 (PDT)
Received: from pork.ICSI.Berkeley.EDU (pork.ICSI.Berkeley.EDU [192.150.186.19]) by core3.amsl.com (Postfix) with ESMTP id 9C7D13A6830 for <tcpm@ietf.org>; Wed, 29 Jul 2009 09:06:05 -0700 (PDT)
Received: from guns.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by pork.ICSI.Berkeley.EDU (8.12.11.20060308/8.12.11) with ESMTP id n6TG64nN010804; Wed, 29 Jul 2009 09:06:04 -0700
Received: from lawyers.icir.org (adsl-69-222-35-58.dsl.bcvloh.ameritech.net [69.222.35.58]) by guns.icir.org (Postfix) with ESMTP id 2A7503BD69C4; Wed, 29 Jul 2009 12:05:57 -0400 (EDT)
Received: from lawyers.icir.org (localhost [127.0.0.1]) by lawyers.icir.org (Postfix) with ESMTP id 122DF3884F5; Wed, 29 Jul 2009 12:05:59 -0400 (EDT)
To: Jerry Chu <hkchu@google.com>
From: Mark Allman <mallman@icir.org>
In-Reply-To: <d1c2719f0907290756h6f4990afu8fe4a573c5669d79@mail.gmail.com>
Organization: International Computer Science Institute (ICSI)
Song-of-the-Day: Sweet Emotion
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="--------ma29542-1"; micalg="pgp-sha1"; protocol="application/pgp-signature"
Date: Wed, 29 Jul 2009 12:05:58 -0400
Sender: mallman@icir.org
Message-Id: <20090729160559.122DF3884F5@lawyers.icir.org>
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] initial RTO (was Re: Tuning TCP parameters for the 21st century)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: mallman@icir.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Jul 2009 16:06:06 -0000

> I can think of a number of variations. The one-shot 1-sec-initRTO idea
> you described above also came through my mind but the drawback is you
> only get one-shot even though we know statistically > 98% of
> connections have RTT < 1 sec so most likely the continuous use of
> 1-sec-initRTO will turn out to be better. (A counter argument might be
> one-shot is "good enough", benefitting > 90% of the cases
> statistically...) The advantage of it is its simplicity, restricting
> the max # of spurious retransmissions caused by the reduced initRTO to
> 1, and obviously avoiding the RTO hell problem.

Two responses to "one shot":

(1) Yes, one-shot ought to be enough.  Considering losing the SYN,
    retransmitting it using an initRTO of 1sec and reseting the initRTO
    to 3sec.  Now, if there is actually loss in the first RTT of data
    transmission talking about fine-grained performance (i.e., that we
    can get from using 1sec again instead of 3sec) doesn't make a lot of
    sense because 1sec vs. 3sec doesn't matter because performance is
    going to suck no matter what.  So, why bother with anything terribly
    "smart" here?

(2) Using the numbers on your slides it seems to me that the fraction of
    hosts with an RTT of > 1sec is roughly the same as the SYN
    retransmit rate (at an RTT of 3sec, I assume).  To me that says that
    if you use an initRTO of 1sec and then retransmit then the reason
    for that retransmit is just as likely to be loss as it is to be a
    long path.  So, your approach of preferring more than one-shot
    assumes loss.  But, I don't see the measurements you gave as
    suggesting that is the right approach.  The notion of going back to
    3sec just sort of punts.  I.e., the notion is that we have hit a
    situation whereby we don't know what is going on and so let's not
    dogmatically try to push forward, but let's throw up our hands and
    try to do things that ultimately will figure out what is going on.
    And, further, one mistake does not propagate.

So, for me one-shot is just about the right balance here.  Any more than
that we're getting into the corner of a corner case and further in that
corner the empirical evidence is not suggestive of a clear path.  So,
let's just do something that will allow the protocol to get a handle on
things as they are in the specific situation and not try to make guesses
that propagate further.

allman