Re: [tcpm] New ID available: RFC2988bis (RTO calculation)

Mark Allman <mallman@icir.org> Tue, 02 March 2010 18:31 UTC

Return-Path: <mallman@icir.org>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 92F7428C0FE for <tcpm@core3.amsl.com>; Tue, 2 Mar 2010 10:31:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WzUoGV2QmwpW for <tcpm@core3.amsl.com>; Tue, 2 Mar 2010 10:31:50 -0800 (PST)
Received: from fruitcake.ICSI.Berkeley.EDU (fruitcake.ICSI.Berkeley.EDU [192.150.186.11]) by core3.amsl.com (Postfix) with ESMTP id 6224328C0ED for <tcpm@ietf.org>; Tue, 2 Mar 2010 10:31:50 -0800 (PST)
Received: from lawyers.icir.org (jack.ICSI.Berkeley.EDU [192.150.186.73]) by fruitcake.ICSI.Berkeley.EDU (8.12.11.20060614/8.12.11) with ESMTP id o22IVmMn009700; Tue, 2 Mar 2010 10:31:48 -0800 (PST)
Received: from lawyers.icir.org (localhost [127.0.0.1]) by lawyers.icir.org (Postfix) with ESMTP id 109E4A182EC; Tue, 2 Mar 2010 13:31:48 -0500 (EST)
To: Alexander Zimmermann <alexander.zimmermann@nets.rwth-aachen.de>, hkchu@google.com, vern@icir.org, "tcpm@ietf.org Extensions WG" <tcpm@ietf.org>
From: Mark Allman <mallman@icir.org>
In-Reply-To:
Organization: International Computer Science Institute (ICSI)
Song-of-the-Day: Money For Nothing
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="--------ma22929-1"; micalg="pgp-sha1"; protocol="application/pgp-signature"
Date: Tue, 02 Mar 2010 13:31:48 -0500
Sender: mallman@icir.org
Message-Id: <20100302183148.109E4A182EC@lawyers.icir.org>
Subject: Re: [tcpm] New ID available: RFC2988bis (RTO calculation)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: mallman@icir.org
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Mar 2010 18:31:51 -0000

[This is a re-send.  My original note did not make it to the list.  I
 made a tweak to my mail setup.  Lets see if this works. --allman]

> currently the document only suggests lowering the initial RTO.
> IMHO we should also discuss if we want to lower the minimal RTO, too.

I have two responses to this ...

  - First, I sent a technical note to the list on Feb/11 about changing
    the min.  I am appending that to this note.

  - Second, it seems that we should keep these two questions independent
    to me.  It seems to me that changing the initRTO is pretty minor and
    well-scoped, whereas changing the minRTO has more unknowns.  So, I
    am not at all opposed to thinking about both changes as you suggest,
    but coupling them is likely to hinder progress on the initRTO
    (IMO). 

Just my 2c.

allman




> Is it time to reconsider a smaller minimum RTO as well? 

One problem with the minimum is we don't know what to set it at [*].
The finding in [AP99] is that as you reduce the min you take more
spurious timeouts.  The tradeoff seemed fundamental to us.  But, given
our methodology (trace driven simulation) we could not really tell the
impact of the spurious timeouts.  So, we just sought to minimize them in
the spec.  I know some TCP implementations are being pretty aggressive
with the min these days.  But, it is not at all clear to me whether this
is a net win or net lose.  I.e., is the win in terms of peppiness to
needed retransmits still a win when it is played against the
consequences of a spurious transmit (i.e., setting cwnd = 1)?  Perhaps
there is a sweet spot that is < 1sec whereby this tradeoff better
balances and we should be using that.  But, it'd take a big experiment
to show that, I think.

[*] Maybe there is work that I just don't know about and if so I'd like
    to be pointed at it.

Another thing I have wondered about from time to time .... Much like
Early Retransmit could we have some sort of early RTO when the cwnd size
is small?  When the window size is small it seems that the downside of
making a bad rexmt decision is not as bad as when the cwnd is large so
it might pay in terms of responsiveness to RTO sooner.  Also, as the
cwnd gets larger we have better loss recovery techniques that should
come into play that can repair lots of loss without involving the RTO
and so perhaps by being overly aggressive in the RTO we just hurt
ourselves.  Finally, as the cwnd gets bigger that means we are
transferring more data.  And, the longer a connection goes the less
important the fine-grained responsiveness matters.  Just a random
thought ...

allman