Re: [tcpm] Tuning TCP parameters for the 21st century

Jerry Chu <hkchu@google.com> Wed, 15 July 2009 16:51 UTC

Return-Path: <hkchu@google.com>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 810EF3A6BFF for <tcpm@core3.amsl.com>; Wed, 15 Jul 2009 09:51:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.977
X-Spam-Level:
X-Spam-Status: No, score=-101.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GpyJBRJ-zuTv for <tcpm@core3.amsl.com>; Wed, 15 Jul 2009 09:51:10 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [216.239.45.13]) by core3.amsl.com (Postfix) with ESMTP id 5825D3A6BD8 for <tcpm@ietf.org>; Wed, 15 Jul 2009 09:51:10 -0700 (PDT)
Received: from zps18.corp.google.com (zps18.corp.google.com [172.25.146.18]) by smtp-out.google.com with ESMTP id n6F0hkAp003842 for <tcpm@ietf.org>; Tue, 14 Jul 2009 17:43:46 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1247618626; bh=RjV4UW4S+PM5YE0/YJyIGh+exOY=; h=DomainKey-Signature:MIME-Version:In-Reply-To:References:Date: Message-ID:Subject:From:To:Cc:Content-Type: Content-Transfer-Encoding:X-System-Of-Record; b=Pggexrtp6CRroZc3BC TLM7NlrF72x8gbLm34Z30dhAm6H1nuvdRVyKALy8RA2cNKjmWswe6QXOASQ5yH1zzqY g==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:date:message-id:subject:from:to: cc:content-type:content-transfer-encoding:x-system-of-record; b=Y+J2m+fyT/zbyN9WMeCuH0dnxf/tu3R6u9kusuEfXfQdx+jjTEUjtLeFtiL8jUU3B 9zMANdXNoySwH02wPBAOg==
Received: from yxe40 (yxe40.prod.google.com [10.190.2.40]) by zps18.corp.google.com with ESMTP id n6F0hirp024394 for <tcpm@ietf.org>; Tue, 14 Jul 2009 17:43:44 -0700
Received: by yxe40 with SMTP id 40so245763yxe.21 for <tcpm@ietf.org>; Tue, 14 Jul 2009 17:43:44 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.100.32.13 with SMTP id f13mr9539883anf.36.1247618623936; Tue, 14 Jul 2009 17:43:43 -0700 (PDT)
In-Reply-To: <4A5D0E8F.1040402@isi.edu>
References: <d1c2719f0907131619t1a80997ep4080a3a721ef3627@mail.gmail.com> <4A5C540E.9070104@sun.com> <4A5C9309.8030704@isi.edu> <d1c2719f0907141241p73e605adqc1d2e6f0db4eb3aa@mail.gmail.com> <4A5CE3D0.5000904@isi.edu> <d1c2719f0907141532i31d2b740hfa32209a8ccb156@mail.gmail.com> <4A5D0E8F.1040402@isi.edu>
Date: Tue, 14 Jul 2009 17:43:43 -0700
Message-ID: <d1c2719f0907141743n4952c9far54e3be36668577ed@mail.gmail.com>
From: Jerry Chu <hkchu@google.com>
To: Joe Touch <touch@isi.edu>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
X-System-Of-Record: true
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] Tuning TCP parameters for the 21st century
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jul 2009 16:51:11 -0000

On Tue, Jul 14, 2009 at 4:02 PM, Joe Touch<touch@isi.edu> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>> On Tue, Jul 14, 2009 at 1:00 PM, Joe Touch<touch@isi.edu> wrote:
> ...
>>> > Granted, our web server didn't have your traffic, but we found that the
>>> > impact of SND_CWND was small, and limited to only a small subset of
>>> > connections. Connections too short don't benefit from a large window.
>>> > Connections too long have little benefit from increasing startup behavior.
>>
>> Hmm, your conclusion seems a bit counter-intuitive - for a typical web object
>> of 5-8 pkts in size, starting with an initcwnd of 6, e.g., from the cache will
>> require only half of the round trips to complete the http transaction compared
>> to initcwnd = 3.
>
> First, you get to send 4K or 4 packets (whichever is smaller) at the
> start of a connection anyway. Second, RTT is only a fraction of the
> overall delays experienced by a message.

Sure and we are looking at other layers (especially HTTP) for performance
optimization as well.

>Finally, connections already
> eat 2 RTTs before you do anything to get a response, due to the SYN

We're taking one more look at T/TCP as well (probably need to start a different
mail thread on T/TCP too).

> handshake followed by the request-response of HTTP. You might be eating
> a single RTT off of a 3-4 RTT exchange. Your RTTs may go down by 30-50%,
> but since they're not 100% of your delay, you end up benefitting only
> around 20% or so.
>
> Larger messages benefit as you open the window to burst the whole thing
> out. Here's a back of the envelope:
>
> # packets       was RTTs:       is RTTs:        benefit
>        4       2               2               0%
>        10      3               2               33%
>        19      4               2               50%
>        32      5               2               60%
>        52      6               2               67%
>        82      7               2               71%
>
> Note though that at some point your SND_CWND will be opened only as far
> as the path allows, at which point sending more packets just takes more
> RTTs, so you start to lose. 90% of our max CWNDs were below 9K, which
> means we only saw benefits in the 20-30% range (i.e., the first two rows
> above). Sure, if you see CWNDs much larger, you might benefit from
> reusing old values...

Correct. We are trying to pilot our cache code and will find out our snd_cwnd
distribution, and how much it helps (or hurts if it triggers more pkt drops).

>
>>> > (see the tech report at http://www.isi.edu/aln)
>>> >
>>>>>> >>> > The pro is more rapid convergence to an accurate RTT; the con is that
>>>>>> >>> > you're using a potentially invalid RTT, but then that's what you do when
>>>>>> >>> > you start without knowing the RTT at all anyway.
>>>>>> >>> >
>>>>>> >>> > It has also been implemented in Linux; see RFC4614.
>>>> >>
>>>> >> Yes Linux stack maintains a "destination cache" of ssthresh and RTT on
>>>> >> a per dest IP address basis. It doesn't currently use snd_cwnd, probably
>>>> >> out of the same concerned mentioned above?
>>> >
>>> > Lack of benefit is also an issue.
>>
>> See above. I thought this should be the one that will give the largest beneift
>> (in terms of latency reduction) for the web traffic, if done (i.e.,
>> predicted) correctly.
>
> It would if most connections's max CWND was larger than 10, and if most
> connections sent more than 20 packets, but neither was the case for us.
>
>>>> >> Also it is very conservative
>>>> >> hence won't use the cached RTT value unless the timestamp option is
>>>> >> on.
>>>> >>
>>>> >> Also the dst cache is only consulted after 3WHS. So for SYN/SYN-ACK
>>>> >> the initRTO is still set to 3secs (don't know why).
>>> >
>>> > If you spin the initRTO down too far, you end up resending the SYN
>>> > needlessly, no?
>>
>> Correct so there is a fine line to walk. But if > 98% of all TCP connections
>> experience RTT << 1 sec, it just seems too conservative to have a global
>> initRTO == 3secs just to avoid spurious retransmission in the < 2% category.
>
> It doesn't matter much if 99.99% of connections wouldn't benefit from
> having the initRTO go off earlier anyway. That's the tradeoff. Don't
> know if it helps your case, though.

The SYN/SYN-ACK retransmission rate we measured turned out to be >> 1%
in many cases (a bit surprising to us) hence the benefit.

Jerry

>
> Joe
>
>>> > Joe
>>> >
>>> > -----BEGIN PGP SIGNATURE-----
>>> > Version: GnuPG v1.4.9 (MingW32)
>>> > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>> >
>>> > iEYEARECAAYFAkpc49AACgkQE5f5cImnZruF5QCfc5BRj9np2yjizDIqOa+i09bT
>>> > LsQAoMBPrr2dYErruJc0ED7s2hfCQztE
>>> > =uouq
>>> > -----END PGP SIGNATURE-----
>>> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkpdDo4ACgkQE5f5cImnZrs3fwCfaOB3Ki+FKf1rrJP3kKfdxV2X
> D68AoKxBh/3C0Bh3MVZ8AniBoen9vuX/
> =X+Od
> -----END PGP SIGNATURE-----
>