Re: HTTP/2 and TCP CWND

"Eggert, Lars" <lars@netapp.com> Mon, 22 April 2013 08:07 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF2BC21F85DB for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 22 Apr 2013 01:07:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.599
X-Spam-Level:
X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KkgMJTk7GrGq for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 22 Apr 2013 01:06:59 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 0644121F841D for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 22 Apr 2013 01:06:57 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UUBln-0007u4-7Y for ietf-http-wg-dist@listhub.w3.org; Mon, 22 Apr 2013 08:06:39 +0000
Resent-Message-Id: <E1UUBln-0007u4-7Y@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <ylafon@w3.org>) id 1UUBlk-0007nc-IJ for ietf-http-wg@listhub.w3.org; Mon, 22 Apr 2013 08:06:36 +0000
Received: from jay.w3.org ([128.30.52.169]) by lisa.w3.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <ylafon@w3.org>) id 1UUBlk-0001Jz-Gp for ietf-http-wg@w3.org; Mon, 22 Apr 2013 08:06:36 +0000
Received: from ylafon by jay.w3.org with local (Exim 4.72) (envelope-from <ylafon@w3.org>) id 1UUBlk-0007a2-CK for ietf-http-wg@w3.org; Mon, 22 Apr 2013 04:06:36 -0400
X-Return-path: <>
X-Received: from lisa.w3.org ([128.30.52.41]) by jay.w3.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) id 1URqZ7-0005Cy-Ei for ylafon@jay.w3.org; Mon, 15 Apr 2013 17:03:53 -0400
X-Received: from frink.w3.org ([128.30.52.56]) by lisa.w3.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) id 1URqZ7-0001Rx-B8 for ylafon@w3.org; Mon, 15 Apr 2013 21:03:53 +0000
X-Received: from lists by frink.w3.org with local (Exim 4.72) id 1URqZ7-0000pr-5Y for ylafon@w3.org; Mon, 15 Apr 2013 21:03:53 +0000
Date: Mon, 15 Apr 2013 21:03:53 +0000
X-From_: lars@netapp.com Mon Apr 15 21:03:50 2013
X-Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <lars@netapp.com>) id 1URqZ4-0000ow-Gr for ietf-http-wg@listhub.w3.org; Mon, 15 Apr 2013 21:03:50 +0000
X-Received: from mx12.netapp.com ([216.240.18.77]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <lars@netapp.com>) id 1URqZ3-0001R2-CK for ietf-http-wg@w3.org; Mon, 15 Apr 2013 21:03:50 +0000
X-IronPort-AV: E=Sophos;i="4.87,479,1363158000"; d="scan'208";a="40859155"
X-Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx12-out.netapp.com with ESMTP; 15 Apr 2013 14:03:20 -0700
X-Received: from vmwexceht02-prd.hq.netapp.com (vmwexceht02-prd.hq.netapp.com [10.106.76.240]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id r3FL3IBk009099; Mon, 15 Apr 2013 14:03:18 -0700 (PDT)
X-Received: from SACEXCMBX01-PRD.hq.netapp.com ([169.254.2.71]) by vmwexceht02-prd.hq.netapp.com ([10.106.76.240]) with mapi id 14.02.0342.003; Mon, 15 Apr 2013 14:03:17 -0700
From: "Eggert, Lars" <lars@netapp.com>
To: Gabriel Montenegro <Gabriel.Montenegro@microsoft.com>
CC: Roberto Peon <grmocg@gmail.com>, "Simpson, Robby (GE Energy Management)" <robby.simpson@ge.com>, Eliot Lear <lear@cisco.com>, Robert Collins <robertc@squid-cache.org>, Jitu Padhye <padhye@microsoft.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "Brian Raymor (MS OPEN TECH)" <Brian.Raymor@microsoft.com>, Rob Trace <Rob.Trace@microsoft.com>, Dave Thaler <dthaler@microsoft.com>, Martin Thomson <martin.thomson@skype.net>, Martin Stiemerling <martin.stiemerling@neclab.eu>
Thread-Topic: HTTP/2 and TCP CWND
Thread-Index: Ac4u/JrxD8HEzB3rR8CdUf/Gdhfe2wAColQAAADGCYAAANjQAAIugYQQAIIaNAAAFR2LAAAAqK8AAAXmO4AABiSsgA==
Old-Date: Mon, 15 Apr 2013 21:03:17 +0000
Message-ID: <8B0AAE84-CAB8-483B-99FD-DA6A0CA13395@netapp.com>
References: <516B8824.8040904@cisco.com> <DF8F6DB7E5D58B408041AE4D927B2F48CBB88103@CINURCNA14.e2k.ad.ge.com> <CAP+FsNfeUtKfOMPKriYP7Ak_YzsjEFKvprJOAQaxYP7_BxTBsw@mail.gmail.com> <cf53405c48dc431693573a9148776c8a@BN1PR03MB072.namprd03.prod.outlook.com>
In-Reply-To: <cf53405c48dc431693573a9148776c8a@BN1PR03MB072.namprd03.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.106.53.51]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <2674627B993E0A448D1EF99FBEBE9769@tahoe.netapp.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Received-SPF: pass client-ip=216.240.18.77; envelope-from=lars@netapp.com; helo=mx12.netapp.com
X-W3C-Hub-Spam-Status: No, score=-7.5
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.556, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1URqZ3-0001R2-CK 13f82581662a53f7684ca389a3fa8f38
Old-X-Envelope-To: ietf-http-wg
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Mon Apr 15 17:03:55 2013
X-DSPAM-Confidence: 0.9983
X-DSPAM-Improbability: 1 in 60147 chance of being spam
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 516c6b3b200301804284693
ReSent-Date: Mon, 22 Apr 2013 04:06:33 -0400
ReSent-From: Yves Lafon <ylafon@w3.org>
ReSent-To: ietf-http-wg@w3.org
ReSent-Subject: [Moderator Action] Re: HTTP/2 and TCP CWND
ReSent-User-Agent: Alpine 2.00 (DEB 1167 2008-08-23)
X-Original-To: ietf-http-wg@w3.org
Subject: Re: HTTP/2 and TCP CWND
Archived-At: <http://www.w3.org/mid/8B0AAE84-CAB8-483B-99FD-DA6A0CA13395@netapp.com>
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17463
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi,

I had commented in an off-list discussion on this issue and was asked to summarize what I said to the list. So here we go.

I fully understand why the idea to bypass slow-start and instead start with the window used during the last connection instantiation sounds nice. But: it has been thought of before a dozen times and has huge issues.

This has the potential to generate large line-rate bursts into the network, which can can cause loss bursts and force TCP into timeout-based recovery, which has a huge impact on throughput. (Much more so than slow-starting with a smaller window.) That is, because you normally have no idea if the path conditions are at all comparable between when you cached that CWND and when you want to reuse it. So when you burst and create a series of losses - for yourself and other flows on the bottleneck! - they all go into timeout of a few hundred ms at least and then slow-start.

The TCP WG has been working on the Google "IW10" proposal (allowing TCP to start with an initial window of 10 segments rather than 1-3). That seems to mitigate much of the need for caching the CWND, since new connections wouldn't need to start with very small windows. A large part of the discussion around that proposal was exactly on the question of how large the initial window can be without significantly increasing the danger of line-rate bursts. There has been a pretty in-depth analysis by multiple folks into whether 10 is safe or not, and the consensus seems to be that it should be. Just caching and reusing any arbitrarily large CWND is certainly not safe.

The issue itself has been thought about for much longer, c.f.http://tools.ietf.org/html/draft-hughes-restart-00 from 2002, which talks about the issue of what the window should be after a connection has been idle for a while and wants to resume sending.

Another related work item in TCP is http://tools.ietf.org/html/draft-ietf-tcpm-newcwv-00, which attempts to specify what TCP should do during periods where it didn't send at a rate that used up the current window, which can also lead to bursting when traffic demands increase.

I'm mentioning this, because a lot of the work of the TCP WG revolves around mitigating these bursts in order to avoid stalls due to timeout-based recovery, and having HTTP go off and define knobs that would actively counteract that work seems, ahem, counterproductive.

I'm all for making HTTP and TCP work better together. Limiting the number of parallel connections, seeing if we can increase the initial window safely, and other similar things are all great examples of what we should be doing more of. But the TCP and HTTP folks will need to work together on this - we can't afford to get this wrong. 

Lars

PS: I'm not on the WG list, so please CC me if you'd like to respond.