Re: HTTP/2 and TCP CWND

Roberto Peon <grmocg@gmail.com> Mon, 15 April 2013 22:17 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D787E21F8842 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 15 Apr 2013 15:17:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.598
X-Spam-Level:
X-Spam-Status: No, score=-10.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0rnlE20Wy-Zn for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 15 Apr 2013 15:17:43 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 61E8821F86DC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 15 Apr 2013 15:17:43 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1URrgv-0003gu-Az for ietf-http-wg-dist@listhub.w3.org; Mon, 15 Apr 2013 22:16:01 +0000
Resent-Date: Mon, 15 Apr 2013 22:16:01 +0000
Resent-Message-Id: <E1URrgv-0003gu-Az@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1URrgr-0003gA-M5 for ietf-http-wg@listhub.w3.org; Mon, 15 Apr 2013 22:15:57 +0000
Received: from mail-oa0-f46.google.com ([209.85.219.46]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1URrgq-0003yt-5p for ietf-http-wg@w3.org; Mon, 15 Apr 2013 22:15:57 +0000
Received: by mail-oa0-f46.google.com with SMTP id h2so3589056oag.19 for <ietf-http-wg@w3.org>; Mon, 15 Apr 2013 15:15:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=5f8XC6BXDsYv/bkHS5VFKSpftsva0rtGapCRv52raS4=; b=xtyu5lY1CzPMZAildTBWUXaJS7IyoCdutpix8UWo0rQ2oCtD+biJxPu56l6UyCfWZ9 tmLHvn1aJ1AQ7lWfzeo6W9kJztKGx6XPtsHUzoffCimXZH3776OZk6fjMLS3GfZaIrVS rawJOV80m5/i2F4ZCmcFKrKTw1c89k0n3h/iG+G2pAhzIeq2bqcO+Epc3gh42h91YN7W fDJD5GrjmH0559m0xGX+OVP79V7MkucYDlP/7V/eqLMvtVkCq2VULBZEPbAG+XPZMLG2 Cb0XGUrx2sZ8jqV2VyioHLhJf/TimIrYzD2vq9FUC3VxjCDuz8ZaRuZ/1FQP5YV7o6hF e7NA==
MIME-Version: 1.0
X-Received: by 10.182.125.200 with SMTP id ms8mr3969529obb.67.1366064130284; Mon, 15 Apr 2013 15:15:30 -0700 (PDT)
Received: by 10.76.141.83 with HTTP; Mon, 15 Apr 2013 15:15:30 -0700 (PDT)
In-Reply-To: <8B0AAE84-CAB8-483B-99FD-DA6A0CA13395@netapp.com>
References: <516B8824.8040904@cisco.com> <DF8F6DB7E5D58B408041AE4D927B2F48CBB88103@CINURCNA14.e2k.ad.ge.com> <CAP+FsNfeUtKfOMPKriYP7Ak_YzsjEFKvprJOAQaxYP7_BxTBsw@mail.gmail.com> <cf53405c48dc431693573a9148776c8a@BN1PR03MB072.namprd03.prod.outlook.com> <8B0AAE84-CAB8-483B-99FD-DA6A0CA13395@netapp.com>
Date: Mon, 15 Apr 2013 15:15:30 -0700
Message-ID: <CAP+FsNca6TOB2B-ntnEHvzPx3JY=6Qcp34RgF7uQsbdsLUbptQ@mail.gmail.com>
From: Roberto Peon <grmocg@gmail.com>
To: "Eggert, Lars" <lars@netapp.com>
Cc: Gabriel Montenegro <Gabriel.Montenegro@microsoft.com>, "Simpson, Robby (GE Energy Management)" <robby.simpson@ge.com>, Eliot Lear <lear@cisco.com>, Robert Collins <robertc@squid-cache.org>, Jitu Padhye <padhye@microsoft.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "Brian Raymor (MS OPEN TECH)" <Brian.Raymor@microsoft.com>, Rob Trace <Rob.Trace@microsoft.com>, Dave Thaler <dthaler@microsoft.com>, Martin Thomson <martin.thomson@skype.net>, Martin Stiemerling <martin.stiemerling@neclab.eu>
Content-Type: multipart/alternative; boundary="089e012942ca3a021604da6d97d6"
Received-SPF: pass client-ip=209.85.219.46; envelope-from=grmocg@gmail.com; helo=mail-oa0-f46.google.com
X-W3C-Hub-Spam-Status: No, score=-4.4
X-W3C-Hub-Spam-Report: AWL=-1.732, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1URrgq-0003yt-5p bc36ac142f8d36a0e5c2f292c37f1811
X-Original-To: ietf-http-wg@w3.org
Subject: Re: HTTP/2 and TCP CWND
Archived-At: <http://www.w3.org/mid/CAP+FsNca6TOB2B-ntnEHvzPx3JY=6Qcp34RgF7uQsbdsLUbptQ@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17239
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

I'll point out that it is very clear that we have (and have had) consensus
to remove this feature as it exists today.
So, we should.
It is and always has been a tool for research for the transport layer.

If it was defined as an opaque blob that the transport layer delegates to
the application layer to transmit and cache, would it seem as scary?


On Mon, Apr 15, 2013 at 2:03 PM, Eggert, Lars <lars@netapp.com> wrote:

> Hi,
>
> I had commented in an off-list discussion on this issue and was asked to
> summarize what I said to the list. So here we go.
>
> I fully understand why the idea to bypass slow-start and instead start
> with the window used during the last connection instantiation sounds nice.
> But: it has been thought of before a dozen times and has huge issues.
>

> This has the potential to generate large line-rate bursts into the
> network, which can can cause loss bursts and force TCP into timeout-based
> recovery, which has a huge impact on throughput. (Much more so than
> slow-starting with a smaller window.) That is, because you normally have no
> idea if the path conditions are at all comparable between when you cached
> that CWND and when you want to reuse it. So when you burst and create a
> series of losses - for yourself and other flows on the bottleneck! - they
> all go into timeout of a few hundred ms at least and then slow-start.
>
> The TCP WG has been working on the Google "IW10" proposal (allowing TCP to
> start with an initial window of 10 segments rather than 1-3). That seems to
> mitigate much of the need for caching the CWND, since new connections
> wouldn't need to start with very small windows. A large part of the
> discussion around that proposal was exactly on the question of how large
> the initial window can be without significantly increasing the danger of
> line-rate bursts. There has been a pretty in-depth analysis by multiple
> folks into whether 10 is safe or not, and the consensus seems to be that it
> should be. Just caching and reusing any arbitrarily large CWND is certainly
> not safe.
>

I've been a part of that research-- the measurement apparatus for the
proposal was done on my machines and code!
There is still an issue of how many connections get opened. Given current
implementations, with an initcwnd of 10, you are likely to get bursts of
between 10 and 60 packets, as the browser often opens up connections in
parallel, and the server responds with static content with alacrity.



>
> The issue itself has been thought about for much longer, c.f.
> http://tools.ietf.org/html/draft-hughes-restart-00 from 2002, which talks
> about the issue of what the window should be after a connection has been
> idle for a while and wants to resume sending.
>

Good stuff that is ignored when we have lots of (6-36) connections starting
up simultaneously or nearly so. :/


>
> Another related work item in TCP is
> http://tools.ietf.org/html/draft-ietf-tcpm-newcwv-00, which attempts to
> specify what TCP should do during periods where it didn't send at a rate
> that used up the current window, which can also lead to bursting when
> traffic demands increase.
>
> I'm mentioning this, because a lot of the work of the TCP WG revolves
> around mitigating these bursts in order to avoid stalls due to
> timeout-based recovery, and having HTTP go off and define knobs that would
> actively counteract that work seems, ahem, counterproductive.
>

I think you mistake the intent. The intent is to make it easy for transport
experimentation by giving a mechanism that can be implemented today of
storing transport-related data, and by giving that back to the transport
layer upon session resumption.
While an ugly thing (which should hopefully be a short-term band-aid for
the lack of this mechanism in the transport today), it works and does allow
for transport-level experimentation today. What would the transport folks
like stored for use by the transport layer? :)


>
> I'm all for making HTTP and TCP work better together. Limiting the number
> of parallel connections, seeing if we can increase the initial window
> safely, and other similar things are all great examples of what we should
> be doing more of. But the TCP and HTTP folks will need to work together on
> this - we can't afford to get this wrong.
>

Agree (and I believe that I said as much earlier), however, I can't think
of a good mechanism for limiting the number of parallel connections for
HTTP/1.1.
That is really my boogeyman.

-=R