Re: HTTP/2 and TCP CWND

William Chan (陈智昌) <willchan@chromium.org> Wed, 24 April 2013 19:29 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7AC1621F8E4B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 24 Apr 2013 12:29:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.676
X-Spam-Level:
X-Spam-Status: No, score=-9.676 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bZ9GUERWpKb6 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 24 Apr 2013 12:29:05 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 0875721F8E6A for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 24 Apr 2013 12:29:05 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UV5Mb-0004TV-Gk for ietf-http-wg-dist@listhub.w3.org; Wed, 24 Apr 2013 19:28:21 +0000
Resent-Date: Wed, 24 Apr 2013 19:28:21 +0000
Resent-Message-Id: <E1UV5Mb-0004TV-Gk@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <willchan@google.com>) id 1UV5MV-0004Sl-6m for ietf-http-wg@listhub.w3.org; Wed, 24 Apr 2013 19:28:15 +0000
Received: from mail-qe0-f50.google.com ([209.85.128.50]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <willchan@google.com>) id 1UV5MT-0004tl-TN for ietf-http-wg@w3.org; Wed, 24 Apr 2013 19:28:15 +0000
Received: by mail-qe0-f50.google.com with SMTP id k5so19331qej.37 for <ietf-http-wg@w3.org>; Wed, 24 Apr 2013 12:27:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=MI7Lt0PT7bwACtUp2uArQ8nCPsy3ZBo1Mfdt1bRQYic=; b=UqOsSRvTQ/o5PbzeKfIeRaQLqEYC9bQvrVU7f760PcKu1q/yLkCisKWl8Zjohry4jf lHcDpL++F9ARLz8RfvnZmre1G/r/5itthhV0Uh5dtbod2UdfPp8H98QK8lXb4H2h/kSE eALz6WqKYxgLsQ6A2lm7J3N+y5zsMqaqn4kYJTXfpnqlj65qM6Bo1X+pgKREtBsrOgp6 Ps0FqkJ6GkHJNYCbumerKZgTTN4VB8OhKDdB8EsauHnI4V4keTYEUs2vu1fKHaRNQIc1 koTyPminW2sd+BIc5W7H1/5gaS5+IxZD2iiG8gT9H5gGGKjiwBhOf9Nygq+UWao2yJP5 aAmg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=MI7Lt0PT7bwACtUp2uArQ8nCPsy3ZBo1Mfdt1bRQYic=; b=aSZX1wCTMUinatee0rHBqjQmMdisZOaIK4qVzaDHgXeiIl9K/VP+ly726kXTIeygPz J5sR5LeWjKwTR1hhk04dZIeYVF2Ks5ddDQ7KuoI/GN68pyifznMeBM98eDy9BlhVBr9y QKiHRSKumWzWYncMcf4x7wqYhqM/Y1WMbPk7Q=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=MI7Lt0PT7bwACtUp2uArQ8nCPsy3ZBo1Mfdt1bRQYic=; b=gSYV6/9bSgAdsXmw0dis+ewzau3OJQyKpHMxfENE/pTmaLh7JQhAbMZuTo4E+U73h+ EGoSdtGviPUeVWl/f/fpQUrl0Z8zg7lKQWG6a19HNwf70VL0dDtLngbTkVi/G4xHJa55 qSFu4IU3R7K91aBywJzGrfbZRgBwYwApOEKFdBj+syQVxkWiWp895JSvIyXfW6PO8AzT y1zdiaQ1hXfGgzJcMSzq3y5Fpqr9MZPBO/QLWXg+3toHoXeS56VE/c+1jfVir7LIfg2u gXZzZfTAGgd3JjEPKm4peDyxB2tGOEcFr7FVw6Yfszx1slwNcS96CJ/RqyGn8dG6soqy a7iw==
MIME-Version: 1.0
X-Received: by 10.49.19.3 with SMTP id a3mr24823033qee.22.1366831668027; Wed, 24 Apr 2013 12:27:48 -0700 (PDT)
Sender: willchan@google.com
Received: by 10.229.180.4 with HTTP; Wed, 24 Apr 2013 12:27:47 -0700 (PDT)
In-Reply-To: <DF66CBBE-D828-4647-B42F-E3014309AFA7@gmail.com>
References: <516B8824.8040904@cisco.com> <DF8F6DB7E5D58B408041AE4D927B2F48CBB88103@CINURCNA14.e2k.ad.ge.com> <CAP+FsNfeUtKfOMPKriYP7Ak_YzsjEFKvprJOAQaxYP7_BxTBsw@mail.gmail.com> <cf53405c48dc431693573a9148776c8a@BN1PR03MB072.namprd03.prod.outlook.com> <8B0AAE84-CAB8-483B-99FD-DA6A0CA13395@netapp.com> <CAP+FsNca6TOB2B-ntnEHvzPx3JY=6Qcp34RgF7uQsbdsLUbptQ@mail.gmail.com> <95367D0C-D34C-4542-A0DE-921BBDE6A239@netapp.com> <CAP+FsNfGBYXABwLJJMk6rC_GAMVD2RXaMFEu93oGwMaCuCzN7Q@mail.gmail.com> <856946E5-2239-40BB-AC2D-716D6FDAA9FF@netapp.com> <CAP+FsNd97LUZNRJrf=vCc_tmnxn8ygGZ4EyOfVywt=cuc_qutA@mail.gmail.com> <CANmPAYFhD8kwiM5F1vG0A5Thkrf4Dmw+64nDhvOjzPDVONU7mQ@mail.gmail.com> <CAA4WUYi+ewPmapspBETX=7m1Pxvft2u7C_7MHVJ7h1s0BFWN-Q@mail.gmail.com> <DF66CBBE-D828-4647-B42F-E3014309AFA7@gmail.com>
Date: Wed, 24 Apr 2013 12:27:47 -0700
X-Google-Sender-Auth: l9goUdkkaec5Pj33AFy0Pw2dmEA
Message-ID: <CAA4WUYgnUr_-Zja9y-+=uRjses=qU9MxQ4pZZa5xYjNzLRv4+g@mail.gmail.com>
From: =?UTF-8?B?V2lsbGlhbSBDaGFuICjpmYjmmbrmmIwp?= <willchan@chromium.org>
To: Peter Lepeska <bizzbyster@gmail.com>
Cc: Roberto Peon <grmocg@gmail.com>, "Eggert, Lars" <lars@netapp.com>, Gabriel Montenegro <Gabriel.Montenegro@microsoft.com>, "Simpson, Robby (GE Energy Management)" <robby.simpson@ge.com>, Eliot Lear <lear@cisco.com>, Robert Collins <robertc@squid-cache.org>, Jitu Padhye <padhye@microsoft.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "Brian Raymor (MS OPEN TECH)" <Brian.Raymor@microsoft.com>, Rob Trace <Rob.Trace@microsoft.com>, Dave Thaler <dthaler@microsoft.com>, Martin Thomson <martin.thomson@skype.net>, Martin Stiemerling <martin.stiemerling@neclab.eu>
Content-Type: multipart/alternative; boundary=047d7bd766aa0ae60e04db204c97
X-Gm-Message-State: ALoCoQk3uMzx8MBfPFvmH4zwTJMu67s9T/Fd7aguCRQdglaQAw6A1fEm3jj8Ub1ZFC/eVsVS2CXCsBZPfBGbUaHQrb0r9C0NnwVsenPOw3ASyqkyZ4ygbycpVptTT0hPZnT8ozefhKhgiN2zrwA3L3iujkNU343jz3447Um2KlaAosHXy4g3ZkcjBo0MCd97oDrEZC6c0KpU
Received-SPF: pass client-ip=209.85.128.50; envelope-from=willchan@google.com; helo=mail-qe0-f50.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.683, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1UV5MT-0004tl-TN d143e900ce23eb936b38e6fe9b694de3
X-Original-To: ietf-http-wg@w3.org
Subject: Re: HTTP/2 and TCP CWND
Archived-At: <http://www.w3.org/mid/CAA4WUYgnUr_-Zja9y-+=uRjses=qU9MxQ4pZZa5xYjNzLRv4+g@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17557
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Wed, Apr 24, 2013 at 11:52 AM, Peter Lepeska <bizzbyster@gmail.com>wrote:

>
> On Apr 24, 2013, at 12:36 PM, William Chan (陈智昌) <willchan@chromium.org>;
> wrote:
>
> On Wed, Apr 24, 2013 at 8:40 AM, Peter Lepeska <bizzbyster@gmail.com>wrote:
>
>> Not sure this has been proposed before, but better than caching would be
>> dynamic initial CWND based on web server object size hinting.
>>
>> Web servers often know the size of the object that will be sent to the
>> browser. The web server therefore can help the transport make smart initial
>> CWND decisions. For instance, if an object is less than 20KB, which is true
>> for the majority of objects on web pages, the web server could tell the
>> transport to increase the CWND to a size that would allow the object to be
>> sent in the initial window.
>>
>
> In the HTTP/2 case where we often are multiplexing, this doesn't seem to
> make as much sense. Also, I'm not sure that it's a reasonable argument to
> select initcwnd in absence of any congestion information...or were you
> suggesting merely tweaking the initcwnd a little bit if that little bit
> would make a difference in terms of fitting the whole object in the
> initcwnd?
>
>
> Right. A small number of multiplexed connections transfer less of a given
> page's data in slow start so this will have less impact for those
> connections. However it's worth nothing that often the first object
> requested over the multiplexed channel will be the root object alone and of
> course number of round trips to download the root object directly impacts
> page load time.
>

We should move away from this assumption that the first request is for the
root object. I've been advising companies on how to do SPDY deployments,
and a common scenario is origin server hosting the root doc + SPDY capable
CDN for the subresources (primarily images served on the edge). For these
CDNs, they're going to serve a burst of traffic immediately, and those
subresources often have high impact on the above the fold perceived latency
(in many of today's websites, images form a big part of the initial
viewport's content, so serving these images quickly is vital). In today's
non-SPDY / HTTP2 case, they just domain shard and do 6 * [2-4] sharded
hosts, for 12-24 connections with IW10, starting out with effective
initcwnds of 120+. They are gaming initcwnd to the benefit of their users
that don't have a congested path, and severe detriment of users that cannot
handle such high bursts. This situation sucks.


>
> Caching attempts to reuse old congestion information, although it has been
> reasonably pointed out that the validity of that information is
> questionable. It's an open research question as far as I'm concerned, and
> I'd love to see any data people had.
>
>
>>
>> For larger objects, the benefit of a large CWND is minimal so the web
>> server could tell the transport to use the default and let the connection
>> ramp slowly.
>>
>
> I'm not sure this makes sense. GMail and Google+ and I'm sure other large
> web apps have rather large scripts and stylesheets, but they still care
> about their initial page load latency. Perhaps you're making the assumption
> that large objects implies the user does not have interactivity /
> low-latency expectations? If so, that's invalid. Those roundtrips still
> matter and I can tell you our Google app teams work very hard to eliminate
> them. Or maybe your definition is large is larger than what I'm thinking.
>
>
> The threshold is tunable. My point here is if the TCP connection is going
> to be used to download a 100 MB file,  or stream a video, then slow start
> has a negligible impact on overall download time for the file.
>

Sure, if you're doing non-interactive large data transfers, then the slow
start latency isn't going to matter much. I don't view that conversation as
very interesting, and no one's agitating for change there. The contentious
and more interesting discussion is how to safely, yet quickly start up TCP
connections for interactive bursty traffic like web browsing. I include
video web sites like Youtube amongst that, even if their objects are large,
since the time to start viewing the video is still important.


>
>
>
>> Peter
>>
>>
>>
>>
>> On Mon, Apr 15, 2013 at 8:16 PM, Roberto Peon <grmocg@gmail.com>; wrote:
>>
>>>
>>>
>>>
>>> On Mon, Apr 15, 2013 at 4:03 PM, Eggert, Lars <lars@netapp.com>; wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>> On Apr 15, 2013, at 15:56, Roberto Peon <grmocg@gmail.com>; wrote:
>>>> > The interesting thing about the client mucking with this data is
>>>> that, so
>>>> > long as the server's TCP implementation is smart enough not to kill
>>>> itself
>>>> > (and some simple limits accomplish that), the only on the client
>>>> harms is
>>>> > itself...
>>>>
>>>> I fail to see how you'd be able to achieve this. If the server uses a
>>>> CWND that is too large, it will inject a burst of packets into the network
>>>> that will overflow a queue somewhere. Unless you use WFQ or something
>>>> similar on all bottleneck queues (not generally possible), that burst will
>>>> likely cause packet loss to other flows, and will therefore impact them.
>>>>
>>>
>>> The most obvious way is that the server doesn't use a CWND which is
>>> larger than the largest currently active window to a similar RTT. The other
>>> obvious way is to limit it to something like 32, which is about what we'd
>>> see with the opening of a mere 3 regular HTTP connections! This at least
>>> makes the one connection competitive with the circumventions that HTTP/1.X
>>> currently exhibits.
>>>
>>>
>>>> TCP is a distributed resource sharing algorithm to allocate capacity
>>>> throughout a network. Although the rates for all flows are computed in
>>>> isolation, the effect of that computation is not limited to the flow in
>>>> question, because all flows share the same queues.
>>>>
>>>
>>> Yes, that is what I've been arguing w.r.t. the many connections that the
>>> application-layer currently opens :)
>>> It becomes a question of which dragon is actually most dangerous.
>>>
>>> -=R
>>>
>>>
>>>>
>>>> Lars
>>>
>>>
>>>
>>
>
>