Re: delta encoding and state management

William Chan (陈智昌) <willchan@chromium.org> Wed, 23 January 2013 00:53 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2785921F8694 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 22 Jan 2013 16:53:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.811
X-Spam-Level:
X-Spam-Status: No, score=-8.811 tagged_above=-999 required=5 tests=[AWL=0.866, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X5m4Qf1Cjf0r for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 22 Jan 2013 16:53:20 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 0262E21F8689 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 22 Jan 2013 16:53:19 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1TxoZR-0008A6-Dj for ietf-http-wg-dist@listhub.w3.org; Wed, 23 Jan 2013 00:52:05 +0000
Resent-Date: Wed, 23 Jan 2013 00:52:05 +0000
Resent-Message-Id: <E1TxoZR-0008A6-Dj@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <willchan@google.com>) id 1TxoZJ-00089H-A9 for ietf-http-wg@listhub.w3.org; Wed, 23 Jan 2013 00:51:57 +0000
Received: from mail-qa0-f52.google.com ([209.85.216.52]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <willchan@google.com>) id 1TxoZH-0004Si-Et for ietf-http-wg@w3.org; Wed, 23 Jan 2013 00:51:57 +0000
Received: by mail-qa0-f52.google.com with SMTP id d13so211243qak.11 for <ietf-http-wg@w3.org>; Tue, 22 Jan 2013 16:51:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=JKuV9SD2tvNkRUcLb3DtzdP4xl8/kMr7V6r98t4BWtU=; b=U8Lz8qpSBe7WTGApjcma28pZtQhOKPLGs32mz2fLuBoWygX1yp7tAza0wq5dpjZaGW krE9tdDH9L5EMkuGSJkdss1Ey+cTWsU+WuBtBfgPNXK330FI9aScYa2yFJH/Mh2hM3qp SuIA4PwHT3+emW0QLFA4Ytb4dB5j7txMC+o2AQ799dbrHFfG5adlXAlzqf4RPmI3Q41i XlWSGi7DrnoCvWylA+kNH3oybLynMT/M0kGnT/H/yAZXRQTldMWF1utsuLlstrooB+oL Jpd9sKTf/iJB6NNQvCJ9mXgtsc1W+AWIxS/UHuNN7JOOffaAaPuNJ2hzIfWlcAagoLPH zM6w==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=JKuV9SD2tvNkRUcLb3DtzdP4xl8/kMr7V6r98t4BWtU=; b=LUpBECo/OfwQlgy3/risO4W+ZrtNM4gSXdrdFrjU8E6/Heo0YHzOd/DKpW0wi1vDwv dMAmQT/XNhYmato2Ke/XyJ8J7sNyAaD0yVq7LMOkBfvZ9Il9TTTskNUBdeCwuyXC7fry aX5aDn+KEgpae/PYH+pfK+0PVP9QhIbdjsKbE=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=JKuV9SD2tvNkRUcLb3DtzdP4xl8/kMr7V6r98t4BWtU=; b=Z+ZRYNElvkpSLAyXG3r/22yDyi/+SX+GIR4ve/JvbEBzPmEDYvswahpjeAnQqf5m8Q fI8B5gpTMeWrJg7j4anKLJO/QBzQChNxMiUtCTcxUL8jzZhHromokX75/4F5bGcFxHf4 JCgo5tRDxnnesp7kzIHwZMuXXVaiar4seJYc3f9oTcVVBZH6nu/P+slxGDEplE56AK92 bsIgb9SFVjep/d/qCl2zaO64EgaP1ykglPlUlj9hyoD6KDPDHxFr4iONQnucBYFurGQ0 DZnZ+Gs04/m2zzRaj+AqUj40w650YBNUXn63F5dzpNpzFtG6+8IgUGSQlQwLpzFwLXWw ioZA==
MIME-Version: 1.0
X-Received: by 10.224.216.9 with SMTP id hg9mr24856456qab.44.1358902289521; Tue, 22 Jan 2013 16:51:29 -0800 (PST)
Sender: willchan@google.com
Received: by 10.229.57.163 with HTTP; Tue, 22 Jan 2013 16:51:29 -0800 (PST)
In-Reply-To: <20130123000018.GR30692@1wt.eu>
References: <CAK3OfOj3ZgOZnzcQCifhb9f2One7vBUNGv7yhidkZqRzaeZYvQ@mail.gmail.com> <CAP+FsNfswUN-CK6heRGqEnSJatHGo3q2mZZLTrPnjapCZz2sTg@mail.gmail.com> <CABP7RbfDZcRH-0_AaN9iYjPN-v6QjU6_Xdy5o1BHYnDFWHtuAg@mail.gmail.com> <CAK3OfOh0xqZsPYcb0uRLnebKWTKO7ARkJ4joFZoqjiBSTmwBTA@mail.gmail.com> <CABP7Rbeb6MOYmYPhhsKFFtQwE0JxuPyShXY0zpkA5YX2JPSY_w@mail.gmail.com> <CAA4WUYhg2qt_z_TrOAH0ax6mUpYPNeG4x740CgQi5Voq=50K_Q@mail.gmail.com> <20130122212748.GJ30692@1wt.eu> <CAA4WUYj51jRFosut2RsdE46SqoMDqa_r5EB7g4pj5eC2i73j7Q@mail.gmail.com> <20130122224646.GO30692@1wt.eu> <CAA4WUYjuCGyJjA2nN_-oh8TunrA7owWFQRhLg-ps+fkp9T47Ew@mail.gmail.com> <20130123000018.GR30692@1wt.eu>
Date: Tue, 22 Jan 2013 16:51:29 -0800
X-Google-Sender-Auth: fD2UbnPWiY_Fh82PdnQVbszATwE
Message-ID: <CAA4WUYjGiA0WP6o3ub5ZPTh-zYZ9Jth6w2GfuMT+WarT69GW-A@mail.gmail.com>
From: "William Chan (陈智昌)" <willchan@chromium.org>
To: Willy Tarreau <w@1wt.eu>
Cc: James M Snell <jasnell@gmail.com>, Nico Williams <nico@cryptonector.com>, Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQk/QcWDnijoxiTErr8nFIRW7hJuq39KPVTU4tqSAVvJoMz8ul2iHler+2QW3CN5QOty9l762hf75aK56VkfeGrGk8/0mBrnMIm+EMV39Oz4+Sh91C7yeaIZ4jKA00R7cwH6uoCpgGz/RU8NsL0cU2dmvjX10gtT/m3EWvC+9eSwjMBeY+QlOnuZGRnwlT8EkxKagJlH
Received-SPF: pass client-ip=209.85.216.52; envelope-from=willchan@google.com; helo=mail-qa0-f52.google.com
X-W3C-Hub-Spam-Status: No, score=-3.6
X-W3C-Hub-Spam-Report: AWL=-2.770, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1TxoZH-0004Si-Et f3dc9be1fa94255447a0aa9e9d233925
X-Original-To: ietf-http-wg@w3.org
Subject: Re: delta encoding and state management
Archived-At: <http://www.w3.org/mid/CAA4WUYjGiA0WP6o3ub5ZPTh-zYZ9Jth6w2GfuMT+WarT69GW-A@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16124
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Tue, Jan 22, 2013 at 4:00 PM, Willy Tarreau <w@1wt.eu> wrote:
> On Tue, Jan 22, 2013 at 03:08:08PM -0800, William Chan (?????????) wrote:
>> >> How long do you delay
>> >> the resource request in order to consolidate requests into a load
>> >> group? The same thing is even more true for response headers.
>> >
>> > I never want to delay anything, delays only do bad things when we
>> > try to reduce latency.
>>
>> One of us has the wrong mental model for how the proposal would work.
>> Let's figure this out.
>>
>> Let's say the browser requests foo.html. It receives a response packet
>> for foo.html, referencing 1.js. 5ms later, it receives packet 2 for
>> foo.html which references 2.js. 5ms it receives packet 3 for foo.html
>> which references 3.js. And so on. You say no delays. So does this mean
>> each "group" only includes one object each time?
>
> Ah OK I didn't understand. My assumption was that browsers do have a list
> of objects to be fetched, but with what you're explaining, it might not
> always be true. Anyway the principle I proposed suggested that all subsequent
> requests remain in the same group until a new group is emitted, so that
> should cover the need for new objects that are discovered one at a time.
> However, I do think (but may be wrong) that objects are not often scheduled
> to go on the wire one at a time, but that when many objects appear in the
> contents, many of them are seen together.
>
>> And now let's ignore the 5ms delays. Consider how WebKit works. Let's
>> say WebKit has all of foo.html. It starts parsing it. It encounters
>> 1.js. It immediately sends the resource request to the network stack.
>> It hasn't parsed the full document yet, so it doesn't know if it'll
>> encounter any more resources. Each time it encounters a resource while
>> parsing the document, it will send it to the network stack (in
>> Chromium and latest versions of Safari, this is a separate process).
>
> I must say I'm a bit shocked by this behaviour which is very inefficient
> from a TCP point of view. This means you have two possibilities for sending
> your requests then :
>   - either you keep Nagle enabled and your requests wait in the kernel's stack
>     for some time (typically 40 ms) before leaving, even if the request is
>     the last one ;
>
>   - or you disable Nagle to force them to leave immediately, but then each
>     request leaves with a TCP push flag, and then your TCP stack will not
>     send anything else over the same socket for a full RTT (until its pending
>     data are ACKed), which is worse.

We disable Nagle on our sockets. I must be missing something. Why
would the TCP stack only send one packet per roundtrip when you
disable Nagle? I do not believe TCP stacks only allow one packet's
worth of unack'd data.

>
> This is why we generally try to fill packets over the wire as much as
> possible. An alternative consists in opening many connections but this
> is not efficient either then (RTTs, upstream packets).
>
> So in practice I suspect that you already send requests with Nagle enabled
> and disable it when you reach the end of the page, so that whatever can leave
> is delayed at most 40ms and never more than the time to parse the whole page.
> If this is the case, then you already have your requests delayed by as much
> as 40ms and sent as groups.

No. See https://code.google.com/searchframe#OAMlx_jo-ck/src/net/socket/tcp_client_socket_win.cc&exact_package=chromium&q=nagle&type=cs&l=117.

>
>> What is the network stack to do if, as you say, it should never delay
>> anything? If I understand correctly, each "group" would always only
>> include one object then.
>
> I did not understand you meant delay between objects while parsing, I
> thought you meant delay between groups.

Indeed, there's a delay between objects. There must be. As previously
stated, we have no means of predicting the future. We do not know that
there are more objects in the document yet to be parsed. When we
encounter an object during parsing, we request it immediately.

>
> Here you're limited by TCP. If you push too fast, you have to wait one RTT
> between requests. If you ask the kernel to disable quick ACK or if you keep
> NAGLE enabled (using TCP_CORK, MSG_MORE, etc...), your requests will
> automatically leave between 40 and 200ms even if incomplete (far too much).
>
> However, considering that only incomplete packets will remain pending
> for the time it takes to parse the page and will leave anyway if it takes
> longer than that, I think it remains optimal to feed the kernel's buffers
> and let the first of the kernel or the HTML parser decide to send incomplete
> segments. Otherwise you'd delay subsequent requests by an RTT in the TCP
> stack.
>
>> > In the example I proposed, the recipient receives the full headers
>> > block, then from that point, all requests reuse the same headers
>> > and can be processed immediately (just like pipelining in fact).
>> >
>> > Concerning response headers, I'd say that you emit a first response
>> > group with the headers from the first response, followed by the
>> > response. When another response comes in, you have two possibilities,
>> > either it shares the same headers and you can add a response to the
>> > existing group, or it does not and you open a new group.
>>
>> Wait, is this the critical misunderstanding? Are you maintaining state
>> across requests and responses? Isn't this a minor modification on the
>> "simple" compressor? I was assuming you were trying to be stateless.
>
> I'm having a hard time following you, I'm sorry. What state across requests
> and responses do you mean ? The only "state" I'm talking about is the list
> of common headers between the current message and the previous one in fact.
> This is true both for requests and responses.

Yes, this is what I refer to as "simple" compression, as coined by
Mark (see http://www.mnot.net/blog/2013/01/04/http2_header_compression):
"""
“simple” - Omitting headers repeated from the last message, tokenising
common field names, and a few other tweaks. Otherwise, it looks like
HTTP/1.
"""

I consider this a form of stateful compression. If you are fine with
that, then great, I think we've made significant progress in our
discussions here, and it's a good starting point for discussing
connection state requirements.

>
> Regards,
> Willy
>