Re: Proposal - Reduce HTTP2 frame length from 16 to 12 bits

William Chan (陈智昌) <willchan@chromium.org> Tue, 28 May 2013 23:26 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8233421F898B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 28 May 2013 16:26:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.676
X-Spam-Level:
X-Spam-Status: No, score=-9.676 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C24VoDzdn3wS for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 28 May 2013 16:26:41 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id E233B21F87D1 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 28 May 2013 16:26:40 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UhTHb-0004Hn-GB for ietf-http-wg-dist@listhub.w3.org; Tue, 28 May 2013 23:26:23 +0000
Resent-Date: Tue, 28 May 2013 23:26:23 +0000
Resent-Message-Id: <E1UhTHb-0004Hn-GB@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <willchan@google.com>) id 1UhTHO-0004GA-3g for ietf-http-wg@listhub.w3.org; Tue, 28 May 2013 23:26:10 +0000
Received: from mail-qe0-f42.google.com ([209.85.128.42]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <willchan@google.com>) id 1UhTHJ-0007Tg-3w for ietf-http-wg@w3.org; Tue, 28 May 2013 23:26:10 +0000
Received: by mail-qe0-f42.google.com with SMTP id cz11so4788248qeb.1 for <ietf-http-wg@w3.org>; Tue, 28 May 2013 16:25:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=l41ZqJInmlBC59Qv6D7z4pTgvj+DwbUUP+O0dcjTHWI=; b=QCF0xEMmuu3VIKxaaS1z0Jhye4ljMLOZmNN9kIRxFlF2bGw8JqilUlRZCIcT4l67HK B6U1hB5cMX6DvJb/h5Ux1PT7PW4Ut/FvyCs/kI2uTxfs2ImH/MvFn7DjdUh3AqVs1nbq A7DTqz8VC/LK928STWVTeDA4BISMYtubboKvd5GwbLfrxkzjuAkmn/z7dJpWnRZBZt26 8W+lhq9ACH54tVfISAdn0vuqsKSOuUrLk1G327R04YG3YPxC2qBnz23lpjrGN/+bNePM 2WhQRbc72VROBeOb86yRU3pXqpuy0gKQWgUDzYR08P1+dyFlX91UXjLoboektP4yZu0C VF9Q==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=l41ZqJInmlBC59Qv6D7z4pTgvj+DwbUUP+O0dcjTHWI=; b=SPoyuH9hJ1YQltZqX6TsCzSCxXkLUzBOskkHNggRVc35NGy8i3o/H6opsNViHLLM1G IiME1WMrh5SVPNzMogE99K58GrV/io4jBKUpkm1xSZ54QT9BbrD0wisI+hOrkISb3q9T LHC//sCO+vGQRzwUt5HXB4gdDjgT6WHKSgirY=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=l41ZqJInmlBC59Qv6D7z4pTgvj+DwbUUP+O0dcjTHWI=; b=e4gHFXWlcbwfCqLw2DkMmoefUM3ZIWC2fNOHuCVEaCuz34MeHbCiRs3WXDNXSPnnYJ tdoNgimMTaeg+BaK1OC1vmTKiWOJp72YPnEtGhrlLL8UF+wG5lZxZAXRat0HFIDajn5B 3FCqvLYeE14JJrju8Wq3ygm3ZeO02MRdryeiB1RFuh2YK1EesyKjvyz5dSBPnX1xzChA JvDMvHtvWniZ3ASXZPixEF95ZsM1CkW7eltPv132z2Co1d5mjnUAlW4/B8iHWWem1p7c ibH25VdYsj5CrkcChDGwmCRBhcM9de72YK7RvJyA5u9qoFAxzYqf/GsQh7m3OCw/1HIm mP9g==
MIME-Version: 1.0
X-Received: by 10.49.19.42 with SMTP id b10mr169618qee.2.1369783539263; Tue, 28 May 2013 16:25:39 -0700 (PDT)
Sender: willchan@google.com
Received: by 10.229.62.133 with HTTP; Tue, 28 May 2013 16:25:39 -0700 (PDT)
In-Reply-To: <CAP+FsNez763nkt5EPo8Wf496gH-+hY_V1NRuT5TDuM+697L6_g@mail.gmail.com>
References: <CAOdDvNoAjiRSBv9ue6RgCQJ4wMNQcKBH2a8zVa4_96wbp=g8MA@mail.gmail.com> <CABP7Rbefh0HxT7Pui_F8viNvu8232O3Qt=VaR6SgsL1DQarVSA@mail.gmail.com> <CAA4WUYgKsDudsSAywWSwz5KVsEV5iUREqjmYVB5sWuc+11ujOQ@mail.gmail.com> <CAP+FsNdejY=K4fp6jMh1AzSkMpdxWNd+cCnaF6uw2GPfMVtjAA@mail.gmail.com> <CABP7Rbf6Ls8pBf9Rons9hgLeXjnm-yk6t6kebk1EXcS3bTdf_Q@mail.gmail.com> <CAA4WUYjGk5EYeP9pP=TDWdGGyq5PjwHcDc+qD1mBGuSAt9yvng@mail.gmail.com> <CAP+FsNez763nkt5EPo8Wf496gH-+hY_V1NRuT5TDuM+697L6_g@mail.gmail.com>
Date: Tue, 28 May 2013 16:25:39 -0700
X-Google-Sender-Auth: LEsnsk6ubrjirGrZXGK1rr50b0Q
Message-ID: <CAA4WUYhOnocH7nxX=ZmzH8jyygF_JAaYzTezCWFXP1XdTUEgKg@mail.gmail.com>
From: =?UTF-8?B?V2lsbGlhbSBDaGFuICjpmYjmmbrmmIwp?= <willchan@chromium.org>
To: Roberto Peon <grmocg@gmail.com>
Cc: James M Snell <jasnell@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>, Patrick McManus <mcmanus@ducksong.com>
Content-Type: multipart/alternative; boundary=047d7bd76efe4709f704ddcf950c
X-Gm-Message-State: ALoCoQlcJ4kZVIiFuC3aJbHiuY0KIDeE+Z8RJVpgj+0vzkAG628qBAMiiWWQl7fNty9BFI/WlnH9PGqTmfyCON0tEXTTZtRRcMMHfeuMfvS+D9CXlMWAVTkK9rjAOSEGp8pZ2RmImznWgow15ogUgxSJ9YjUKAOeKwcPmLzvcT02k8NF0CXS6w7BdmY73cyuXCrRwWaUT7mv
Received-SPF: pass client-ip=209.85.128.42; envelope-from=willchan@google.com; helo=mail-qe0-f42.google.com
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: AWL=-2.186, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-1.07, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1UhTHJ-0007Tg-3w 19284a251597a7577949bd1d763567f3
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Proposal - Reduce HTTP2 frame length from 16 to 12 bits
Archived-At: <http://www.w3.org/mid/CAA4WUYhOnocH7nxX=ZmzH8jyygF_JAaYzTezCWFXP1XdTUEgKg@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/18132
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Just to be clear, I don't feel too strongly here. I do want to address a
point as I feel my previous point was lost.


On Tue, May 28, 2013 at 1:12 PM, Roberto Peon <grmocg@gmail.com> wrote:

> responses inline
>
>
> On Tue, May 28, 2013 at 12:16 PM, William Chan (陈智昌) <
> willchan@chromium.org> wrote:
>
>> On Tue, May 28, 2013 at 11:50 AM, James M Snell <jasnell@gmail.com>wrote;wrote:
>>
>>> On Tue, May 28, 2013 at 11:41 AM, Roberto Peon <grmocg@gmail.com> wrote:
>>> > As a reverse proxy, I've seen properties for which 4k writes/reads
>>> were too
>>> > small and induced latency increases.
>>> >
>>>
>>> I haven't played with this part too much yet but this is my general
>>> suspicion also.
>>>
>>
>> Can you guys clarify this in more detail? Specifically, where the latency
>> comes from. I have ideas, but I'd rather than an authoritative explanation.
>>
>
> It always comes down to the cost of the context switches (i.e. syscalls)
> and the locking that must be done in the lower layers of the IO stack.
>

Thanks for the clarification, I suspected it was the write()/read() cost,
which I assume is what you mean by syscall.


>
>
>>
>>>
>>> > Admittedly, frame size doesn't have to be the same as read/write size,
>>> but
>>> > it certainly does encourage that implementation (which is, I think, the
>>> > point of smaller max frame size that you proposed).
>>>
>>
>> You're right that it does encourage that implementation. Just like a
>> larger length encourages just naively breaking up frames into that max
>> frame size and thus hurt responsiveness. Which one is likelier to cause
>> worse overall "performance" (I know this is vague, since people care about
>> different aspects of perf)? What we want to do is have the most reasonable
>> default behavior, with the ability for performant implementations to tune
>> without unreasonable difficulty. I believe we're mostly focusing here on
>> optimizing the naive implementations, not the highly tuned implementations.
>>
>
> Remember that I'm the one who proposed the smaller max frame size in the
> first place (now a fair while ago)? :)
>

I don't believe I've said anything that would imply I forgot that :)


> My sweet-spot number was 16k, as I knew that I could saturate a 10G nic
> with 16k frames/writes and have enough CPU left over to do some actual
> work. The amount of overhead goes up more than linearly with the decrease
> in frame size thanks to contention, etc.
>

I think you miss my point. Please correct me if I'm wrong, but I think
you're saying that for your server, 16k was the right choice for write()s.
write() sizes don't need to be tied to actual frame size, but of course
that's what a naive implementation would do. And again, I think we should
pick a max frame size that results in reasonable behavior for naive
implementations/deployments. And I think the highly performant
implementations will want to write their code in a way that decouples frame
size from write() size, and will pick the optimal write() size given the
tradeoffs.


>
>
>>
>>
>>> >
>>> > I propose we keep the 16 bit frame size and instead allow the (now
>>> > negotiated setting of) max frame size to default to 12 bits worth,
>>> with that
>>> > going upwards out downwards when a settings frame arrives from the
>>> other
>>> > side indicating it's max receive size. HK
>>> >
>>>
>>> Honestly, I'd prefer to do away with frame size negotiation altogether
>>> because of the potential for path mtu style issues. Keeping the 16-bit
>>> size for now with strong encouragement (SHOULD, perhaps?) for keeping
>>> sizes around 12-bit lengths for the most common cases  seems like the
>>> right approach.
>>>
>>> -- James
>>>
>>
> Unlike TCP/IP, max frame size is a point-to-point thing, as the primitive
> we mux is streams, not frames. Frames are the way we accomplish the muxing.
> Why would there be any path MTU like thing?
>
> -=R
>
>
>>
>>> > This would give the best chance that the code would be written in such
>>> a way
>>> > as to adapt with the times as they change.
>>> > -=R
>>> >
>>> > On May 28, 2013 10:01 AM, "William Chan (陈智昌)" <willchan@chromium.org>
>>> > wrote:
>>> >>
>>> >> Can you clarify what you mean by a documented performance metric for
>>> >> non-browser use cases? I don't think Patrick said anything browser
>>> specific.
>>> >> He provided some serialization latency numbers and noted that they
>>> are high
>>> >> enough to impact responsiveness. And then he provided numbers on
>>> overhead.
>>> >>
>>> >> I, for one, find the responsiveness argument compelling for browsers.
>>> I'm
>>> >> not completely sure 0.2% is low enough overhead for everyone, but I
>>> wouldn't
>>> >> complain about it. And in absence of complaints, I guess I'd support
>>> moving
>>> >> forward with only 12 bits for length.
>>> >>
>>> >>
>>> >> On Tue, May 28, 2013 at 9:22 AM, James M Snell <jasnell@gmail.com>
>>> wrote:
>>> >>>
>>> >>> Currently, my only challenge with this is that, so far, we have not
>>> >>> seen any documented performance metrics for non-browser based uses.
>>> >>> .That said, I don't really have the time currently to put together a
>>> >>> comprehensive set of such metrics so it wouldn't be polite of me to
>>> >>> insist on them ;-) ... perhaps for now we ought to keep the 16-bit
>>> >>> size but include a recommendation about not exceeding 12-bits, then
>>> >>> see what more implementation experience does for us.
>>> >>>
>>> >>> On Tue, May 28, 2013 at 7:20 AM, Patrick McManus <
>>> mcmanus@ducksong.com>
>>> >>> wrote:
>>> >>> > Hi All,
>>> >>> >
>>> >>> > I've been looking at a lot of spdy frames lately, and I've noticed
>>> what
>>> >>> > I
>>> >>> > consider a common implementation problem that I think a good http/2
>>> >>> > spec
>>> >>> > could help with. I'm commonly seeing frames large enough to
>>> interfere
>>> >>> > with
>>> >>> > effective prioritization. I've seen this from at least 3 different
>>> >>> > servers.
>>> >>> >
>>> >>> > The HTTP/2 draft has a max frame size of 16 bits, which is a huge
>>> >>> > improvement from spdy's 24. I propose we reduce it further to 12.
>>> (i.e.
>>> >>> > 4096
>>> >>> > bytes).
>>> >>> >
>>> >>> > The muxxed approach of multiple streams onto one connection done in
>>> >>> > HTTP/2
>>> >>> > has great advantages, but the one downside of it is that it creates
>>> >>> > head of
>>> >>> > line blocking problems between those streams dictated by frame
>>> >>> > granularity.
>>> >>> > With small frames this is pretty manageable, with extremely large
>>> ones
>>> >>> > we've
>>> >>> > recreated the same head of line problems that HTTP/1 pipelines
>>> have.
>>> >>> > The
>>> >>> > server needs to  be able to respond quickly to higher priority
>>> events
>>> >>> > (including cancellations) and once it has written a frame header
>>> to the
>>> >>> > wire
>>> >>> > it is committed to the entire frame for how ever long it takes to
>>> >>> > serialize
>>> >>> > it. IMO the shorter that time, the better.
>>> >>> >
>>> >>> > Our spec can help implementations do the right thing here by
>>> limiting
>>> >>> > the
>>> >>> > max frame size to 12 bits.
>>> >>> >
>>> >>> > It takes 500msec to serialize 64KB at 1Mbit/sec... 125msec at
>>> >>> > 4Mbit/sec.
>>> >>> > Those are some pretty notable task-switch times. Dropping the
>>> frame to
>>> >>> > 4096
>>> >>> > cuts them to 32msec and 8 msec.. that's much more responsive, at
>>> the
>>> >>> > cost of
>>> >>> > 120 extra bytes of transfer (< 1msec at 1Mbit/sec).
>>> >>> >
>>> >>> > In general - the smaller the better as long as the overhead
>>> doesn't get
>>> >>> > to
>>> >>> > be too large. At 8 in 4096 (~.2%) I think that's acceptable. Its
>>> >>> > roughly the
>>> >>> > same overhead as a VLAN tag.
>>> >>> >
>>> >>> > Obviously this makes a continuation bit for control frames
>>> absolutely
>>> >>> > mandatory, but I think we're already in that spot with 16 bit frame
>>> >>> > lengths.
>>> >>> >
>>> >>> > -Patrick
>>> >>> >
>>> >>> >
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>