Re: [#153] PUSH_PROMISE headers

Amos Jeffries <squid3@treenet.co.nz> Sat, 29 June 2013 08:03 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E72A321F9E39 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 29 Jun 2013 01:03:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.599
X-Spam-Level:
X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JLRkEvHyzRDx for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 29 Jun 2013 01:03:02 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 1D5BB21F9E27 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 29 Jun 2013 01:03:01 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Usq5w-0001mO-7D for ietf-http-wg-dist@listhub.w3.org; Sat, 29 Jun 2013 08:01:20 +0000
Resent-Date: Sat, 29 Jun 2013 08:01:20 +0000
Resent-Message-Id: <E1Usq5w-0001mO-7D@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <squid3@treenet.co.nz>) id 1Usq5a-0001lb-RP for ietf-http-wg@listhub.w3.org; Sat, 29 Jun 2013 08:00:58 +0000
Received: from ip-58-28-153-233.static-xdsl.xnet.co.nz ([58.28.153.233] helo=treenet.co.nz) by maggie.w3.org with esmtp (Exim 4.72) (envelope-from <squid3@treenet.co.nz>) id 1Usq5Y-0004r6-Sg for ietf-http-wg@w3.org; Sat, 29 Jun 2013 08:00:58 +0000
Received: from [192.168.1.218] (ip202-27-218-168.satlan.co.nz [202.27.218.168]) by treenet.co.nz (Postfix) with ESMTP id B62C3E704F for <ietf-http-wg@w3.org>; Sat, 29 Jun 2013 20:00:25 +1200 (NZST)
Message-ID: <51CE9415.6020900@treenet.co.nz>
Date: Sat, 29 Jun 2013 20:00:21 +1200
From: Amos Jeffries <squid3@treenet.co.nz>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: ietf-http-wg@w3.org
References: <CABkgnnVGh9dLkfDrO2fq5TsnxwEu0Dff=LqJEJR5Odq2ibfDMg@mail.gmail.com> <CABP7RbcoSSSKJq3YbZ2ypw-xb0uOgFQcjcQP9tJdkgEjPfJVMA@mail.gmail.com> <CAP+FsNdJcZ_x6RidaVfP+VPtA3CwAbALgAqhOhAjZLzaz4tQRQ@mail.gmail.com> <CABP7Rbf_pGKU-yB-f=6fB5WoVvs087eOf6Beo4DDHGJWYX5XTQ@mail.gmail.com> <CAP+FsNd9BDHBO2YXEfHwvRuiJDDpbAEvCMR2BKLzcoaARxjDJA@mail.gmail.com> <CAP+FsNcEf6s5s7Jk=NLKdrdU8fV1AsSJ4u-8CZNT8P7YXvxkag@mail.gmail.com> <CABP7RbdoBswznxwDm+-00egSHV+h7fO7Ow+aw+mFhLm2Z=GRWg@mail.gmail.com>
In-Reply-To: <CABP7RbdoBswznxwDm+-00egSHV+h7fO7Ow+aw+mFhLm2Z=GRWg@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=58.28.153.233; envelope-from=squid3@treenet.co.nz; helo=treenet.co.nz
X-W3C-Hub-Spam-Status: No, score=-0.0
X-W3C-Hub-Spam-Report: SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1Usq5Y-0004r6-Sg 3a048d5cb013e890095fb5d2f67d1c34
X-Original-To: ietf-http-wg@w3.org
Subject: Re: [#153] PUSH_PROMISE headers
Archived-At: <http://www.w3.org/mid/51CE9415.6020900@treenet.co.nz>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/18415
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On 29/06/2013 5:44 p.m., James M Snell wrote:
> I just gave an example that uses content-negotiation so that argument
> doesn't really work. The server is crafting the request headers for an
> implied GET, so it gets to construct those request headers to be as
> specific as it wants them to be, without much need for actual
> content-negotiation.
>
> For example, suppose the server receives a request for an HTML page like this...
>
> GET /index.html HTTP/1.1
> Accept: text/html, image/jpeg, image/gif
>
> The server decides that it wants to push jpeg files to the client... it sends
>
> PUSH_PROMISE
>    :path = /images/f.jpg
>    :method = GET
>    :host: example.org
>    :if-match: "my-etag1"
>    accept: image/jpeg
>
> That's pretty darn unambiguous. I can easily check to see if I have a
> representation of "/images/f.jpg" with etag "my-etag1" and
> content-type "image/jpeg" in my local cache, without the need to have
> *any* response headers in the PUSH_PROMISE.
>
> Perhaps you have another, more specific example in mind?

The above example is a good one.

The cache will store the pushed response under:
   hash(URL)+hash(Vary:"image/jpeg")

When looking up future HIT the cache for this client on explicit-GET it 
will look up
   hash(URL)+hash(Vary:"text/html, image/jpeg, image/gif")

I assume you can spot the difference.

Middleware which looks up its cache on future PUSH_PROMISE will also 
find the does the hash(URL)+hash(Vary:"image/jpeg") version and probably 
RST_STREAM for teh redundant data. Other clients will however still MISS 
on that object, and may even MISS on the second version above if they 
for example drop text/html on their explicit-GET for images.

... PUSH_PROMISE will only work on middleware which supports the 
proposed Key header. It will not work easily with Vary as you seem to think.

Amos

> On Fri, Jun 28, 2013 at 10:36 PM, Roberto Peon <grmocg@gmail.com> wrote:
>> ... because by then you've opened up a stream., and you're back into
>> problematic territorry.
>> PUSH_PROMISE exists because we need to indicate to the browser all of the
>> information it needs to make a determination about whether or not it wants
>> the stream (and to short circuit the inlining/push mechanism when it already
>> has what it needs!)
>> -=R
>>
>>
>> On Fri, Jun 28, 2013 at 10:34 PM, Roberto Peon <grmocg@gmail.com> wrote:
>>> Any content negotiation would be an appropriate example. :)
>>>
>>> You don't want to have to wait for the HEADERS frame to indicate to the
>>> client which resource it might already have (it should have the opportunity
>>> to RST_STREAM if it has it in cache, for instance).
>>> -=R
>>>
>>>
>>> On Fri, Jun 28, 2013 at 10:25 PM, James M Snell <jasnell@gmail.com> wrote:
>>>> Have an example handy?
>>>>
>>>> Here's an example that shows that response headers in the PUSH_PROMISE
>>>> would not be necessary... Let's say I send a PUSH_PROMISE with the
>>>> following bits of info...
>>>>
>>>> PUSH_PROMISE
>>>>    :path = /images/f.jpg
>>>>    :method = GET
>>>>    :host = example.org
>>>>    :scheme = http
>>>>    accept = image/jpeg
>>>>    if-match: "my-etag1"
>>>>    cache-control: max-age=1000
>>>>
>>>> These headers are giving me everything I would need to determine if
>>>> there is a matching resource in my local cache. I have the method, I
>>>> have the etag, I have the cache-control parameters, accept... There's
>>>> no need for response headers at this point.
>>>>
>>>> Later, once I start accepting the frames for the pushed content, I
>>>> would get something like...
>>>>
>>>> HEADERS
>>>>    :status = 200
>>>>    content-type: image/jpeg
>>>>    content-length: 123
>>>>    etag: "my-etag1"
>>>>    vary: accept
>>>>    cache-control: public
>>>>
>>>> On the off chance that the PUSH_PROMISE doesn't give me what I need,
>>>> the follow on HEADERS frame will give me the rest.
>>>>
>>>>
>>>>
>>>> On Fri, Jun 28, 2013 at 9:55 PM, Roberto Peon <grmocg@gmail.com> wrote:
>>>>> Depending on how the request might have been been constructed, response
>>>>> headers may be necessary to identify the resource in the cache, as
>>>>> compared
>>>>> to the resource specified in the HTML (I'm thinking about vary: stuff).
>>>>>
>>>>> -=R
>>>>>
>>>>>
>>>>> On Fri, Jun 28, 2013 at 9:44 PM, James M Snell <jasnell@gmail.com>
>>>>> wrote:
>>>>>> Let's take a step back and consider what a pushed stream is...
>>>>>>
>>>>>> A pushed stream is essentially an "Implied GET". This means that a
>>>>>> server is going to assume that the client was going to send a GET for
>>>>>> the pushed resource. This also means that the server has to make some
>>>>>> assumptions about the make up of that implied GET.
>>>>>>
>>>>>> Now, consider how HTTP caching works. When a cache receives a request
>>>>>> for a resource, how does it determine whether or not it has a
>>>>>> representation of the resource already available? Does it look at the
>>>>>> request headers or the response headers? Obviously, it looks at the
>>>>>> request headers. It uses the response headers when populating the
>>>>>> cache.
>>>>>>
>>>>>> So, if we look at the pushed resource sent by the server, what we need
>>>>>> is for A) the server to first let us know about the implied GET
>>>>>> request.. which means pushing down a set of request headers then B)
>>>>>> the server to send the actual resource, which means pushing down the
>>>>>> response headers.
>>>>>>
>>>>>> Already in our design for pushed resources, we have the server sending
>>>>>> a PUSH_PROMISE frame that contains a header block, followed by a
>>>>>> HEADERS frame that also contains a headers block. It stands to reason
>>>>>> that the PUSH_PROMISE frame would contain the set of request headers
>>>>>> that the server is assuming for the implied GET. These are delivered
>>>>>> to the client, which uses those to determine whether or not a cached
>>>>>> representation of the resource is already available (just as any cache
>>>>>> would do using the request headers). The server would then send it's
>>>>>> response headers in a HEADERS frame, just as it would any response to
>>>>>> any other kind of GET.
>>>>>>
>>>>>> Two examples to show how this naturally fits... First, let's look at a
>>>>>> normal GET request sent by the client to the server...
>>>>>>
>>>>>> Client                 Server
>>>>>> ------                 ------
>>>>>>    |                        |
>>>>>>    | ---------------------> |
>>>>>>    |   HEADERS              |
>>>>>>    |     GET                |
>>>>>>    |     /images/f.jpg      |
>>>>>>    |     If-Match: etag1    |
>>>>>>    |     Accept: image/jpeg |
>>>>>>    |                        |
>>>>>>    | <--------------------- |
>>>>>>    |   HEADERS              |
>>>>>>    |     200                |
>>>>>>    |     Content-Type:      |
>>>>>>    |       image/jpeg       |
>>>>>>    |     Content-Length:    |
>>>>>>    |       123              |
>>>>>>    |                        |
>>>>>>    | <--------------------- |
>>>>>>    |   DATA....DATA....     |
>>>>>>    |                        |
>>>>>>
>>>>>> Now consider the same resource being pushed by the server using
>>>>>> PUSH_PROMISE...
>>>>>>
>>>>>> Client                 Server
>>>>>> ------                 ------
>>>>>>    |                        |
>>>>>>    | <--------------------- |
>>>>>>    |   PUSH_PROMISE         |
>>>>>>    |     GET                |
>>>>>>    |     /images/f.jpg      |
>>>>>>    |     If-Match: etag1    |
>>>>>>    |     Accept: image/jpeg |
>>>>>>    |                        |
>>>>>>    | <--------------------- |
>>>>>>    |   HEADERS              |
>>>>>>    |     200                |
>>>>>>    |     Content-Type:      |
>>>>>>    |       image/jpeg       |
>>>>>>    |     Content-Length:    |
>>>>>>    |       123              |
>>>>>>    |                        |
>>>>>>    | <--------------------- |
>>>>>>    |   DATA....DATA....     |
>>>>>>    |                        |
>>>>>>
>>>>>>
>>>>>> Note that the only difference here is the direction and type of the
>>>>>> first frame. Everything else is identical. The PUSH_PROMISE contains
>>>>>> everything the client needs to determine whether or not it already has
>>>>>> the resource in it's local cache (request URI, etag, content-type...).
>>>>>>
>>>>>> There's no need to get any more complicated than this. We already
>>>>>> require two distinct header blocks for every request. We already send
>>>>>> two distinct header blocks for each pushed stream. We already indicate
>>>>>> that a pushed stream is an implied GET. To make it work, we simply
>>>>>> state that the PUSH_PROMISE contains the Request headers that the
>>>>>> server has assumed for the implied GET request, while the HEADERS
>>>>>> frame sent later contains the Response headers. If the request headers
>>>>>> in the PUSH_PROMISE end up not being adequate enough to properly
>>>>>> determine if the resource is already cached, then we treat it as just
>>>>>> another cache miss.
>>>>>>
>>>>>> On Fri, Jun 28, 2013 at 5:21 PM, Martin Thomson
>>>>>> <martin.thomson@gmail.com> wrote:
>>>>>>> https://github.com/http2/http2-spec/issues/153
>>>>>>>
>>>>>>> The current text describes PUSH_PROMISE as having a few request
>>>>>>> headers, plus some response headers, but it's quite vague.
>>>>>>>
>>>>>>> I think that if this is going to be properly workable across a wide
>>>>>>> range of uses with lots of different headers, PUSH_PROMISE needs to
>>>>>>> include two sets of headers: the ones that it overrides from the
>>>>>>> associated request (:path being foremost of those) and the ones that
>>>>>>> it provides as a "preview" of the response (e.g., ETag might allow
>>>>>>> caches to determine if they were interested in the rest of the
>>>>>>> response).
>>>>>>>
>>>