Re: draft-asilvas-http-push-assets-00 comments

Martin Thomson <martin.thomson@gmail.com> Thu, 14 July 2016 00:06 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AD96412D8B9 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 13 Jul 2016 17:06:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.308
X-Spam-Level:
X-Spam-Status: No, score=-8.308 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dHbUOjKeliny for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 13 Jul 2016 17:06:42 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0B94812D0DA for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 13 Jul 2016 17:06:41 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bNU6f-0002Yy-8r for ietf-http-wg-dist@listhub.w3.org; Thu, 14 Jul 2016 00:02:21 +0000
Resent-Date: Thu, 14 Jul 2016 00:02:21 +0000
Resent-Message-Id: <E1bNU6f-0002Yy-8r@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <martin.thomson@gmail.com>) id 1bNU6a-0002XY-5g for ietf-http-wg@listhub.w3.org; Thu, 14 Jul 2016 00:02:16 +0000
Received: from mail-qt0-f180.google.com ([209.85.216.180]) by lisa.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <martin.thomson@gmail.com>) id 1bNU6Y-00060n-24 for ietf-http-wg@w3.org; Thu, 14 Jul 2016 00:02:15 +0000
Received: by mail-qt0-f180.google.com with SMTP id 52so34815793qtq.3 for <ietf-http-wg@w3.org>; Wed, 13 Jul 2016 17:01:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=NWV2W0+WHCcqbeFeykH9f0bplMWwC36UX3SYr3Vs+Y8=; b=wFlOqH98zUoALzEy81LG6zZ4lbe2VUgRIaqBjo+UBt5cPLGW2m14DGhjqJElDPXaLa mFzxPkC1VMwKU076lA5YlRN0WiU6w+KAhH4TRWJuov4MHj73mVnztYzReeSR8nSzmTCw 0M2ioSEo6eI3VXR7T1ryXVNY9+QmNxkfOTEspJnU4EX79Bpsiv2BvwVEn+ySvee3Wi27 woXqdBQVCmtE1UM0/i1Ut80pUVaSHmiBQ2imfyMG5XVW40TV9DQTK0mBbjOn0uVyU/Za 43pQiV8RWJZLzG/h39lXPPrBXaqq1Hs2I8DjWKDIGq0jSPRy5QztLLnXHrRMwgXwgWQB FtIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=NWV2W0+WHCcqbeFeykH9f0bplMWwC36UX3SYr3Vs+Y8=; b=FOsHiidJoY7wVkynsuJqjuRMztNY6bKzo7VMM2eDSYcn7dd7BwXCalJU3lzu+rFTJa BEXWUvawEd+NRPdCg3vbZE7p931xW2JQMxtyHWfzqcPfFEbLoB/e2xHL8S2aAJdJeFlo syTndKV9ZP5BvMFGVkKUm+PZbe/jg0mZBsL9tjMxhtVHcK0ZNE25ZiAWmlMEs8NMYVc9 9Ez+MfQvwACaqXExS4otp8k4cpt0PizklEPKJD1x6Akm5GhUoMZA6dzCJyzEr+D2Cco5 RMu4AnCpqOdD63IVm/KItKaDddbTCQ1FsStJWjCFjiTuMWU5Qx4VOmKjtOsNvSgGbwCV pbQQ==
X-Gm-Message-State: ALyK8tLU7WscZB+/ou7AzxAZd6t6+SzyWQoSC/j2/F33zsrpodT1WHwSO5m54jg69rz2/u+rIrrKvckI+JgSRQ==
X-Received: by 10.200.44.136 with SMTP id 8mr16702995qtw.18.1468454507905; Wed, 13 Jul 2016 17:01:47 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.140.22.38 with HTTP; Wed, 13 Jul 2016 17:01:47 -0700 (PDT)
In-Reply-To: <CY1PR0201MB159412923C9DB7F815290AC7B2310@CY1PR0201MB1594.namprd02.prod.outlook.com>
References: <CABkgnnVVja__isnUTmn3hgbNi8B=6FhYNnzwE+hAdxuS=WOHxw@mail.gmail.com> <CY1PR0201MB1594F2DD3ED98840BC9B9A7EB2310@CY1PR0201MB1594.namprd02.prod.outlook.com> <CAAMqGzZG=93-bXJVHxjBmjLyVy5e-q5AMUtvbXPah2XrGs5NMg@mail.gmail.com> <CY1PR0201MB159412923C9DB7F815290AC7B2310@CY1PR0201MB1594.namprd02.prod.outlook.com>
From: Martin Thomson <martin.thomson@gmail.com>
Date: Thu, 14 Jul 2016 10:01:47 +1000
Message-ID: <CABkgnnVXc04ih+StFhKEv7PYe5ygjRVDonCbSOgws2cnZYtNaw@mail.gmail.com>
To: "Aaron L. Silvas" <asilvas@godaddy.com>
Cc: Alcides Viamontes E <alcidesv@zunzun.se>, HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: text/plain; charset="UTF-8"
Received-SPF: pass client-ip=209.85.216.180; envelope-from=martin.thomson@gmail.com; helo=mail-qt0-f180.google.com
X-W3C-Hub-Spam-Status: No, score=-7.9
X-W3C-Hub-Spam-Report: AWL=1.832, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1bNU6Y-00060n-24 e38e6c8e912f77e4714f7a4c9eae247c
X-Original-To: ietf-http-wg@w3.org
Subject: Re: draft-asilvas-http-push-assets-00 comments
Archived-At: <http://www.w3.org/mid/CABkgnnVXc04ih+StFhKEv7PYe5ygjRVDonCbSOgws2cnZYtNaw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/31962
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Thanks for putting this (and the previous email) together.  That helps
clarify a great deal.

Since I don't have a lot of time before I travel, I'll be brief.

I'm very concerned about request size and how this scales.

There could be something good in this, but it might be a subset of
what you have defined.  If for no other reason than the one that you
identified: this might be easier to understand and implement.  I don't
see it as an advantage over cache digests, which are designed to be
hands-off from an operational perspective.  My concern is that this
could require some hands-on from a server operator to get the most out
of it.  In particular, the bucketing you refer to would seem to
require tweaking unless you had a whole lot of very smart heuristics
in your server code.

On 14 July 2016 at 06:01, Aaron L. Silvas <asilvas@godaddy.com> wrote:
> This draft aims to make HTTP/2 Server Push ready for prime time.
>
>
> Sure, I'll take a stab at a basic example. Excuse the formatting...
>
>
> * Request - 1st visit
>
> GET /
>   Push-Assets: *
> * PUSH_PROMISE - Promises sent by server
> GET /shared.js
> GET /shared.css
> * Server Push
> GET /shared.js
>   Status: 200 OK
>   Push-Asset-Key: /shared.js
>   Push-Asset-Match: *
>   ETag: abc123
> GET /shared.css
>   Status: 200 OK
>   Push-Asset-Key: /shared.css
>   Push-Asset-Match: *
>   ETag: 123abc
>   Cache-Control: public, max-age:123456
> * Response
> GET /
>   Status: 200 OK
>
> (navigate to another page)
>
> * Request - page transition
>
> GET /page2
>   Push-Assets: md5(/shared.js)=etag(abc123);md5(/shared.css)=no-push
> * Server Push
> GET /shared.js
>   Status: 304 NotModified
>   Push-Asset-Key: /shared.js
>   Push-Asset-Match: *
>   ETag: abc123
> * Response
> GET /page2
>   Status: 200 OK
>
> As you can see, the benefits of providing server-state can be quite
> substantial. In the case of "blind" server push, where a server doesn't care
> about client state and just sends everything, everytime, you'll get a lot of
> extra unpredictable waste, as the client will end up sending many RESETS to
> cancel in-flight server pushes. This is forcing those that want to use
> Server Push to use other means of managing state, namely cookies, and it's
> hacky at best.
>
>
> Regarding (potential) tradeoffs with the other major alternative being
> proposed (Cache Digest), I'll copy/paste a response I gave to this question
> in another email:
>
> -------------
>
> Until we can test the two approaches side-by-side in real-world scenarios
> with tons of samples, it's hard (for me) to predict which is "better". But
> here are some tradeoffs:
>
> * Cache Digest uses considerably fewer bytes per cached resource -
> significance of delta unknown
> * Push-Assets is presumably simpler for client & server integrators, as it
> leverages HTTP Headers - potentially broader adoption
> * Push-Assets can be URI "bucketed" to prevent wasted bytes over the wire
> for large and complex websites - potential big benefits for
> multi-app-per-domain websites, CDN's, etc
> * Push-Assets can be tuned by the apps developers to only enable server-push
> for assets that make the most sense (for example, an app may decide that js
> & css should be pushed but images should not) - Cache Digest sends all state
> regardless if "push-enabled"
> * Push-Assets can work for non-HTML resources - Potentially optimizing all
> HTTP traffic, such as API's, or more real-world scenarios like CSS with
> imports
>
> Ultimately I believe both would be *very* positive solutions, as most
> visitors visit websites with few previously cached resources, and may
> greatly benefit from Server Push. A recent study was posted in this group
> which illustrates the point better than I can:
> https://if-report.shimmercat.com/dirhtml/
>
> -------------
>
>
>
>
> -aaron
>
> ________________________________
> From: Alcides Viamontes E <alcidesv@zunzun.se>
> Sent: Wednesday, July 13, 2016 10:16:24 AM
> To: Aaron L. Silvas
> Cc: Martin Thomson; HTTP Working Group
> Subject: Re: draft-asilvas-http-push-assets-00 comments
>
> Hello Aaron,
>
> I'm also very interested on understanding your proposal better. I have read
> your draft and then have followed the questions asked on this list  and now
> your clarifications. However, neither your examples in the draft nor your
> latest email have helped me. Could you please repeat your examples with full
> request/response headers and with annotations about which requests are part
> of a PUSH_PROMISE?
>
> You mention that "browser and server adhere to a strict dependency state
> contract".... can you describe this contract?  Compared with the situation
> today, what are the major differences?
>
> Thanks in advance,
>
> ./Alcides.
>
>
>
> On Wed, Jul 13, 2016 at 6:24 PM, Aaron L. Silvas <asilvas@godaddy.com>
> wrote:
>>
>> Thanks for the feedback, Martin. I agree the document needs work to better
>> clarify things.
>>
>>
>> I'll attempt to address your comments here, in hopes of striking further
>> conversation on the topic.
>>
>>
>> "Push-Assets" is the only request header; required only if the client
>> wishes to enable the full HTTP/2 Push-Assets flow as outlined in the draft.
>> If the server does not support/understand the header, it is benign. This
>> allows the client to inform the server of its cache state, for push-enabled
>> assets only (unlike Cache Digest HTTP/2 proposal which sends everything).
>> This header includes the exact state of each of these resources, as if they
>> were individually requested, and thus supports existing etag and
>> last-modified headers. Not only will the server know what resources the
>> client does and does not have, but it will also know which resources are
>> simply out of date and must still be pushed. The server won't even need to
>> send a 304 (Server Push) response for unmodified resources, as the server
>> knows the state of the clients push-enabled assets, and the client can
>> assume "no change" if Server Push is performed on the given resources. This
>> effectively means that the server will only ever send what is missing or
>> changed, no more, no less.
>>
>>
>> Example (requests only to keep length of email to a minimum):
>>
>>
>>   GET /page1
>>
>>   Push-Assets: *
>>
>>
>>   GET /page2
>>   Push-Assets: md5(shared-resource1.js)=etag(123456)
>>
>>
>> "Push-Asset-Key" is an optional response header. It allows the server to
>> "name" a resource, allowing it to renamed at a later time without worry of
>> having to refetch unnecessarily. By default, the "key" of every resource is
>> the URI Path, minus any querystring parameters.
>>
>>
>> "Push-Asset-Key" is also a required PUSH_PROMISE header, which is likely
>> part of the confusion. Being a PUSH_PROMISE is essentially the server
>> delivering a request on its behalf, this header field informs the client
>> that this resource should be tracked as a "Push-Asset" (aka push-enabled).
>> The key itself is what uniquely identifies the resource, and will typically
>> be the URI Path of the resource, minus querystring parameters, but in MD5
>> form. The client will only ever provide client cache state of resources that
>> have responded with this header field, as they are "push-enabled". This
>> gives the server control of what state it should or should not track for the
>> purpose of Server Push resources.
>>
>>
>> "Push-Asset-Match" is an optional response header. This effectively allows
>> the server to inform the client that a given resource is only used within
>> specific "buckets" of matching URI's. This is especially useful for large or
>> complex domains, such as CDN's, or other multi-app-per-domain scenarios.
>>
>>
>>
>> I'll continue to collect feedback, and especially suggestions, and update
>> the next draft accordingly. Thanks again for the interest.
>>
>>
>>
>>
>> -aaron
>>
>> ________________________________
>> From: Martin Thomson <martin.thomson@gmail.com>
>> Sent: Tuesday, July 12, 2016 5:52:49 PM
>> To: HTTP Working Group; Aaron L. Silvas
>> Subject: draft-asilvas-http-push-assets-00 comments
>>
>> First, I think that there is an interesting idea hidden in here.  It
>> could be that it's complementary to the more generic digests idea.
>>
>> However, I found it impossible to determine how this document is
>> claiming to achieve its stated goals.  None of the examples include
>> header fields, which would have gone a long way to explaining this.
>> The new header fields don't really say what each is used for.  That
>> leaves me guessing about how this fits together.
>>
>> Here's my best guess, though I have to confess that I can't connect
>> this to what Section 4 says:
>>
>> On request N.  A server provides a new header field with responses
>> that create a secondary identifier for resources.  I'm really guessing
>> here, but I assume that unlike etag, this header field includes a
>> value that is the same for a group of resources.
>>
>> On request >N. Clients include a new header field with requests that
>> controls what is pushed.  If it includes '*', then everything is
>> pushed.  If it includes 'no-push', then nothing is pushed.  If it
>> includes a list of these new push-asset-keys, then anything matching
>> those keys is not pushed.
>>
>> Based on this, I'm fairly certain that I don't understand the
>> proposal, because this design doesn't require both Push-Asset-Key and
>> Push-Asset-Match header fields.  I'm clearly missing something.
>>
>> I did start to look at the code, but without a better overview of what
>> it aims to achieve, I'm afraid that I'm not going to get much from it.
>
>
>
>
> --
> Alcides Viamontes
> www.shimmercat.com