Re: More SPDY Related Questions..

Jonathan Ballard <dzonatas@gmail.com> Sat, 21 July 2012 23:11 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 143A221F852B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 21 Jul 2012 16:11:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.787
X-Spam-Level:
X-Spam-Status: No, score=-9.787 tagged_above=-999 required=5 tests=[AWL=0.211, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_34=0.6, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MGeF8d8s3Qqn for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 21 Jul 2012 16:11:49 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id D825B21F84F1 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 21 Jul 2012 16:11:48 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Ssipe-00015k-Dt for ietf-http-wg-dist@listhub.w3.org; Sat, 21 Jul 2012 23:11:30 +0000
Resent-Date: Sat, 21 Jul 2012 23:11:30 +0000
Resent-Message-Id: <E1Ssipe-00015k-Dt@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <dzonatas@gmail.com>) id 1SsipN-00014U-LI for ietf-http-wg@listhub.w3.org; Sat, 21 Jul 2012 23:11:13 +0000
Received: from mail-vc0-f171.google.com ([209.85.220.171]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <dzonatas@gmail.com>) id 1SsipL-0008K3-5v for ietf-http-wg@w3.org; Sat, 21 Jul 2012 23:11:13 +0000
Received: by vcdd16 with SMTP id d16so3749671vcd.2 for <ietf-http-wg@w3.org>; Sat, 21 Jul 2012 16:10:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Q/bB3EQCJFOoaOE+vvTxf5i3ikHMCaorofRfOACIArU=; b=YcN1Bt+AYd5PqGddu9J4e55Gs0pQ5kesFi7gut1V3KUoT4jTGwYyS41Girk6dq3Qo3 BPPvgnIC0RAELJKp5fT5GZcwZOKQbY2oqQLV8I6bIx5d34WEe7MlOo7I9YqyWO5ZoR12 nFN8aiFlm/BEgMyxHR6bkrzWN3eNs+Dt7sPFdVqnIgBXncWtuu2l+y1JS1Z6f2eIilMt BuJfo8fCiDiuTafrVp1Aw3OBF6WQkWiqZOk50vtLtxg0WdOgTBor3qJu9+rnyB+VtjP5 wQVnY1uGqK/+h65TZ0kcpRQpVtKug48JjQR5ojsQGkAGZmglcMi4UpaWTC4/HoKROe3A yi3A==
MIME-Version: 1.0
Received: by 10.220.153.136 with SMTP id k8mr8344862vcw.38.1342912244749; Sat, 21 Jul 2012 16:10:44 -0700 (PDT)
Received: by 10.52.115.104 with HTTP; Sat, 21 Jul 2012 16:10:44 -0700 (PDT)
In-Reply-To: <CABP7RbcT31WXTDYp5r6Wqj7isGYY+Eie1YjMepAFLWjigZ9uXg@mail.gmail.com>
References: <CABP7RbdsCKq3f0in5cMCGCGqYFMW5LBf-47pN1HZ3+uO=4UpXA@mail.gmail.com> <CAP+FsNen8pnp0Ph8HnEgudtgbavLM88TtweBacurFoNx2v5o=g@mail.gmail.com> <CABP7RbcT31WXTDYp5r6Wqj7isGYY+Eie1YjMepAFLWjigZ9uXg@mail.gmail.com>
Date: Sat, 21 Jul 2012 16:10:44 -0700
Message-ID: <CAAPAK-5o_m+0fEi7FRMmVzVMkz7n+Y-fVpptp7bS2MGbBTe=-Q@mail.gmail.com>
From: Jonathan Ballard <dzonatas@gmail.com>
To: James M Snell <jasnell@gmail.com>
Cc: Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="f46d043084c65014fe04c55f1f0b"
Received-SPF: pass client-ip=209.85.220.171; envelope-from=dzonatas@gmail.com; helo=mail-vc0-f171.google.com
X-W3C-Hub-Spam-Status: No, score=-2.7
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1SsipL-0008K3-5v 26480e52f58cadee3a13f8eec87fdc76
X-Original-To: ietf-http-wg@w3.org
Subject: Re: More SPDY Related Questions..
Archived-At: <http://www.w3.org/mid/CAAPAK-5o_m+0fEi7FRMmVzVMkz7n+Y-fVpptp7bS2MGbBTe=-Q@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/14702
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Okay, without digression, let's assume that we know the push happens when
the cache possibly taints by previous authentication, especially if the
origin server does not restrict itself only to HTTP 1.1. Under that
assumption, I can list the flow:

(A|B)                     RP                       O
  +                        |                       |
  |                        |                       |
  |==SPDY(1):/index.html==>|                       |
  |                        |=== GET /index.html ==>|
  |                        |                       |
  |                        |<==  200:/index.html ==|
  |<=SPDY(1)-O(nA=9,nB=4)=>|                       |
  |                        |                       |


That flow, untainted, is the same for /foo.css, images/a.jpg and
/video/a.mpg. That is how I resolved that push you showed.


On Saturday, July 21, 2012, James M Snell wrote:

>
> On Jul 21, 2012 12:41 PM, "Roberto Peon" <grmocg@gmail.com<javascript:_e({}, 'cvml', 'grmocg@gmail.com');>>
> wrote:
> >
> >
> > On Jul 21, 2012 12:10 PM, "James M Snell" <jasnell@gmail.com<javascript:_e({}, 'cvml', 'jasnell@gmail.com');>>
> wrote:
> > >
> > > Continuing my review of the SPDY draft... have a few questions
> relating to SPDY and load balancers / reverse proxy set ups... The intent
> is not to poke holes but to understand what the SPDY authors had in mind
> for these scenarios...
> >
> > Poke away!
> >
> > >
> > > 1. Imagine two client applications (A and B) accessing an Origin (D)
> via a Reverse Proxy (C). When a client accesses /index.html on Origin D,
> the Origin automatically pushes static resources /foo.css, /images/a.jpg
> and /video/a.mpg to the client.
> > >
> > > Basic flow looks something like...
> > >
> > > A                  RP                 O
> > > |                   |                 |
> > > |                   |                 |
> > > |==================>|                 |
> > > | 1)SYN             |                 |
> > > |<==================|                 |
> > > | 2)SYN_ACK         |                 |
> > > |==================>|                 |
> > > | 3)ACK             |                 |
> > > |==================>|                 |
> > > | 4)SYN_STREAM (1)  |                 |
> > > |                   |================>|
> > > |                   | 5) SYN          |
> > > |                   |<================|
> > > |                   | 6) SYN_ACK      |
> > > |                   |================>|
> > > |                   | 7) ACK          |
> > > |                   |================>|
> > > |                   | 8) SYN_STREAM(1)|
> > > |                   |<================|--
> > > |                   | 9) SYN_STREAM(2)| |
> > > |                   |  uni=true       | |
> > > |<==================|                 | |
> > > | 10) SYN_STREAM(2) |                 | |
> > > |  uni=true         |                 | | Content Push
> > > |                   |<================| |
> > > |                   | 11) SYN_REPLY(1)| |
> > > |<==================|                 | |
> > > | 12) SYN_REPLY(1)  |                 | |
> > > |                   |                 | |
> > > |                   |<================| |
> > > |<==================| 13) DATA (2,fin)|--
> > > | 14) DATA (2,fin)  |                 |
> > > |                   |                 |
> > > |                   |                 |
> > >
> > > My question is: what does this picture look like if Client's A and B
> concurrently request /index.html?
> > >
> > > With HTTP/1.1, static resources can be pushed off to CDN's, stored in
> caches, distributed around any number of places in order to improve overall
> performance. Suppose /index.html is cached at the RP. Is the RP expected to
> also cache the pushed content?
> >
> > It could if the pushed content indicated it was cacheable using http/1.2
> caching semantics/headers.
> >
> > > Is the RP expected to keep track of the fact that /foo.css,
> images/a.jpg and /video/a.mpg were pushed before and push those
> automatically from it's own cache when it returns the cached instance of
> /index.html?
> >
> > No, but a smart implementation would, cache headers allowing...
> >
>
> Ok... so what if the cache headers don't allow? e.g. /index.html is
> cacheable, /images/foo.jpg is not... /index.html would be served from the
> cache and images/foo.jpg would never be pushed unless the cache is smart
> enough to know that it has to go grab it and push it to the client.  Not
> saying that's a bad thing, just acknowledging that for push to work, caches
> and proxies would need to become much more aware of the content that's
> being passed around.
>
> > > If not, when the caching proxy returns index.html from it's cache, A
> and B will be forced to issue GETs for the static resources defeating the
> purpose of pushing those resources in the first place.
> > >
> > > In theory, we could introduce new Link rels in the same spirit as
> http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-03 that tell
> caches when to push cached content... e.g.
> >
> > Yup
> >
> > >
> > > SYN_STREAM
> > >   id=2
> > >   unidirectional=true
> > >   Content-Location: http://example.org/images/foo.jpg
> > >   Content-Type: image/jpeg
> > >   Cache-Control: public
> > >   Link: </index.html>; rel="cache-push-with"
> > >
> > > What does cache validation look like for pushed content? E.g. what
> happens if the cached /index.html is fresh and served from the cache but
> the related pushed content also contained in the cache is stale?
> >
> > Same as any normal cache expiry.
> >
>
> Just want to make sure I'm clear on it... Cache wants to push
> /images/foo.jpg to the client, notices that the cached version is stale, it
> has to send a GET to the origin to revalidate, it gets back the updated
> version then pushes that version back to the client.
>
> > >
> > > I'm sure I can come up with many more questions, but it would appear
> to me that server push in SPDY is, at least currently, fundamentally
> incompatible with existing intermediate HTTP caches and RP's, which is
> definitely a major concern.
> >
> > Given that the headers on the pushed objects indicate cacheability I'm
> either misunderstanding or missing something :)
> >
> > >
> > > As a side note, however, it does open up the possibility for a new
> type of proxy that can be configured to automatically push static content
> on the Origin's behalf... e.g. A SPDY Proxy that talks to a backend
> HTTP/1.1 server and learns that /images/foo.jpg is always served with
> /index.html so automatically pushes it to the client. Such services would
> be beneficial in general, but the apparent incompatibility with existing
> deployed infrastructure is likely to significantly delay adoption. Unless,
> of course, I'm missing something fundamental :-)
> >
> > Jetty has done this already publically, btw.
> >
>
> Good to know, thank you. I'd missed that. I'll dig into their
> implementation shortly to see what they've done.
>
> > >
> > > 2. While we on the subject of Reverse Proxies... the SPDY spec
> currently states:
> > >
> > >    When a SYN_STREAM and HEADERS frame which contains an
> > >    Associated-To-Stream-ID is received, the client must
> > >    not issue GET requests for the resource in the pushed
> > >    stream, and instead wait for the pushed stream to arrive.
> > >
> > >    Question is: Does this restriction apply to intermediaries like
> Reverse Proxies?
> >
> > Yup. RPs are both servers and clients (depending on the side one is on)
> and the respective requirement are still in force.
> >
> > > For instance, suppose the server is currently pushing a rather large
> resource to client A and Client B comes along and sends a GET request for
> that specific resource. Assume that the RP ends up routing both requests to
> the same backend Origin server. A strict reading of the above requirement
> means that the RP is required to block Client B's get request until the
> push to Client A is completed.
> >
> > Only per connection. The intention and purpose of this requirement is to
> eliminate the potential race whereby the server is pushing an object, but
> the client is requesting it as well (because it doesn't yet know the object
> is already on its way).
> >
>
> Ok, yeah I understand the reasoning behind it. The spec should clarify
> that the restriction is only per connection.
>
> Even still, I can see a number of ways where this blocking could
> potentially trip developers up if they don't understand or aren't fully
> aware of how it works. Namely, developers would need to avoid pushing
> relatively large and/or non-cacheable resources via a single URI unless
> those resources are ALWAYS pushed and not intended to be retrieved directly
> by GET.
>
> Quite interestingly... server push opens up the possibility of push-only
> resources that do not allow any direct requests. That is, I could set up an
> server to always push /images/foo.jpg while blocking all methods including
> GET.  Obviously that would have "interesting" cache validation issues....
> I'm not saying this would be a good idea, just that it would be possible.
> hmmm.....
>
> > > Further, the spec is not clear if this restriction only applies for
> requests sent over the same TCP connection. Meaning, a strict reading of
> this requirement means that even if the RP opens a second connection to the
> Origin server, it is still forbidden to forward Client B's GET request
> until Client A's push has been completed.
> >
> > We should clarify that the invariant here exists to prevent the race
> mentioned above and any behavior is allowed so long as that race is
> prevented.
> >
>
> +1
>
> Overall, the spec could stand a lot more clarity with regards to
> intermediaries and caching. There is quite a bit that is implied and after
> a bit of noodling makes sense but it would be great if some of those things
> were spelled out a bit more explicitly.
>
> > -=R
> >
> > >
> > >
> > > - James
> > >
> > >
> >
> > On Jul 21, 2012 12:10 PM, "James M Snell" <jasnell@gmail.com> wrote:
> >>
> >> Continuing my review of the SPDY draft... have a few questions relating
> to SPDY and load balancers / reverse proxy set ups... The intent is not to
> poke holes but to understand what the SPDY authors had in mind for these
> scenarios...
> >>
> >> 1. Imagine two client applications (A and B) accessing an Origin (D)
> via a Reverse Proxy (C). When a client accesses /index.html on Origin D,
> the Origin automatically pushes static resources /foo.css, /images/a.jpg
> and /video/a.mpg to the client.
> >>
> >> Basic flow looks something like...
> >>
> >> A                  RP                 O
> >> |                   |                 |
> >> |                   |                 |
> >> |==================>|                 |
> >> | 1)SYN             |                 |
> >> |<==================|                 |
> >> | 2)SYN_ACK         |                 |
> >> |==================>|                 |
> >> | 3)ACK             |                 |
> >> |==================>|                 |
> >> | 4)SYN_STREAM (1)  |                 |
> >> |                   |================>|
> >> |                   | 5) SYN          |
> >> |                   |<================|
> >> |                   | 6) SYN_ACK      |
> >> |                   |================>|
> >> |                   | 7) ACK          |
> >> |                   |================>|
> >> |                   | 8) SYN_STREAM(1)|
> >> |                   |<================|--
> >> |                   | 9) SYN_STREAM(2)| |
> >> |                   |  uni=true       | |
> >> |<==================|                 | |
> >> | 10) SYN_STREAM(2) |                 | |
> >> |  uni=true         |                 | | Content Push
> >> |                   |<================| |
> >> |                   | 11) SYN_REPLY(1)| |
> >> |<==================|                 | |
> >> | 12) SYN_REPLY(1)  |                 | |
> >> |                   |                 | |
> >> |                   |<================| |
> >> |<==================| 13) DATA (2,fin)|--
> >> | 14) DATA (2,fin)  |                 |
> >> |                   |                 |
> >> |                   |                 |
> >>
> >> My question is: what does this picture look like if Client's A and B
> concurrently request /index.html?
> >>
> >> With HTTP/1.1, static resources can be pushed off to CDN's, stored in
> caches, distributed around any number of places in order to improve overall
> performance. Suppose /index.html is cached at the RP. Is the RP expected to
> also cache the pushed content? Is the RP expected to keep track of the fact
> that /foo.css, images/a.jpg and /video/a.mpg were pushed before and push
> those automatically from it's own cache when it returns the cached instance
> of /index.html? If not, when the caching proxy returns index.html from it's
> cache, A and B will be forced to issue GETs for the static resources
> defeating the purpose of pushing those resources in the first place.
> >>
> >> In theory, we could introduce new Link rels in the same spirit as
> http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-03 that tell
> caches when to push cached content... e.g.
> >>
> >> SYN_STREAM
> >>   id=2
> >>   unidirectional=true
> >>   Content-Location: http://example.org/images/foo.jpg
> >>   Content-Type: image/jpeg
> >>   Cache-Control: public
> >>   Link: </index.html>; rel="cache-push-with"
> >>
> >> What does cache validation look like for pushed content? E.g. what
> happens if the cached /index.html is fresh and served from the cache but
> the related pushed content also contained in the cache is stale?
> >>
> >> I'm sure I can come up with many more questions, but it would appear to
> me that server push in SPDY is, at least currently, fundamentally
> incompatible with existing intermediate HTTP caches and RP's, which is
> definitely a major concern.
> >>
> >> As a side note, however, it does open up the possibility for a new type
> of proxy that can be configured to automatically push static content on the
> Origin's behalf... e.g. A SPDY Proxy that talks to a backend HTTP/1.
>