Re: dont-revalidate Cache-Control header

"Roy T. Fielding" <fielding@gbiv.com> Thu, 16 July 2015 22:50 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 20FD21A014C for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 16 Jul 2015 15:50:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.011
X-Spam-Level:
X-Spam-Status: No, score=-7.011 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OuAwuqS-L3hs for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 16 Jul 2015 15:50:25 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0283D1AC425 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 16 Jul 2015 15:50:24 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1ZFrvY-0008Pw-5z for ietf-http-wg-dist@listhub.w3.org; Thu, 16 Jul 2015 22:46:52 +0000
Resent-Date: Thu, 16 Jul 2015 22:46:52 +0000
Resent-Message-Id: <E1ZFrvY-0008Pw-5z@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <fielding@gbiv.com>) id 1ZFrvT-0008PA-Jv for ietf-http-wg@listhub.w3.org; Thu, 16 Jul 2015 22:46:47 +0000
Received: from sub4.mail.dreamhost.com ([69.163.253.135] helo=homiemail-a36.g.dreamhost.com) by lisa.w3.org with esmtp (Exim 4.80) (envelope-from <fielding@gbiv.com>) id 1ZFrvR-0007zi-L4 for ietf-http-wg@w3.org; Thu, 16 Jul 2015 22:46:47 +0000
Received: from homiemail-a36.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a36.g.dreamhost.com (Postfix) with ESMTP id 98752778093; Thu, 16 Jul 2015 15:46:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gbiv.com; h=content-type :mime-version:subject:from:in-reply-to:date:cc:message-id :references:to; s=gbiv.com; bh=TIts9/zeqSymBZ8t/dpJe3nGmpg=; b=p oOLNY1TQwzYscvBQd+0Qs2eJcjiTBjP4x7oRVMldrcbe6UaD/QoJCvU2bunmvEQp rwU7oQ4RtlU+Y6qf9aG6x0LyGr9F4OTh7XN8Nlq9n8I0VjPfP/Do3HZohOYCYxWg YlGnOxXjyHdpLWkuMV0tR7m7ckkyqRBa6SoAYWPD4g=
Received: from [192.168.1.2] (ip68-228-83-124.oc.oc.cox.net [68.228.83.124]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: fielding@gbiv.com) by homiemail-a36.g.dreamhost.com (Postfix) with ESMTPSA id 5385E778088; Thu, 16 Jul 2015 15:46:23 -0700 (PDT)
Content-Type: multipart/alternative; boundary="Apple-Mail=_BDCCD1E6-5438-4F8A-89EE-B4D2E1DA30CB"
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\))
From: "Roy T. Fielding" <fielding@gbiv.com>
In-Reply-To: <CAKRe7JHd4v=6DtYY=PL7c8kuV=KcqmG4sRK6L2o+EJi8gYi2rQ@mail.gmail.com>
Date: Thu, 16 Jul 2015 15:46:22 -0700
Cc: Ben Maurer <ben.maurer@gmail.com>, Mark Nottingham <mnot@mnot.net>, Amos Jeffries <squid3@treenet.co.nz>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <805C006D-7755-4817-AA29-17995AC9B871@gbiv.com>
References: <CABgOVaLHBb4zcgvO4NUUmAzUjNkocBGYY3atFA9iuYyoLaLQsA@mail.gmail.com> <559F9E90.4020801@treenet.co.nz> <CABgOVaLG6QZyjqk2AGYupShST_u3ty9BpxUcPX+_yMEC1hyHAQ@mail.gmail.com> <961203FE-7E54-410F-923E-71C04914CD2E@mnot.net> <CABgOVaJxntEyT0v4GvWm0Qi9jbUPEnzxJgg4KyQSM1T_gN1mjQ@mail.gmail.com> <16407353-5C34-42E8-81A6-E0027EC3A0D0@mnot.net> <CABgOVa+C48yYp-ZkawY+Ho6pXONa_UfB0MVt_2+d0ejyESu2Pw@mail.gmail.com> <CAKRe7JFKEsUMaG40=yt5p=3hdXUBf-dVGaUV6fcA2N1wBwGkfw@mail.gmail.com> <A71CA75A-B614-4612-8C7F-9687B1204EFE@gbiv.com> <CAKRe7JHd4v=6DtYY=PL7c8kuV=KcqmG4sRK6L2o+EJi8gYi2rQ@mail.gmail.com>
To: Ilya Grigorik <igrigorik@gmail.com>
X-Mailer: Apple Mail (2.2102)
Received-SPF: none client-ip=69.163.253.135; envelope-from=fielding@gbiv.com; helo=homiemail-a36.g.dreamhost.com
X-W3C-Hub-Spam-Status: No, score=-8.2
X-W3C-Hub-Spam-Report: AWL=1.538, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1ZFrvR-0007zi-L4 17ac6c95cea0b16d04cf5c3ab5ac3413
X-Original-To: ietf-http-wg@w3.org
Subject: Re: dont-revalidate Cache-Control header
Archived-At: <http://www.w3.org/mid/805C006D-7755-4817-AA29-17995AC9B871@gbiv.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/29982
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

> On Jul 14, 2015, at 7:53 PM, Ilya Grigorik <igrigorik@gmail.com> wrote:
> On Tue, Jul 14, 2015 at 5:19 PM, Roy T. Fielding <fielding@gbiv.com <mailto:fielding@gbiv.com>> wrote:
> On Jul 14, 2015, at 3:33 PM, Ilya Grigorik <igrigorik@gmail.com <mailto:igrigorik@gmail.com>> wrote:
> 
>> On Tue, Jul 14, 2015 at 3:03 AM, Ben Maurer <ben.maurer@gmail.com <mailto:ben.maurer@gmail.com>> wrote:
>> That said, this doesn't feel like a great thing for us to promote as a web performance best practice. "If you use long cache lifetimes for your static content, the dont-revalidate cache control header will reduce the cost of client reloads" seems like a piece of advice folks might take, as would "Use the <meta> tag 'dont-reload-non-expired-resources' to avoid browsers revalidating your content when the user presses reload". On the other hand "you should find every image, script, stylesheet, etc and set the fetch option on each to say force-cached" feels more tedious and unlikely to be used.
>> 
>> To this point, the HTTP mechanism is something that FEO / optimization proxies can do on your behalf - e.g. rewrite and/or bundle resources, add version fingerprint, append the HTTP header we're discussing here. By comparison, rewriting markup (HTML, CSS, JS) is significantly harder and very expensive. Which is to say.. +1 for HTTP directive over markup.
> 
> No, it would be managed in the CMS along with all of the other decisions that led to a static version. Sane folks don't manage their content in an optimization proxy.
> 
> Most every CDN has an FEO product that performs resource optimization (minification, obfuscation, bundling, fingerprinting + cache extension, and more). PageSpeed modules [1] alone, which I'm most familiar with myself, power many hundreds of thousands of sites. Which is to say, "sane folks" do deploy such tools and with great success.
> 
> [1] https://developers.google.com/speed/pagespeed/module/ <https://developers.google.com/speed/pagespeed/module/>
Umm, I don't consider mod_pagespeed to be an optimization proxy, but I guess it can be configured that way
in combination with mod_proxy. Managing configuration of CDNs is a common feature for a CMS.

In any case, the place where an output filter like pagespeed should be adding the static indicator is the same
place it is doing the content modification to modify the URL reference by adding a content hash: the HTML
element attributes. It is making a decision to force the reference to have a static representation and we want that
decision to have an impact on the page rendering algorithm of a browser, so the static association belongs with the
reference so that it can be retained regardless of how many other protocols might be used to deliver, cache,
or otherwise distribute that modified content.

OTOH, if we want to have metadata that indicates a given resource has one and only one representation
for all time, regardless of the use context, that metadata would not belong in Cache-Control either (because
it isn't about controlling a cache -- it is asserting some knowledge about the resource that can be used
by any recipient, regardless of cache behavior). That could be defined as a new header field or as a
relation for Link (where the link could point to the original resource that isn't static).

We would then be left with the question of when does the page rendering process have sufficient
confidence that it has the right representation in order to avoid making a conditional request for a
given static resource.  I think having either some sort of Content-Hash (or similar) [also orthogonal
to cache-control], or properly marking incomplete responses as suggested by HTTP/1, would be
sufficient for browsers to make their own decision to optimize away that request.

....Roy