Re: Why should caches and intermediaries ignore If-Match?

"Roy T. Fielding" <fielding@gbiv.com> Sat, 04 March 2017 00:14 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C6AF1127601 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 3 Mar 2017 16:14:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.02
X-Spam-Level:
X-Spam-Status: No, score=-7.02 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gbiv.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id foDAAAMdEwUM for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 3 Mar 2017 16:14:49 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 092A612706D for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 3 Mar 2017 16:14:48 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cjxJ7-000829-IE for ietf-http-wg-dist@listhub.w3.org; Sat, 04 Mar 2017 00:12:21 +0000
Resent-Date: Sat, 04 Mar 2017 00:12:21 +0000
Resent-Message-Id: <E1cjxJ7-000829-IE@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <fielding@gbiv.com>) id 1cjxIy-00081B-7F for ietf-http-wg@listhub.w3.org; Sat, 04 Mar 2017 00:12:12 +0000
Received: from sub5.mail.dreamhost.com ([208.113.200.129] helo=homiemail-a123.g.dreamhost.com) by titan.w3.org with esmtps (TLS1.1:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from <fielding@gbiv.com>) id 1cjxIq-0003kh-Ep for ietf-http-wg@w3.org; Sat, 04 Mar 2017 00:12:06 +0000
Received: from homiemail-a123.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a123.g.dreamhost.com (Postfix) with ESMTP id 93B056000091D; Fri, 3 Mar 2017 16:11:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gbiv.com; h=content-type :mime-version:subject:from:in-reply-to:date:cc:message-id :references:to; s=gbiv.com; bh=fL3FD3SCK0vf5oBzJtTz5ZDm4cA=; b=B OxSML0+qK6cAXI0p3AcSDAGSH9h1cef7W5oweHnCAglvvxlFWMhHGt17f9OY0yfP kZ+bHVmoRIJV70yfRStfY0gGiugQKNBkco3sn8tu92JPC2Lq3JgksIKmqgi0wZCT Bn/evzehkCiOuw52kXccVbd1wUtcx5jeSPYg6rLos0=
Received: from [192.168.1.8] (ip68-228-71-159.oc.oc.cox.net [68.228.71.159]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: fielding@gbiv.com) by homiemail-a123.g.dreamhost.com (Postfix) with ESMTPSA id 58C7C60001A06; Fri, 3 Mar 2017 16:11:20 -0800 (PST)
Content-Type: multipart/alternative; boundary="Apple-Mail=_EC6C94A3-BA0B-4EA5-8D08-B59591E5A758"
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\))
From: "Roy T. Fielding" <fielding@gbiv.com>
In-Reply-To: <CA+3+x5GdrhsaE=X2qDXOaGvrLeR45X3LU8fRO761waxsBLtB3g@mail.gmail.com>
Date: Fri, 03 Mar 2017 16:11:19 -0800
Cc: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <838CF191-53A4-4CE6-A36F-8D0FED8A3069@gbiv.com>
References: <CA+3+x5FgdfAQ4Nos9VTGe35RiH8Z+3zZiUGH_bKXHz+VO+UAbQ@mail.gmail.com> <aaedcb18-2a19-9b77-95d9-0559e21407c2@measurement-factory.com> <CA+3+x5E_HPycm4axSLtO0jGmjDBS3=kVfhaJzKR+7n7S_yMgkg@mail.gmail.com> <DEF639B3-6A6D-4030-93D0-B7473D2A14F6@mnot.net> <9DA41CBA-673C-416B-A9CF-AD9A108C2440@gbiv.com> <CA+3+x5HZJsJ+903UKx+2O0b2dtc7o48Ks8CGrGfs=E_dXyWx-g@mail.gmail.com> <629EE31B-7235-4EFB-9C4C-CA4010165B2F@gbiv.com> <3963F360-005D-41F9-BF51-EB3EBD9C6F7F@mnot.net> <CA+3+x5GdrhsaE=X2qDXOaGvrLeR45X3LU8fRO761waxsBLtB3g@mail.gmail.com>
To: Tom Bergan <tombergan@chromium.org>
X-Mailer: Apple Mail (2.2104)
Received-SPF: none client-ip=208.113.200.129; envelope-from=fielding@gbiv.com; helo=homiemail-a123.g.dreamhost.com
X-W3C-Hub-Spam-Status: No, score=-7.7
X-W3C-Hub-Spam-Report: AWL=1.252, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1cjxIq-0003kh-Ep dc1540a330cf310bc55d887cda208f6f
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Why should caches and intermediaries ignore If-Match?
Archived-At: <http://www.w3.org/mid/838CF191-53A4-4CE6-A36F-8D0FED8A3069@gbiv.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33655
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Mar 3, 2017, at 3:24 PM, Tom Bergan <tombergan@chromium.org> wrote:
> On Fri, Mar 3, 2017 at 2:58 PM, Mark Nottingham <mnot@mnot.net <mailto:mnot@mnot.net>> wrote:
> > On 4 Mar 2017, at 9:30 am, Roy T. Fielding <fielding@gbiv.com <mailto:fielding@gbiv.com>> wrote:
> >
> > On Mar 1, 2017, at 5:49 PM, Tom Bergan <tombergan@chromium.org <mailto:tombergan@chromium.org>> wrote:
> >>
> >> Here is the use case:
> >>
> >> We have a content-optimization (compression) proxy sitting between the browser and origin server. Among other things, the proxy can compress videos. When the browser starts playing a video, it makes an initial HTTP request to fetch (part of) the video, then builds an in-memory representation of the video and uses additional HTTP range requests as needed to fetch the rest of the video. For example, range requests are used to implement seeking.
> >>
> >> The challenge is that we now have multiple representations of every video: the original representation (from the origin server) and one or more compressed representations served by the proxy. When the browser makes an initial request for a video, it gets one of these representations. When it makes a subsequent range request, we want to ensure that it receives the *same* representation that it received on the initial request. Otherwise the browser cannot combine the second response with the first response and video playback will fail.
> >>
> >> An additional challenge is that the browser and proxy both have a cache. In theory, we control the entire connection and could add custom code to the browser, proxy, and caches to implement any protocol that we invent. In practice, both caches are intended to be HTTP-compliant caches and we'd rather not add custom hacks for use cases like this if we can avoid it.
> >>
> >> The browser needs to label each range request with the ETag it expects to receive. If-Match originally seemed like the perfect solution: The browser adds `If-Match: ETag` to every range request. If a cache has a copy of the video with a *different* ETag, the cache forwards the request to the next server in the chain rather than returning its cached copy (as would happen if we used If-Range instead of If-Match). Similarly, the proxy knows if the browser is requesting a compressed video or the original video, so it can respond accordingly. However, as discussed previously in this thread, If-Match doesn't work like this.
> >>
> >> Note that I agree it doesn't make sense for a cache to return 412 and we don't need that behavior. The semantics I'm looking for is: "Send me this representation if you have it, otherwise forward to the next server. A 4xx means that this representation is not current in the origin or in any intermediate cache or proxy."
> >>
> >> Hope that makes sense.
> >
> > You have several choices:
> >
> > 1) implement this using transfer encodings because they don't change range offsets;
> >     presumably, these would be added/removed by the protocol handlers before
> >     the caches ever see them.
> 
> Ew.
> 
> Can you expand what you mean by this? I'm not sure I followed.
> 
> In case I wasn't clear, the proxy actually produces a completely different transcode of the original video, possibly in a different container format or codec. The "compressed" video is actually a completely different file than the original video; this is not just compression via Content-Encoding.

If the encoding is reversible (lossless), transfer encoding is a better idea.  Of course,
this has zero chance of being implemented already -- it would be custom code.

> > 2) use If-Range and configure your proxy to forward the request when no match;
> >     yes, that's legitimate HTTP (a server is free to ignore partial requests and a proxy
> >     can forward any request it likes).
> 
> Nod.
> 
> This doesn't help with the caches, which return 200 when there is no match on the If-Range etag rather than forwarding the request. If we didn't have any HTTP caches in the middle, we would have already done this :)

You control those HTTP caches, right? Change that behavior.  We are talking about a trivial
configuration change (or a one-line source code change), as opposed to a change to HTTP
semantics which, even if we agreed to it, wouldn't be deployed for another five years.

OTOH, you can just do the sensible thing and use a different URL for the compressed stream.
Then the proxy can redirect initial (normal) requests to the compressed stream when it already
has the beginning of that stream in cache.

> > 3) use If-Match and deal with the extra round-trip after a 412.
> 
> Why doesn't the logic in #2 apply here as well? Intermediary servers aren't required to 412.

They are required to either not implement it or not perform the method.  Either way, the
response isn't going to be what you want (a 2xx status) because that would change the
semantics of the field. Your use case fits If-Range's purpose, not that of If-Match.
To be clear regarding the subject, the RFC doesn't say caches and intermediaries always
ignore If-Match; it says they may.  Deployed practice will just as often respond with a 412
when an unmatched etag is received in If-Match.

....Roy