Re: Why should caches and intermediaries ignore If-Match?

Tom Bergan <tombergan@chromium.org> Sat, 04 March 2017 00:52 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 77CC81295E2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 3 Mar 2017 16:52:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.521
X-Spam-Level:
X-Spam-Status: No, score=-6.521 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DB6CoCCSbgkp for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 3 Mar 2017 16:52:18 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D1E0312940E for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 3 Mar 2017 16:52:17 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cjxsT-0006AU-Er for ietf-http-wg-dist@listhub.w3.org; Sat, 04 Mar 2017 00:48:53 +0000
Resent-Date: Sat, 04 Mar 2017 00:48:53 +0000
Resent-Message-Id: <E1cjxsT-0006AU-Er@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <tombergan@chromium.org>) id 1cjxsM-00069i-Es for ietf-http-wg@listhub.w3.org; Sat, 04 Mar 2017 00:48:46 +0000
Received: from mail-wm0-f51.google.com ([74.125.82.51]) by titan.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <tombergan@chromium.org>) id 1cjxsE-0005E2-14 for ietf-http-wg@w3.org; Sat, 04 Mar 2017 00:48:41 +0000
Received: by mail-wm0-f51.google.com with SMTP id v186so26980132wmd.0 for <ietf-http-wg@w3.org>; Fri, 03 Mar 2017 16:48:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=yqWK0nLuJzqWLxljfSFGufgD/NclB1Wt9U83gw/M5bc=; b=St492fvE3zKHEvZVcGt/9GUBA/UeEFAGWrXd0ni2HBX3pwrgf1QS0YJvuZ7wkrHpZY Ga3SfGhALwHbODvItGsG3YYcX8V7B6v499oCB6hTP0m3I7NCLpuYRELTZu+OM7PaIv5W kaqL10sZ7OYaR5pjtqYTfpsBi+b9+uuGS7gf4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=yqWK0nLuJzqWLxljfSFGufgD/NclB1Wt9U83gw/M5bc=; b=CMjd9sS8V2hqfSN7eGVAxDlhqREzsKdu3J77i27G8sDxgd0Yft6b8ni9+gq+1oQejN WJgwKL7qD0bLVcw1plWD1jNkiX+Pp1bp8WaSrE1o7SvS8kBPYxyR8ed7gInDMRIQ/ccY WmBTrNG93jjVEIJ2KdPv+546sGTpXMK8XUaaMdrZfLCwf3OjIgSRsFAea6tl8RlAf14N ggNfA7Z1YPbYML+TQ06NQcoJVHZQi5bIAy+hD+ZtwuUZIN9YxPCtxCezP2x775xhTO+T 4TvQ9R/Gz9Q5iEqJIP2baihAZK9Mx8h/KttJ3HA/8IFw5i3/bH1vHyNww8sT0hKiXCSc kF1w==
X-Gm-Message-State: AMke39l7HrKbTDQTesbWfmZzzIOIzUwWzDh7NBeJ/IdAsrBT6CVHuntTgo3zSCsILAUmIK47
X-Received: by 10.28.63.5 with SMTP id m5mr4818755wma.95.1488587052372; Fri, 03 Mar 2017 16:24:12 -0800 (PST)
Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com. [74.125.82.50]) by smtp.gmail.com with ESMTPSA id m80sm4893697wmi.34.2017.03.03.16.24.11 for <ietf-http-wg@w3.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Mar 2017 16:24:11 -0800 (PST)
Received: by mail-wm0-f50.google.com with SMTP id t193so26885296wmt.1 for <ietf-http-wg@w3.org>; Fri, 03 Mar 2017 16:24:11 -0800 (PST)
X-Received: by 10.28.11.205 with SMTP id 196mr4379971wml.31.1488587051181; Fri, 03 Mar 2017 16:24:11 -0800 (PST)
MIME-Version: 1.0
Received: by 10.28.135.14 with HTTP; Fri, 3 Mar 2017 16:24:10 -0800 (PST)
In-Reply-To: <838CF191-53A4-4CE6-A36F-8D0FED8A3069@gbiv.com>
References: <CA+3+x5FgdfAQ4Nos9VTGe35RiH8Z+3zZiUGH_bKXHz+VO+UAbQ@mail.gmail.com> <aaedcb18-2a19-9b77-95d9-0559e21407c2@measurement-factory.com> <CA+3+x5E_HPycm4axSLtO0jGmjDBS3=kVfhaJzKR+7n7S_yMgkg@mail.gmail.com> <DEF639B3-6A6D-4030-93D0-B7473D2A14F6@mnot.net> <9DA41CBA-673C-416B-A9CF-AD9A108C2440@gbiv.com> <CA+3+x5HZJsJ+903UKx+2O0b2dtc7o48Ks8CGrGfs=E_dXyWx-g@mail.gmail.com> <629EE31B-7235-4EFB-9C4C-CA4010165B2F@gbiv.com> <3963F360-005D-41F9-BF51-EB3EBD9C6F7F@mnot.net> <CA+3+x5GdrhsaE=X2qDXOaGvrLeR45X3LU8fRO761waxsBLtB3g@mail.gmail.com> <838CF191-53A4-4CE6-A36F-8D0FED8A3069@gbiv.com>
From: Tom Bergan <tombergan@chromium.org>
Date: Fri, 03 Mar 2017 16:24:10 -0800
X-Gmail-Original-Message-ID: <CA+3+x5FgcQ91a2TFnUVOO8WXX+DxpXJ9YEFH69mGhB8iF6gMMw@mail.gmail.com>
Message-ID: <CA+3+x5FgcQ91a2TFnUVOO8WXX+DxpXJ9YEFH69mGhB8iF6gMMw@mail.gmail.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="001a11444938678a2f0549dcaef5"
Received-SPF: pass client-ip=74.125.82.51; envelope-from=tombergan@chromium.org; helo=mail-wm0-f51.google.com
X-W3C-Hub-Spam-Status: No, score=-2.4
X-W3C-Hub-Spam-Report: AWL=1.150, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1cjxsE-0005E2-14 f58212e1c32647e53dc60b5dbdfc49c8
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Why should caches and intermediaries ignore If-Match?
Archived-At: <http://www.w3.org/mid/CA+3+x5FgcQ91a2TFnUVOO8WXX+DxpXJ9YEFH69mGhB8iF6gMMw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33656
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Fri, Mar 3, 2017 at 4:11 PM, Roy T. Fielding <fielding@gbiv.com> wrote:

> On Mar 3, 2017, at 3:24 PM, Tom Bergan <tombergan@chromium.org> wrote:
>
> On Fri, Mar 3, 2017 at 2:58 PM, Mark Nottingham <mnot@mnot.net> wrote:
>
>> > On 4 Mar 2017, at 9:30 am, Roy T. Fielding <fielding@gbiv.com> wrote:
>> >
>> > On Mar 1, 2017, at 5:49 PM, Tom Bergan <tombergan@chromium.org> wrote:
>> >>
>> >> Here is the use case:
>> >>
>> >> We have a content-optimization (compression) proxy sitting between the
>> browser and origin server. Among other things, the proxy can compress
>> videos. When the browser starts playing a video, it makes an initial HTTP
>> request to fetch (part of) the video, then builds an in-memory
>> representation of the video and uses additional HTTP range requests as
>> needed to fetch the rest of the video. For example, range requests are used
>> to implement seeking.
>> >>
>> >> The challenge is that we now have multiple representations of every
>> video: the original representation (from the origin server) and one or more
>> compressed representations served by the proxy. When the browser makes an
>> initial request for a video, it gets one of these representations. When it
>> makes a subsequent range request, we want to ensure that it receives the
>> *same* representation that it received on the initial request. Otherwise
>> the browser cannot combine the second response with the first response and
>> video playback will fail.
>> >>
>> >> An additional challenge is that the browser and proxy both have a
>> cache. In theory, we control the entire connection and could add custom
>> code to the browser, proxy, and caches to implement any protocol that we
>> invent. In practice, both caches are intended to be HTTP-compliant caches
>> and we'd rather not add custom hacks for use cases like this if we can
>> avoid it.
>> >>
>> >> The browser needs to label each range request with the ETag it expects
>> to receive. If-Match originally seemed like the perfect solution: The
>> browser adds `If-Match: ETag` to every range request. If a cache has a copy
>> of the video with a *different* ETag, the cache forwards the request to the
>> next server in the chain rather than returning its cached copy (as would
>> happen if we used If-Range instead of If-Match). Similarly, the proxy knows
>> if the browser is requesting a compressed video or the original video, so
>> it can respond accordingly. However, as discussed previously in this
>> thread, If-Match doesn't work like this.
>> >>
>> >> Note that I agree it doesn't make sense for a cache to return 412 and
>> we don't need that behavior. The semantics I'm looking for is: "Send me
>> this representation if you have it, otherwise forward to the next server. A
>> 4xx means that this representation is not current in the origin or in any
>> intermediate cache or proxy."
>> >>
>> >> Hope that makes sense.
>> >
>> > You have several choices:
>> >
>> > 1) implement this using transfer encodings because they don't change
>> range offsets;
>> >     presumably, these would be added/removed by the protocol handlers
>> before
>> >     the caches ever see them.
>>
>> Ew.
>
>
> Can you expand what you mean by this? I'm not sure I followed.
>
> In case I wasn't clear, the proxy actually produces a completely different
> transcode of the original video, possibly in a different container format
> or codec. The "compressed" video is actually a completely different file
> than the original video; this is not just compression via Content-Encoding.
>
>
> If the encoding is reversible (lossless), transfer encoding is a better
> idea.  Of course,
> this has zero chance of being implemented already -- it would be custom
> code.
>
> > 2) use If-Range and configure your proxy to forward the request when no
>> match;
>> >     yes, that's legitimate HTTP (a server is free to ignore partial
>> requests and a proxy
>> >     can forward any request it likes).
>>
>> Nod.
>
>
> This doesn't help with the caches, which return 200 when there is no match
> on the If-Range etag rather than forwarding the request. If we didn't have
> any HTTP caches in the middle, we would have already done this :)
>
>
> You control those HTTP caches, right? Change that behavior.  We are
> talking about a trivial
> configuration change (or a one-line source code change), as opposed to a
> change to HTTP
> semantics which, even if we agreed to it, wouldn't be deployed for another
> five years.
>

I think you're under-stating the maintenance complexity of adding
non-standard behavior to caches that are generally expected to have
standard HTTP behavior. But, I understand the reluctance to change HTTP
semantics, so we'll look into a non-standard or custom solution.

OTOH, you can just do the sensible thing and use a different URL for the
> compressed stream.
> Then the proxy can redirect initial (normal) requests to the compressed
> stream when it already
> has the beginning of that stream in cache.
>

This doesn't solve the problem completely either. The proxy still needs to
distinguish initial requests (where it can return a redirect) from
subsequent requests (where it cannot). Obviously this is easy to signal
with any custom protocol, but I was trying to figure out if there's a
purely standards-compliant solution to this problem. It sounds like no, so
I'll do something custom.

> 3) use If-Match and deal with the extra round-trip after a 412.
>>
>> Why doesn't the logic in #2 apply here as well? Intermediary servers
>> aren't required to 412.
>>
>
> They are required to either not implement it or not perform the method.
> Either way, the
> response isn't going to be what you want (a 2xx status) because that would
> change the
> semantics of the field. Your use case fits If-Range's purpose, not that of
> If-Match.
> To be clear regarding the subject, the RFC doesn't say caches and
> intermediaries always
> ignore If-Match; it says they may.  Deployed practice will just as often
> respond with a 412
> when an unmatched etag is received in If-Match.
>
> ....Roy
>
>