Re: Why should caches and intermediaries ignore If-Match?

Tom Bergan <tombergan@chromium.org> Thu, 02 March 2017 02:25 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 68EF312945B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 1 Mar 2017 18:25:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.521
X-Spam-Level:
X-Spam-Status: No, score=-6.521 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FaCAA3o0seFv for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 1 Mar 2017 18:25:05 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A5CBA12944E for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 1 Mar 2017 18:25:05 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cjGN9-0004XU-PZ for ietf-http-wg-dist@listhub.w3.org; Thu, 02 Mar 2017 02:21:39 +0000
Resent-Date: Thu, 02 Mar 2017 02:21:39 +0000
Resent-Message-Id: <E1cjGN9-0004XU-PZ@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <tombergan@chromium.org>) id 1cjGN3-0004We-RB for ietf-http-wg@listhub.w3.org; Thu, 02 Mar 2017 02:21:33 +0000
Received: from mail-wm0-f51.google.com ([74.125.82.51]) by titan.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <tombergan@chromium.org>) id 1cjGMw-0000QE-7J for ietf-http-wg@w3.org; Thu, 02 Mar 2017 02:21:28 +0000
Received: by mail-wm0-f51.google.com with SMTP id v186so124358618wmd.0 for <ietf-http-wg@w3.org>; Wed, 01 Mar 2017 18:21:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=TIEj6o8cBMFnwyX3xWCvFxhBxbTGs3J83sFcbD7ruUA=; b=HrMFRK+YSY/vwrcGUURx2CKbCjb52JK63I0lTfJKJH0ChJyAa2qi+EQkFHahswkbUe FNeu2GNsA26yEYHyTkHgY5V5FD43LlWCIEMZhpQx8YiAHNDw+hBYFRPk9YtsYSLcbjaT Q72Bxj+9T9y+qGg2Q911Q1Vq660lrulIkcEes=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=TIEj6o8cBMFnwyX3xWCvFxhBxbTGs3J83sFcbD7ruUA=; b=gW0LYAGbuCMcLmpO6QStfKKRCc3U150RqZYp9iJR5+KTavcsdnJA4QyuyE4QDMiQ5b J19cdWA0hm2L9CvHFlpenfeO77iCXocb2TaWIK0zVnHw+b/d83J76ybAwFSq+S2k3ZIS vtMeUdmm/d2q3bFhPihmiSZ6MbfkYBTAzKJBcuIij5qvBUnkwkyAOXiuysHEcMMG9c7S UGLwUpTaFHkaPL5ObOoY+11g1hNgi6oX5T5vAmki+SQKjONgn0sY3lXOkAjGmtF4NhM5 aYNvUmYEQBjQefdrmFgOl2+7l0I4arS5twCb3wVm2fRyOy93Lo7r/+ZoNMdJ3gNCxaSH Hv4w==
X-Gm-Message-State: AMke39kvLJBLOEzwuBWZnqZSjQ/9H8fKtjAxEQXZfWPztnlV2HSebkWJjuohaqxvKt84mhDR
X-Received: by 10.28.8.130 with SMTP id 124mr5549954wmi.65.1488419401884; Wed, 01 Mar 2017 17:50:01 -0800 (PST)
Received: from mail-wr0-f175.google.com (mail-wr0-f175.google.com. [209.85.128.175]) by smtp.gmail.com with ESMTPSA id v102sm8767919wrb.11.2017.03.01.17.50.00 for <ietf-http-wg@w3.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Mar 2017 17:50:01 -0800 (PST)
Received: by mail-wr0-f175.google.com with SMTP id u48so42229122wrc.0 for <ietf-http-wg@w3.org>; Wed, 01 Mar 2017 17:50:00 -0800 (PST)
X-Received: by 10.223.136.182 with SMTP id f51mr9188690wrf.90.1488419400363; Wed, 01 Mar 2017 17:50:00 -0800 (PST)
MIME-Version: 1.0
Received: by 10.28.135.14 with HTTP; Wed, 1 Mar 2017 17:49:59 -0800 (PST)
In-Reply-To: <9DA41CBA-673C-416B-A9CF-AD9A108C2440@gbiv.com>
References: <CA+3+x5FgdfAQ4Nos9VTGe35RiH8Z+3zZiUGH_bKXHz+VO+UAbQ@mail.gmail.com> <aaedcb18-2a19-9b77-95d9-0559e21407c2@measurement-factory.com> <CA+3+x5E_HPycm4axSLtO0jGmjDBS3=kVfhaJzKR+7n7S_yMgkg@mail.gmail.com> <DEF639B3-6A6D-4030-93D0-B7473D2A14F6@mnot.net> <9DA41CBA-673C-416B-A9CF-AD9A108C2440@gbiv.com>
From: Tom Bergan <tombergan@chromium.org>
Date: Wed, 01 Mar 2017 17:49:59 -0800
X-Gmail-Original-Message-ID: <CA+3+x5HZJsJ+903UKx+2O0b2dtc7o48Ks8CGrGfs=E_dXyWx-g@mail.gmail.com>
Message-ID: <CA+3+x5HZJsJ+903UKx+2O0b2dtc7o48Ks8CGrGfs=E_dXyWx-g@mail.gmail.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: Mark Nottingham <mnot@mnot.net>, Alex Rousskov <rousskov@measurement-factory.com>, HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="001a1149222ea2fbea0549b5a55f"
Received-SPF: pass client-ip=74.125.82.51; envelope-from=tombergan@chromium.org; helo=mail-wm0-f51.google.com
X-W3C-Hub-Spam-Status: No, score=-4.4
X-W3C-Hub-Spam-Report: AWL=-0.850, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1cjGMw-0000QE-7J 8ce4b1427831e7cb24edded04e03d6f4
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Why should caches and intermediaries ignore If-Match?
Archived-At: <http://www.w3.org/mid/CA+3+x5HZJsJ+903UKx+2O0b2dtc7o48Ks8CGrGfs=E_dXyWx-g@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33642
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Here is the use case:

We have a content-optimization (compression) proxy sitting between the
browser and origin server. Among other things, the proxy can compress
videos. When the browser starts playing a video, it makes an initial HTTP
request to fetch (part of) the video, then builds an in-memory
representation of the video and uses additional HTTP range requests as
needed to fetch the rest of the video. For example, range requests are used
to implement seeking.

The challenge is that we now have multiple representations of every video:
the original representation (from the origin server) and one or more
compressed representations served by the proxy. When the browser makes an
initial request for a video, it gets one of these representations. When it
makes a subsequent range request, we want to ensure that it receives the
*same* representation that it received on the initial request. Otherwise
the browser cannot combine the second response with the first response and
video playback will fail.

An additional challenge is that the browser and proxy both have a cache. In
theory, we control the entire connection and could add custom code to the
browser, proxy, and caches to implement any protocol that we invent. In
practice, both caches are intended to be HTTP-compliant caches and we'd
rather not add custom hacks for use cases like this if we can avoid it.

The browser needs to label each range request with the ETag it expects to
receive. If-Match originally seemed like the perfect solution: The browser
adds `If-Match: ETag` to every range request. If a cache has a copy of the
video with a *different* ETag, the cache forwards the request to the next
server in the chain rather than returning its cached copy (as would happen
if we used If-Range instead of If-Match). Similarly, the proxy knows if the
browser is requesting a compressed video or the original video, so it can
respond accordingly. However, as discussed previously in this thread,
If-Match doesn't work like this.

Note that I agree it doesn't make sense for a cache to return 412 and we
don't need that behavior. The semantics I'm looking for is: "Send me this
representation if you have it, otherwise forward to the next server. A 4xx
means that this representation is not current in the origin or in any
intermediate cache or proxy."

Hope that makes sense.

On Mon, Feb 27, 2017 at 5:03 PM, Roy T. Fielding <fielding@gbiv.com> wrote:

> > On Feb 26, 2017, at 3:49 PM, Mark Nottingham <mnot@mnot.net> wrote:
> >
> > I think the best way to characterise the situation currently is that
> HTTP doesn't define any requirements for If-Match on non-origin servers;
> the only requirements in 7232 Section 3.1 apply to origin servers.
> >
> > AFAIK current intermediaries ignore If-Match, so if you wanted to define
> some guidelines here, they'd need to be completely optional. E.g., "An
> intermediary MAY process If-Match based upon the contents of its cache,
> replying with 4xx when..." (note that that's just rapid hand-waving, not
> suggested spec text).
> >
> > If we did that, we'd have a header whose handling by origin servers was
> mandatory for some methods, and handling by intermediary servers was
> optional for other methods. Not sure how much that would confuse people,
> but properly spec'd, it'd probably be OK.
> >
> > We'd also have to have a discussion about whether 412 was the right
> status code.
> >
> > Roy, any thoughts?
> >
> > Tom, can you say any more about your use case?
>
> My thoughts would probably depend on the use case.
> Note that the HTTP spec is only defining rules for communication
> between independent components.  Although the internal architecture of
> a user agent might include something like an HTTP cache, HTTP's rules
> do not limit communication between the UA and its own internal cache.
> As far as HTTP is concerned, they are both part of the user agent.
>
> Thus, the sentence in 3.1:
>
>    It can also be used with safe
>    methods to abort a request if the selected representation does not
>    match one already stored (or partially stored) from a prior request.
>
> is referring to one already stored on the user agent from a prior
> request by that user agent.
>
> Originally, If-Match was defined to be answerable by intermediaries
> for GET/HEAD requests.  However, 412 was considered by the WG to be an
> undesirable response in those cases, so If-Range was created to
> replace that function. My guess is that's the use case here. OTOH,
> a 412 might be preferred for safe methods other than GET and HEAD.
>
> AFAICR, limiting If-Match requirements to origin servers in RFC7232
> was due to lack of implementation by clients (aside from the unsafe
> methods) and a desire for semantic consistency for the field.
>
> For unsafe methods, the client's field value is referring to the
> current selected representation on the origin server, which is something
> that can only be tested by the origin server. Having a special-case for
> safe methods meant that both the meaning of the field changed per
> method and the need to implement it changed per method, which is quite
> a bit of complexity for a feature that nobody ever used.
>
> BTW, Apache httpd implements If-Match in the default resource handler
> and anywhere that calls ap_meets_conditions().  That will result in a
> 412 response to an otherwise successful request if the etag given
> doesn't match the selected representation, regardless of the method.
> [I haven't tested it to see if that gets called by default when the
> server is installed as an intermediary.]
>
> Cheers,
>
> ....Roy
>
>