Re: Submitted new I-D: Cache Digests for HTTP/2

Alex Rousskov <rousskov@measurement-factory.com> Sat, 09 January 2016 07:32 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 676651A1B2D for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 8 Jan 2016 23:32:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.912
X-Spam-Level:
X-Spam-Status: No, score=-6.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fswRVha-ImnI for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 8 Jan 2016 23:32:12 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B099A1A1B13 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 8 Jan 2016 23:32:12 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1aHnwd-0002Iv-LE for ietf-http-wg-dist@listhub.w3.org; Sat, 09 Jan 2016 07:28:15 +0000
Resent-Date: Sat, 09 Jan 2016 07:28:15 +0000
Resent-Message-Id: <E1aHnwd-0002Iv-LE@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <rousskov@measurement-factory.com>) id 1aHnwb-0002I8-2n for ietf-http-wg@listhub.w3.org; Sat, 09 Jan 2016 07:28:13 +0000
Received: from mail.measurement-factory.com ([104.237.131.42]) by lisa.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <rousskov@measurement-factory.com>) id 1aHnwZ-0005LV-7J for ietf-http-wg@w3.org; Sat, 09 Jan 2016 07:28:12 +0000
Received: from [65.102.233.169] (unknown [65.102.233.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.measurement-factory.com (Postfix) with ESMTPSA id D9B62E06A; Sat, 9 Jan 2016 07:27:43 +0000 (UTC)
References: <CANatvzyLsrbY4d1Vnq3tSSvt_Tf44sYx0gM-dAWw4d97pz3Mgw@mail.gmail.com> <56900101.1050506@measurement-factory.com> <CANatvzzywypKYN_T0mxYNzFs+AniwUt_gWV6WXEJ4oiuYWb1OQ@mail.gmail.com>
Cc: Kazuho Oku <kazuhooku@gmail.com>
From: Alex Rousskov <rousskov@measurement-factory.com>
X-Enigmail-Draft-Status: N1110
To: ietf-http-wg@w3.org
Message-ID: <5690B662.4070006@measurement-factory.com>
Date: Sat, 09 Jan 2016 00:27:30 -0700
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.4.0
MIME-Version: 1.0
In-Reply-To: <CANatvzzywypKYN_T0mxYNzFs+AniwUt_gWV6WXEJ4oiuYWb1OQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=104.237.131.42; envelope-from=rousskov@measurement-factory.com; helo=mail.measurement-factory.com
X-W3C-Hub-Spam-Status: No, score=-5.4
X-W3C-Hub-Spam-Report: AWL=-1.499, BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1aHnwZ-0005LV-7J 12cc300d11478ca5ecb3f2752d301143
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Submitted new I-D: Cache Digests for HTTP/2
Archived-At: <http://www.w3.org/mid/5690B662.4070006@measurement-factory.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/30872
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On 01/08/2016 11:27 PM, Kazuho Oku wrote:

> If we are to generalize the proposal to support other purposes such as
> exchanging cache states between proxies, I think we should also
> consider of defining a way for sending a digest divided into multiple
> HTTP/2 frames in case the size of the digest exceeds 16KB, in addition
> to providing space to define which encoding is being used.

This is an important caveat that I have missed! Squid Cache Digests are
often many megabytes in size... Perhaps the Draft should be renamed to
"Small Cache Digests for HTTP/2" to emphasize that the proposed
mechanism is not applicable to large caches?


> Or if there is no immediate demand to use an encoding other than
> Golomb-coded sets for sending a small-sized digest, then we can add a
> sentence stating that:
> 
> * sender of a CACHE_DIGEST frame must set its flags to zero
> * receiver of the frame must ignore if its flags are not set to zero
> 
> , and if such demand arises, define new flags to extend the semantics.

It feels wrong to use frame flags to specify digest _encoding_, but
perhaps that is appropriate in HTTP/2 context.


> Also, Golomb-coded sets
> will be the only practical choice, the size of the digest will become
> significantly larger if Bloom filter was chosen (in case false
> positive rate is set to 1/256, it will be about 8x as large).

I would not limit the possibilities to Bloom filters and Golomb-coded
sets. For example, I can imagine a client talking to a server with a
small set of *known-a-priori* objects and using a small 1:1 bitmap to
reliably represent the current client cache digest.

You only need to "waste" an octet to open up support for other digest
formats without changing the overall semantics of the "small cache
digest" feature...


>>>    servers ought not
>>>    expect frequent updates; instead, if they wish to continue to utilise
>>>    the digest, they will need update it with responses sent to that
>>>    client on the connection.

>> Perhaps I am missing some important HTTP/2 caveats here, but how would
>> an origin server identify "that client" when the "connection" is coming
>> from a proxy and multiplexes responses to many user agents?

> Proxies understanding the frame can simply transfer it to the upstream
> server

Yes, but how would an origin server identify "that client" when the
"connection" is coming from a CACHE_DIGEST-aware proxy and multiplexes
responses to many user agents served by that proxy? AFAICT, the server
cannot know whether the responses sent "on the connection" are going to
be cached by the proxy (and, hence, should not be pushed again) or are
going to be forwarded to the user agent without proxy caching (and,
hence, should be pushed again in case other user agents need them).

IIRC, from terminology point of view, the proxy is the "client" in this
context so there is no problem in the current Draft wording if that is
what you meant. There may be a problem if, by "that client", you meant
"that user agent" instead.

Please note that I am _not_ saying that there is a protocol bug here. I
am just noting that it is not clear what should happen when proxies
multiplex streams from different user agents to the same origin server,
and whether there are some specific strategies that caching and
non-caching proxies should deploy to maximize the savings. There seems
to be at least three cases to consider: CACHE_DIGEST-unaware proxies,
aware caching proxies, and aware non-caching proxies.


Thank you,

Alex.