Re: New Version Notification for draft-vkrasnov-h2-compression-dictionaries-01.txt

Vlad Krasnov <vlad@cloudflare.com> Thu, 03 November 2016 05:48 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3FAFD12966B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 2 Nov 2016 22:48:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.497
X-Spam-Level:
X-Spam-Status: No, score=-8.497 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-1.497, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cloudflare.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3QccmqDgUGJN for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 2 Nov 2016 22:48:57 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 54180129436 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 2 Nov 2016 22:48:56 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1c2ApO-0005rA-8M for ietf-http-wg-dist@listhub.w3.org; Thu, 03 Nov 2016 05:44:42 +0000
Resent-Date: Thu, 03 Nov 2016 05:44:42 +0000
Resent-Message-Id: <E1c2ApO-0005rA-8M@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <vlad@cloudflare.com>) id 1c2ApF-0005nP-4T for ietf-http-wg@listhub.w3.org; Thu, 03 Nov 2016 05:44:33 +0000
Received: from mail-pf0-f181.google.com ([209.85.192.181]) by mimas.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <vlad@cloudflare.com>) id 1c2Ap7-0000mf-S5 for ietf-http-wg@w3.org; Thu, 03 Nov 2016 05:44:27 +0000
Received: by mail-pf0-f181.google.com with SMTP id 189so25335777pfz.3 for <ietf-http-wg@w3.org>; Wed, 02 Nov 2016 22:44:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=QKp/o3b2oiQKmJzopJAa2ty4VsHOM7I1R8v4gwcgazc=; b=iuiNnQN8WyGbtDAQt3NpyCSQOHo7RwFrM8HLn+Md8ysqf0Nm1BxC6DQsWraV9GgkBQ IVq3S6QQuDg+10fDJ6P35GG5QkLthEkskTLrgqkN6k1maRMDp/gHZ1Tc1K9vXLtr2VCS i5/cVYQc0FiQZi+YOIturVflf5PJ6V+X6YM1o=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=QKp/o3b2oiQKmJzopJAa2ty4VsHOM7I1R8v4gwcgazc=; b=g3daGudxY0S0QFjXxrRKdXO5QwY13DMJCX6XSmShGV+/1gakO/XSlI4YPgP0yHNNJZ XTjj+ZZQe04L07r6nNUy+ol3dLHulaGyPouG4TIJLkG11B/cTX6YpBdRU0oKrTGnSkXo 9Q36lDdyLfwEPtW+asarvGASkvpu/DrJcWKTG9uoKU3XBN3fGOW7ZPGTJhRwn0j3CQst 9u6l8xh28TKAvo+lKkipqVs6ZrY6Mnksd3CUg0Tzi6070GwKqMBBuE6lJIPTHJLLm+JU 1xoylq3IScjd7ymkQrn3PSboENpapxZ4aHYY7BirkUIgHLpKaNTOm2c/DETC0dTfltqc PPjg==
X-Gm-Message-State: ABUngvenq/EJzASWvO6K99xVbBxC1u5Gxi/WfxcZJYIyTaMXWlDVo+xYbmXQoD7/ai82dCFz
X-Received: by 10.98.9.67 with SMTP id e64mr14059123pfd.74.1478151839193; Wed, 02 Nov 2016 22:43:59 -0700 (PDT)
Received: from ?IPv6:2601:645:8302:ef30:d925:662e:498c:898? ([2601:645:8302:ef30:d925:662e:498c:898]) by smtp.gmail.com with ESMTPSA id s8sm8923915pfj.45.2016.11.02.22.43.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Nov 2016 22:43:58 -0700 (PDT)
From: Vlad Krasnov <vlad@cloudflare.com>
Message-Id: <568945A1-75B6-4E9D-9021-38479FC55580@cloudflare.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_854E3C41-61D2-444D-BD36-4AFAFB0BA991"
Mime-Version: 1.0 (Mac OS X Mail 10.1 \(3251\))
Date: Wed, 02 Nov 2016 22:43:57 -0700
In-Reply-To: <CACweHND+E7D0oKR+_2sKVOqrAwx_hQW9Z=MAmDFGfbqEzR4xGQ@mail.gmail.com>
Cc: Martin Thomson <martin.thomson@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
To: Matthew Kerwin <matthew@kerwin.net.au>
References: <147793576451.32369.14134057573457350871.idtracker@ietfa.amsl.com> <3669167D-26AC-4B78-8175-99B0028B6891@cloudflare.com> <CABkgnnXqHP6RNpHBcFStO5TWz8Sq6Uqs7KMWFof88RjxhoW-Qg@mail.gmail.com> <06396a0d-a0c1-19fc-85d5-6ddfb9bcf39f@gmx.de> <CABkgnnWFds=rYHc-ufCynXg701ekQ6MJTrbXXZrV0ozRod6HzA@mail.gmail.com> <D8E74F06-A6CC-4EA9-9D7C-EFD043F72624@cloudflare.com> <CANatvzzZOvPWrdQqNfV4VSiZ4cb2zt36f1-mKTrxTS8kW6eSuw@mail.gmail.com> <CACweHND+E7D0oKR+_2sKVOqrAwx_hQW9Z=MAmDFGfbqEzR4xGQ@mail.gmail.com>
X-Mailer: Apple Mail (2.3251)
Received-SPF: pass client-ip=209.85.192.181; envelope-from=vlad@cloudflare.com; helo=mail-pf0-f181.google.com
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: AWL=-0.100, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1c2Ap7-0000mf-S5 7be55f1c9e37990cbae02924994ff00f
X-Original-To: ietf-http-wg@w3.org
Subject: Re: New Version Notification for draft-vkrasnov-h2-compression-dictionaries-01.txt
Archived-At: <http://www.w3.org/mid/568945A1-75B6-4E9D-9021-38479FC55580@cloudflare.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32831
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

> On 2 Nov 2016, at 19:55, Matthew Kerwin <matthew@kerwin.net.au> wrote:
> 
> Just chiming in without necessarily attaching to a particular thread of discussion: I'm quite probably being thick here, but isn't there a problem (of the abstraction/encapsulation flavour) with making a content-encoding dependent on values sent at the transport layer? I think I'm just reiterating what Martin was saying, but in a more vague and incoherent way.
> 
> If we're discussing compression parameters/algorithms/dictionaries/etc. at the transport layer, shouldn't the entirety of the compression happen at the transport layer? Thus making it like HTTP/2's new version of TE.
> 
> And if so, isn't transport layer compression a Bad Thing™? Because – thanks to the wonder of abstraction – the transport machinery doesn't (necessarily) know the provenance of the bytes it's compressing (thus potentially allowing sensitive and attacker-controlled data to be compressed in the same context – i.e. BREACH.)

Indeed, making a dumb compression at the transport layer is not my aim here.
A) It is indeed less safe
B) The compression benefits are much smaller when the protocol is unaware of the data type transported

Again, I am looking from the PoV of the nginx architecture that we use, and there is no clear distinction there between the layers, and introspection into the application level is easy to do. And from what I have seen in Apache, it is not that difficult either.

Certainly Server Push is a form of application/protocol level fusion.

Another approach can be client hints. Since we already have client hints in the form of priorities, we can not deny that http/2 is somewhat connected to the application level too.

> So we bounce it up the stack to the application, which has a much better chance of knowing who authored what bytes. And thus we end up back at content-encoding.

Doing it in the application level is also not as good. Because streams can get canceled, and reprioritized you are at a danger of fatal failures (such as deadlocks) if you try to control and optimize the process in the application.

> 
> If it's tied to content-encoding, it should be *entirely* contained in the semantic layer – headers and payload entities. Isn't that what SDCH is?

This is indeed not unlike SDCH and “quasi dictionaries” only your dictionary defined by a stream and not a different url. In fact it can be used with SDCH just as well.
My proposal tries to be as algorithm agnostic as possible.
However brolti compresses much better in that case. In fact from what I have seen brotli+”quasi” beats sdch+”quasi”+brotli (but maybe sdch+”quasi”+brotli+”quasi” will do even better?).

The point is you just can’t get that level of control purely at the application level or the protocol level. We should try and find a middle ground.

> 
> If it's pushed down to the transport layer, isn't it just an even less safe version of draft-kerwin-http2-encoded-data? (I said no shared compression context between different frames, this is about sharing contexts between completely different streams!)

Again, blindly compressing everything at the transport layer is not my suggestion. However even that is OK for the majority of the websites. I for one am not aware of a major effort to mitigate BREACH. 
However the methods to mitigate BREACH are valid for this proposal as well. In fact the greatest danger here is to make BREACH even faster, but isn’t the point moot when you can execute it in 30 seconds already?

> 
> I'm not entirely sure what new thing this particular proposal brings to the table.

Improved compression? 

Cheers,
Vlad