Re: Dictionary Compression for HTTP (at Facebook)

Mark Nottingham <mnot@mnot.net> Thu, 23 August 2018 06:16 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8A9E4130DFA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 22 Aug 2018 23:16:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.999
X-Spam-Level:
X-Spam-Status: No, score=-7.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=mnot.net header.b=Id9X2ElY; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=M5mpaDaT
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HJC7e46dypw7 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 22 Aug 2018 23:16:33 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5F254130DD8 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 22 Aug 2018 23:16:33 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.89) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1fsisN-0004WY-Ge for ietf-http-wg-dist@listhub.w3.org; Thu, 23 Aug 2018 06:13:47 +0000
Resent-Date: Thu, 23 Aug 2018 06:13:47 +0000
Resent-Message-Id: <E1fsisN-0004WY-Ge@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <mnot@mnot.net>) id 1fsisJ-0004Vt-Gh for ietf-http-wg@listhub.w3.org; Thu, 23 Aug 2018 06:13:43 +0000
Received: from out1-smtp.messagingengine.com ([66.111.4.25]) by mimas.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <mnot@mnot.net>) id 1fsisH-0005S7-0e for ietf-http-wg@w3.org; Thu, 23 Aug 2018 06:13:43 +0000
Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 22F5721E97; Thu, 23 Aug 2018 02:13:20 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Thu, 23 Aug 2018 02:13:20 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mnot.net; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; bh=s+U6GlgwTR90kJB8u+TyKVeicpovv aogUCT+cXfub+o=; b=Id9X2ElYb0/avr51FA8z5lrztQbwLdzJfJvZbW5ZvfH2O rfmCDybUv24xX+gRs3fHNvz4DlMaVw06sqAMsA1SzARdP6125wu37yjMjDsIYr02 A6iTQPK+jvA8JkIbAoDQItjoapWDSOfBdOTNmVb3gkacbBr9anB9gKv6cHaGq9rv ZxwgAzbp47Qk5Wg3BemD/kXhEbdW5ZWOfLOdnib9dda+fsEVL1Ka/AH5eQgcBja0 6Oe1z9EH9P9NIYkuQFdeGDuGlq06dtmTuCsd60pS3tQ9b4Amxm75S0wAtd0rK2B1 Ro7O2aWFbVVTAt1FepePrfdw/fhWG8J4Iisl/XP5g==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=s+U6Gl gwTR90kJB8u+TyKVeicpovvaogUCT+cXfub+o=; b=M5mpaDaTvgQFUxo4K6fvBn NgzDDFzH6vK/sOF8aWe7sKKBkt4e/U8xbFIptHAmpwGkbtYgqBKu2MEHNWIzkooY HZW00grsjW0Ie1ZVDn0hy8g01enIVEteb4HcWWFzSxNUf7V6ZVnow3Ujye7GBXX1 qp9N7FdCv+pQgxcZQncWXkzirZPcPqeg+eW4ihwX0lRs0veF+iOhEzB/69G/ddc0 qm45GDQ3MZThJW4v0yn7UMxmkR/sw+xU60suBEatbYmJScpQRDjR9RL17cUmqXVs M6jiFwIsyGWcYGZDQscs+oMTmk+rRfm7zIukisEn28Yj+N7nPb2c9C7ZOGKJvomA ==
X-ME-Proxy: <xmx:fVB-W8xPy0RiXsMqqm3KlKRTx_1diRh1cgR3k9_Mycfi802w2zugDw> <xmx:fVB-Wwc4m6BFF4_J2d75K21QypDCzaV-Qp2CQLyyj7gkbOhbRwGqrQ> <xmx:fVB-WwJuu7aVnnVXmHQSBI40u1szSz0zIWOsMZx_lD8BN_yL8YzRkA> <xmx:fVB-W5EgQkI4Pc2VwTLO4kmS8jYSQqWHPmtra2AbrEPxUkoUMytYMQ> <xmx:fVB-WyoPrgCTKyb7wX71vIBIYlJLBSC-301IPPEsh0cDVj5cixvzXQ> <xmx:gFB-W43Fc_p5LTuuBs_cEvjrsnsna5K8tMidnIz-DYLoRCl55eCaMw>
X-ME-Sender: <xms:fVB-W3fZiRhvkgIipQoC6D-WUE0YRWri_ouEfUPTk_W-fYjOgQG5Cw>
Received: from attitudadjuster.mnot.net (unknown [144.136.175.28]) by mail.messagingengine.com (Postfix) with ESMTPA id C25CB10273; Thu, 23 Aug 2018 02:13:14 -0400 (EDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <CAPapA7RLncAsHH5pr5RJSYjvPiNk8JvgBJ8T-tKebnC1C5ptHw@mail.gmail.com>
Date: Thu, 23 Aug 2018 16:13:11 +1000
Cc: Felix Handte <felixh@fb.com>, Charles McCathie-Neville <chaals@yandex-team.ru>, Evgenii Kliuchnikov <eustas@google.com>, Vlad Krasnov <vlad@cloudflare.com>, Nick Terrell <terrelln@fb.com>, Yann Collet <cyan@fb.com>, HTTP Working Group <ietf-http-wg@w3.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <ED51E194-503A-4339-B564-A6543F42D0A1@mnot.net>
References: <18eb0343-640c-8b95-1cc2-273bc72ec134@fb.com> <CAPapA7RLncAsHH5pr5RJSYjvPiNk8JvgBJ8T-tKebnC1C5ptHw@mail.gmail.com>
To: Jyrki Alakuijala <jyrki@google.com>
X-Mailer: Apple Mail (2.3445.9.1)
X-W3C-Hub-Spam-Status: No, score=-6.6
X-W3C-Hub-Spam-Report: AWL=3.091, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1fsisH-0005S7-0e 8a2ec9a780336a4883db08d254442343
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Dictionary Compression for HTTP (at Facebook)
Archived-At: <https://www.w3.org/mid/ED51E194-503A-4339-B564-A6543F42D0A1@mnot.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/35807
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hello Felix and Jyrki,

Shared dictionary compression in various forms has been discussed in the Working Group for a fair amount of time. What's blocking progress is an agreed-to description of its security properties, issues therein, and acceptable mitigations for them.

Some people have expressed interest in attempting to document that in an Internet-Draft, but to date we haven't seen any progress publicly. If you'd like, I can try to put you in touch with them to see if they need help, etc.

Regards,


P.S. Felix, your e-mail didn't make it to me, or into the archives. Are you subscribed to the list?



> On 22 Aug 2018, at 6:23 pm, Jyrki Alakuijala <jyrki@google.com> wrote:
> 
> Fully agree! Sharing dictionaries is an amazing opportunity in making the internet faster and cheaper. SDCH never exploited that opportunity fully and it is great that we all are giving another go on this. 
> 
> I presume Zstd dictionaries are simple: 
> 	• fill lz77 buffer with bytes
> ... and I know that Shared brotli dictionaries are relatively complex: 
> 	• fill lz77 buffer with bytes, or, 
> 	• add special meaning for unique (distance, length) pairs (2 % more density than filling lz77 buffer with bytes), or,
> 	• perform a binary diff on patch data (makes bsdiff obsolete by compressing 5–10 % more than bsdiff+brotli, can by 95+ % more dense than traditional lz77 dictionary for patching).
> 	• when distance overflows for unique (distance, length) pairs, a customized word transform is applied (gives 2 % more density)
> 	• context modeling: dictionary ordering of interpretation of (distance, length) pairs may depend on the last two bytes (unknown gains, I anticipate 1 %)
> For data like the Google search result pages we can see a reduction of ~50 % in data when we go from "br" Brotli to Shared Brotli, and naturally very significant latency wins. Having binary diffing within shared dictionary infrastructure can allow patches for web packaging, Android apps, fonts, or other complex structured data to be efficiently compressed with shared dictionary by just using the previous version of that data as a dictionary.
> 
> 
> 
> On Wed, Aug 22, 2018 at 2:30 AM, Felix Handte <felixh@fb.com> wrote:
> Hello all,
> 
> Quick introduction: I'm an engineer on the Data Compression team at Facebook. While we partner with other teams here to apply compression internally at Facebook, we primarily maintain the open source Zstandard[1][2] and LZ4[3] libraries.
> 
> We've seen enormous success leveraging dictionary-based compression with Zstd internally, and I'm starting to look at how we can apply the same toolkit/approach to compressing our public web traffic. As we're thinking about how to do this, both as a significant origin of HTTP traffic and as maintainers of open source compression tools, we want very much to pursue a course of action that is constructive for the broader community.
> 
> There are, by my count, three competing proposals for how this sort of thing might work (SDCH[4], Compression Dictionaries for HTTP/2[5], and Shared Brotli Dictionaries[6]+[7]). With no public consensus around how to do this well, it's tempting for us to simply build on the tooling we've built internally, and apply it to our traffic between our webservers and our mobile apps (where we control both ends of the connection and can do anything we want). However, it would be pretty tragic for Facebook to gin up its own spec and implementation in this space, roll it out, and end up with something mutually incompatible with anyone else's efforts, further fragmenting the community and driving consensus further off.
> 
> So I wanted to first resurface this topic with you all. In short, is there anyone still interested in pursuing a standard covering these topics? If so, I would like to work with you and help build something in this space that can actually see adoption.
> 
> Thanks,
> Felix
> 
> [1] https://github.com/facebook/zstd
> [2] https://tools.ietf.org/html/draft-kucherawy-dispatch-zstd-03
> [3] https://github.com/lz4/lz4
> [4] https://tools.ietf.org/html/draft-lee-sdch-spec-00
> [5] https://tools.ietf.org/html/draft-vkrasnov-h2-compression-dictionaries-03
> [6] https://tools.ietf.org/html/draft-vandevenne-shared-brotli-format-01
> [7] https://github.com/google/brotli/wiki/Fetch-Specification
> 
> 

--
Mark Nottingham   https://www.mnot.net/