Inter-Stream Compression and Delta Encodings

Patrick McManus <mcmanus@ducksong.com> Tue, 25 April 2017 01:11 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1D225131989 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 24 Apr 2017 18:11:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.4
X-Spam-Level:
X-Spam-Status: No, score=-6.4 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=sendgrid.me
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dWO79MiPBiFE for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 24 Apr 2017 18:11:38 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B0BBA13198E for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 24 Apr 2017 18:11:38 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1d2oy1-0001lj-TZ for ietf-http-wg-dist@listhub.w3.org; Tue, 25 Apr 2017 01:08:33 +0000
Resent-Date: Tue, 25 Apr 2017 01:08:33 +0000
Resent-Message-Id: <E1d2oy1-0001lj-TZ@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <bounces+1568871-208f-ietf-http-wg=w3.org@sendgrid.net>) id 1d2oxv-0001jX-Tf for ietf-http-wg@listhub.w3.org; Tue, 25 Apr 2017 01:08:27 +0000
Received: from [168.245.5.182] (helo=o1682455182.outbound-mail.sendgrid.net) by titan.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <bounces+1568871-208f-ietf-http-wg=w3.org@sendgrid.net>) id 1d2oxo-0003Xl-IC for ietf-http-wg@w3.org; Tue, 25 Apr 2017 01:08:22 +0000
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=sendgrid.me; h=mime-version:from:subject:to:content-type; s=smtpapi; bh=bUt5q6TNpyylNjnUnDiMIeefReQ=; b=P/p9ALug1wUcL8VE05+vE63Q76X7+ DPuBuY6whDpxFTTS7P2p4P3/Jrqz1qurjhHQM56Dq5zmfcw2HfZIVx0D+GIuKRdN +X/5tZ7uTbstA3UP/TdVIVXDo5N5Ietb/YOY7gwjZDoMjUvpkWvgqdHuavsvZFf1 BqzKD2z7rC/lU8=
Received: by filter0536p1mdw1.sendgrid.net with SMTP id filter0536p1mdw1-15499-58FEA161-41 2017-04-25 01:07:45.473206434 +0000 UTC
Received: from mail-qk0-f170.google.com (mail-qk0-f170.google.com [209.85.220.170]) by ismtpd0002p1iad1.sendgrid.net (SG) with ESMTP id iGJSXPx2RA6p-_gf2tPlDA for <ietf-http-wg@w3.org>; Tue, 25 Apr 2017 01:07:45.446 +0000 (UTC)
Received: by mail-qk0-f170.google.com with SMTP id f76so52800852qke.2 for <ietf-http-wg@w3.org>; Mon, 24 Apr 2017 18:07:45 -0700 (PDT)
X-Gm-Message-State: AN3rC/6joYUVL1pL+DiXIGmvm8qJC+nk+kFF932bs7FYlTaoWi7pE471 5EqTlHupudKsWKyLXJzghpdrHnOU3g==
X-Received: by 10.55.52.13 with SMTP id b13mr2162596qka.28.1493082465002; Mon, 24 Apr 2017 18:07:45 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.12.182.31 with HTTP; Mon, 24 Apr 2017 18:07:44 -0700 (PDT)
From: Patrick McManus <mcmanus@ducksong.com>
Date: Mon, 24 Apr 2017 21:07:44 -0400
X-Gmail-Original-Message-ID: <CAOdDvNpCFdXjx3O3FbpVVhcXcOxhEueePjv+DgpaiTyS=r6CDg@mail.gmail.com>
Message-ID: <CAOdDvNpCFdXjx3O3FbpVVhcXcOxhEueePjv+DgpaiTyS=r6CDg@mail.gmail.com>
To: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="001a1147806ef269b7054df359b8"
X-SG-EID: YLWet4rakcOTMHWvPPwWbcsiUJbN1FCn0PHYd/Uujh59KAbkdAT7gKvdh8JfuBbciKqp2yDb8RUqEA xaNZdi8I/weZvv/UrKjMuGGBjk1MLE3FPD5ZAXuBsbUX/QYK0GkzAXnbBCfeHxQn99pCLpYhkhFAJm 4DVX+URxJwu0Uw3dI2qRcXyzn6xn/gUFWkNDmqcZ+5STg54WpTqceHbA2gYSnxsyxHN9Fsy68hkaMd w=
Received-SPF: pass client-ip=168.245.5.182; envelope-from=bounces+1568871-208f-ietf-http-wg=w3.org@sendgrid.net; helo=o1682455182.outbound-mail.sendgrid.net
X-W3C-Hub-Spam-Status: No, score=-6.3
X-W3C-Hub-Spam-Report: AWL=0.155, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-2.8, RCVD_IN_SORBS_SPAM=0.5, RDNS_NONE=0.793, SPF_PASS=-0.001, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1d2oxo-0003Xl-IC 7d00823765ce59479240e1c208e9d860
X-Original-To: ietf-http-wg@w3.org
Subject: Inter-Stream Compression and Delta Encodings
Archived-At: <http://www.w3.org/mid/CAOdDvNpCFdXjx3O3FbpVVhcXcOxhEueePjv+DgpaiTyS=r6CDg@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33832
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi WG!

Right before we met in Chicago Vlad updated his compression draft - I'll
highlight it here in case you lost it in the shuffle of the meeting (as I
did originally).
https://datatracker.ietf.org/doc/draft-vkrasnov-h2-compression-dictionaries/?include_text=1
He presented on this in Seoul (and that means it is in the meetecho video
archive if you missed it
http://oldrecs.conf.meetecho.com/Playout/watch.jsp?recording=IETF97_HTTPBIS_II&chapter=chapter_1
)

I would like to start a discussion on whether the working group has an
interest in adopting  a work item in this general area (which may or may
not be this draft depending on consensus). The general topic is recurring,
and I know of interest from at least three parties - though other than Vlad
(who has submitted a draft!) they should speak for themselves in this
thread. But the chairs would need to hear a wider array of viewpoints
before asking the group to take on the burden.

To summarize the top level pros and cons:
 Pro: saves lots of bytes and serialization time (vlad had data on that you
can find in the ietf 97 meeting materials). arguably also fixes a
regression from h1 where h2 discourages inlining which results in less
efficient content-encodings.

Con: mixing compression and encryption is a scary business - ala CRIME.
Vlad's draft attempts to address this by creating different sets of
compression contexts and letting the clients determine the sets and the
servers determine whether or not they will compress individual resources
within those sets.

Does the working group think that is a mechanism that can be effectively
used in a safe way? Thanks for your comments.

-Patrick

-- here are a few drive-by review comments from a first take on the text --

I guess the settings is c->s with encodings flowing s->c. We often manage
to make these things symmetrical can this operate in the other direction
too? We've long lamented content-encoding: gzip not working well in h1 POST
e.g.

what's the interaction with push?

set_compression_context default should probably be 254 to be conservative
rather than 0.

set_dictionary can't be "set on any stream" - subject to opt in

 - If not enough DATA was
   sent, the Dictionary for the given ID is considered uninitialized
   vs
   "If Size is greater than the length of the
      transmitted data, then all of the data will be used."

having a definiton for a context very early in the document would help..
maybe "a context is a non-overlapping set of response streams and
dictionaries"

h1 bindings in a document with for h2 in its title is weird. I would just
get rid of the h1 definitions completely - it has a much richer tradition
of transaction independence on a persistent connection and things that have
tried to bypass that (e.g. connect auth) have a checkered history.

"In addition when binary data is expected on the stream, the clinet SHOULD
hint to the server by sending a SET_COMPRESSION_CONTEXT with the special
value of 255." This is really better advice to the server than anything a
client should be guessing. There are N representations potentially each
with a unique MIME type as possible responses for a request.

"If a USE_DICTIONARY frame arrives for an uninitialized dictionary, this is
considered as stream error of type COMPRESSION_ERROR." Given the cross
stream nature of this - that's probably a protocol error. Any decoding
error is probably a protocol error too.

what does it mean for the extension to be disabled by default? Is that
something different than the SETTINGS frame, or are you informing how the
server config switches need to work?

probably not enough bits for contexts (and maybe dictionaries).