Re: delta encoding and state management

Mark Nottingham <mnot@mnot.net> Sat, 19 January 2013 23:51 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 984D621F87BA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 19 Jan 2013 15:51:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.08
X-Spam-Level:
X-Spam-Status: No, score=-9.08 tagged_above=-999 required=5 tests=[AWL=1.519, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id su3YXHk7dAlh for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 19 Jan 2013 15:51:44 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 8F29F21F8780 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 19 Jan 2013 15:51:44 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Twi9y-0000aV-1p for ietf-http-wg-dist@listhub.w3.org; Sat, 19 Jan 2013 23:49:14 +0000
Resent-Date: Sat, 19 Jan 2013 23:49:14 +0000
Resent-Message-Id: <E1Twi9y-0000aV-1p@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <mnot@mnot.net>) id 1Twi9q-0000Zm-Pe for ietf-http-wg@listhub.w3.org; Sat, 19 Jan 2013 23:49:06 +0000
Received: from mxout-07.mxes.net ([216.86.168.182]) by lisa.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <mnot@mnot.net>) id 1Twi9p-00017U-KD for ietf-http-wg@w3.org; Sat, 19 Jan 2013 23:49:06 +0000
Received: from [192.168.1.80] (unknown [118.209.240.13]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 4377822E1FA; Sat, 19 Jan 2013 18:48:40 -0500 (EST)
Content-Type: text/plain; charset="iso-8859-1"
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <6C71876BDCCD01488E70A2399529D5E52E13CE@ADELE.crf.canon.fr>
Date: Sun, 20 Jan 2013 10:48:36 +1100
Cc: James M Snell <jasnell@gmail.com>, Nico Williams <nico@cryptonector.com>, Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <A983C018-A313-4880-B9FB-4B8AE40FB2A6@mnot.net>
References: <CABP7Rbf-_Of0Gnn7uaeuPiiZ6n+MxbpJjbggmD3qjykWX3gaXQ@mail.gmail.com> <CAK3OfOgvK=GEhCr3jghgFu-1FnZLv5j4bmpYoEpsj59kekL5kg@mail.gmail.com> <CAP+FsNcmLH6fWQoptBoP3a1x-zSpbP8piCFz1fg5KuF+6R3jjg@mail.gmail.com> <CAK3OfOj3ZgOZnzcQCifhb9f2One7vBUNGv7yhidkZqRzaeZYvQ@mail.gmail.com> <CAP+FsNfswUN-CK6heRGqEnSJatHGo3q2mZZLTrPnjapCZz2sTg@mail.gmail.com> <CABP7RbfDZcRH-0_AaN9iYjPN-v6QjU6_Xdy5o1BHYnDFWHtuAg@mail.gmail.com> <CAK3OfOh0xqZsPYcb0uRLnebKWTKO7ARkJ4joFZoqjiBSTmwBTA@mail.gmail.com> <CABP7Rbeb6MOYmYPhhsKFFtQwE0JxuPyShXY0zpkA5YX2JPSY_w@mail.gmail.com> <6C71876BDCCD01488E70A2399529D5E52E13CE@ADELE.crf.canon.fr>
To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
X-Mailer: Apple Mail (2.1499)
Received-SPF: pass client-ip=216.86.168.182; envelope-from=mnot@mnot.net; helo=mxout-07.mxes.net
X-W3C-Hub-Spam-Status: No, score=-4.2
X-W3C-Hub-Spam-Report: AWL=-2.337, BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1Twi9p-00017U-KD 293da0d09a1d95e6ab331dec911f3da3
X-Original-To: ietf-http-wg@w3.org
Subject: Re: delta encoding and state management
Archived-At: <http://www.w3.org/mid/A983C018-A313-4880-B9FB-4B8AE40FB2A6@mnot.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16026
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Indeed. You can see this in the results for the simple compressor, which just keeps the previous set of headers on the connection as state. 

It's not as efficient as delta or gzip, but the numbers aren't bad (actually, much better than I reported in my blog post, due to a bug in the test runner which is fixed in my refactor branch), and the amount of state (and complexity!) is bounded. 


On 19/01/2013, at 12:50 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> wrote:

> I agree that finding optimized binary encodings for headers will help us reducing the size of the data transmitted.
> 
> At the same time, stateful information is also very useful when transmitting a set of successive messages. It allows encoding a header as a reference to another header present in a previous message.
> 
> In my experiments, I tried to devise a binary encoding for the Accept header. However, I found that I was not able to reach the compression ratio obtained by using references to previous messages. Currently, in a set of requests to get a full web page, the Accept header will take 4 or 5 different values. This allows for the stateful compression to be very efficient.
> 
> The drawback of stateful compression is that this state must be stored. I understand that this can be a critical problem for intermediaries. I think that we should work for minimizing the amount of state an intermediary has to store for each connection. I was also wondering if anyone had some rough figure of what would be acceptable by an intermediary.
> 
> Hervé.
> 
>> -----Original Message-----
>> From: James M Snell [mailto:jasnell@gmail.com]
>> Sent: vendredi 18 janvier 2013 00:32
>> To: Nico Williams
>> Cc: Roberto Peon; ietf-http-wg@w3.org
>> Subject: Re: delta encoding and state management
>> 
>> Agreed on all points. At this point I'm turning my attention towards
>> identifying all of the specific headers we can safely and successfully provide
>> optimized binary encodings for. The rest will be left as is. The existing bohe
>> draft defines an encoding structure for the list of headers themselves, I will
>> likely drop that and focus solely on the representation of the header values
>> for now. My goal is to have an updated draft done in time for the upcoming
>> interim meeting.
>> 
>> 
>> On Thu, Jan 17, 2013 at 2:16 PM, Nico Williams <nico@cryptonector.com>
>> wrote:
>> 
>> 
>> 	On Thu, Jan 17, 2013 at 3:44 PM, James M Snell <jasnell@gmail.com>
>> wrote:
>> 
>> 	> We certainly cannot come up with optimized binary encodings for
>> everything
>> 	> but we can get a good ways down the road optimizing the parts we
>> do know
>> 	> about. We've already seen, for instance, that date headers can be
>> optimized
>> 	> significantly; and the separation of individual cookie crumbs allows
>> us to
>> 	> keep from having to resend the entire cookie whenever just one
>> small part
>> 	> changes. I'm certain there are other optimizations we can make
>> without
>> 	> feeling like we have to find encodings for everything.
>> 
>> 
>> 	The only way cookie compression can work is by having connection
>> 	state.  But often the whole point of cookies is to not store state on
>> 	the server but on the client.
>> 
>> 	The more state we associate with connections the more pressure
>> there
>> 	will be to close connections sooner and then we'll have to establish
>> 	new connections, build new compression state, and then have it torn
>> 	down again.  Fast TCP can help w.r.t. reconnect overhead, but that's
>> 	about it.
>> 
>> 	We need to do more than measure compression ratios.  We need to
>> 	measure state size and performance impact on fully-loaded
>> middleboxes.
>> 	 We need to measure the full impact of compression on the user
>> 	experience.  A fabulous compression ratio might nonetheless spell
>> doom
>> 	for the user experience and thence the protocol.  If we take the
>> wrong
>> 	measures we risk failure for the new protocol, and we may not try
>> 	again for a long time.
>> 
>> 	Also, with respect to some of those items we cannot encode
>> minimally
>> 	(cookies, URIs, ...): their size is really in the hands of the
>> 	entities that create them -- let *them* worry about compression.
>> That
>> 	might cause some pressure to create shorter, less-meaningful URIs,
>> 	but... we're already there anyways.
>> 
>> 	Nico
>> 	--
>> 
>> 
> 

--
Mark Nottingham   http://www.mnot.net/