bohe and delta experimentation...
James M Snell <jasnell@gmail.com> Wed, 16 January 2013 22:10 UTC
Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3299411E809A for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 16 Jan 2013 14:10:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.601
X-Spam-Level:
X-Spam-Status: No, score=-8.601 tagged_above=-999 required=5 tests=[AWL=1.997, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U8agosukIVpZ for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 16 Jan 2013 14:10:13 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id D0C6621F88C8 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 16 Jan 2013 14:10:08 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Tvb9a-0004x4-EC for ietf-http-wg-dist@listhub.w3.org; Wed, 16 Jan 2013 22:08:14 +0000
Resent-Date: Wed, 16 Jan 2013 22:08:14 +0000
Resent-Message-Id: <E1Tvb9a-0004x4-EC@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1Tvb9W-0004wO-RP for ietf-http-wg@listhub.w3.org; Wed, 16 Jan 2013 22:08:10 +0000
Received: from mail-ie0-f178.google.com ([209.85.223.178]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1Tvb9V-0007i6-R0 for ietf-http-wg@w3.org; Wed, 16 Jan 2013 22:08:10 +0000
Received: by mail-ie0-f178.google.com with SMTP id c12so3516857ieb.23 for <ietf-http-wg@w3.org>; Wed, 16 Jan 2013 14:07:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=oUfOyCdXdnqTjbLDK37ZunARtqSn3v16J3V5q8XYTtI=; b=DYj6qq5vWYc6WYydZXpL6PYngCn4v4ddFFudX7Dl++tn68GIfh/J3pq5g40gdAVBFu dVxybMLbFUHTQhrYIhdR/khtEgwNBV/s0FFJvjFTrDBhZUZZjLFnTFsdJrHVUh8BFhIU uU0SS7PoB72LMQ5/3xGW6o2DSkedXd/ounlPx9gwSXde+0b/zqu8xrOsrPsErclut3Be Y6QApovWEYXNN2NpOz1l3+0Yw/QQ5vh8Ch5uAnfWUocZsZ6vEy7NjxgvlVFEUApjU6kY q2x08qT79SYiUVPAT8zprK8IBcEQ0H3gHQs6nPNnELqL8TE3uv0YpTGkITy9cTtDa9xa urgg==
X-Received: by 10.50.158.170 with SMTP id wv10mr2022497igb.75.1358374063852; Wed, 16 Jan 2013 14:07:43 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.26.137 with HTTP; Wed, 16 Jan 2013 14:07:23 -0800 (PST)
From: James M Snell <jasnell@gmail.com>
Date: Wed, 16 Jan 2013 14:07:23 -0800
Message-ID: <CABP7RbeNFm3ZHdtDBUJb3idJjFj0q+fxDPzxKZBhSJqXw8zWaQ@mail.gmail.com>
To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="14dae9340f218c4d7b04d36f1b9b"
Received-SPF: pass client-ip=209.85.223.178; envelope-from=jasnell@gmail.com; helo=mail-ie0-f178.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.710, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1Tvb9V-0007i6-R0 dd09182270206ef48bd172d5b965d89c
X-Original-To: ietf-http-wg@w3.org
Subject: bohe and delta experimentation...
Archived-At: <http://www.w3.org/mid/CABP7RbeNFm3ZHdtDBUJb3idJjFj0q+fxDPzxKZBhSJqXw8zWaQ@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/15911
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
After going a number of scenarios with bohe using a variety of stream-compression scenarios it's painfully obvious that there is really no way around the CRIME issue when using stream-compression. So with that, I'm turning my attention to the use of Roberto's delta encoding and exploring whether or not binary optimized values can make a significant difference (as opposed to simply dropping in huffman-encoded text everywhere). I'm starting with dates first... Right now, dates in http/1 requests are rather inefficient. The existing date-time format wastes a significant amount of space, albeit across only a relatively few headers. On the plus side, these tend to compress well, but given that the dates change frequently request-to-request, they will be short-lived in the delta context. Given this, I decided to run a test scenario for compressing RFC3999 dates as text vs. using a compact binary encoding. I generated a sample of 100k randomly generated RFC3999 timestamps that variably include milliseconds and timezone offsets, I then used that to generate a date-time specific symbol map and used a static huffman coding. Then, given a sample of 100k more randomly generated timestamps, the average compression was 12-13 bytes for the date value. (average length of the uncompressed timestamp is 24 bytes).. so pretty good compression using a symbol tree specifically optimized for date-times. By comparison, I devised a simple binary coding for dates using the following format: +-+---+---+-------------------+ |M|TZH|TZM| year (16-bit) | +-+---+---+-----+-------------+ | month (4-bit) | day (5-bit) | +---------------+-------------+ | hour (5-bit) | minute (6) | +---------------+-------------+ | second (6 bit)| millis (31) | +---------------+-------------+ |d|tz hrs (5 bit)| tz min (6) | +-----------------------------+ M, TZH and TZM are single bit flags. When M is set, the value includes a 31-bit millisecond field. When TZH is set, it includes timezone offset hours, and when TZM is set, it includes timezone offset minutes. The d field (last row) is a single bit indicating positive or negative timezone offset. The minimum possible binary encoding is 6-bytes, which includes the first three flag bits, year, month, day, hour, minute and second. The maximum possible encoding is 11-bytes which includes full timezone offset and milliseconds. Giving an average encoding of 8-bytes over any sample size of randomly generated timestamps. While the binary encoding is certainly more efficient, I'm not yet certain if those 4-bytes are worth the effort, but it does improve the overall compression ratio for the message as a whole. Either way, regardless of whether we huffman code or binary code the date values, we should require that RFC3339/ISO8601 timestamps be used for all date headers within the http/2 header encoding as those are going to compress much better than the current http/1 date format. Entity Tags are another area where binary values may be useful. Currently, ETag values generally tend to be hex or base64 encoded binary data. By simply allowing the etag to be dropped in as a set of bytes in the encoded header we can cut the transmitted size of those tags in half. The format I'm considering for these is: +-+------+-----------+ |W|len(7)| octets... | +-+------+-----------+ Where W is a single bit flag indicating weak or not, len is the number of encoded octets for the entity tag. (I'm wondering, tho, whether or not we could get away with dropping the entire concept of a "weak entity tag") By optimizing dates and entity tags this way, we end up with optimized encodings for a good number of commonly used headers (date, last-modified, expires, etag, if-none-match, if-match, if-modified-since, etc), and we can eliminate the need for doing any compression on those values at all. Another set of headers we can optimize within delta are the numeric values for Content-Length, :status, Expires, etc. Rather than encoding those as ascii strings, we would simply encode them as their numeric value. Will be turning my attention to cookie values next. I'm considering whether or not we should produce a code-tree that is specific to cookie headers and/or allow for purely binary values. - James
- Re: bohe and delta experimentation... Adrien W. de Croy
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... Mark Nottingham
- bohe and delta experimentation... James M Snell
- Re: bohe and delta experimentation... Mark Nottingham
- Re: bohe and delta experimentation... James M Snell
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... Mark Nottingham
- Re: bohe and delta experimentation... James M Snell
- Re: bohe and delta experimentation... Roberto Peon
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... Mark Nottingham
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... James M Snell
- Re: bohe and delta experimentation... Roberto Peon
- Re: bohe and delta experimentation... Roberto Peon
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... James M Snell
- Re: bohe and delta experimentation... Roberto Peon
- Re: bohe and delta experimentation... Mark Nottingham
- Re: bohe and delta experimentation... James M Snell
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... Mark Nottingham
- Re: bohe and delta experimentation... Patrick McManus
- Re: bohe and delta experimentation... Martin Thomson
- Re: bohe and delta experimentation... Martin Thomson
- Re: bohe and delta experimentation... Poul-Henning Kamp
- RE: bohe and delta experimentation... RUELLAN Herve
- Re: bohe and delta experimentation... Frédéric Kayser
- Re: bohe and delta experimentation... Nico Williams
- Re: bohe and delta experimentation... Martin J. Dürst
- Re: bohe and delta experimentation... Martin J. Dürst
- Re: bohe and delta experimentation... Mark Nottingham
- Re: bohe and delta experimentation... Frédéric Kayser
- Re: bohe and delta experimentation... Amos Jeffries
- Re: bohe and delta experimentation... Amos Jeffries
- Re: bohe and delta experimentation... Poul-Henning Kamp
- Re: bohe and delta experimentation... Poul-Henning Kamp
- RE: bohe and delta experimentation... RUELLAN Herve
- Re: bohe and delta experimentation... Willy Tarreau
- Re: bohe and delta experimentation... Willy Tarreau
- RE: bohe and delta experimentation... Roberto Peon
- Re: bohe and delta experimentation... Willy Tarreau
- Re: bohe and delta experimentation... Martin Nilsson
- Re: bohe and delta experimentation... Roberto Peon
- Re: bohe and delta experimentation... Phillip Hallam-Baker
- Re: bohe and delta experimentation... Willy Tarreau
- RE: bohe and delta experimentation... RUELLAN Herve