Cache-Cache and Binary Encoding

James M Snell <jasnell@gmail.com> Fri, 18 January 2013 20:00 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5CEE321F879B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 18 Jan 2013 12:00:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.251
X-Spam-Level:
X-Spam-Status: No, score=-9.251 tagged_above=-999 required=5 tests=[AWL=1.347, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XCtuU-JSAvBr for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 18 Jan 2013 12:00:21 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id BED9B21F87DC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 18 Jan 2013 12:00:19 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1TwI6V-0004pv-Sg for ietf-http-wg-dist@listhub.w3.org; Fri, 18 Jan 2013 19:59:55 +0000
Resent-Date: Fri, 18 Jan 2013 19:59:55 +0000
Resent-Message-Id: <E1TwI6V-0004pv-Sg@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1TwI6J-0004of-WD for ietf-http-wg@listhub.w3.org; Fri, 18 Jan 2013 19:59:44 +0000
Received: from mail-ia0-f170.google.com ([209.85.210.170]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1TwI6I-0007sI-Hp for ietf-http-wg@w3.org; Fri, 18 Jan 2013 19:59:43 +0000
Received: by mail-ia0-f170.google.com with SMTP id k20so1802533iak.29 for <ietf-http-wg@w3.org>; Fri, 18 Jan 2013 11:59:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=EVQP9vScOHP57O+v1v5jeytQzrjsDp4yV0CTgoSiIno=; b=hAsbl8Q7AXjtyQnJtjQGfdTjxkFBgp3sK3QiiKfqwTCVOA21o9RNsCTTg1x4chOQnG IkJLzolfsRyh6tn7HQWxJbM1GJPAzrdQ+gsVEqvNi1vSwHNPr28iM3QbD/WvRS6keQxq TRA9x7AkprVD71x1MB7LWWOn/yKcf4UQ+Tqye0MNGsSOTWkZCgJ3VeP4vj4aAx/FGySe JMJfoMiJB+uprsWgWp+vnosEQoYGD9guQ3KwQ6UVKZM6oyFTOUTEYeLX2DbhK/gZswBc LEz8GfNYDKN5P+TCth+t0WEW0ceUkaQZa0mIa6OiTTEmaeURjmVVlGyCt7ZtcNAtNoSG 38HA==
X-Received: by 10.42.32.71 with SMTP id c7mr6960567icd.35.1358539156506; Fri, 18 Jan 2013 11:59:16 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.26.137 with HTTP; Fri, 18 Jan 2013 11:58:56 -0800 (PST)
From: James M Snell <jasnell@gmail.com>
Date: Fri, 18 Jan 2013 11:58:56 -0800
Message-ID: <CABP7Rbep0CjJV5OTc9-toJLbEHwc7g=by-3JUTtuoBWYu=L+eA@mail.gmail.com>
To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="bcaec517cda8d649fb04d3958b1e"
Received-SPF: pass client-ip=209.85.210.170; envelope-from=jasnell@gmail.com; helo=mail-ia0-f170.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.710, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1TwI6I-0007sI-Hp 2a293bc6b4ec1c1a94b6eda04cb48329
X-Original-To: ietf-http-wg@w3.org
Subject: Cache-Cache and Binary Encoding
Archived-At: <http://www.w3.org/mid/CABP7Rbep0CjJV5OTc9-toJLbEHwc7g=by-3JUTtuoBWYu=L+eA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16017
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Just one more for the day... Looking at Cache-Control.. Currently the
cache-control header consists of a list of named directives that optionally
have associated values. The format is extensible which is great, but makes
things a bit more difficult to optimize. Let's look at a few random
examples...

  Cache-Control: public (6 bytes)
  Cache-Control: public, max-age=1600 (21 bytes)
  Cache-Control: no-store, no-transform, must-revalidate (39 bytes)

Let's see if we can do better.

First off, let's assume that Cache-Control on requests can have a different
encoding than Cache-Control on responses. For requests, let's make it:

 +----------+----------+---------------------+
 | no-cache | no-store |   no-transform      |
 +----------+-----+----+---------+-----------+
 | only-if-cached |xxxx| max-age | max-stale |
 +-----------+----+----+---------+-----------+
 | min-fresh | num-ext | repeating ext block |
 +-----------+-----+---+---------+-----------+

   no-cache       = 1 bit
   no-store       = 1 bit
   no-transform   = 1 bit
   only-of-cached = 1 bit
   xxx            = 4 reserved bits

   max-age        = uintvar
   max-stale      = uintvar
   min-fresh      = uintvar
   num-ext        = 1 byte

 repeating ext block =

 +---------------------------+
 |TYP|XXXXXX|len(key)|key|val|
 +---------------------------+

   TYP = 2 bit type code
         00 = Boolean, no val
         01 = Numeric, val is uintvar
         10 = Text, val is encoded text
         11 = Reserved
   XXXXXX = Reserved Bits

   if TYP is 00, then val is omitted. The idea is that this is a boolean
flag, like no-cache, no-store, etc. The key identifies the flag. Key is a
text label.

   if TYP is 01, then val is uintvar.

   if TYP is 02, then val is 2-byte length followed by encoded text

So if we look at examples, then,

  Cache-Control: no-cache  encodes as five-bytes
  Cache-Control: only-if-cached, max-age=1600, encodes as seven-bytes

Looking at the Cache-Control header for Responses we can do:

 +--------+---------+----------+-------------+
 | public | private | no-cache | no-transform|
 +--------+-+-------+----------+-----------+-+
 | no-store | must-revalidate  |proxy-reval|X|
 +----------+----------+-------+-----------+-+
 | max-age  | s-maxage | num-no-cache-headers|
 +----------+-------+--+---------------------+
 | no-cache-headers | num-private-headers    |
 +------------------+------------------------+
 |private-headers|num-ext|repeating ext block|
 +------------------+------------------------+

Same idea,

  public               = 1 bit
  private              = 1 bit
  no-cache             = 1 bit
  no-transform         = 1 bit
  no-store             = 1 bit
  must-revalidate      = 1 bit
  proxy-reval          = 1 bit
  X                    = reserved
  max-age              = uintvar
  s-maxage             = uintvar
  num-no-cache-headers = 1-byte
  no-cache-headers     = null-byte separated list of header names
  num-private-headers  = 1-byte
  private-headers      = null-byte separated list of header names

Examples...

  Cache-Control: public (encodes as 6 bytes)
  Cache-Control: public, max-age=1600 (encodes as 6 bytes, saving 17 bytes)
  Cache-Control: no-store, no-transform, must-revalidate (encodes as 6
bytes, saving 33 bytes)

So looking at these examples, it is definitely possible to save a lot of
space but at the cost of quite a bit of encoding-complexity. I'm sure we
could possibly do better but this provides a good starting point, and, it's
bidirectionally compatible with 1.1. Whether or not it's worth the effort
is a different question entirely.

- James