Re: bohe and delta experimentation...

James M Snell <jasnell@gmail.com> Wed, 16 January 2013 23:12 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADB3C11E80E2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 16 Jan 2013 15:12:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.669
X-Spam-Level:
X-Spam-Status: No, score=-8.669 tagged_above=-999 required=5 tests=[AWL=1.798, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8, SARE_UNSUB18=0.131]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ayqv3Dm7Vc+s for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 16 Jan 2013 15:12:32 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 7AA2B11E809A for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 16 Jan 2013 15:12:32 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Tvc94-0000DL-Vj for ietf-http-wg-dist@listhub.w3.org; Wed, 16 Jan 2013 23:11:47 +0000
Resent-Date: Wed, 16 Jan 2013 23:11:46 +0000
Resent-Message-Id: <E1Tvc94-0000DL-Vj@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1Tvc90-0008L7-UC for ietf-http-wg@listhub.w3.org; Wed, 16 Jan 2013 23:11:42 +0000
Received: from mail-ie0-f169.google.com ([209.85.223.169]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1Tvc8z-00019v-S6 for ietf-http-wg@w3.org; Wed, 16 Jan 2013 23:11:42 +0000
Received: by mail-ie0-f169.google.com with SMTP id c14so3769089ieb.0 for <ietf-http-wg@w3.org>; Wed, 16 Jan 2013 15:11:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=2kSCZCsgcqUNplLwe04Xj10Yvurw6hyWApb9NHsIgzE=; b=eT0JJO+bjUZ5i1XV/ou8LPiAB0fKt8PGLNWAMuI5LqDdDzN+2DJmFNHmc5A4c8pszH QKqklQq+/7dbiCK1NNXzErMjhSe6Aef4qPYaYqy9Pn85hzD1ALMff1ASwNBs767TAd3z /4JZKRNLW2Xbp2+PnNTaVm1TNJ3AaxqIejafwNSWXwzdmvno+FuKZqbA+JZ8hxWQjPl0 ZBiONUM/vOGJBH7YgecOylW6lGTIqtuKFOh/fI//0zZU2Z+mCq/jA7mRSlS8GmxIKfIE nG98kJOVE2Cef5tCBXzbFOnUnzV1tQOjI6OG/fi5DOmTD1pLlAauG6dAUy7pGBjpo7JM pbqg==
X-Received: by 10.50.150.174 with SMTP id uj14mr2127064igb.19.1358377875883; Wed, 16 Jan 2013 15:11:15 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.26.137 with HTTP; Wed, 16 Jan 2013 15:10:55 -0800 (PST)
In-Reply-To: <CAK3OfOhWm3XD57aX6oqxB50SO4KUL+b+fY0T6+ndk0G=q4BYbg@mail.gmail.com>
References: <CABP7RbeNFm3ZHdtDBUJb3idJjFj0q+fxDPzxKZBhSJqXw8zWaQ@mail.gmail.com> <CAK3OfOhWm3XD57aX6oqxB50SO4KUL+b+fY0T6+ndk0G=q4BYbg@mail.gmail.com>
From: James M Snell <jasnell@gmail.com>
Date: Wed, 16 Jan 2013 15:10:55 -0800
Message-ID: <CABP7Rbeuk1DX+dKam=AHeUgfLAoa4XybOLFA+C1t1oQuQswdeA@mail.gmail.com>
To: Nico Williams <nico@cryptonector.com>
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="f46d043d644bc3465c04d36ffe68"
Received-SPF: pass client-ip=209.85.223.169; envelope-from=jasnell@gmail.com; helo=mail-ie0-f169.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.660, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1Tvc8z-00019v-S6 2ca6a222f935d21baf818d8c06733435
X-Original-To: ietf-http-wg@w3.org
Subject: Re: bohe and delta experimentation...
Archived-At: <http://www.w3.org/mid/CABP7Rbeuk1DX+dKam=AHeUgfLAoa4XybOLFA+C1t1oQuQswdeA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/15916
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Wed, Jan 16, 2013 at 2:47 PM, Nico Williams <nico@cryptonector.com>wrote:

> > [snip]
> > +-+---+---+-------------------+
> > |M|TZH|TZM|   year (16-bit)   |
> > +-+---+---+-----+-------------+
> > | month (4-bit) | day (5-bit) |
> > +---------------+-------------+
> > | hour (5-bit)  | minute (6)  |
> > +---------------+-------------+
> > | second (6 bit)| millis (31) |
> > +---------------+-------------+
> > |d|tz hrs (5 bit)| tz min (6) |
> > +-----------------------------+
> >
> > M, TZH and TZM are single bit flags. When M is set, the value includes a
> > 31-bit millisecond field. When TZH is set, it includes timezone offset
> > hours, and when TZM is set, it includes timezone offset minutes. The d
> field
> > (last row) is a single bit indicating positive or negative timezone
> offset.
>
> You don't need 31 bits for milliseconds; 10 will do!  But sure, it's
> nice to be able to get to microseconds, in which case 20 bits should
> suffice, or nanoseconds, in which case 30 bits should suffice.  In no
> case do we need 31 bits for fractions of seconds.  But at best we save
> 21 bits -- two bytes, or, if we're lucky, three.
>
>
Yes, 31 bits was intentional overkill just for the strawman. I'm generally
unconvinced that we would need anything more than millisecond precision,
allowing us to drop to a max of 9-bytes.


> > The minimum possible binary encoding is 6-bytes, which includes the first
> > three flag bits, year, month, day, hour, minute and second. The maximum
> > possible encoding is 11-bytes which includes full timezone offset and
> > milliseconds. Giving an average encoding of 8-bytes over any sample size
> of
> > randomly generated timestamps.
>
> But if everyone chooses to send the max then it's 11 vs. the 12 you
> got with date string compression.  Too trivial a gain?
>
> Of course, an encoding that uses, say, 44 bits for twos-complement (do
> we need negative dates for this?) seconds since the Unix epoch + 20
> for microseconds would always be 8 bytes, but we'd get no TZ
> information, and TZ info would require at least two more bytes so...
> we're back to about 10-12 bytes.  If we could do with just 34 bits for
> seconds w/o negative dates we're getting closer to always 8 bytes.
> And if we could do with just 33 bits for seconds ... we'd get to
> exactly 8 bytes but at the price of a 2,242 year problem.
>
>
One of the nice thing about the strawman encoding I used is that it is a
field-for-field representation of the RFC3339 timestamp. It encodes exactly
the same information and can represent the full range of dates supported by
the date-time construct. Other variations may shave off one or two
additional bytes but either lose information or are far more limited in the
values they can express. Suppose we decided to adjust the millisecond field
to 10 bits as you suggest we have a worse case of 9-bytes, best case of 6.
Seems like a reasonable compromise to me.


> What if we use julian day?  Then we'd need 31 bits for days (which
> allows us to go 1000 years into the future), 16 bits for seconds and
> milliseconds, and now we're at 6 bytes + two more for TZ data.  And if
> we encode TZ offset in terms of 15 minute increments then we get down
> to just 7 bytes for the whole thing.  Seven bytes is pretty good, but
> is it good enough to bother with this?
>
> We can do slightly better if we don't allow dates in the past, set a
> new epoch, and limit how far into the future our dates will go (we can
> always allow for encoding far-future dates with many more bytes).  I
> think we can probably get down to 6 bytes for dates, including TZ
> information and milliseconds for the next few decades then go up to 7
> bytes and so on.
>
> > Will be turning my attention to cookie values next. I'm considering
> whether
> > or not we should produce a code-tree that is specific to cookie headers
> > and/or allow for purely binary values.
>
> Where cookies bear encrypted session state you won't be able to
> compress them at all.  And it's not like the server can't do the
> effort to set maximally-compressed cookies -- it should!  IMO: leave
> cookies alone.
>

Yeah, that's what I suspect also. Allowing for binary cookie values can
allow us to avoid extra bits on the wire but compression here typically
doesn't help for these at all, regardless of how optimized our code tree
is.


>
> Nico
> --
>