RE: bohe and delta experimentation...

Roberto Peon <grmocg@gmail.com> Fri, 18 January 2013 17:24 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1ECC621F86D2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 18 Jan 2013 09:24:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.404
X-Spam-Level:
X-Spam-Status: No, score=-10.404 tagged_above=-999 required=5 tests=[AWL=0.194, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HZSpTzpUIkUt for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 18 Jan 2013 09:24:36 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 8AB2D21F86A9 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 18 Jan 2013 09:24:36 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1TwFeP-00077V-LC for ietf-http-wg-dist@listhub.w3.org; Fri, 18 Jan 2013 17:22:45 +0000
Resent-Date: Fri, 18 Jan 2013 17:22:45 +0000
Resent-Message-Id: <E1TwFeP-00077V-LC@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1TwFeK-00076l-7g for ietf-http-wg@listhub.w3.org; Fri, 18 Jan 2013 17:22:40 +0000
Received: from mail-lb0-f176.google.com ([209.85.217.176]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1TwFeI-0003Y3-OU for ietf-http-wg@w3.org; Fri, 18 Jan 2013 17:22:40 +0000
Received: by mail-lb0-f176.google.com with SMTP id k6so2849280lbo.21 for <ietf-http-wg@w3.org>; Fri, 18 Jan 2013 09:22:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=M+0Lpg0k2wPDpF2D5NudqLldJin5vqvS3uQkuSUUg6Q=; b=KYSjaXugNxKrv4++rgzw4NKaICMkZz2y7+LmON6OmqL7IKXsgNut/dYrgB0LTwJjz/ E58U80qJFF8CRYlvXfaEEfhxptsUwJr+5PEdjgAfVp5g7g5dKYl0jNJZhxSXMZMsIQCJ q+tsopspte0/JLYd2Dchy8JYHKNh3M0XF6RkiDS+/Ln2V4HC0KyB/sa15mgAbO7wu4TA Fp2XBEM1iVp+fc0hee8IzqyVTO6J79r5ErZXHxTzAORFpYZRhZKMsznak2b80nZPJWXu SWI/7UpbBPklzfBavLekYhrIb50mXh0/JTajLp1vLuBJTsSrSihQXht23XpBUcjW3b1/ wYNA==
MIME-Version: 1.0
X-Received: by 10.112.44.164 with SMTP id f4mr3996851lbm.111.1358529731939; Fri, 18 Jan 2013 09:22:11 -0800 (PST)
Received: by 10.112.81.5 with HTTP; Fri, 18 Jan 2013 09:22:11 -0800 (PST)
Received: by 10.112.81.5 with HTTP; Fri, 18 Jan 2013 09:22:11 -0800 (PST)
In-Reply-To: <6C71876BDCCD01488E70A2399529D5E52E13E4@ADELE.crf.canon.fr>
References: <CABP7RbeNFm3ZHdtDBUJb3idJjFj0q+fxDPzxKZBhSJqXw8zWaQ@mail.gmail.com> <2FD0BBE1-59C6-4E49-ACCE-60C1A895FB7D@mnot.net> <CABP7RbdXh1mb_P-HQucksiHc1So0ggVxH5v8y7vk13g+CcWe-Q@mail.gmail.com> <DD2EFC9F-5201-4829-9E6F-BD9CF0307BB0@mnot.net> <CAK3OfOj1O82WqO0L0rNpq2qeKJoT9E0ZQrV6Y=ULETtACpYMag@mail.gmail.com> <CAK3OfOgOGFNbve_QrTrCesqrrAQRH5qWgvebBxAhoMD7_MjhjQ@mail.gmail.com> <0A36AEB6-09B9-462F-B2E8-90B67FE69980@mnot.net> <CAK3OfOhewuVdjxu7UUp49g8B33YZNJ_N-PkASkHLP213+8gquA@mail.gmail.com> <CAP+FsNdi4=Am7pZdKySHZESp79BzRzPaR3UGQM2dsOM-yAxBOA@mail.gmail.com> <CAP+FsNf++RVVAyqweCsGG45wWQyjRrT7LEyWbv+QOd7Z2XdXwg@mail.gmail.com> <50F8F44E.9040401@it.aoyama.ac.jp> <0D1ABADB-E17F-46D3-9B6F-5CDC99FC06B9@mnot.net> <6C71876BDCCD01488E70A2399529D5E52E13E4@ADELE.crf.canon.fr>
Date: Fri, 18 Jan 2013 09:22:11 -0800
Message-ID: <CAP+FsNeJG_RWNbZDvuNBn3RuYEy3bA-wKdoCVfD476kdkC+xow@mail.gmail.com>
From: Roberto Peon <grmocg@gmail.com>
To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Cc: Nico Williams <nico@cryptonector.com>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Mark Nottingham <mnot@mnot.net>, James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="f46d0401229f16cce904d3935a39"
Received-SPF: pass client-ip=209.85.217.176; envelope-from=grmocg@gmail.com; helo=mail-lb0-f176.google.com
X-W3C-Hub-Spam-Status: No, score=-4.4
X-W3C-Hub-Spam-Report: AWL=-1.725, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1TwFeI-0003Y3-OU 53c27ac0b6ac6b421ce869027cb6d11b
X-Original-To: ietf-http-wg@w3.org
Subject: RE: bohe and delta experimentation...
Archived-At: <http://www.w3.org/mid/CAP+FsNeJG_RWNbZDvuNBn3RuYEy3bA-wKdoCVfD476kdkC+xow@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16007
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

This makes URLs vulnerable to the CRIME attack, and URLs definitely do
contain sensitive information often :(

This is true for anything which allows partial matches (I just can't figure
out how date could be sensitive, but if it could, even the encoding
suggested earlier by me would be dangerous).

I dropped exactly this (prefix match) functionality from delta early on
because of this.
-=R
On Jan 18, 2013 5:58 AM, "RUELLAN Herve" <Herve.Ruellan@crf.canon.fr> wrote:

> I'll try a shot at the URLs. Experimental data show that URLs often share
> the same beginning: for requests targeting a web sites, the URLs will
> usually start with the same scheme and host and possibly port. The
> beginning of the path is also usually shared by several URLs.
>
> Therefore an efficient encoding for an URL is as a delta from a previous
> URL: the number of shared characters at the beginning, and the new
> characters. To reduce the state that need to be stored, it is possible to
> use only the previous URL as a reference.
>
> Regards,
>
> Hervé.
>
> > -----Original Message-----
> > From: Mark Nottingham [mailto:mnot@mnot.net]
> > Sent: vendredi 18 janvier 2013 08:09
> > To: Martin J. Dürst
> > Cc: Roberto Peon; Nico Williams; James M Snell; ietf-http-wg@w3.org
> > Subject: Re: bohe and delta experimentation...
> >
> > I feel like we're starting to focus a bit too closely on dates here (not
> just you,
> > Martin!).
> >
> > Let's look at the bigger picture, and other headers, before getting too
> deep
> > here; we're talking about saving a handful of bytes at this point, and we
> > haven't yet looked at URLs, etc.
> >
> > Cheers,
> >
> >
> > On 18/01/2013, at 6:05 PM, Martin J. Dürst <duerst@it.aoyama.ac.jp>
> wrote:
> >
> > > On 2013/01/17 8:49, Roberto Peon wrote:
> > >> Er, by which I mean that dates can be relative to the time stamped by
> > >> something and kept for the connection duration. That would reduce the
> > >> number of bits needed by a fair margin, assuming that is desirable.
> > >> -=R
> > >
> > > I was thinking about something similar, but on a bigger scale. If we
> have an
> > encoding that can cover about 80 years (this is a simplification from
> Unix time
> > does, which is 1970-2037 with 31 bits), then if we assume every server
> > around the globe understands that we are currently somewhere between
> > 2010 and 2020, we could just use that as a very rough base point. In
> that case,
> > we can't use a strict offset, because that would make dates move around
> > every time we move to a new decade. But what we can do is to just rotate
> > around. For this rotation to work, we have to leave some empty space.
> > Below is a very very rough table of how something like this could work.
> > >
> > > Assume we have three bits in a prefix to label 8 different decades.
> Then in
> > each decade as indicated below on the left side, the prefixes would be
> used
> > with the meaning as indicated at the top of the table.
> > >
> > >           1970 1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080
> > >            -    -    -    -    -    -    -    -    -    -    -    -
> > >           1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090
> > >
> > > 1970-1980    0    1    2    3    x    x    x    x    x    x    x    x
> > > 1980-1990    0    1    2    3    4    x    x    x    x    x    x    x
> > > 1990-2000    0    1    2    3    4    5    x    x    x    x    x    x
> > > 2000-2010    x    1    2    3    4    5    6    x    x    x    x    x
> > > 2010-2020    x    x    2    3    4    5    6    7    x    x    x    x
> > > 2020-2030    x    x    x    3    4    5    6    7    0    x    x    x
> > > 2030-3040    x    x    x    x    4    5    6    7    0    1    x    x
> > > 2040-2050    x    x    x    x    x    5    6    7    0    1    2    x
> > > 2050-2060    x    x    x    x    x    x    6    7    0    1    2    3
> > >
> > > So as an example, in our current decade, we would use prefix 2 to
> indicated
> > dates between 1990 and 2000, prefix 4 to indicate dates in our decade,
> and
> > prefix 7 to indicate dates between 2040 and 2050. Prefixes 0 and 1 are on
> > purpose currently out of service to avoid any misunderstadings (does
> prefix 0
> > refer to 1970-80 or to 2050-60?). This way we avoid problems at the
> start/end
> > of a decade, when some servers might think they are still in the old
> decade,
> > where some others already think they are in the new decade.
> > >
> > > This is just a very rough sketch; the decades should be non-overlapping
> > (1991-2000), it shouldn't be exactly decades, but some other intervals
> that
> > we can cover with an exact number of bits. And maybe the past/future
> > balance isn't ideal (currently 2 past and 3 future decades, maybe just 1
> future
> > and 4 past is better, or so).
> > >
> > > Anyway, I hope you can see the basic principles of the system: Use a
> > rotating scheme with a very rough current anchoring and a wide-enough
> > period of slack to avoid ambiguities.
> > >
> > > Regards,    Martin.
> > >
> > >
> > >> On Wed, Jan 16, 2013 at 3:48 PM, Roberto Peon<grmocg@gmail.com>
> > wrote:
> > >>
> > >>> How about setting epoch as the first request in the connection? :)
> > >>> -=R
> > >>>
> > >>>
> > >>> On Wed, Jan 16, 2013 at 3:45 PM, Nico
> > Williams<nico@cryptonector.com>wrote:
> > >>>
> > >>>> On Wed, Jan 16, 2013 at 5:39 PM, Mark Nottingham<mnot@mnot.net>
> > wrote:
> > >>>>> On 17/01/2013, at 10:35 AM, Nico Williams<nico@cryptonector.com>
> > >>>> wrote:
> > >>>>> Yep, but you either need to make the epoch start at least a few
> > >>>>> years
> > >>>> ago (old Last-Modified times, is important for heuristic
> > >>>> freshness), OR keep it signed (losing a bit).
> > >>>>>
> > >>>>> And I think you need more than 12 bits for seconds in a day...
> > >>>>
> > >>>> Oops, for some reason I thought of seconds in an hour.  So 5 more
> > >>>> bits, and we're about even with seconds since epoch.  Either way
> > >>>> getting from 24 bytes to 4 is pretty good, and no compression
> > >>>> scheme will do better.
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> > --
> > Mark Nottingham   http://www.mnot.net/
> >
> >
> >
>
>