Calculation of the hash width in cache-digests

Martin Thomson <martin.thomson@gmail.com> Mon, 18 July 2016 13:58 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 92B3912E035 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 18 Jul 2016 06:58:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.308
X-Spam-Level:
X-Spam-Status: No, score=-8.308 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LKaS1r8XmRc0 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 18 Jul 2016 06:58:27 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8078B12DD70 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 18 Jul 2016 06:32:47 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bP8b8-00062Y-Bv for ietf-http-wg-dist@listhub.w3.org; Mon, 18 Jul 2016 13:28:38 +0000
Resent-Date: Mon, 18 Jul 2016 13:28:38 +0000
Resent-Message-Id: <E1bP8b8-00062Y-Bv@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <martin.thomson@gmail.com>) id 1bP8b3-00061H-T3 for ietf-http-wg@listhub.w3.org; Mon, 18 Jul 2016 13:28:33 +0000
Received: from mail-qk0-f172.google.com ([209.85.220.172]) by maggie.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <martin.thomson@gmail.com>) id 1bP8ar-0006kF-68 for ietf-http-wg@w3.org; Mon, 18 Jul 2016 13:28:32 +0000
Received: by mail-qk0-f172.google.com with SMTP id p74so156578528qka.0 for <ietf-http-wg@w3.org>; Mon, 18 Jul 2016 06:28:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=aL5m6xo89bWnwSshG92vGD/7NXFv126IAon06OkmJ/o=; b=nxLVNgK428CwVajGSllGc4bokKcKv4bC+TnkXQf8EBS7vC9K0bYzhea3ilivQq985v XGgEesC0Ce2Nt9OIFQ6cJk6yFpJwliTCO+IaGB7BgR5sX4IkpFUY7UkrrRM6PXet+zks P02Z5fexzs4aXhiVcKTU7Tl5kfCwIQzkAX8uZNujgXiBxeUBCxZDaCPtIaGkMfAeprQB XfHHk+IZKr4ecSxRcwWfcjvjlkSemc5TYN9HkbI8Njtbp81QUJfF4KFswTGCWH8f1vCQ UQ7Q2PupHomuhCflzyqydPPtIPdmV/zDGbgR6cjwhdm6gDhSnKZSAJ4GgB6EUbapZ+v/ 8eYg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=aL5m6xo89bWnwSshG92vGD/7NXFv126IAon06OkmJ/o=; b=H3gsX4OkH/7ZQkfYiaZPNlYnBhxjauXIGwRyQCuRMI1vjWGz7rtq6ugfiTRyhZRBWB A14n2bJe+X0tzMJbfuxpOhoYDcb0MzlApMgyKT10xiJz4BFXg3tMQSFf8UkUVspSqqmX vTIBTgNDLcxnon43f+e4TBM/WV848zWcrc8AU1HXsHmiJ02KLduK0UA/b9u/tJZ33Igq mLHL8DGdWEqYvAE9iM+rlbn9/6aZv5f120DyC94WsKwBEoi81ZB7FZLxUnCbyg5mT1QY eK8nWECtAvxyCUvRv8IXddy4UlVb+BUZP7OtB/Au+87Tl7U+Qy0UeldXGVaTRyNVORw9 XXrA==
X-Gm-Message-State: ALyK8tI7AnUaKxFRuj7sqDUFSGd9JGrGQZTlzjunS3smWzdY4zXkXcqS60elJOPs5D7W4D5q1FZvZSmzlHivww==
X-Received: by 10.55.214.77 with SMTP id t74mr44354438qki.80.1468848474690; Mon, 18 Jul 2016 06:27:54 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.140.22.146 with HTTP; Mon, 18 Jul 2016 06:27:54 -0700 (PDT)
From: Martin Thomson <martin.thomson@gmail.com>
Date: Mon, 18 Jul 2016 15:27:54 +0200
Message-ID: <CABkgnnVTpDLAb8O_gttjVWEWfTRy=UpbM2hgkeSMxdcPsdJvpg@mail.gmail.com>
To: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: text/plain; charset="UTF-8"
Received-SPF: pass client-ip=209.85.220.172; envelope-from=martin.thomson@gmail.com; helo=mail-qk0-f172.google.com
X-W3C-Hub-Spam-Status: No, score=-7.9
X-W3C-Hub-Spam-Report: AWL=1.831, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: maggie.w3.org 1bP8ar-0006kF-68 b7d41c6fcd7dc4c925dd256e2bb5a829
X-Original-To: ietf-http-wg@w3.org
Subject: Calculation of the hash width in cache-digests
Archived-At: <http://www.w3.org/mid/CABkgnnVTpDLAb8O_gttjVWEWfTRy=UpbM2hgkeSMxdcPsdJvpg@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32006
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

The draft describes how to select P to achieve a particular false
positive probability.  The net effect of this is that you get
approximately the probability.

I think that this would be better if there was a description of how to
decide on an upper bound for the false positive probability.

The calculation is a little less clean:

n is the number of resources
p is the lower bound of the probability of false positive

P is the number of fixed bits allocated to each entry
N+P is the width of the hash output

All of these are integers, except p, which is a number between 0 and 1.

The current draft chooses:

P = ??(0 - log2(p)) - it's not clear, but I imagine that this uses floor()
N = round(log2(n))

I propose that we instead do:

P = round(0 - log2(p))
N+P = ceil(log2(n)-log2(p))

This is, I think, a much better calculation.  We would still benefit
from encoding N and P separately.