Re: [Trans] What's the load on a CT log?

Daniel Kahn Gillmor <dkg@fifthhorseman.net> Thu, 13 March 2014 16:31 UTC

Return-Path: <dkg@fifthhorseman.net>
X-Original-To: trans@ietfa.amsl.com
Delivered-To: trans@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CC3121A0A2D; Thu, 13 Mar 2014 09:31:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eMvGlb1ghbRL; Thu, 13 Mar 2014 09:31:58 -0700 (PDT)
Received: from che.mayfirst.org (che.mayfirst.org [209.234.253.108]) by ietfa.amsl.com (Postfix) with ESMTP id C91D21A08ED; Thu, 13 Mar 2014 09:31:57 -0700 (PDT)
Received: from [10.70.10.55] (unknown [38.109.115.130]) by che.mayfirst.org (Postfix) with ESMTPSA id C0FD7F984; Thu, 13 Mar 2014 12:31:49 -0400 (EDT)
Message-ID: <5321DD69.2040805@fifthhorseman.net>
Date: Thu, 13 Mar 2014 12:31:37 -0400
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Icedove/24.2.0
MIME-Version: 1.0
To: Ben Laurie <benl@google.com>, "trans@ietf.org" <trans@ietf.org>, "therightkey@ietf.org" <therightkey@ietf.org>, "certificate-transparency@googlegroups.com" <certificate-transparency@googlegroups.com>, CABFPub <public@cabforum.org>
References: <CABrd9SR4G6hEUEW9yHLyS40Km3+jmK8K-tEjLMjLqN1M+Go_=g@mail.gmail.com>
In-Reply-To: <CABrd9SR4G6hEUEW9yHLyS40Km3+jmK8K-tEjLMjLqN1M+Go_=g@mail.gmail.com>
X-Enigmail-Version: 1.6
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="rfCb6m9MjfUCJVi1pAePavI0ql7enmOXT"
Archived-At: http://mailarchive.ietf.org/arch/msg/trans/EWkLtha9Qr7jeQfxuSEX3E69K9k
Subject: Re: [Trans] What's the load on a CT log?
X-BeenThere: trans@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Public Notary Transparency working group discussion list <trans.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trans>, <mailto:trans-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trans/>
List-Post: <mailto:trans@ietf.org>
List-Help: <mailto:trans-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trans>, <mailto:trans-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Mar 2014 16:32:00 -0000

On 03/13/2014 12:06 PM, Ben Laurie wrote:
> So, total average load is 3 * b * w / l ~ 20,000 web fetches per
> second. 

This part i follow (you're switching temporal units between months and
years and seconds, but i get roughly the same final figures)

> If we optimise the API we can get that down to 7,000 qps. Each
> query (in the optimised case) would be around 3 kB, 

And i agree this seems like a win.  Why was the API broken into three
parts instead of the complete proof originally?  what (other than
conceptual cleanliness) might we lose by creating the optimized API?

> which gives a bandwidth of around 150 kb/s.

This looks off by a few orders of magnitude to me.  7kqps and 3kB/q
gives me 7000*3000*8 bits per second, which is 168Mbps.  Am i missing
something?

Should we be considering swarm-based distribution of this kind of data,
or hierarchical proxying for load distribution?

	--dkg