Re: [Cfrg] Publicly verifiable benchmarks

"D. J. Bernstein" <djb@cr.yp.to> Mon, 13 October 2014 11:36 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 61EA81A896D for <cfrg@ietfa.amsl.com>; Mon, 13 Oct 2014 04:36:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.014
X-Spam-Level:
X-Spam-Status: No, score=-0.014 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, J_CHICKENPOX_15=0.6, RCVD_IN_DNSWL_LOW=-0.7, THIS_AD=0.086, UNPARSEABLE_RELAY=0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Kj51nWrA9o1F for <cfrg@ietfa.amsl.com>; Mon, 13 Oct 2014 04:36:16 -0700 (PDT)
Received: from mace.cs.uic.edu (mace.cs.uic.edu [131.193.32.224]) by ietfa.amsl.com (Postfix) with SMTP id D43FA1A6F20 for <cfrg@irtf.org>; Mon, 13 Oct 2014 04:36:15 -0700 (PDT)
Received: (qmail 27398 invoked by uid 1011); 13 Oct 2014 11:36:12 -0000
Received: from unknown (unknown) by unknown with QMTP; 13 Oct 2014 11:36:12 -0000
Received: (qmail 30290 invoked by uid 1001); 13 Oct 2014 11:36:07 -0000
Date: Mon, 13 Oct 2014 11:36:07 -0000
Message-ID: <20141013113607.30288.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
In-Reply-To: <2FBC676C3BBFBB4AA82945763B361DE60A76B077@MX17A.corp.emc.com> <CACsn0cnHJmydwsf5i9tHjvgawHN4fmQ8NwXaJMRgLEnEkcthEA@mail.gmail.com> <5439ED77.3010701@brainhub.org> <ACC414D4-6651-42C7-B0EF-8E381EE9A0B9@shiftleft.org> <5437FED9.50409@sbcglobal.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/-vIfw6tfWxS4jqbcAoQPc2Fr6l8
Subject: Re: [Cfrg] Publicly verifiable benchmarks
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Oct 2014 11:36:18 -0000

More notes on how SUPERCOP documents various things.

Andrey Jivsov writes:
> Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz, Ivy Bridge (not Haswell).

http://bench.cr.yp.to/computers.html lists four "IB+AES" machines
(Ivy Bridge is actually two different microarchitectures: Core i3 CPUs
don't have AES instructions), namely hydra8, ares, khazaddum, and h9ivy.

The last column of the table shows that the latest reports contributed
from ares and khazaddum are from 2013 and 2012, which is why they're
marked in gray; hydra8 and h9ivy are reasonably up to date. If you then
check, for example,

   http://bench.cr.yp.to/results-dh.html#amd64-hydra8

you see that crypto_dh/curve25519 takes 182544 cycles (quartiles: 182424
and 182664) on hydra8. As I mentioned, this X25519 code was actually
written years ago (optimized for Nehalem), so you can also find the same
speed in the 2013 report from ares, but in general it's good to focus on
the up-to-date speed reports.

Parkinson, Sean writes:
> Is there a REST API to the benchmark data?

The underlying compressed database is hundreds of gigabytes, creating
serious performance problems for most standard tools. We're working on
supporting more client-side manipulation of data but for the moment the
best way to get additional reports onto the web pages is to talk to us.

David Jacobson writes:
> It would be nice if you tagged implementations (of algorithms where it
> matters) into according to leakage resistance.

Yes. Some implementors advertise constant-time software, but right now
SUPERCOP doesn't provide a structured mechanism for this advertisement.
One of the difficulties here is that some people are more stringent than
others in what they mean by "constant time"; I'll write a separate
message about this.

Michael Hamburg writes:
> some of the machines on bench.cr.yp.to have quirks
> (turboboost, mismatched cycle counter frequency, etc) which can make
> the data difficult to interpret and reproduce.

Turbo Boost is noted as "boost" in red (as is Turbo Core), and is also
easy to spot as unusually large gaps between the quartiles. See, e.g.,
http://bench.cr.yp.to/results-dh.html#amd64-hydra3.

The SUPERCOP documentation has a "Reducing randomness in benchmarks"
section that tells people how to turn off hyperthreading and Turbo
Boost. Eventually we'd like to measure what the actual Turbo Boost
speedup is, but this isn't as easy as it sounds.

Some CPUs don't give the OS full control over clock frequencies. Of
course, clock frequency makes far less of a difference in cycle counts
than it makes in other metrics such as operations per second, but it
does sometimes make a noticeable difference, especially for code that
doesn't fit into L1 cache. Clicking on machine names shows pages such as

   http://bench.cr.yp.to/web-impl/armeabi-h7green-crypto_dh.html

with a "CPU cycles/second" line showing the range of frequencies
observed by SUPERCOP (highly variable for h7green---this particular
Cortex-A9 CPU seems quite hard to control), along with information about
which cycle counter SUPERCOP is using (in this case cpucycles/cortex.c).

> You should also make sure to install the compilers DJB uses (GCC 4.8.1
> and Clang 3.2 on titan0, or GCC 4.6.3 and Clang 3.0 on h6sandy, for
> example) to make sure that your system compiles and runs passably well
> using those compilers.

The compiler versions are noted parenthetically in the same machine
pages mentioned above. But they do change every now and then when
systems are upgraded (for example, titan0 just switched to gcc 4.8.2),
and when people contribute more machines they usually have different
compilers. The real-world situation is that people use many different
compilers, and it isn't safe to assume that those compilers all behave
the same way. Of course, asm code produces much less variability.

---Dan