Re: [Cfrg] E-521 vs. numsp512t1

Andy Lutomirski <luto@amacapital.net> Thu, 23 October 2014 17:53 UTC

Return-Path: <luto@amacapital.net>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 41B851ACE3B for <cfrg@ietfa.amsl.com>; Thu, 23 Oct 2014 10:53:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.52
X-Spam-Level:
X-Spam-Status: No, score=0.52 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, FM_FORGED_GMAIL=0.622, J_CHICKENPOX_71=0.6, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4N3Y8CNETy0Z for <cfrg@ietfa.amsl.com>; Thu, 23 Oct 2014 10:53:21 -0700 (PDT)
Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3EE181ACE41 for <cfrg@irtf.org>; Thu, 23 Oct 2014 10:51:53 -0700 (PDT)
Received: by mail-lb0-f182.google.com with SMTP id z11so1235755lbi.41 for <cfrg@irtf.org>; Thu, 23 Oct 2014 10:51:51 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=T8EzhJx5+OY0uRlZi6rBik9JUeqCm3rJPj09EmF5gw0=; b=PyOCyBoxyQ+drsc7GJXxHMpqXyW8RYCrglzGxP/xCoaaS1eBl0kZF8GJNozZv+Z+Fx aFkhvwB9WlCDgSHO9tG+8stMod0CyQ5/98a2Nawjs/r0psL780Wg1o8OxGLkp1tJIt+F GL4GT+/2D7pIzQKXni//N5/gNK6Vwjoo5CcTORQiloZZABLeu5XdWx1HRplt+g1hdbXw bSrgYcK8htWdN6Aby4O9G4A143+1ZvQeVnGfrx2bWvIHR19Kf7xjaOr5qLmMyFzY3KDs lQJjDiA9BTLdaqd7hqbnTw4T7tOXEvSbpz/Y2ZGDiFbf2kJnDT01cx9wiRKvtDUxpPPn f/Jg==
X-Gm-Message-State: ALoCoQkfzEaGLLqvAgfcDjwwgjiH1UlptVnkNBFfbGQYhRlVbO70P5dunIEoxthitMMY6yBTXCzR
X-Received: by 10.152.27.67 with SMTP id r3mr6754844lag.19.1414086711141; Thu, 23 Oct 2014 10:51:51 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.152.4.71 with HTTP; Thu, 23 Oct 2014 10:51:30 -0700 (PDT)
In-Reply-To: <5449359C.10105@shiftleft.org>
References: <20141022213447.20218.qmail@cr.yp.to> <CAA7UWsXmo_H4vYVzfPdjP3xzgyHvCcwvQfP==OZi1P5Wvn-Qvw@mail.gmail.com> <20141022234258.GA29823@LK-Perkele-VII> <5449359C.10105@shiftleft.org>
From: Andy Lutomirski <luto@amacapital.net>
Date: Thu, 23 Oct 2014 10:51:30 -0700
Message-ID: <CALCETrVRC5scuKPZeqSj9nCZ+tC3DrnWzTWpe6r9O+X1QvgYWw@mail.gmail.com>
To: Mike Hamburg <mike@shiftleft.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/qBulmmKVmS31f8SZnXF2IvS7f6s
Cc: "cfrg@irtf.org" <cfrg@irtf.org>, "D. J. Bernstein" <djb@cr.yp.to>
Subject: Re: [Cfrg] E-521 vs. numsp512t1
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Oct 2014 17:53:22 -0000

On Thu, Oct 23, 2014 at 10:06 AM, Mike Hamburg <mike@shiftleft.org> wrote:
>
>
> On 10/22/2014 04:42 PM, Ilari Liusvaara wrote:
>>
>> On Wed, Oct 22, 2014 at 07:19:44PM -0400, David Leon Gil wrote:
>>>
>>> On Wed, Oct 22, 2014 at 5:34 PM, D. J. Bernstein <djb@cr.yp.to> wrote:
>>>>
>>>> Rob Granger and Mike Scott have posted a new paper "Faster ECC over
>>>> \F_{2^521-1}" (https://eprint.iacr.org/2014/852) reporting ECC speeds
>>>> mod 2^521-1, and in particular the first (as far as I know) serious
>>>> implementation of E-521.
>>>
>>> The implementation djb mentions is available on their website:
>>>
>>> http://indigo.ie/~mscott/{ed521,ws521}.cpp
>>>
>>
>> Watch out (from ed521.cpp):
>>
>>
>> void mul(int *w,ECp *P)
>> {
>>         ECp W[33],Q;
>>         precomp(P,W);
>>
>>         copy(&W[w[86]],P);
>>         for (int i=85;i>=0;i--)
>>         {
>>                 if (w[i]>=0) copy(&W[w[i]],&Q);
>>                 else         neg(&W[-w[i]],&Q);
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>                 window(&Q,P);
>>         }
>>         norm(P);
>> }
>>
>>
>> That does not look constant-time...
>>
>>
>> -Ilari
>>
> Also, on curves at moderncrypto dot org, Samuel Neves is reporting (and I
> can confirm) that the performance numbers do not account for TurboBoost.  He
> measured a slower but still impressive ~884kcy.
>
> That said, the Granger-Scott implementation does not take advantage of
> vectorization or assembly optimizations, even to the degree that Goldilocks
> does (asm wide multiply and accumulate, mostly there to constrain the
> scheduler and register allocator).  It would be interesting to check its
> performance with more optimization.
>
> On a related note, one of the numbers I reported for Goldilocks (~480k
> Haswell cycles) also did not account for TurboBoost.  DJB's Titan0 SUPERCOP
> measurement of 529kcy on Haswell is accurate.  It turns out that on Ubuntu
> at least (Linux 3.13.0-29), disabling HyperThreading re-enables TurboBoost,
> so be careful what order you do it in.
>

I haven't tried it for these types of tests, but something like:

$ perf stat ./whatever_benchmark

will count cycles directly.  As long as the benchmark isn't too
memory-heavy and as long as it runs enough loops to wash out the
startup time, it might be a much simpler way to do this kind of
benchmarking.

--Andy