[Cfrg] Primes vs. hardware side channels

"D. J. Bernstein" <djb@cr.yp.to> Fri, 17 October 2014 00:55 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3901C1A8AEC for <cfrg@ietfa.amsl.com>; Thu, 16 Oct 2014 17:55:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.101
X-Spam-Level:
X-Spam-Status: No, score=0.101 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_LOW=-0.7, UNPARSEABLE_RELAY=0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ih4eApmDlT80 for <cfrg@ietfa.amsl.com>; Thu, 16 Oct 2014 17:55:23 -0700 (PDT)
Received: from mace.cs.uic.edu (mace.cs.uic.edu [131.193.32.224]) by ietfa.amsl.com (Postfix) with SMTP id 886D91A8FD6 for <cfrg@irtf.org>; Thu, 16 Oct 2014 17:55:21 -0700 (PDT)
Received: (qmail 31914 invoked by uid 1011); 17 Oct 2014 00:55:17 -0000
Received: from unknown (unknown) by unknown with QMTP; 17 Oct 2014 00:55:17 -0000
Received: (qmail 14018 invoked by uid 1001); 17 Oct 2014 00:55:11 -0000
Date: Fri, 17 Oct 2014 00:55:11 -0000
Message-ID: <20141017005511.14016.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
In-Reply-To: <201410081357.03062.manfred.lochter@bsi.bund.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/2VNzcCSEIHgJPLLJ4EIsHRDES9k
Subject: [Cfrg] Primes vs. hardware side channels
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Oct 2014 00:55:26 -0000

The "special primes are harder to secure than general primes" claims are
almost completely backwards.

The correct question is what level of security can be achieved within
the user's performance budget. Whether one counts hardware transistors,
FPGA slices, AVR cycles, ARM cycles, or Intel cycles, the simple fact
is that special primes are about half the cost of general primes (they
almost eliminate reduction cost, which is about half the cost of mulmod
and more than half the cost of a separate sqmod), leaving far more room
for adding defenses against side-channel attacks.

In the case of Brainpool's 256-bit curve, the defense under discussion
consists of obscuring the bits of a 256-bit scalar s by adding a small
multiple of the 256-bit group order to s, typically a 32-bit multiple.
This seems safe if the bits are further obscured by a considerable level
of physical noise. However, for plausibly smaller noise levels this is
breakable, as illustrated by the recent paper by Schindler and Wiemers.

This not-very-secure 288-bit Brainpool scalar multiplication is roughly
as expensive as a 576-bit scalar multiplication on NIST P-256, using a
320-bit multiple instead of a 32-bit multiple. Nobody knows how to break
a 320-bit multiple for the same plausible noise levels: the attacker
needs the noise levels to be considerably lower. Less expensive curves,
such as Curve25519, make room for even stronger side-channel defenses.

It's easy to see, and well known, that for NIST P-256 etc. there are
some scalar bits that wouldn't be obscured in this way if the multiple
were smaller than about 128 bits. But such small multiples are at levels
of performance that can't be achieved _at all_ by the Brainpool curves,
so talking about this as a Brainpool security advantage makes no sense.

The only reason I said "almost completely backwards" rather than
"completely backwards" is that there are a few unusual platforms that
have trouble seeing the cost advantages of special primes. For example,
if someone built typical RSA-acceleration hardware years ago and is
trying to resell it as ECC-acceleration hardware, he'll get

   * bad speeds out of Brainpool,
   * similarly bad speeds out of special primes,
   * worse speeds out of side-channel-protected Brainpool, and
   * even worse speeds out of side-channel-protected special primes.

He won't see that competently designed special-prime hardware meets the
user's performance goals with much stronger side-channel protection at
vastly lower cost. The picture here is not that special primes are
harder to secure than general primes; the picture is that special primes
are easier to secure than general primes, except on obsolete niche
hardware.

The new Brainpool paper http://eprint.iacr.org/2014/832 incorrectly
states that a "special shape of the prime does not improve performance
in hardware" if the hardware is designed to "support arbitrary primes".
In fact, anyone who can afford the hardware area for "arbitrary primes"
is very close to the hardware area for arbitrary primes _plus_ several
special primes. Mike Hamburg already gave an example last month:

   I have some limited experience in crypto hardware design, in
   particular working with a hardware modulo multiplier.  If I recall
   correctly, adding support for a handful of the NIST primes (192, 224,
   256 and 384 maybe?) cost about 10% area overhead in our lightweight
   design, for double the performance.

The huge speed advantage over Brainpool again turns into an advantage in
side-channel protection for any performance target, again contradicting
the "special primes are harder to secure than general primes" claim.

Of course, for ECC hardware designers who have more serious performance
constraints, supporting "arbitrary primes" is out of the question, and
choosing brainpoolP256t1 is much less pleasant than choosing a special
prime, particularly with a Montgomery curve. The special prime again
ends up with better side-channel protection for any performance target.

To summarize, the strongest hardware designs provide the maximum
side-channel protection by taking advantage of the speed of special
primes. The core Brainpool objection to these stronger hardware designs
is really that they don't match Brainpool's old designs. This is a
transition-cost objection, raising questions such as

   * which hardware we're actually talking about,
   * how much the hardware is used in IETF protocols today,
   * whether the hardware would really have trouble with new curves,
   * how expensive new hardware for the new curves would be,
   * whether this is outweighed by the costs of sticking to old curves,

etc. This shouldn't be misrepresented as a security objection.

---Dan