[Cfrg] 2^389-21 vs 2^414-17 vs 2^448-2^224-1

Watson Ladd <watsonbladd@gmail.com> Tue, 16 December 2014 02:56 UTC

Return-Path: <watsonbladd@gmail.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A4B41A028A for <cfrg@ietfa.amsl.com>; Mon, 15 Dec 2014 18:56:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.7
X-Spam-Level:
X-Spam-Status: No, score=0.7 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pD6tDCZwOPN1 for <cfrg@ietfa.amsl.com>; Mon, 15 Dec 2014 18:56:22 -0800 (PST)
Received: from mail-yh0-x231.google.com (mail-yh0-x231.google.com [IPv6:2607:f8b0:4002:c01::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A9D621AC3E6 for <cfrg@irtf.org>; Mon, 15 Dec 2014 18:56:22 -0800 (PST)
Received: by mail-yh0-f49.google.com with SMTP id f10so5770822yha.36 for <cfrg@irtf.org>; Mon, 15 Dec 2014 18:56:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=4ZYvi11+Efi0SGSa6SpejRrXKiixBz32mUo1jvwOz5s=; b=WrBXj4Qrqd8dDDCDNwBu+RkxqHOqeE8v3f2vX91MbRoLQPrzQpcY0BCgBJ4fn+i00F 3gNsqP/lfD2mROcsYXi2BXDlec8WZzX68pp7yd3XZTWd05n5xMkw75a60toFJiE0H3i3 nFUv/v4Nogx+/t+xDZXWlyfw/dvuYwKVrpGU9aq56pQsNM0N5sKWRc5GygTCRskTYNDm zO1sErVxiDFlvRIWhqQ7hcA5qatuSEfb0hRPNAusB5IBYyhE058Rt/d4gAxK2OVR6squ BgQKN0X/4kR0vf0uqxCyVbxkCDrdSxZTQ1Y8reYpE1nUSw8jNvFOaYEWlIk+5CcKq8ud OJWg==
MIME-Version: 1.0
X-Received: by 10.170.185.145 with SMTP id b139mr27885479yke.58.1418698581805; Mon, 15 Dec 2014 18:56:21 -0800 (PST)
Received: by 10.170.195.21 with HTTP; Mon, 15 Dec 2014 18:56:21 -0800 (PST)
Date: Mon, 15 Dec 2014 18:56:21 -0800
Message-ID: <CACsn0cm=f3pJQXybLsMVT5cozrZkJv9N4GFT1m2uychzPvsAeg@mail.gmail.com>
From: Watson Ladd <watsonbladd@gmail.com>
To: "cfrg@irtf.org" <cfrg@irtf.org>
Content-Type: text/plain; charset=UTF-8
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/hdY11VA15imF4EnOm6DKa0wCfag
Subject: [Cfrg] 2^389-21 vs 2^414-17 vs 2^448-2^224-1
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Dec 2014 02:56:25 -0000

Dear all,

I've come up with some back of the envelope sketches of the
performance differences for these primes. Note that there aren't hard
measurements yet: I'm only counting limbs for various choices of
radix, and these radix choices might not be themselves optimal. These
counts don't include the cost of reduction, which is at worst an
additional multiplication of each limb by a constant plus an addition
with carry. They don't include additions. In these calculations I've
permitted that addition to overflow, but not the preceding
multiplications. I've also not used signed arithmetic, which can make
things slightly better.

On a 32 bit machine with 32x32->64 multiplier, but no 64 bit adder:

2^389-21 can use 15 26 bit words. The product can be computed without
carries, but the final reduction requires a carry chain. Schoolboy
costs 225 multiplications. Karatsuba will cost 140 mults. The best I
can do without using a hardware carry is a much worse 18 23 bit words,
which will cost 198 multiplies with two uses of Karatsuba. There may
be a better way, and I didn't investigate overflow/addition tradeoff.

2^414-17 as described in http://eprint.iacr.org/2014/526.pdf uses 16
franctional radix words, and two stages of Karatsuba for 144 mults.

2^448-2^224-1 uses a special method for multiplication which I don't
have multiplication counts for. I believe 16 28 bit words is used in
the current implementation, and so 144 multiplications.

On a 64 bit machine with 64x64->128 multiplier:

2^389-21 can use 7 56 bit limbs. Here there is plenty of headroom, and
so no carrying in the reduction is required.

2^414-17 could use 8 56 bit limbs, with no carrying in the reduction
required. It could use 7 60 bit limbs, if the hardware carry is used
in the reduction.

2^448-2^224-1 is clearly using 8 56 bit limbs: it matches exactly.

At these sizes schoolboy is the right choice: 49 vs 64 multiplications.

Please double-check this. I'll try to verify this in more detail,
including implementing a few strategies. I know that the results I've
listed for some 2^414-17 and 2^448-2^224-1 are not optimal, and would
appreciate correction.

The tentative conclusion is that the performance advantages of
2^389-21 vs 2^448-2^224-1 are likely to be minimal on 32-bit machines,
and about 10% on 64 bit machines. As always, the believable results
require implementations, which I don't think I will make in time for
our deadline.

Sincerely,
Watson Ladd