Re: [Cfrg] Point format endian (was: Adoption of draft-ladd-spake2 as a RG document)

"D. J. Bernstein" <> Wed, 28 January 2015 02:29 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 344FF1A1B24 for <>; Tue, 27 Jan 2015 18:29:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 3.496
X-Spam-Level: ***
X-Spam-Status: No, score=3.496 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_NL=0.55, HOST_EQ_NL=1.545, J_CHICKENPOX_64=0.6, UNPARSEABLE_RELAY=0.001] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id g8NmYf9hdoGp for <>; Tue, 27 Jan 2015 18:29:51 -0800 (PST)
Received: from ( []) by (Postfix) with SMTP id 171221A00A8 for <>; Tue, 27 Jan 2015 18:29:50 -0800 (PST)
Received: (qmail 24548 invoked by uid 1017); 28 Jan 2015 02:30:10 -0000
Received: from unknown (unknown) by unknown with QMTP; 28 Jan 2015 02:30:10 -0000
Received: (qmail 10932 invoked by uid 1000); 28 Jan 2015 02:29:32 -0000
Date: Wed, 28 Jan 2015 02:29:32 -0000
Message-ID: <>
From: "D. J. Bernstein" <>
In-Reply-To: <>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: <>
Subject: Re: [Cfrg] Point format endian (was: Adoption of draft-ladd-spake2 as a RG document)
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 28 Jan 2015 02:29:53 -0000

Peter Gutmann writes:
> Looking at BN_bn2bin()/BN_bin2bn() from one
> widely-used library that powers half the Internet (and mobile phone market),
> it only supports the big-endian format.

To evaluate this "support" claim, let's try writing actual code using
BN_bn2bin() to print a 256-bit x-coordinate as a 32-byte binary string:

   BIGNUM *x = 0;
   char xstr[32];

Let's assume that the (omitted) code computing x guarantees that x is
between 0 and 2^256-1 so that there isn't a buffer overflow here; assume
that the fwrite() doesn't fail; etc.

This code will pass typical test vectors. However, if you do thousands
of random tests under valgrind etc., then you'll see that this code is
simply wrong: integers x below 2^248 fail to write anything to the last
byte of xstr. The number of bytes touched by BN_bn2bin() is the minimum
number of bytes _needed_ for x, whereas what we want is exactly 32 bytes.

Second try. Let's clear xstr before calling BN_bn2bin():

   BIGNUM *x = 0;
   char xstr[32];

This now passes memory tests---but it's still wrong, as sufficiently
comprehensive correctness tests will show. The problem is again with
the occasional integers x below 2^248: there's a difference between

   * the variable-size "big-endian" BN_bn2bin() format, aligning the
     first byte of output with the most significant byte used in x; and

   * the intended constant-size "big-endian" format, aligning 32 bytes
     of output with 32 bytes in x, no matter which bytes of x are 0.

Notice that this discrepancy is very much tied to the use of big-endian:
if the BN functions and target format had been little-endian then this
code _would_ have worked.

If you're thinking "Wait a minute, isn't there a symmetry between
little-endian and big-endian?": Yes, there is, but this is tied to the
symmetry between the convention of pointing to arrays via their _first_
element and the convention of pointing to arrays via their _last_
element. In the real world, machine languages and assembly languages and
C and so on have all converged upon the first convention by default, and
trying to fight this is silly.

Third try, now that we understand how severe the mismatch is between the
OpenSSL API and the target. Let's call an auxiliary OpenSSL function so
that we can figure out where we're supposed to put the first byte of x:

   BIGNUM *x = 0;
   char xstr[32];
   BN_bn2bin(x,xstr + 32 - BN_num_bytes(x));

This always works (I think---I'm not an OpenSSL expert), and it seems to
be what Peter is alluding to when he tells us that OpenSSL "supports the
big-endian format". For comparison, here's code that uses exactly the
same OpenSSL functions but reverses the order of output bytes:

   BIGNUM *x = 0;
   char xstr[32];
   int i;
   BN_bn2bin(x,xstr + 32 - BN_num_bytes(x));
   for (i = 31;i >= 0;--i) putchar(xstr[i]);

For some reason Peter doesn't call this "support" for little-endian. I
suppose he could argue that this code is marginally more complex than
the previous code; but calling this a lack of "support"---and claiming
that it's a serious argument for choosing one format over another---is
quite a severe exaggeration of a tiny code change.

The difference between little-endian and big-endian is much more
noticeable for people doing serious size optimization of crypto code
(e.g., complete crypto libraries in <20KB of compiled code). These
people would never tolerate the ludicrously bloated code shown above:
it's crazy to use three functions to create a 32-byte copy of something
that should already be sitting in memory in the first place. These
people are happiest when the CPU, via the ABI, has easy instructions to
handle the I/O format---and at the CPU+ABI levels it's clear that the
religious war is over:

   * The big-endian CPUs have been killed by the little-endian CPUs and
     the agnostic CPUs.

   * For agnostic CPUs the big-endian ABIs are rapidly being replaced by
     little-endian ABIs.
I'm not saying that the resulting code-size benefit is very large; I'm
just saying that it outweighs the miniscule benefits of what some people
here are calling "tradition".