Re: [Cfrg] Point format endian

Michael Clark <michael@metaparadigm.com> Mon, 26 January 2015 09:37 UTC

Return-Path: <michael@metaparadigm.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 286E61A8834 for <cfrg@ietfa.amsl.com>; Mon, 26 Jan 2015 01:37:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.231
X-Spam-Level:
X-Spam-Status: No, score=0.231 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, IP_NOT_FRIENDLY=0.334, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ClYw0HY_c-3p for <cfrg@ietfa.amsl.com>; Mon, 26 Jan 2015 01:37:31 -0800 (PST)
Received: from tlsx.org (tlsx.org [67.207.128.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 783691A8826 for <cfrg@irtf.org>; Mon, 26 Jan 2015 01:37:31 -0800 (PST)
Received: from monty.local (unknown.maxonline.com.sg [58.182.168.20] (may be forged)) (authenticated bits=0) by tlsx.org (8.14.4/8.14.4/Debian-4) with ESMTP id t0Q9xKRS031355 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 26 Jan 2015 09:59:24 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=metaparadigm.com; s=klaatu; t=1422266371; bh=RBX5rX5FHK8P9/KMw0S3NeTvt33bYTgDZU41bF0IxrA=; h=Date:From:To:Subject:References:In-Reply-To:From; b=AN9BQofhmGtgHpS2S7f95xh8FR5NMZMByN/k+WyxYptORTmCitkQpwzARy2ZZaCbk zR2QeQh1VDBLOkaULX8MlaJO6qe/YcIkzVMHmTQYZk9C86LTwdxztRd1A0hl/g5CTh MoIOVRUhgKX3gcQ24ETCya386rP4aeJ2cgeGwUjM=
Message-ID: <54C60AC7.6000301@metaparadigm.com>
Date: Mon, 26 Jan 2015 17:37:11 +0800
From: Michael Clark <michael@metaparadigm.com>
Organization: Metaparadigm
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
MIME-Version: 1.0
To: Alyssa Rowan <akr@akr.io>, "cfrg@irtf.org" <cfrg@irtf.org>
References: <BF9DADF6-003F-454D-8E96-4A28A060CA72@isode.com> <B31EEDDDB8ED7E4A93FDF12A4EECD30D40DF8FE3@GLKXM0002V.GREENLNK.net> <04A0462F-0A20-42F3-A404-FDA6A3E5A17A@akr.io> <0bee84ff19938a1a02dca5c422602215.squirrel@www.trepanning.net> <50d4436f6a004409b297e1d8c7e72787@usma1ex-dag1mb2.msg.corp.akamai.com> <54C55C2A.2090104@akr.io>
In-Reply-To: <54C55C2A.2090104@akr.io>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Virus-Scanned: clamav-milter 0.98.4 at klaatu.tlsx.org
X-Virus-Status: Clean
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/-PfJvR-WczY_Dhdawonb00YNeLk>
Subject: Re: [Cfrg] Point format endian
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Jan 2015 09:37:33 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 26/1/15 5:12 am, Alyssa Rowan wrote:
> On 24/01/2015 16:18, Salz, Rich wrote:
> 
>>> [DH] So a long-standing tradition of the on-the-wire format is 
>>> changed because of the way the first curve25519 library was 
>>> written? That's a weak justification.
> 
> So all the running code in every X25519 library out there (there
> are several) should change its on-the-wire format because of the
> way some other earlier wire protocols (with explicitly no relation
> to this one, as Rich mentions) were written? ¬_¬
> 
>> [RS] The IETF should follow its tradition and drop its wire
>> format tradition.
> 
> Agreed. To the extent it matters, there's no justification for 
> changing the X25519 wire format to big-endian at all; it'd just be 
> gratuitous incompatibility with existing libraries for no reason if
> we did. I might not put it quite as sarcastically as djb, but I am
> not a fan of that. Running code wins: and that's little-endian.

Can't we just define point formats with '_le' suffix?

software still needs to consider endianness as these libraries are
likely to be ported to big endian systems (< power8, < sparc9, etc),
but nowadays big endian should perhaps pay the bswap price (however
sadly the older systems are the slower ones). "network byte order"
loses some meaning. format names should then become endian explicit.

the bignum libraries would typically have limbs in host-endianness,
and it makes sense for scalar growth that they store vectors of limbs
little end first, but they could be either big or little endian host
words packed little word first in memory. it is host dependent.

(example 128)

32-bit limbs on a little-endian system (memory representation):

  (limb0_le32), (limb1_le32), (limb2_le32), (limb3_le32)
   0x80000000 ,  0x00000000 ,  0x00000000 ,  0x00000000

32-bit limbs on a big-endian system (memory representation):

  (limb0_be32), (limb1_be32), (limb2_be32), (limb3_le32)
   0x00000080 ,  0x00000000 ,  0x00000000 ,  0x00000000

the natural composition of little endian limbs and little limb first
(2nd order bignum endianness) on a little endian system ends up being
byte-level little endian big nums, so yes little endian representation
is more efficient on modern systems.

on the other hand, with fixed-size bignums (for timing resistance)
where you know the size in advance, it would be possible to use big
word/limb first but the memory access pattern might be non-optimal.
suspect most bignum code is designed with little word/limb first for
scalar growth (and fixed size is essentially a subset or variable size).

this however is besides the point. as people indicate, these are
internal implementation details, and an implementation can serialize
and deserialize numbers before and after any computation.

big-endian still exists however it makes a reasonable amount of sense
for a new wire-format to make the (now less common) big-endian systems
pay the bswap price and use little endian on the network.

the other thing to remember is there is also base 2 little endian; or
bit reflection, versus what we call little endian (byte-level little
endian). see {Bit Reflection Peculiarity of GCM}. However in the case
of GCM, this is opaque so is not an issue. Just want to point out
there is another endianness. In bit reflection (or base 2 le) the
above example would = 1 not 128.

the swizzle code is already there, just it doesn't have a conditional
on it. If there are _le point formats then all the bswaps will eval to
nothing on the systems that are the most common today.

X25519 will likely be ported to big-endian systems?

if it isn't "network byte order" then endianness should be explicit in
a point format name i.e. add "_le" suffix. rationale for old names :-D

as developers, we will still need the bswaps in there either way, even
in the little-endian case where they the macros will evaluable to
nothing, assuming the code gets ported to < power8, < sparc9, etc

a shrewd strategy is send the endianness of your platform and have le
point format. You pay no price for this strategy unless the other end
is big endian. So make little endian point formats.

endianness is a pain. We need a <stdenian.h> in ISO C17, and
<cstdenian> in C++17. WIP. This still needs to emit BSWAP asm on MSC.

  http://austingroupbugs.net/view.php?id=162
  https://gist.github.com/michaeljclark/3b4fd912f6fa8bb598b3

endianness will need to be dealt with for some time to come,
especially in network code where everything is still "network byte order"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlTGCscACgkQa/HXs1fvPk8vAwCgpKjRExM/R3MmPBTvH/CUJAH7
iCEAoIZQhNum36Wn/Bnglrl2CQCGzWOk
=bc3q
-----END PGP SIGNATURE-----