Re: [Cfrg] Point format endian (was: Adoption of draft-ladd-spake2 as a RG document)

Watson Ladd <watsonbladd@gmail.com> Wed, 28 January 2015 06:02 UTC

Return-Path: <watsonbladd@gmail.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5541E1A0024 for <cfrg@ietfa.amsl.com>; Tue, 27 Jan 2015 22:02:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0hYGgiuBkOg6 for <cfrg@ietfa.amsl.com>; Tue, 27 Jan 2015 22:02:39 -0800 (PST)
Received: from mail-yh0-x236.google.com (mail-yh0-x236.google.com [IPv6:2607:f8b0:4002:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 315AE1A003A for <cfrg@irtf.org>; Tue, 27 Jan 2015 22:02:32 -0800 (PST)
Received: by mail-yh0-f54.google.com with SMTP id 29so7941057yhl.13 for <cfrg@irtf.org>; Tue, 27 Jan 2015 22:02:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vLqSNlFr/KcYoXTpvZ6NGmgUDmk+bwBVEr0dXAKGLwY=; b=zF6QUP6NS3NqRtg04Xs2+M8a9ibZcu9q7nftbyHzi54bmfcpf7cWocSlZ/HEqxG/k6 UOFtYW5dIcrFyXwU4H0hA4ObcbmUsnFfFV4y1EpdUNEaObdQ8+JhAcbWsS9ZlLB7wFMe n0FI0iKAl4HrDysauZSjO7asxordxYsIkcy/Md8I4UrWD3OdLvVKJDF9aZzquDAcsNK4 PORctRnm++I1C2Te7+2JB7dWTIZEEeUVOjQKk7laSMSLU5dcQWY1Fkdg4NOJ5rB1dpFc uy1MGpLjOgJLCL3HF4VVfz2mW7vfGf47YYZF/9M769rufAhkxXss/OYSBx8nmgYm3szU zrvg==
MIME-Version: 1.0
X-Received: by 10.236.61.8 with SMTP id v8mr378112yhc.44.1422424951283; Tue, 27 Jan 2015 22:02:31 -0800 (PST)
Received: by 10.170.115.77 with HTTP; Tue, 27 Jan 2015 22:02:31 -0800 (PST)
In-Reply-To: <0bb77994109993ccca2f70fc646aead0.squirrel@www.trepanning.net>
References: <9A043F3CF02CD34C8E74AC1594475C73AAF6839A@uxcn10-tdc05.UoA.auckland.ac.nz> <54C77376.3080005@cs.tcd.ie> <9ad11090808dc1e97bfc10196ad0e0c4.squirrel@www.trepanning.net> <CACsn0c=+uKicVmuex+jo5L6VQcJPLuQ45z3T1EZbSXMOrpy-=A@mail.gmail.com> <4dbcfbff889d175765d549d96826767a.squirrel@www.trepanning.net> <CACsn0cnFQhRrP=7oG+7eWKr_2+L+kNjGkXW0xHsdV5WLXsH1Tw@mail.gmail.com> <0bb77994109993ccca2f70fc646aead0.squirrel@www.trepanning.net>
Date: Tue, 27 Jan 2015 22:02:31 -0800
Message-ID: <CACsn0c=KjbT5rKzHOv9t3yphPkMZ=CTBNbttW_9HLCCpzSsmnA@mail.gmail.com>
From: Watson Ladd <watsonbladd@gmail.com>
To: Dan Harkins <dharkins@lounge.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/g2BXa2HEU5D8qD9fUpGd8sikQBg>
Cc: "cfrg@irtf.org" <cfrg@irtf.org>
Subject: Re: [Cfrg] Point format endian (was: Adoption of draft-ladd-spake2 as a RG document)
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jan 2015 06:02:49 -0000

On Tue, Jan 27, 2015 at 8:47 PM, Dan Harkins <dharkins@lounge.org> wrote:
>
>
> On Tue, January 27, 2015 5:49 pm, Watson Ladd wrote:
>> On Jan 27, 2015 9:30 AM, "Dan Harkins" <dharkins@lounge.org> wrote:
>>>
>>> On Tue, January 27, 2015 9:07 am, Watson Ladd wrote:
>>> >
>>> > My SPAKE2 draft contains specified M&N, generated by a C program
>>> > Nathan McCullum sent me. I've been unable to determine what that
>>> > program does in anything more than the vaguest terms because OpenSSL
>>> > internals are opaque, but users do not need to generate their own
>>> > points.
>>>
>>>   Yes, it's opaque to you, the application writer. That's the point! But
>>> to have curve25519 as a special case then you'd have to pry into the
>>> opacity of the code and figure out what's going on.
>>>
>>>   So to allow Nathan McCullum to send you a chunk of code that
>>> should work with any curve supported by OpenSSL, and have it just
>>> work without you knowing any opaque internals, it requires a
>>> canonical conversion of bitstring to integer to field element and
>>> back.
>>>
>>>   If, as you say, users do not need to generate their own points then
>>> they're gonna have to have a registry of M and N for all curves. And
>>> to import a bitstring (for instance from the appendix of your draft)
>>> into code requires converting that bitstring into an element. And
>>> without the canonical conversion then you need to know about the
>>> opaque internals.
>>
>> What makes you think I wouldn't write the point appropriately? More
>> importantly, if this encoding was different for each curve, how would I
>> notice, except when I generate the table? Every point has a representation
>> either way.
>
>   You don't understand. It's not that you don't write them appropriately.
> It's that you are writing them big endian because you are interpreting the
> output of the appropriate hash function as a big endian number.
>
>   Your draft claims "[t]he points are presented in hexidecimal SEC1 format."
> If someone wants to implement your draft he or she would have to convert
> the bit strings in table 3 into points on the elliptic curve.
>
>   So check out how one converts an octet string into an elliptic curve
> point in
> SEC1. It's section 2.3.4 and that will take you eventually to section
> 2.3.8 (octet
> string to integer conversion) and that is quite obviously a big endian
> conversion.
>
>> I can always do the calculation in PARI and reverse and pad the result. In
>> fact, PARI only uses decimal IO so I have to convert anyway. I'm not
>> insisting on decimal format to make my tools work.
>>
>> There is an encoding of x coordinates as 32 byte strings. The intent is
>> that this encoding is used for everything having to do with Curve25519.
>> Why
>> is this such a large problem? At least with the Weierstrass coordinate on
>> the wire proposal there was a clear rationale for the change in terms of
>> compatibility. But this has nothing to do with compatibility: you will
>> need
>> to write new code anyway.
>
>  The point is, your draft lists M and N for p256, p384, and p521 and it uses
> big endian format exclusively. If one were to try to use curve25519 to
> compute M and N it would have to interpret the output of the hash
> function differently. So therefore someone who implements your draft
> has to know about the special case of curve25519. He has to know not
> to do what he'd do for every other curve.
>
>> Of course, it's not the case that there is a single universal key format.
>> SEC1 serializes bignums as fixed-width in keys and variable width in
>> signatures. PGP uses a different length encoding from TLS. Nor do all
>> algorithms on words take the words the same way: MD5 uses a little-endian
>> length encoding, and encodes the low byte first. SHA1 uses a big-endian
>> length encoding and big-endian words.
>
>   No there is, and it's big endian. MD5 and SHA1 are hash algorithms and
> their output is a digest, not a number. The important thing to realize
> though is that if one were to use MD5 or SHA1 to generate a bit string
> in order to convert the digest into a point on an elliptic curve according
> to SEC1 (or X9.62 or ISO/IEC 15946 or RFC 6090) that output would be
> interpreted as big endian in both instances, the internals of the hash
> algorithm notwithstanding.
>
>> Furthermore, forcing applications to carry out encoding and decoding,
>> instead of having the library do it has several bad effects. It makes it
>> more complicated to use Curve25519. It prevents codesharing among
>> libraries, unless they have identical bignum APIs. It makes applications
>> non portable across libraries.
>
>   That is exactly my point! Applications shouldn't have to carry out
> different encodings and decodings. If curve25519 does not follow the
> canonical convention, though, they will.
>
>> Defining Curve25519 as a function of bigintegers was considered and
>> rejected by multiple people due to interoperability problems historically
>> caused by lack of canonical byte encodings: TLS DHE uses two distinct
>> encodings at some point.
>>
>> What's wrong with the API everyone is already using?
>
>   Exactly, and everybody is using big endian.

You're confusing the job of the person who writes a library that
provides a function called X25519 with the job of the person who
writes an application that uses X25519 to calculate keys, and then
conflating that with the work required to do calculations of points on
the curve in some other application. I'll let slide the fact that
SPAKE2 will not work with X coordinate only arithmetic.

If you look in tweetnacl.c you will see a function crypto_scalarmult
that takes two const char * arguments, and places a result in a third
char *. There is no encoding or decoding anywhere in sight in an
application that uses this function. Clearly, an application calling
this function isn't affected by the endian choice, provided we all
agree. It's clear we can write down X coordinates as arguments to this
function: 64 hex nibbles as 32 bytes in the order we all know and
love: 0f10 is the byte array with first byte 15, second one 16, etc.

Now let's consider the library writer. They potentially have to write
a swap routine, if their prefered bignum library doesn't have a little
endian load. It's likely, as DJB points out, that they have to do some
work to get the padding correct even if we pick big endian.

But what if I want to do a calculation in the field F_p? Well, you can
use SAGE or PARI or your whatever: that format doesn't have anything
to do with the format here. If I'm going to take 32 bytes and
interpret them as a field element, why does it matter which one goes
first vs. which one goes first in the format used by the
crypto_scalarmult function above for purposes of picking random field
elements? And why does the absolute order that I use when doing the
calculation matter between one field and the next, so long as I know
what it is?

So what actually is your argument? It can't be that applications that
link against a library containing crypto_scalarmult_curve25519 have to
deal with an inconsistent API if we switch endianness: they don't. It
can't be that authors of libraries providing that function can't do it
if the endianness is little-endian: they clearly can. And it can't be
that we can't do calculations on the curve itself if we switch
endianness: we can.

Of course, you might as well ask why we don't use big-endian. But the
fact is that switching has costs: at minimum we need to rename the
primitive. All existing implementations of that crypto_scalarmult
function don't work anymore. All existing documentation has to be
carefully checked to ensure that it's dealing with the correct
variant. The old version isn't going away: working groups end up
having to pick one.

Is this the only way to structure the interface between the library
and the application? No. One could instead have the library separate
out the necessary steps of masking the private key, converting x to a
bignum, carrying out the scalar multiplication, and turning the result
back into a sequence of bytes. Now, there actually is something to
complain about if you decide to expose the endianness to the
application developer. But this need is entirely self-inflicted
damage: the API was already terrible by exposing details that didn't
need to be exposed.

(It's also unclear how you manage to do this: A SEC1 encoded point and
a SEC1 signature don't have the same padding. You would have to leave
almost everything about the packing and unpacking to the calling
application unless you had a per-curve encoding and decoding function,
in which case endianness can be hidden inside it.  Why would leaving
this up to the caller be a good idea?)

Sincerely,
Watson Ladd

>
>   Dan.
>
>
>



-- 
"Those who would give up Essential Liberty to purchase a little
Temporary Safety deserve neither  Liberty nor Safety."
-- Benjamin Franklin