Re: [MLS] KDF instead of hashing up the tree

Konrad Kohbrok <konrad.kohbrok@datashrine.de> Fri, 08 February 2019 12:17 UTC

Return-Path: <konrad.kohbrok@datashrine.de>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8FE4C12EB11 for <mls@ietfa.amsl.com>; Fri, 8 Feb 2019 04:17:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v9nPo6bEh4EH for <mls@ietfa.amsl.com>; Fri, 8 Feb 2019 04:17:13 -0800 (PST)
Received: from mx2.mailbox.org (mx2a.mailbox.org [IPv6:2001:67c:2050:104:0:2:25:2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9FBA212867A for <mls@ietf.org>; Fri, 8 Feb 2019 04:17:11 -0800 (PST)
Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:105:465:1:2:0]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 898DAA113A for <mls@ietf.org>; Fri, 8 Feb 2019 13:17:09 +0100 (CET)
X-Virus-Scanned: amavisd-new at heinlein-support.de
Received: from smtp2.mailbox.org ([80.241.60.241]) by spamfilter05.heinlein-hosting.de (spamfilter05.heinlein-hosting.de [80.241.56.123]) (amavisd-new, port 10030) with ESMTP id x3frv5WDco_T for <mls@ietf.org>; Fri, 8 Feb 2019 13:17:06 +0100 (CET)
To: mls@ietf.org
References: <dc702cea-d780-216b-ab8e-1eba99a2bace@datashrine.de> <CAL02cgTdx7_=t9jfZj2iFULFK4x-RSL+J5LrqRN=3co1nSKS7A@mail.gmail.com> <DB120F33-B500-42F2-8117-8883B396B278@gmail.com> <7507c820-d574-a570-6aba-c469366cc9c5@datashrine.de> <CAL02cgSsoi5JiEpLf4PCP0MufS2qAJQugW7WOVFVkH0ffLURfA@mail.gmail.com> <321d21c2-22ab-def4-7014-8948eeaa0dea@datashrine.de> <1AF700AD-243F-4BDD-AC44-F57CAF115E80@inria.fr> <05a6578c-2cc9-5338-e5df-3f13b9eb216f@datashrine.de>
From: Konrad Kohbrok <konrad.kohbrok@datashrine.de>
Message-ID: <dac4e3a8-daae-5ff3-b2c5-1307b7c47a3c@datashrine.de>
Date: Fri, 08 Feb 2019 14:17:02 +0200
MIME-Version: 1.0
In-Reply-To: <05a6578c-2cc9-5338-e5df-3f13b9eb216f@datashrine.de>
Content-Type: text/plain; charset="utf-8"
Content-Language: de-DE
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/kOgMQ8Rsr59I6EbbXfgSP7IaYdE>
Subject: Re: [MLS] KDF instead of hashing up the tree
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Feb 2019 12:17:17 -0000

Next question: Is it worth editing the RFC beyond the "Cryptographic Objects"
headline? There is a lot of DH-specific stuff that seems to be somewhat outdated
anyway if we aim for a KEM-agnostic design.

Konrad

 07/02/2019 09:37, Konrad Kohbrok wrote:
> Hey Benjamin,
> 
> sure, that sounds good. I was just wondering if we want to keep the notion of a
> "Derive-Key-Pair" function, e.g., for a case where a private key is not
> necessarily a (pseudo-)random string you get out of a KDF.
> 
> But if everyone is ok with that assumption and the additional notion of a
> "Derive-Public-Key" function, then I'm absolutely happy with that.
> 
> Konrad
> 
> On 06/02/2019 13:37, Benjamin Beurdouche wrote:
>> Hi Konrad,
>>
>> I think I got the idea, but let me try to reformulate with the following tree:
>>
>>    AB
>>   /     \
>> A      B
>>
>> What we currently want to do:
>>
>> For A:
>> x_a <-$- {0,1}^n // This is A's new secret octet string
>> a <-- KDF(x_a,"kem_key") // This is A's private KEM-key
>> A <-- Derive-Public-Key(a) // This is A's public KEM-key
>>
>> For AB:
>> OLD = x_ab <-- Hash(x_a) // This is AB's new secret octet string
>> NEW = x_ab <-- KDF(x_a,”parent_key") // This is AB's new secret octet string
>> ab <-- KDF(x_ab,"kem_key") // This is AB's new private KEM-key
>> AB <-- Derive-Public-Key(ab) // This is AB's public KEM-key
>>
>> What we need for a computational security proof is a (“technical") key separation
>> between A and AB’s secrets (a & ab) hence moving the draft towards using a KDF.
>> (I think I remember the TLS story which was expand one and split versus
>> expand twice). I completely agree with that and that doesn’t cost anything : )
>>
>> Regarding the multiple options: to me option 1 and 3 are really the same as I
>> consider the following relationship:
>> (y,Y) <-- Derive-Key-Pair(X) 
>> :=
>> y <-- KDF(X,”kem_key”);
>> Y <-- Derive-Public-Key(y);
>>
>> Am I wrong in saying that I believe option 1 already gives us everything we need ?
>> I really like that design actually which fits the derivation scheme for the message protection.
>>
>> Best,
>> B.
>>
>>> On Feb 6, 2019, at 11:07 AM, Konrad Kohbrok <konrad.kohbrok@datashrine.de> wrote:
>>>
>>> A question I ran into while working on the PR: Currently, there is the notion of
>>> "Derive-Key-Pair" function that produces a key pair from an octet string.
>>>
>>> Do we want to keep that function for modularity/composability purposes at the
>>> possible price of an additional KDF expand operation per key derivation? This is
>>> probably just a formal detail, but I wasn't quite sure what the best way forward
>>> would be.
>>>
>>> The options as I see them are as follows:
>>>
>>> Consider a node A that wants to update its secret.
>>>
>>> Option 1, derive the private KEM key directly, no Derive-Key-Pair function.
>>> Instead a Derive-Public-Key function:
>>>
>>> x_a <-$- {0,1}^n // This is A's new secret octet string
>>> a <-- KDF(x_a,"kem_key") // This is A's private KEM-key
>>> A <-- Derive-Public-Key(a) // This is A's public KEM-key
>>> x_b <-- KDF(x_a,"parent_secret") // This is B's new secret octet string
>>> b <-- KDF(x_b,"kem_key") // This is B's new private KEM-key
>>> B <-- Derive-Public-Key(b) // This is B's public KEM-key
>>>
>>>
>>> Option 2, stick with a Derive-Key-Pair function, but add additional KDF expand
>>> to ensure key separation:
>>>
>>> x_a <-$- {0,1}^n // This is B's new seed octet string
>>> x_a' <-- KDF(x_a,"secret_seed") // This is A's new secret octet string
>>> (a,A) <-- Derive-Key-Pair(x_a') // This is A's new private/public key pair
>>> x_b <-- KDF(x_a,"parent_secret") // This is B's new seed octet string
>>> x_b' <-- KDF(x_b,"secret_seed") // This is B's new secret octet string
>>> (b,A) <-- Derive-Key-Pair(x_b) // This is B's new private/public key pair
>>>
>>>
>>> Option 3, use Derive-Key-Pair function directly on seed. This would add the
>>> requirement to the Derive-Key-Pair notion, that it (internally) uses a KDF to
>>> derive the private key.
>>>
>>> x_a <-$- {0,1}^n
>>> (a,A) <-- Derive-Key-Pair(x_a) // This is A's new private/public key pair
>>> x_b <-- KDF(x_a,"parent_secret") // This is B's new secret octet string
>>> (b,B) <-- Derive-Key-Pair(x_b) // This is B's new private/public key pair
>>>
>>>
>>> My guess is that in most cases the Derive-Key-Pair function will in the end call
>>> a KDF to get the private key anyway, which means that we have one superfluous
>>> KDF expand operation.
>>>
>>> Konrad
>>>
>>>
>>>
>>> On 23/01/2019 17:47, Richard Barnes wrote:
>>>> Ah, ok, I get it.  I misunderstood "new secret" as "fresh entropy".
>>>>
>>>> In that case, this falls into the "sure, if it makes the analysis better"
>>>> bucket.  We've been treating hashes/HKDF invocations as basically free.  At some
>>>> point we might need to worry about that, but I suspect that today is not that day.
>>>>
>>>> Want to make a PR?
>>>>
>>>> --Richard
>>>>
>>>>
>>>> On Wed, Jan 23, 2019 at 10:19 AM Konrad Kohbrok <konrad.kohbrok@datashrine.de
>>>> <mailto:konrad.kohbrok@datashrine.de>> wrote:
>>>>
>>>>    Exactly, thanks Karthik!
>>>>
>>>>    Say we have the same tree as in the example in 5.4:
>>>>
>>>>             G
>>>>           /   \
>>>>          /     \
>>>>         E       _
>>>>        / \     / \
>>>>       A   B   C   D
>>>>
>>>>    Then A generates a fresh secret X_1secret and derives the following new secrets:
>>>>    X_1kemkey=HKDF(X_1secret,"kemkey")
>>>>    X_2secret=HKDF(X_1secret,"parent")
>>>>
>>>>    X_2kemkey=HKDF(X_2secret,"kemkey")
>>>>    X_3secret=HKDF(X_2secret,"parent")
>>>>
>>>>    X_3kemkey=HKDF(X_3secret,"kemkey")
>>>>
>>>>    A then sends E(pk(B), X_2secret) to B, E(pk(C),X_3secret) to C and
>>>>    E(pk(D),X_3secret) to D.
>>>>
>>>>    Hopefully that makes the idea a little clearer. Sorry for the terrible notation.
>>>>
>>>>    Konrad
>>>>
>>>>    On 23/01/2019 16:59, Karthikeyan Bhargavan wrote:
>>>>> If I understand correctly, Chris and Konrad are not suggesting changing
>>>>    the secrets.
>>>>> Instead, they are suggesting that H(.) be implemented as something like:
>>>>>
>>>>> H(x) = HKDF(x,label=”parent”)
>>>>>
>>>>> where x is the tree secret for the current node.
>>>>>
>>>>> Similarly, when generating the KEM private key for a node, we use
>>>>>
>>>>> KGEN(x) = HKDF(x,label=“kem key”)
>>>>>
>>>>> This would be a good way of making sure that each key in the protocol is
>>>>> independent, but at no additional cost.
>>>>> Am I understanding this correctly, Konrad?
>>>>>
>>>>>> On 23 Jan 2019, at 15:29, Richard Barnes <rlb@ipv.sx <mailto:rlb@ipv.sx
>>>>    <mailto:rlb@ipv.sx>>> wrote:
>>>>>>
>>>>>> Note this is a little bit expensive in terms of message size; it changes the
>>>>>> size of an update from log(N) to log(N)^2.  It does not change the number of
>>>>>> DH operations
>>>>>>
>>>>>> This is because you have to send the fresh secret for each intermediate node
>>>>>> in the tree to all its descendants.  Comparing the secrets you generate in
>>>>>> each case, from leaf to root, along a path of depth 3 with S0 at the leaf:
>>>>>>
>>>>>> Current: S0, S1 = H(S0), S2 = H^2(S0), S3 = H^3(S0)
>>>>>> Proposed: S0, S1 = KDF(T0, S0), S2 = KDF(T1, S1), S3 = KDF(T2, S2)
>>>>>>
>>>>>> Where T* are the fresh secrets called for here.  This doesn't change to whom
>>>>>> you encrypt things, but changes what you encrypt to each copath node:
>>>>>>
>>>>>> Current -> Proposed
>>>>>> S1 -> S1, T1, T2
>>>>>> S2 -> S2, T2
>>>>>> S3 -> S3
>>>>>>
>>>>>> (This is of course because you need to enable each recipient to compute
>>>>    up the
>>>>>> tree.)  So there's your log->log^2.
>>>>>>
>>>>>> This discussion is not to say that I'm opposed to this idea.  It just looks
>>>>>> like it has some non-negligible cost, so we should make sure we know what
>>>>>> we're getting for that cost.
>>>>>>
>>>>>> --Ricahrd
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 23, 2019 at 8:58 AM Konrad Kohbrok <konrad.kohbrok@datashrine..de
>>>>>> <mailto:konrad.kohbrok@datashrine.de
>>>>    <mailto:konrad.kohbrok@datashrine.de>>> wrote:
>>>>>>
>>>>>>      Hey everyone,
>>>>>>
>>>>>>      I just discussed the current draft with my advisor Chris Brzuska and he
>>>>>>      came up
>>>>>>      with a suggestion that I thought I'd just quickly relay here. As I
>>>>    have only
>>>>>>      started following the discussion recently, I apologize if this was
>>>>    already
>>>>>>      brought up in the past.
>>>>>>
>>>>>>      In terms of key separation, wouldn't it make for a cleaner design, if we
>>>>>>      used a
>>>>>>      KDF instead of a hash function? Instead of  generating a new
>>>>    leaf-node secret
>>>>>>      and then hashing it to compute the new secret for the parent node, it
>>>>    would be
>>>>>>      better to generate a new secret and then from that secret
>>>>    independently (i.e.
>>>>>>      with different labels) compute the new leaf secret and the new secret
>>>>    for the
>>>>>>      parent node. This key independence would also make the proof easier. In
>>>>>>      terms of
>>>>>>      overhead, this would mean two KDF operations instead of one hashing
>>>>    operation.
>>>>>>
>>>>>>      Cheers,
>>>>>>      Konrad
>>>>>>
>>>>>>      _______________________________________________
>>>>>>      MLS mailing list
>>>>>>      MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
>>>>    <mailto:MLS@ietf.org>>
>>>>>>      https://www.ietf.org/mailman/listinfo/mls
>>>>>>
>>>>>> _______________________________________________
>>>>>> MLS mailing list
>>>>>> MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
>>>>    <mailto:MLS@ietf.org>>
>>>>>> https://www.ietf.org/mailman/listinfo/mls
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> MLS mailing list
>>>> MLS@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/mls
>>>>
>>>
>>> _______________________________________________
>>> MLS mailing list
>>> MLS@ietf.org
>>> https://www.ietf.org/mailman/listinfo/mls
>>
> 
> _______________________________________________
> MLS mailing list
> MLS@ietf.org
> https://www.ietf.org/mailman/listinfo/mls
>