Re: [MLS] KDF instead of hashing up the tree

Hey Benjamin,

sure, that sounds good. I was just wondering if we want to keep the notion of a
"Derive-Key-Pair" function, e.g., for a case where a private key is not
necessarily a (pseudo-)random string you get out of a KDF.

But if everyone is ok with that assumption and the additional notion of a
"Derive-Public-Key" function, then I'm absolutely happy with that.

Konrad

On 06/02/2019 13:37, Benjamin Beurdouche wrote:
> Hi Konrad,
> 
> I think I got the idea, but let me try to reformulate with the following tree:
> 
>    AB
>   /     \
> A      B
> 
> What we currently want to do:
> 
> For A:
> x_a <-$- {0,1}^n // This is A's new secret octet string
> a <-- KDF(x_a,"kem_key") // This is A's private KEM-key
> A <-- Derive-Public-Key(a) // This is A's public KEM-key
> 
> For AB:
> OLD = x_ab <-- Hash(x_a) // This is AB's new secret octet string
> NEW = x_ab <-- KDF(x_a,”parent_key") // This is AB's new secret octet string
> ab <-- KDF(x_ab,"kem_key") // This is AB's new private KEM-key
> AB <-- Derive-Public-Key(ab) // This is AB's public KEM-key
> 
> What we need for a computational security proof is a (“technical") key separation
> between A and AB’s secrets (a & ab) hence moving the draft towards using a KDF.
> (I think I remember the TLS story which was expand one and split versus
> expand twice). I completely agree with that and that doesn’t cost anything : )
> 
> Regarding the multiple options: to me option 1 and 3 are really the same as I
> consider the following relationship:
> (y,Y) <-- Derive-Key-Pair(X) 
> :=
> y <-- KDF(X,”kem_key”);
> Y <-- Derive-Public-Key(y);
> 
> Am I wrong in saying that I believe option 1 already gives us everything we need ?
> I really like that design actually which fits the derivation scheme for the message protection.
> 
> Best,
> B.
> 
>> On Feb 6, 2019, at 11:07 AM, Konrad Kohbrok <konrad.kohbrok@datashrine.de> wrote:
>>
>> A question I ran into while working on the PR: Currently, there is the notion of
>> "Derive-Key-Pair" function that produces a key pair from an octet string.
>>
>> Do we want to keep that function for modularity/composability purposes at the
>> possible price of an additional KDF expand operation per key derivation? This is
>> probably just a formal detail, but I wasn't quite sure what the best way forward
>> would be.
>>
>> The options as I see them are as follows:
>>
>> Consider a node A that wants to update its secret.
>>
>> Option 1, derive the private KEM key directly, no Derive-Key-Pair function.
>> Instead a Derive-Public-Key function:
>>
>> x_a <-$- {0,1}^n // This is A's new secret octet string
>> a <-- KDF(x_a,"kem_key") // This is A's private KEM-key
>> A <-- Derive-Public-Key(a) // This is A's public KEM-key
>> x_b <-- KDF(x_a,"parent_secret") // This is B's new secret octet string
>> b <-- KDF(x_b,"kem_key") // This is B's new private KEM-key
>> B <-- Derive-Public-Key(b) // This is B's public KEM-key
>>
>>
>> Option 2, stick with a Derive-Key-Pair function, but add additional KDF expand
>> to ensure key separation:
>>
>> x_a <-$- {0,1}^n // This is B's new seed octet string
>> x_a' <-- KDF(x_a,"secret_seed") // This is A's new secret octet string
>> (a,A) <-- Derive-Key-Pair(x_a') // This is A's new private/public key pair
>> x_b <-- KDF(x_a,"parent_secret") // This is B's new seed octet string
>> x_b' <-- KDF(x_b,"secret_seed") // This is B's new secret octet string
>> (b,A) <-- Derive-Key-Pair(x_b) // This is B's new private/public key pair
>>
>>
>> Option 3, use Derive-Key-Pair function directly on seed. This would add the
>> requirement to the Derive-Key-Pair notion, that it (internally) uses a KDF to
>> derive the private key.
>>
>> x_a <-$- {0,1}^n
>> (a,A) <-- Derive-Key-Pair(x_a) // This is A's new private/public key pair
>> x_b <-- KDF(x_a,"parent_secret") // This is B's new secret octet string
>> (b,B) <-- Derive-Key-Pair(x_b) // This is B's new private/public key pair
>>
>>
>> My guess is that in most cases the Derive-Key-Pair function will in the end call
>> a KDF to get the private key anyway, which means that we have one superfluous
>> KDF expand operation.
>>
>> Konrad
>>
>>
>>
>> On 23/01/2019 17:47, Richard Barnes wrote:
>>> Ah, ok, I get it.  I misunderstood "new secret" as "fresh entropy".
>>>
>>> In that case, this falls into the "sure, if it makes the analysis better"
>>> bucket.  We've been treating hashes/HKDF invocations as basically free.  At some
>>> point we might need to worry about that, but I suspect that today is not that day.
>>>
>>> Want to make a PR?
>>>
>>> --Richard
>>>
>>>
>>> On Wed, Jan 23, 2019 at 10:19 AM Konrad Kohbrok <konrad.kohbrok@datashrine.de
>>> <mailto:konrad.kohbrok@datashrine.de>> wrote:
>>>
>>>    Exactly, thanks Karthik!
>>>
>>>    Say we have the same tree as in the example in 5.4:
>>>
>>>             G
>>>           /   \
>>>          /     \
>>>         E       _
>>>        / \     / \
>>>       A   B   C   D
>>>
>>>    Then A generates a fresh secret X_1secret and derives the following new secrets:
>>>    X_1kemkey=HKDF(X_1secret,"kemkey")
>>>    X_2secret=HKDF(X_1secret,"parent")
>>>
>>>    X_2kemkey=HKDF(X_2secret,"kemkey")
>>>    X_3secret=HKDF(X_2secret,"parent")
>>>
>>>    X_3kemkey=HKDF(X_3secret,"kemkey")
>>>
>>>    A then sends E(pk(B), X_2secret) to B, E(pk(C),X_3secret) to C and
>>>    E(pk(D),X_3secret) to D.
>>>
>>>    Hopefully that makes the idea a little clearer. Sorry for the terrible notation.
>>>
>>>    Konrad
>>>
>>>    On 23/01/2019 16:59, Karthikeyan Bhargavan wrote:
>>>> If I understand correctly, Chris and Konrad are not suggesting changing
>>>    the secrets.
>>>> Instead, they are suggesting that H(.) be implemented as something like:
>>>>
>>>> H(x) = HKDF(x,label=”parent”)
>>>>
>>>> where x is the tree secret for the current node.
>>>>
>>>> Similarly, when generating the KEM private key for a node, we use
>>>>
>>>> KGEN(x) = HKDF(x,label=“kem key”)
>>>>
>>>> This would be a good way of making sure that each key in the protocol is
>>>> independent, but at no additional cost.
>>>> Am I understanding this correctly, Konrad?
>>>>
>>>>> On 23 Jan 2019, at 15:29, Richard Barnes <rlb@ipv.sx <mailto:rlb@ipv.sx
>>>    <mailto:rlb@ipv.sx>>> wrote:
>>>>>
>>>>> Note this is a little bit expensive in terms of message size; it changes the
>>>>> size of an update from log(N) to log(N)^2.  It does not change the number of
>>>>> DH operations
>>>>>
>>>>> This is because you have to send the fresh secret for each intermediate node
>>>>> in the tree to all its descendants.  Comparing the secrets you generate in
>>>>> each case, from leaf to root, along a path of depth 3 with S0 at the leaf:
>>>>>
>>>>> Current: S0, S1 = H(S0), S2 = H^2(S0), S3 = H^3(S0)
>>>>> Proposed: S0, S1 = KDF(T0, S0), S2 = KDF(T1, S1), S3 = KDF(T2, S2)
>>>>>
>>>>> Where T* are the fresh secrets called for here.  This doesn't change to whom
>>>>> you encrypt things, but changes what you encrypt to each copath node:
>>>>>
>>>>> Current -> Proposed
>>>>> S1 -> S1, T1, T2
>>>>> S2 -> S2, T2
>>>>> S3 -> S3
>>>>>
>>>>> (This is of course because you need to enable each recipient to compute
>>>    up the
>>>>> tree.)  So there's your log->log^2.
>>>>>
>>>>> This discussion is not to say that I'm opposed to this idea.  It just looks
>>>>> like it has some non-negligible cost, so we should make sure we know what
>>>>> we're getting for that cost.
>>>>>
>>>>> --Ricahrd
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jan 23, 2019 at 8:58 AM Konrad Kohbrok <konrad.kohbrok@datashrine..de
>>>>> <mailto:konrad.kohbrok@datashrine.de
>>>    <mailto:konrad.kohbrok@datashrine.de>>> wrote:
>>>>>
>>>>>      Hey everyone,
>>>>>
>>>>>      I just discussed the current draft with my advisor Chris Brzuska and he
>>>>>      came up
>>>>>      with a suggestion that I thought I'd just quickly relay here. As I
>>>    have only
>>>>>      started following the discussion recently, I apologize if this was
>>>    already
>>>>>      brought up in the past.
>>>>>
>>>>>      In terms of key separation, wouldn't it make for a cleaner design, if we
>>>>>      used a
>>>>>      KDF instead of a hash function? Instead of  generating a new
>>>    leaf-node secret
>>>>>      and then hashing it to compute the new secret for the parent node, it
>>>    would be
>>>>>      better to generate a new secret and then from that secret
>>>    independently (i.e.
>>>>>      with different labels) compute the new leaf secret and the new secret
>>>    for the
>>>>>      parent node. This key independence would also make the proof easier. In
>>>>>      terms of
>>>>>      overhead, this would mean two KDF operations instead of one hashing
>>>    operation.
>>>>>
>>>>>      Cheers,
>>>>>      Konrad
>>>>>
>>>>>      _______________________________________________
>>>>>      MLS mailing list
>>>>>      MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
>>>    <mailto:MLS@ietf.org>>
>>>>>      https://www.ietf.org/mailman/listinfo/mls
>>>>>
>>>>> _______________________________________________
>>>>> MLS mailing list
>>>>> MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
>>>    <mailto:MLS@ietf.org>>
>>>>> https://www.ietf.org/mailman/listinfo/mls
>>>>
>>>
>>>
>>> _______________________________________________
>>> MLS mailing list
>>> MLS@ietf.org
>>> https://www.ietf.org/mailman/listinfo/mls
>>>
>>
>> _______________________________________________
>> MLS mailing list
>> MLS@ietf.org
>> https://www.ietf.org/mailman/listinfo/mls
>