Re: [MLS] KDF instead of hashing up the tree

A question I ran into while working on the PR: Currently, there is the notion of
"Derive-Key-Pair" function that produces a key pair from an octet string.

Do we want to keep that function for modularity/composability purposes at the
possible price of an additional KDF expand operation per key derivation? This is
probably just a formal detail, but I wasn't quite sure what the best way forward
would be.

The options as I see them are as follows:

Consider a node A that wants to update its secret.

Option 1, derive the private KEM key directly, no Derive-Key-Pair function.
Instead a Derive-Public-Key function:

x_a <-$- {0,1}^n // This is A's new secret octet string
a <-- KDF(x_a,"kem_key") // This is A's private KEM-key
A <-- Derive-Public-Key(a) // This is A's public KEM-key
x_b <-- KDF(x_a,"parent_secret") // This is B's new secret octet string
b <-- KDF(x_b,"kem_key") // This is B's new private KEM-key
B <-- Derive-Public-Key(b) // This is B's public KEM-key

Option 2, stick with a Derive-Key-Pair function, but add additional KDF expand
to ensure key separation:

x_a <-$- {0,1}^n // This is B's new seed octet string
x_a' <-- KDF(x_a,"secret_seed") // This is A's new secret octet string
(a,A) <-- Derive-Key-Pair(x_a') // This is A's new private/public key pair
x_b <-- KDF(x_a,"parent_secret") // This is B's new seed octet string
x_b' <-- KDF(x_b,"secret_seed") // This is B's new secret octet string
(b,A) <-- Derive-Key-Pair(x_b) // This is B's new private/public key pair

Option 3, use Derive-Key-Pair function directly on seed. This would add the
requirement to the Derive-Key-Pair notion, that it (internally) uses a KDF to
derive the private key.

x_a <-$- {0,1}^n
(a,A) <-- Derive-Key-Pair(x_a) // This is A's new private/public key pair
x_b <-- KDF(x_a,"parent_secret") // This is B's new secret octet string
(b,B) <-- Derive-Key-Pair(x_b) // This is B's new private/public key pair

My guess is that in most cases the Derive-Key-Pair function will in the end call
a KDF to get the private key anyway, which means that we have one superfluous
KDF expand operation.

Konrad

On 23/01/2019 17:47, Richard Barnes wrote:
> Ah, ok, I get it.  I misunderstood "new secret" as "fresh entropy".
> 
> In that case, this falls into the "sure, if it makes the analysis better"
> bucket.  We've been treating hashes/HKDF invocations as basically free.  At some
> point we might need to worry about that, but I suspect that today is not that day.
> 
> Want to make a PR?
> 
> --Richard
> 
> 
> On Wed, Jan 23, 2019 at 10:19 AM Konrad Kohbrok <konrad.kohbrok@datashrine.de
> <mailto:konrad.kohbrok@datashrine.de>> wrote:
> 
>     Exactly, thanks Karthik!
> 
>     Say we have the same tree as in the example in 5.4:
> 
>              G
>            /   \
>           /     \
>          E       _
>         / \     / \
>        A   B   C   D
> 
>     Then A generates a fresh secret X_1secret and derives the following new secrets:
>     X_1kemkey=HKDF(X_1secret,"kemkey")
>     X_2secret=HKDF(X_1secret,"parent")
> 
>     X_2kemkey=HKDF(X_2secret,"kemkey")
>     X_3secret=HKDF(X_2secret,"parent")
> 
>     X_3kemkey=HKDF(X_3secret,"kemkey")
> 
>     A then sends E(pk(B), X_2secret) to B, E(pk(C),X_3secret) to C and
>     E(pk(D),X_3secret) to D.
> 
>     Hopefully that makes the idea a little clearer. Sorry for the terrible notation.
> 
>     Konrad
> 
>     On 23/01/2019 16:59, Karthikeyan Bhargavan wrote:
>     > If I understand correctly, Chris and Konrad are not suggesting changing
>     the secrets.
>     > Instead, they are suggesting that H(.) be implemented as something like:
>     >
>     > H(x) = HKDF(x,label=”parent”)
>     >
>     > where x is the tree secret for the current node.
>     >
>     > Similarly, when generating the KEM private key for a node, we use
>     >
>     > KGEN(x) = HKDF(x,label=“kem key”)
>     >
>     > This would be a good way of making sure that each key in the protocol is
>     > independent, but at no additional cost.
>     > Am I understanding this correctly, Konrad?
>     >
>     >> On 23 Jan 2019, at 15:29, Richard Barnes <rlb@ipv.sx <mailto:rlb@ipv.sx
>     <mailto:rlb@ipv.sx>>> wrote:
>     >>
>     >> Note this is a little bit expensive in terms of message size; it changes the
>     >> size of an update from log(N) to log(N)^2.  It does not change the number of
>     >> DH operations
>     >>
>     >> This is because you have to send the fresh secret for each intermediate node
>     >> in the tree to all its descendants.  Comparing the secrets you generate in
>     >> each case, from leaf to root, along a path of depth 3 with S0 at the leaf:
>     >>
>     >> Current: S0, S1 = H(S0), S2 = H^2(S0), S3 = H^3(S0)
>     >> Proposed: S0, S1 = KDF(T0, S0), S2 = KDF(T1, S1), S3 = KDF(T2, S2)
>     >>
>     >> Where T* are the fresh secrets called for here.  This doesn't change to whom
>     >> you encrypt things, but changes what you encrypt to each copath node:
>     >>
>     >> Current -> Proposed
>     >> S1 -> S1, T1, T2
>     >> S2 -> S2, T2
>     >> S3 -> S3
>     >>
>     >> (This is of course because you need to enable each recipient to compute
>     up the
>     >> tree.)  So there's your log->log^2.
>     >>
>     >> This discussion is not to say that I'm opposed to this idea.  It just looks
>     >> like it has some non-negligible cost, so we should make sure we know what
>     >> we're getting for that cost.
>     >>
>     >> --Ricahrd
>     >>
>     >>
>     >>
>     >> On Wed, Jan 23, 2019 at 8:58 AM Konrad Kohbrok <konrad.kohbrok@datashrine..de
>     >> <mailto:konrad.kohbrok@datashrine.de
>     <mailto:konrad.kohbrok@datashrine.de>>> wrote:
>     >>
>     >>     Hey everyone,
>     >>
>     >>     I just discussed the current draft with my advisor Chris Brzuska and he
>     >>     came up
>     >>     with a suggestion that I thought I'd just quickly relay here. As I
>     have only
>     >>     started following the discussion recently, I apologize if this was
>     already
>     >>     brought up in the past.
>     >>
>     >>     In terms of key separation, wouldn't it make for a cleaner design, if we
>     >>     used a
>     >>     KDF instead of a hash function? Instead of  generating a new
>     leaf-node secret
>     >>     and then hashing it to compute the new secret for the parent node, it
>     would be
>     >>     better to generate a new secret and then from that secret
>     independently (i.e.
>     >>     with different labels) compute the new leaf secret and the new secret
>     for the
>     >>     parent node. This key independence would also make the proof easier. In
>     >>     terms of
>     >>     overhead, this would mean two KDF operations instead of one hashing
>     operation.
>     >>
>     >>     Cheers,
>     >>     Konrad
>     >>
>     >>     _______________________________________________
>     >>     MLS mailing list
>     >>     MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
>     <mailto:MLS@ietf.org>>
>     >>     https://www.ietf.org/mailman/listinfo/mls
>     >>
>     >> _______________________________________________
>     >> MLS mailing list
>     >> MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
>     <mailto:MLS@ietf.org>>
>     >> https://www.ietf.org/mailman/listinfo/mls
>     >
> 
> 
> _______________________________________________
> MLS mailing list
> MLS@ietf.org
> https://www.ietf.org/mailman/listinfo/mls
>