Re: [MLS] New Parent Hash

Raphael Robert <> Sat, 02 January 2021 17:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 78C803A0C4F for <>; Sat, 2 Jan 2021 09:03:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id XsDRLZiYsHFV for <>; Sat, 2 Jan 2021 09:03:22 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4864:20::62b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 518B83A0C49 for <>; Sat, 2 Jan 2021 09:03:21 -0800 (PST)
Received: by with SMTP id d17so30836418ejy.9 for <>; Sat, 02 Jan 2021 09:03:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=lPgbQGqPPQw+VVFZslO3+BAg9UmnfxDZKHdBxE23NvA=; b=AIHGYyzaNtevsiCgvtGnzP37qq+ugjLNgpm69Qb3D9/6AaK7H7fjWpbKWN2jwRQoUo J98sMiIxgLPkEjKeGTFWUlstXrV/UvuTYQfPT1ECs37HqOLqBixti3565HpXFG/uAdVy PgRr+aCjXaCdnYygOodopxPBqHYBG9OIlTprG24chbWiKmAqfGVPd7iJzLVk5dP6Vvu1 +5OzzNt6RjsXYTnNxsbnpCtjYibAIiD90uvB4VTg8HhJ+/Mv5XfeOyzGBECe8z+zDm0K eGHGP2iqwbWS6k/SyHQfiomN1zMjximQFzaccOTnO13vqXooDmNJ62VZbejcNQODwLwA H1EQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=lPgbQGqPPQw+VVFZslO3+BAg9UmnfxDZKHdBxE23NvA=; b=OjEhY8T0yqVTuVGgyDZAff1Js5PgMMTHqT9c6hTbVgMo1zAuw384roccruvZsG01ex 7OwdjVqJj27IBCmgJoDyL6KQePq4mD1WWONj+E47WHp2tsRQeYe2r0pAnuIDHCFFN8qB p+Q/Y+7vemJ2OxwJCyJUJp1IB1fFjUXFXv2RqMIJBpsFfu1J59Dn0B/MAPyQ88WIzNx8 atVzH4NxI3z8S484P6mEBozuo1uTA3RWBONE9djlBgv9kNDgIm8Hl6PFgqbhtA2IArC0 g3V9GmElZmFGWD5ZTJb1Rc0D6q2mXYCo/f2ihSU0C7iTZNVmcZF2gmz8Gfp2isGb/5iA zF5g==
X-Gm-Message-State: AOAM53362atszK3+9LKaun57V3W0+IUJydpMEaHgWZJXAbP4U+ikbGtS m/afMY4B9enrdobZZeLDh8wYMg==
X-Google-Smtp-Source: ABdhPJzqOgV2fZIqSaa1aJt1OKY9mUDY+75etb2MQ96CIEXgba6mNbr2LcdM6fOYtLHEettoOUOZkQ==
X-Received: by 2002:a17:906:710b:: with SMTP id x11mr63594494ejj.433.1609606999350; Sat, 02 Jan 2021 09:03:19 -0800 (PST)
Received: from [] ([]) by with ESMTPSA id ck27sm41683038edb.13.2021. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 02 Jan 2021 09:03:18 -0800 (PST)
From: Raphael Robert <>
Message-Id: <>
Content-Type: multipart/alternative; boundary="Apple-Mail=_F6418678-4B60-4861-9A61-36A84FB4689F"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.\))
Date: Sat, 2 Jan 2021 18:02:45 +0100
In-Reply-To: <>
Cc: Mularczyk Marta <>, "" <>
To: Joel Alwen <>
References: <> <> <> <> <> <>
X-Mailer: Apple Mail (2.3654.
Archived-At: <>
Subject: Re: [MLS] New Parent Hash
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 02 Jan 2021 17:03:26 -0000

Thanks Marta and Joël for the explanation! That was the delta that was missing for my understanding and it’s definitely good to have that on list.

> On 1. Jan 2021, at 23:10, Joel Alwen <> wrote:
> Servus Raphael,
>> Happy New Year
> Same to you! Here's hoping the new one will be a tad happier than the last. ;-)
> As for why "resolutions": The impetus to look for a new approach after our
> initial proposal came from a comment in your email on the Fri, 23 October 2020
> 18:23 UTC in the thread "Weak vs. Strong Tree Authentication". There you wrote:
>> Going for fully hashing the tree nodes as described locks down the whole
>> tree structure, making any reshuffling impossible. Maybe there is a more 
>> lightweight approach that would yield the same result without “provid[ing] 
>> irrefutable evidence that the signing key's owner was in a group with (at 
>> least some) of the other members in the group”, but I can’t think of
>> anything right now.
> That was an intriguing challenge and got us thinking. The result is the current
> Parent Hash definition using resolutions. We realized that to buy wiggle room in
> not locking down as much of the tree (especially, not locking down other members
> identities into the hash) we could pay a "price" that actually doesn't harm the
> security we've been aiming for. Namely, we can allow the adversary some more
> tree manipulation but still only benign ones that dont actually allow her to
> break the security property we've been aiming for.

Just for context, what I had in mind was something that was not as detrimental to the membership deniability. I think in that respect both the “resolution” and the “descendants” approach are equivalent.
That being said, this looks quite elegant!

> In retrospect, a second reason to use resolutions is efficiency (though to fair,
> that was really a coincidence rather than an explicit goal of the redesign).
> Initially, we had been thinking of simply having parent hash include tree hash
> as in the Benjamin-Karthik (BK) proposal of days yore. But we realized that
> since then MLS had introduced Unmerged Leaves which meant that the tree hash of
> some nodes can change between when they are assigned their HPKE key and later
> when a new member wants to verify the parent hash of the node. In other words
> the BK approach was broken by unmerged leaves. So, to make the "include the
> whole subtree in parent hash" approach work, it seems new members would, for
> each node being verified, traverse the nodes full sub-tree so as to weed out
> those nodes that were added after the to-be-verified node got its HPKE key. (I
> believe this is the O(n) time that Hubert is referring too?) In contrast, the
> new version of parent hash (using resolutions) has the potential to be much
> faster; it requires hashing less data but also visiting less of the tree. E.g.
> when verifying near the root the difference can be as much as visiting order n
> nodes vs. visiting 1 node in the new construction.
> The more blanks & unmerged leaves in the tree the more the new construction
> collapses to the old one. But only in the most extreme case do they actually
> reach parity and the new approach is never slower than the old.
> Needless to say, we didn't implement this stuff though, let alone benchmark the
> 2 approaches so any hard data elucidating what the actual difference is would be
> very welcome. Still, the new approach can only be as fast or faster (but never
> slower) than the new one. Nor do I think it's any more complicated to define or
> implement. And from a security perspective they both let MLS meet our security
> definition. So regardless of how small the speed up is (and ignoring the locking
> in the tree discussion above) it's still not clear to me why we'd want to switch
> back.

> A meaningful attack that only works on the new but not old construction would be
> one very quick way of changing my mind though! :-) But, I believe such an attack
> would require either new adversarial capabilities and/or break a new security
> property we didn't consider. (For the adversaries & security we considered,
> there is no security loss with the new approach compared to the old one.)

I think we are now fully aligned on the definitions and the performance implications and now that we have better documentation we can continue with the interim effort.

I’d however like to leave it as an open question whether we shouldn’t reconsider locking down entire subtrees as a security-in-depth measure if we can convince ourselves the performance toll is not too high. 
For example, I’m wondering if there isn’t a clever way to combine the tree hash and the parent hash. The tree hash has to be calculated for every epoch change anyway and it’s always in O(n).



> - Joël
> On 01/01/2021 17:30, Raphael Robert wrote:
>> Thanks for the lengthy explanation!
>> I’m glad we are on the same page regarding the definition of the resolution. 
>> I think I understand where the misunderstanding is coming from.
>> Initially you proposed a “strong hash” solution on the ML (on Oct 23 2020) 
>> that was the following:
>> "strong parent_hash" : Define parent_hash to include tree_hash (and
>> tree_hash to *not* include parent_hash). The tree_hash of a node covers all
>> values in the subtree rooted at that node  (except the parent_hashes and
>> signatures). So basically something like this: - leaf.tree_hash :=
>> H( - node.tree_hash := H(, node.lchild.tree_hash,
>> node.rchild.tree_hash) - root.parent_hash := 0 - node.parent_hash :=
>> H(node.parent.tree_hash, node.parent.parent_hash).
>> This definition did include the whole subtree and therefore I was assuming 
>> that this would also end up in the spec change. As far as I now understand, 
>> you went for a more lightweight approach where you say that just using the 
>> resolution is enough, because pinning all nodes in the subtree is not 
>> necessary and would only prevent “benign” attacks. I guess I missed the part 
>> where you switched from one concept to the other.
>> What I discovered during implementation is that when you only use the 
>> resolution approach and the tree has no blanks, you are not always able to 
>> detect if the attacker has swapped two leaves. This might indeed just be a 
>> benign attack in the sense that the tree invariant is not violated.
>> At this point I don’t have strong arguments for or against using the 
>> resolution approach, I would have to think about it some more. Just as a 
>> reminder for context, the original concept by Benjamin and Karthik was 
>> “parent hash covers the tree hash”. That’s pretty much what you initially 
>> proposed, except that it also covered the KeyPackage at the leaf level, not 
>> just the HPKE public key.
>> My question right now would be: What is the benefit of the lightweight 
>> resolution-based approach? If it’s just efficiency, we should look at whether
>> there’s really that much to gain.
>> On efficiency: The parent hashes are verified in two instances: a) when a
>> new member joins the group and validates the public tree it received and b)
>> when a member receives a full Commit from another member.
>> Re a): The spec now says "Moreover, when joining a group, new members MUST 
>> authenticate each non-blank parent node P. A parent node P is authenticated”.
>> This means the effort is linear to start with. Depending on how you define
>> the operation, it is even worse than linear. Say the tree has n leaves and
>> therefore 2n-1 nodes. For every node, we need to iterate over at least a part
>> of the subtree and collect public keys that we then concatenate. We end up
>> only hashing once per node, but with a variable size payload. I think the
>> cost of navigating the subtree and collecting the keys is O(log n) for the
>> “resolution approach and O(n) for the “descendants" approach and the cost of
>> hashing is linearly proportional to the size of the payload. If we assume
>> that hashing is the most expensive operation, the total cost would be O(n *
>> log n) for “resolution” and O(n ^ 2) for “descendants".
>> Re b): It is very similar to a), except that we only have to iterate over
>> log n nodes of the direct path, not the whole tree. That means that the
>> total effort is O(log n * log n) for “resolution” and O(n * log n) for 
>> “descendants”.
>> There’s clearly a difference in terms of efficiency between the two 
>> approaches. That being said, we should keep in mind that for small payloads, 
>> hashing is a lot faster than other cryptographic operations like e.g. 
>> producing and verifying signatures. Since a new joiner needs to also verify
>> n signatures when joining a group, it could well be that the effort of 
>> verifying the parent hashes is negligible compared to the cost of verifying 
>> all signatures.
>> In summary: I still don’t have strong arguments against the “resolution” 
>> approach, but I also have a feeling that the efficiency differences are not 
>> dramatic either.
>> Happy New Year and thanks!
>> Raphael
>>> On 1. Jan 2021, at 15:48, Mularczyk Marta < 
>>> < <>>> wrote:
>>> Hi Raphael,
>>> Thanks for explaining! Using "descendants" is definitely better for 
>>> security, but also, as Hubert mentioned, less efficient.
>>> Also we did understand “resolution” the same way as you when we proposed 
>>> the strong parent hash (sorry for the confusion).
>>> But we obviously really want to understand if/why using resolution is 
>>> insecure. Ultimately, the security property we care about is that privacy 
>>> and authenticity of application messages holds. The Parent Hash mechanism 
>>> helps us ensure this holds also for epochs in adversarially generated 
>>> groups, as long as the adversary does not control any leaves in this epoch 
>>> (i.e. does not know the signing key of the leaf). So to really see if your 
>>> examples break security, it would be good to see if they let us violate 
>>> this security goal.
>>> To see why we believe the new parent hash notion (i.e. with resolutions) 
>>> achieves this, let us first clarify what technical property about (even 
>>> adversarially produced) ratchet trees we want from verifying parent hashes.
>>> It’s not that the adversary can’t produce manipulated trees at all. It’s
>>> that the only manipulations she is able to do will not let her produce a
>>> tree that violates the tree invariant. That’s because (in our model) when 
>>> the tree invariant holds then we can show that the epoch is secure.
>>> So some manipulations *are* possible. They are just “benign”. For example,
>>> suppose adversary A controls a leaf L with HPKE key pk in tree T (i.e. she
>>> knows the signing key of L’s owner). Then A can construct a new tree T’ 
>>> identical to T but where leaf L has pk’ != pk (say, by performing a commit 
>>> at leaf L to obtain T’). But for this, in T’ leaf L must still be 
>>> controlled by A. We call this manipulation “benign” because even though A 
>>> might now break privacy of application messages in epoch with tree T’ it 
>>> doesnt violate our security goal because privacy is conditioned on A *not* 
>>> controlling any leaves.
>>> Conversely, one thing A can *not* do is the following. Say T’ has a node V
>>> with HPKE public key pk0 such that A knows corresponding sk0. Then for T’ 
>>> to pass verification, it must contain a leaf controlled by A. This holds 
>>> since for T’ to pass verification there must be a leaf in V’s subtree with 
>>> a signature covering a hash that includes pk0. I.e. some leaf must
>>> “attest” to having inserted pk0 into T’. (Recall that in our model the
>>> adversary can’t just leak sk0 alone. They must leak a full state of some
>>> client which also means leaking the client's signing key.)
>>> So our question to you is the following. (Does this make sense? And) we’re 
>>> not exactly sure what you have in mind when you say “swapping out two leaf 
>>> nodes”. But we think it would only be a problem if it is not benign. Is 
>>> this true for your attack? Could you expand a bit on what tree
>>> manipulation you have in mind?
>>> Best, Joël& Marta
>>> --------------------------------------------------------------------------------
> *From:* MLS < <> < <>>> on behalf of
>>> Hubert Chathi < <> < <>>> *Sent:* 
>>> Thursday, December 31, 2020 5:36:53 PM *To:* <> 
>>> < <>> *Subject:* Re: [MLS] New Parent Hash
>>> I think that if you use "descendants" as you define it, then calculating 
>>> the parent_hash will become O(n) all the time, rather than O(log n) in the
>>> best-case, since the size of the descendants list will be O(n).  I'm not 
>>> sure if there's a good way to maintain O(log n) and still provide the 
>>> desired security.
>>> On Thu, 31 Dec 2020, at 06:47, Raphael Robert wrote:
>>>> Hi Joël, Daniel and Marta,
>>>> I just implemented the stronger parent hash scheme you proposed in the 
>>>> way it is described in the spec now. The good news is that it is 
>>>> implementable, the bad news is that is does not solve the problem you 
>>>> initially describe. Specifically, swapping out two leaf nodes still goes 
>>>> undetected.
>>>> I think it is due to a terminology conflict, but it would be great if
>>>> you could validate that assumption. In 7.4 you introduce the notion of 
>>>> “original child resolution” as " the array of HPKEPublicKey values of
>>>> the nodes in the resolution of S”. The term “resolution” is defined in
>>>> 5.2. It is essentially a way of dealing with blank nodes by substituting
>>>> them with the first non-blank descendants you find when descending the
>>>> subtree of a node. Here is my assumption: You probably thought the
>>>> resolution would cover all non-blank descendants of a node, not just the
>>>> first non-blank descendant. In that case we would need another function
>>>> that does that (let’s call it “descendants”).
>>>> If we look at the example of 5.2:
>>>>  - The resolution of node 3 is the list [A, CD, C] - The descendants
>>>> of node 3 is the list [A, CD, C, D]
>>>>  - The resolution of node 5 is the list [CD] - The descendants of node 
>>>> 5 is the list [CD, C, D] (we just need to additionally filter out C, 
>>>> because it is an unmerged leaf of node 5)
>>>> Here is a code snippet of what the “descendants” function would look
>>>> like (sorry that it’s Rust, not Python):
>>>> fn descendants(x: NodeIndex, size: LeafIndex) -> Vec<NodeIndex> { if 
>>>> level(x) == 0 { vec![x] } else { let left_child = left(x); let 
>>>> right_child = right(x, size); [ descendants(left_child, size), vec![x], 
>>>> descendants(right_child, size), ] .concat() } }
>>>> There’s a more efficient implementation of course, this is just to 
>>>> illustrate.
>>>> In other words, the new “original child resolution” of S would be the 
>>>> subtree under S, but with filtered-out blank nodes and unmerged leaves
>>>> of S.
>>>> If my assumption is correct, I’ll follow up quickly with a PR since this 
>>>> is currently a blocker for interop based on draft-11.
>>>> Thanks
>>>> Raphael
>>>>> On 12. Nov 2020, at 22:14, Joel Alwen < <> 
>>>>> < <>>> wrote:
>>>>> Hey people,
>>>>> Just a quick heads up that Daniel, Marta and I put in a PR with a new 
>>>>> version of strong parent hash to prevent the attacks from the earlier 
>>>>> thread. We tried to make it as minimal as we could to help with 
>>>>> deniability while still preventing the attacks permitted by weak
>>>>> parent hashes.
>>>>> In a nutshell, when computing parent_hash at node v with parent p and 
>>>>> sibling w we include - p's HPKE pubkey - p's parent_hash value and - 
>>>>> HPKE pub keys in the resolution of w except for those belonging to 
>>>>> leaves unmerged at p.
>>>>> As a sanity check, notice that as long as p's keys remain the same one 
>>>>> can always recompute the same parent_hash value at v as was initially 
>>>>> computed by the member that set p's keys. (In other words, new members 
>>>>> can verify that the stored parent_hash values match whats in the
>>>>> tree.) In particular, that's coz as long as p's keys are unchanged so
>>>>> is the resolution of w. The only exception are leaves being added as
>>>>> unmerged at w. But those leaves are also added as unmerged at p so they
>>>>> are left out of the hash.
>>>>> As for deniability, at least parent_hash only binds HPKE keys and 
>>>>> nothing else (like, say, credentials or signatures). Its the best we 
>>>>> were able to come up with for now...
>>>>> - joël
>>>>> _______________________________________________ MLS mailing list 
>>>>> <> < <>> 
>>>>> <>
>>> < <>>
>>>> _______________________________________________ MLS mailing list 
>>>> <> < <>> 
>>>> <>
>>> < <>>
>>> _______________________________________________ MLS mailing list 
>>> <> < <>> 
>>> <> 
>>> < <>>