Re: [MLS] Optimizing Unmerged Leaves (or: bringing the MLS tree size back to O(n))

Raphael Robert <> Sat, 22 May 2021 10:14 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 891C03A1F01 for <>; Sat, 22 May 2021 03:14:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id EklKKE9POVsP for <>; Sat, 22 May 2021 03:14:14 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::532]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E8AEF3A1F00 for <>; Sat, 22 May 2021 03:14:13 -0700 (PDT)
Received: by with SMTP id j10so8576760edw.8 for <>; Sat, 22 May 2021 03:14:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=rr; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=sRRozJW+uQcyZPse8rlxxhqiSzzEoPmkTVsSHp354MM=; b=lI9zmt7fzoPSYBQJb+H4pI1A6kPcUgiQGevDME+C3WLDOD1B74ZvM5FDZEfRSYLTIK /cY4TcoIP4RfJrCrg3lkj0rSAce8NjbYU08OJUDoLyjnwIHOj6+EGC9RMyuXhYWZCELl RXsUyYuvmo2BwKiiNvfF45F/C9QhGEeAuJjIk=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=sRRozJW+uQcyZPse8rlxxhqiSzzEoPmkTVsSHp354MM=; b=Vn66xanMxzm9gNtvxvFn7o7sf9VQ9gCdMGtHZJ19B476bBEf9Fw/939jCZjNQfcryC eTPJhV5Vh/nC19LAp7lsi/+toYclDVKYotoWAOxMVxkzy3oq4cE0zQOH2kEAFxCfz62V 4b6SLVHgBI+X0nZgOUa9xuNLgYzOUqou8rhXNUBiWCGQhk2zMdWyer5VkYVCJyk2fqF3 dtjnnXIRznpE+Umne9w7Yic6pQrGu+h9+GIdns3Wwjs88pBa+EKK2r3c6wlVQD770OAK cco1xfYMWMeQ93ak3Qssa6LearXD7BU8HuF5csPf3xLCT/w2hr7Ja/5upoAj+AfcHjFl Pmkw==
X-Gm-Message-State: AOAM532SuGHNnSO0apoyXRrxk38baYExmUxn9gQ7yajgZnJ7fFTDVerM 6sh5eA9as1du8NqKgAI1mexM9VFeW2um2ZngcRI=
X-Google-Smtp-Source: ABdhPJydtvS2sZugXULEWA0NX3CEZ705pY9M4RnM0hQWtRrZbkVi+tArr2B3J61krIuesbav5JKPOQ==
X-Received: by 2002:a05:6402:190e:: with SMTP id e14mr15755596edz.146.1621678451910; Sat, 22 May 2021 03:14:11 -0700 (PDT)
Received: from ([]) by with ESMTPSA id e22sm6434265edu.35.2021. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 May 2021 03:14:11 -0700 (PDT)
From: Raphael Robert <>
Message-Id: <>
Content-Type: multipart/alternative; boundary="Apple-Mail=_FFA2C19F-4129-4B33-ABDD-345BF4ABFC04"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.\))
Date: Sat, 22 May 2021 12:14:10 +0200
In-Reply-To: <>
Cc: MLS List <>
To: Karthik Bhargavan <>
References: <> <> <> <> <>
X-Mailer: Apple Mail (2.3654.
Archived-At: <>
Subject: Re: [MLS] Optimizing Unmerged Leaves (or: bringing the MLS tree size back to O(n))
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 22 May 2021 10:14:20 -0000

Hi Karthik,

Thanks for sharing your findings!

Just for context (for those who were not at the Paris interim meeting in 2018), the fact that “unmerged leaves” exist is because it gives us a mechanism to do Adds in O(1) instead of O(log n). This benefit does however come at the cost having to do this “book keeping” (storage cost) and more KEM operations during subsequent updates (computation cost). It also delays PCS for the adder.

I think the question is worth investigating and I am commenting inline:

> On 21. May 2021, at 16:59, Karthik Bhargavan < <>> wrote:
> Hello All,
> Théophile, Benjamin, and I have been writing a formal specification of MLS in F* and we’re happy to report a compleger interoperable spec that (with Franziskus’ help) shares test vectors with OpenMLS.
> In the course of writing this specification, we observed several potential improvements in the way MLS does things, and this email is about the first of these. Subsequent emails will have other questions/comments.
> The way we store unmerged leaves in the ratchet tree is sub-optimal since it stores a lot of redundant information.
> In the worst case, a tree can have n/2 unmerged leaves  (read the end of this email to see how we obtained this number).

I agree with the extreme case, but I think it is very unlikely to happen frequently “in the wild”. Trees that are quite sparse are problematic in general, doing frequent updates is still the recommended way to go both for PCS and efficiency.

> Consequently, the total size of the ratchet tree can be as big as 2*n*log(n) + 4*n bytes (for n participants), and is never lower than 4*n bytes.
> This is unfortunate, since the unmerged nodes optimization has now bloated our tree to O(n logn) from O(n).
> Consequently, the welcome message/initial tree and local storage is now O(n log n).

I’d like to add the these considerations in big O notation that an index is 4 bytes long, while the public keys in the nodes are at least 32 bytes long. So there is an order of magnitude of difference at least in lower intermediary nodes of the tree.

> For more precision, in the rest of this email, we won't say that "a leaf  l is unmerged", but "a leaf l is unmerged at a node n", meaning that `n` is an ancestor of the leaf `l` and the member at the leaf `l` doesn't know the node secret at `n`.
> Then, a leaf `l` is "unmerged” (in the sense of the MLS spec) iff. there exists a node `n` such that this leaf `l` is unmerged for `n`.
> In the current MLS spec, we store the index of a leaf `l` at each ancestor node `n` such that `l` is unmerged for `n`.
> To see why this results in quite a bit of storage redundancy, observe the following property:  If a leaf `l` is unmerged for the parent of a node `n`, then `l` is also unmerged for `n`.
> Given this observation, we can optimize the unmerged leaves design to bring the tree size back to O(n).
> Solution 1
> —————
> One possibility, close to the current MLS spec would be to not store the unmerged leaf at a node if it is already stored at the parent.
> This way, every leaf is present at most once in some `unmerged_leaves` array.
> In the worst case, this uses 4*n + 4*(n/2) = 6*n bytes (4*n for the `unmerged_leaves` array size, 4*(n/2) because at most n/2 leaves are stored in the `unmerged_leaves` arrays).
> In the best case, this uses 4*n bytes.
> The tree resolution algorithm will now have to do a little more calculation to re-compute the unmerged leaves at each node.

This sounds rather straight forward on a logical level. It clearly reduces the stored/transmitted payload size, but it introduces a new cost: previously only O(log n) nodes needed to be modified during an Update/Remove. Now we might have to “push down” the book keeping information from parents to children, in the worst case this affects n/2 nodes, so the cost of modifying nodes goes up to O(n).
If we consider a very large group, there would be a measurable difference when clients load/store nodes from/to disk.
My current intuition is that currently I/O operations can be done in O(log n) with clever parent hash caching. The worst case here would be the following: A client is a member of a large group and has been offline for a while (which is not a far fetched scenario). The client now comes online and has to process a large number of handshake messages. For robustness, the client might be inclined to frequently persist the intermediary state. This is where the I/O cost kicks in.

> Solution 2
> —————
> An alternative, which we prefer, is to get rid of the unmerged leaves array entirely and replace them by epoch numbers:
>  - for each leaf, store the epoch of its creation in the tree (when the leaf was added),
>  - for each parent node, store the epoch at which it was last updated.
> Then, a leaf `l` is unmerged for node `n` iff. n.last_update_epoch < l.creation_epoch.
> Finding the list of unmerged leaves at a node then becomes part of tree resolution, and a member can pre-compute and store this array if it likes.
> In this design, we will store a 64-bit epoch number at each leaf and each parent node, thus it uses 8*2*n = 16*n bytes.
> We believe that having an epoch number documenting the last update time of each node is in itself a useful feature for debugging and for security analysis.
> In our model, it makes stating secrecy and authentication invariants easier.

I like this approach for being more explicit and thus maybe increasing the security/level of agreement. It also helps members determining when they should evict a stale member for being inactive for too long without any extra book keeping.

Some counter arguments:
 - Since public trees might actually be really public in some scenarios, this might give attackers better indications about which client is most interesting to compromise.
 - I think the payload is only smaller for groups with more than 2 ^ 5 = 32 members. For smaller groups it actually makes the stored payload bigger (when we consider exactly one unmerged leaf, more unmerged leaves shift the break even point down).

> Solution 2+
> —————
> However, if we wish to compress further, we can do so as well, using two facts:
> - if a node `n` has two children `l` and `r`, then n.last_update_epoch is equal to either l.last_update_epoch or r.last_update_epoch,
> - if `l` is unmerged for some node then l.creation_epoch = l.last_update_epoch (because in this case, the leaf never generated an UpdatePath) (the converse is not true, for example in the case when its sibling generates an UpdatePath).
> So at each node, we can store the direction from which the node was last updated (for example with a boolean flag), then given any node `n` we can "follow the path" down the tree to obtain a leaf `l`, and we have n.last_update_epoch = l.last_update_epoch.
> Furthermore, we don't need to store the creation epoch for leaves:
> - if `l` is unmerged for `n`, then n.last_update_epoch < l.creation_epoch = l.last_update_epoch,
> - if `l` is not unmerged for `n`, then n.last_update_epoch >= l.last_update_epoch (because when `l` is updated, it updates all the nodes above it).
> This uses 8*n + 1*n = 9*n bytes for the whole tree.

I didn’t check all the ramifications here tbh. I think it is similar to solution 1 in terms of increased I/O cost.

> Comments are welcome, and do please correct us if we misunderstood something.
> In our opinion, it would be great to simplify the tree data structure and compress its size back to O(n) at the same time.

I think all 3 solutions are generally valid. But it also becomes clear that saving in one place introduces cost (or security/privacy degradation) at another. That’s always been the case with all changes we made in MLS though :)

From an implementor’s perspective I wouldn’t say that the current suboptimal payload poses a big problem. But I’d like to hear from other on that as well.



> Best regards,
> Théophile, Benjamin, and Karthik
> Appendix : How to create a tree with n/2 unmerged leaves?
> First, add n participants to the tree.
> Then, remove the participants with odd leaf index.
> Then, generate an update path from every participant with even leaf index (this ensures that there are no blank nodes)
> Then, add new participants in leaves with odd leaf index, without regenerating the group secret.
> Every leaf with odd leaf index will be present in the `unmerged_leaves` array on all the nodes above it, so at each level of the tree there are n/2 unmerged leaves stored, so the whole tree will store 4*(n/2)*log(n) = 2*n*log(n) bytes of information.
> We can’t have a tree with more unmerged leaves, because a node at level 1 can't have both its children is its `unmerged_leaves` array, otherwise this node would be blank.
>> On 21 May 2021, at 16:05, Sean Turner < <>> wrote:
>> Just another FYI!
>>> On May 12, 2021, at 08:26, Sean Turner < <>> wrote:
>>> Just an FYI that we have a meeting scheduled for next week.
>>> spt
>>>> On May 7, 2021, at 13:33, Sean Turner < <>> wrote:
>>>> Meeting Scheduled!
>>>> More information can also be found here:
>>>> <>
>>>> spt
>>>>> On May 7, 2021, at 12:39, IESG Secretary < <>> wrote:
>>>>> The Messaging Layer Security (mls) WG will hold
>>>>> a virtual interim meeting on 2021-05-26 from 10:00 to 12:00 America/New_York (14:00 to 16:00 UTC).
>>>>> Agenda:
>>>>> GitHub issues and pull requests related to -protocol and -architecture.
>>>>> Information about remote participation:
>>>>> <>
>>>>> _______________________________________________
>>>>> IETF-Announce mailing list
>> _______________________________________________
>> MLS mailing list
>> <>
>> <>
> _______________________________________________
> MLS mailing list
> <>